Aiming at the problems that engine operation data are multi-modal and it is difficult to achieve effective engine life prediction, a parallel branching engine life prediction method was proposed on the basis of multi-modal features integrating potential relationship between images and engine operation time data. Firstly, a sliding window was used to segment the engine operation data, so as to construct sequence samples of engine operation data, and Gramian Angular Field (GAF) was used to convert the constructed sequence samples into images. Then, the sequence samples and images were processed by a Bi-directional Long Short-Term Memory (BiLSTM) network and a Convolutional Neural Network (CNN) to obtain potential relationship features between sensors such as trends and cycles. Finally, Cross-Attention Mechanism (CAM) was introduced to achieve fusion of the two modal features and realize life prediction of the engine. Experimental results on the public C-MAPSS dataset show that the R-squared (R2) of the prediction method is higher than 0.99 and the Root Mean Square Error (RMSE) of the method is less than 1. It can be seen that the method can improve computational efficiency while ensuring the prediction accuracy.