Fault detection method for batch process based on deep long short-term memory network and batch normalization

doi:10.11772/j.issn.1001-9081.2018061371

Abstract

Abstract: Traditional fault detection methods for batch process based on data-driven often need to make assumptions about the distribution of process data, and often lead to false positives and false negatives when dealing with non-linear data and other complex data. To solve this problem, a supervised learning algorithm based on Long Short-Term Memory (LSTM) network and Batch Normalization (BN) was proposed, which does not need to make assumptions about the distribution of original data. Firstly, a preprocessing method based on variable-wise unfolding and continuous sampling was applied to the batch process raw data, so that the processed data could be input to the LSTM unit. Then, the improved deep LSTM network was used for feature learning. By adding the BN layer and the representation method of cross entropy loss, the network was able to effectively extract the characteristics of the batch process data and learned quickly. Finally, a simulation experiment was performed on a semiconductor etching process. The experimental results show that compared with Multilinear Principal Component Analysis (MPCA) method, the proposed method can identify more faults types, which can effectively identify various faults, and the overall detection rate of faults reaches more than 95%. Compared with the traditional single-LSTM model, it has higher recognition speed, and its overall detection rate of faults is increased by more than 8%, and it is suitable for dealing with fault detection problems with non-linear and multi-case characteristics in the batch process.

Key words: data driven, deep learning, Long Short-Term Memory (LSTM) network, batch process, fault detection

摘要： 传统的基于数据驱动的间歇过程故障诊断方法往往需要对过程数据的分布进行假设，而且对非线性等复杂数据的监控往往会出现误报和漏报，为此提出一种基于长短期记忆网络（LSTM）与批规范化（BN）结合的监督学习方法，不需要对原始数据的分布进行假设。首先，对间歇过程原始数据运用一种按变量展开并连续采样的预处理方式，使处理后的数据可以向LSTM单元输入；然后，利用改进的深层LSTM网络进行特征学习，该网络通过添加BN层，结合交叉熵损失的表示方法，可以有效提取间歇过程数据的特征并进行快速学习；最后，在一类半导体蚀刻过程上进行仿真实验。实验结果表明，所提方法比多元线性主成分分析（MPCA）方法故障识别的种类更多，可以有效地识别各类故障，对故障的整体检测率达到95%以上；比传统单层LSTM模型建模速度更快，且对故障的整体检测率提高了8个百分点以上，比较适合处理间歇过程中具有非线性、多工况等特征的故障检测问题。

关键词: 数据驱动, 深度学习, 长短期记忆网络, 间歇过程, 故障检测

CLC Number:

TP277

WANG Shuo, WANG Peiliang. Fault detection method for batch process based on deep long short-term memory network and batch normalization[J]. Journal of Computer Applications, 2019, 39(2): 370-375.

王硕, 王培良. 基于深层长短期记忆网络与批规范化的间歇过程故障检测方法[J]. 计算机应用, 2019, 39(2): 370-375.

References

[1] 任浩,屈剑锋,柴毅,等.深度学习在故障诊断领域中的研究现状与挑战[J].控制与决策,2017,32(8):1345-1358. (REN H, QU J F, CHAI Y, et al. Deep learning for fault diagnosis:The state of the art and challenge[J]. Control and Decision, 2017, 32(8):1345-1358.)
[2] 赵春晖,王福利,姚远,等.基于时段的间歇过程统计建模、在线监测及质量预报[J].自动化学报,2010,36(3):366-374. (ZHAO C H, WANG F L, YAO Y, et al. Phase-based statistical modeling, online monitoring and quality prediction for batch processes[J]. Acta Automatica Sinica, 2010, 36(3):366-374.)
[3] HUNG H, WU P, TU I, et al. On multilinear principal component analysis of order-two tensors[J]. Biometrika, 2012, 99(3):569-583.
[4] WANG J, HE Q P, QIN S J, et al. Recursive least squares estimation for run-to-run control with metrology delay and its application to STI etch process[J]. IEEE Transactions on Semiconductor Manufacturing, 2005, 18(2):309-319.
[5] YU J. Fault detection using principal components-based Gaussian mixture model for semiconductor manufacturing processes[J]. IEEE Transactions on Semiconductor Manufacturing, 2011, 24(3):432-444.
[6] JACKSON J E, MUDHOLKAR G S. Control procedures for residuals associated with principal component analysis[J]. Technometrics, 2012, 21(3):341-349.
[7] 王建林,马琳钰,邱科鹏,等.基于SVDD的多时段间歇过程故障检测[J].仪器仪表学报,2017,38(11):2752-2761. (WANG J L, MA L Y, QIU K P, et al. Multi-phase batch processes fault detection based on support vector data description[J]. Chinese Journal of Scientific Instrument, 2017, 38(11):2752-2761.)
[8] HE Q P, WANG J. Fault detection using the k-nearest neighbor rule for semiconductor manufacturing processes[J]. IEEE Transactions on Semiconductor Manufacturing, 2007, 20(4):345-354.
[9] GRAVES A. Supervised Sequence Labelling with Recurrent Neural Networks[M]. Berlin:Springer, 2012:37-45.
[10] HINTON G E, SALAKHUTDINOV R R. Reducing the dimensionality of data with neural networks[J]. Science, 2006, 313(5786):504-507.
[11] WU S, ZHANG L, ZHENG W, et al. A DBN-based risk assessment model for prediction and diagnosis of offshore drilling incidents[J]. Journal of Natural Gas Science and Engineering, 2016, 34:139-158.
[12] SUN J, XIAO Z, XIE Y. Automatic multi-fault recognition in TFDS based on convolutional neural network[J]. Neurocomputing, 2017, 222:127-136.
[13] LU C, WANG -Y, QIN W-L, et al. Fault diagnosis of rotary machinery components using a stacked denoising autoencoder-based health state identification[J]. Signal Processing, 2017, 130:377-388.
[14] de TIM B, VERBERT K, BABUSKA R. Railway track circuit fault diagnosis using recurrent neural networks[J]. IEEE Transactions on Neural Networks and Learning Systems, 2017, 28(3):523-533.
[15] TALEBI N, SADRNIA M A, DARABI A. Robust fault detection of wind energy conversion systems based on dynamic neural networks[J]. Computational Intelligence and Neuroscience, 2014, 4(7):580972
[16] PASCANU R, MIKOLOV T, BENGIO Y. On the difficulty of training recurrent neural networks[C]//Proceedings of the 30th International Conference on Machine Learning:Vol. 28. Atlanta, GA:JMLR, 2013, 28:1310-1318.
[17] IOFFE S, SZEGEDY C. Batch normalization:accelerating deep network training by reducing internal covariate shift[C]//Proceedings of the 32nd International Conference on Machine Learning:Vol. 37. Atlanta, GA:JMLR, 2015:448-456.
[18] GOODFELLOW I, BENGIO Y, COURVILLE A, et al. Deep learning[M]. Cambridge, UK:MIT Press, 2016:172-187.
[19] DUCHI J, HAZAN E, SINGER Y. Adaptive subgradient methods for online learning and stochastic optimization[J]. Journal of Machine Learning Research, 2011, 12:2121-2159.
[20] WISE B M, GALLAGHER N B, BUTLER S W, et al. A comparison of principal component analysis, multiway principal component analysis, trilinear decomposition and parallel factor analysis for fault detection in a semiconductor etch process[J]. Journal of Chemometrics, 1999, 13(3/4):379-396.
[21] 常玉清,王姝,谭帅,等.基于多时段MPCA模型的间歇过程监测方法研究[J].自动化学报,2010,36(9):1312-1320. (CHANG Y Q, WANG S, TAN S, et al. Research on multistage-based MPCA modeling and monitoring method for batch processes[J]. Acta Automatica Sinica, 2010, 36(9):1312-1320.)
[22] 陶栋琦,薄翠梅,易辉.基于多时段MPCA的半导体蚀刻过程监测方法[J].传感技术学报,2015,28(6):798-802. (TAO D Q, BO C M, YI H. Semiconductor etch process monitoring based on multi-stage MPCA[J]. Chinese Journal of Sensors and Actuators, 2015, 28(6):798-802.)
[23] GLOROT X, BORDES A, BENGIO Y. Deep sparse rectifier neural networks[C]//Proceedings of the 2011 Fourteenth International Conference on Artificial Intelligence and Statistics:Vol. 15. Atlanta, GA:JMLR, 2011:315-323.