Video semantic detection based on topographic independent component analysis and Gaussian mixture model

doi:10.11772/j.issn.1001-9081.2016.03.770

Journal of Computer Applications ›› 2016, Vol. 36 ›› Issue (3): 770-773.DOI: 10.11772/j.issn.1001-9081.2016.03.770

Previous Articles Next Articles

Video semantic detection based on topographic independent component analysis and Gaussian mixture model

KONG Weiting, ZHAN Yongzhao

College of Computer Science and Telecommunication Engineering, Jiangsu University, Zhenjiang Jiangsu 212013, China

Received:2015-08-24 Revised:2015-10-20 Online:2016-03-17 Published:2016-03-10
Supported by:
This work is partially supported by the National Natural Science Foundation of China (61170126).

基于拓扑独立成分分析和高斯混合模型的视频语义概念检测

孔玮婷, 詹永照

江苏大学计算机科学与通信工程学院, 江苏镇江 212013

通讯作者: 孔玮婷
作者简介:孔玮婷(1991-),女,江苏南京人,硕士研究生,主要研究方向:多媒体;詹永照(1962-),男,江苏镇江人,教授,博士,CCF高级会员,主要研究方向:人机交互、模式识别、多媒体。
基金资助:
国家自然科学基金资助项目(61170126)。

Abstract

Abstract: To reduce quantization error in vector quantization of Bag of Words (BoW) for video semantic detection and extract feature automatically and effectively, a new video semantic detection method based on Topographic Independent Component Analysis (TICA) and Gaussian Mixture Model (GMM) was proposed. Firstly, features of each video clip were extracted by TICA algorithm to learn complex invariant features from video clips. Secondly, the feature distribution of each video clip was described by GMM. Finally, a GMM supervector was created from GMM parameters and the GMM supervector for each shot was used as the input of an Support Vector Machine (SVM) for video semantic detection. A GMM can be regard as an extension of the BoW to a probabilistic framework, and thus, has less quantization error, better retaining the information in the original feature vectors. The experiments were conducted on the TRECVID 2012 and OV datasets. The experimental results show that compared with BoW and SIFT (Scale Invariant Feature Transform)-GMM algorithm, the proposed method can improve the mean average precision on both of the TRECVID 2012 and OV datasets for video semantic detection.

Key words: video semantic detection, Topographic Independent Component Analysis (TICA), Gaussian Mixture Model (GMM), Bag of Words (BoW) model, Support Vector Machine (SVM)

摘要： 针对目前词袋模型(BoW)视频语义概念检测方法中的量化误差问题,为了更有效地自动提取视频的底层特征,提出一种基于拓扑独立成分分析(TICA)和高斯混合模型(GMM)的视频语义概念检测算法。首先,通过TICA算法进行视频片段的特征提取,该特征提取算法能够学习到视频片段复杂不变性特征;其次利用GMM方法对视频视觉特征进行建模,描述视频特征的分布情况;最后构造视频片段的GMM超向量,采用支持向量机(SVM)进行视频语义概念检测。GMM是BoW概率框架下的拓展,能够减少量化误差,具有良好的鲁棒性。在TRECVID 2012和OV两个视频库上,将所提方法与传统的BoW、SIFT-GMM方法进行了对比实验,结果表明,基于TICA和GMM的视频语义概念检测方法能够提高视频语义概念检测的准确率。

关键词: 视频语义检测, 拓扑独立成分分析, 高斯混合模型, 词袋模型, 支持向量机

CLC Number:

TP391

KONG Weiting, ZHAN Yongzhao. Video semantic detection based on topographic independent component analysis and Gaussian mixture model[J]. Journal of Computer Applications, 2016, 36(3): 770-773.

孔玮婷, 詹永照. 基于拓扑独立成分分析和高斯混合模型的视频语义概念检测[J]. 计算机应用, 2016, 36(3): 770-773.

References

[1] INOUE N, SHINODA K. Q-Gaussian mixture models for image and video semantic indexing [J]. Journal of Visual Communication and Image Representation, 2013, 24(8): 1450-1457.
[2] CAPODIFERRO L,COSTANTINI L,MANGIATORDI F, et al. Data pre-processing to improve SVM video classification [C]//Proceedings of the 2012 10th International Workshop on Content-based Multimedia Indexing. Piscataway, NJ: IEEE, 2012:1-4.
[3] YANG J, JIANG Y-G, HAUPTMANN A G, et al. Evaluating bag-of-visual-words representations in scene classification [C]//Proceedings of the International Workshop on Multimedia Information Retrieval. New York: ACM, 2007:197-206.
[4] QUELHAS P, MONAY F, ODOBEZ J M, et al. A thousand words in a scene [J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2007,29(9):1575-1589.
[5] JIANG Y-G, ZENG X, YE G, et al. Columbia-UCF TRECVID2010 multimedia event detection: combining multiple modalities, contextual concepts, and temporal matching [EB/OL]. [2015-03-04]. http://crcv.ucf.edu/papers/trecvid10_CUUCF.pdf.
[6] KUANAR S K, RANGA K B, CHOWDHURY A S. Multi-view video summarization using bipartite matching constrained optimum-path forest clustering [J]. IEEE Transactions on Multimedia, 2015,17(8):1166-1173.
[7] DOS SANTOS J A, PENATTI O A B, GOSSELIN P H, et al. Efficient and effective hierarchical feature propagation [J]. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 2014,7(12):4632-4643.
[8] PERRONNIN F, DANCE C, CSURKA G, et al. Adapted vocabularies for generic visual categorization [M]//LEONARDIS A, BISCHOF H, PINZ A. Computer Vision—ECCV 2006, LNCS 3954. Berlin: Springer, 2006: 464-475.
[9] KAMISHIMA Y, INOUE N, SHINODA K. Event detection in consumer videos using GMM supervectors and SVMs [EB/OL]. [2015-02-16]. http://www.jivp.eurasipjournals.com/content/pdf/1687-5281-2013-51.pdf.
[10] HYVÄRINEN A, HOYER P O. Topographic independent component analysis as a model of V1 organization and receptive fields [J]. Neuralcomputing, 2001,38/39/40:1307-1315.
[11] HYVÄRINEN A, HURRI J, HOYER P O. Natural image statistics: a probabilistic approach to early computational vision [M]. Berlin: Springer, 2009:246-253.
[12] LE Q V, NGIAM J, CHEN Z, et al. Tiled convolutional neural networks [EB/OL]. [2015-02-16]. http://ai.stanford.edu/~quocle/LeNgiamChenChiaKohNg10.pdf.
[13] HYVARINEN A, KARHUNEN J, OJA E. Independent component analysis [M]. Hoboken, NJ: John Wiley and Sons, 2004:151-154.
[14] HYVARINEN A, HOYER P, INKI M. Topographic independent component analysis [J]. Neural Computation, 2001,13(7):1527-1558.

[1]	Min SUN, Qian CHENG, Xining DING. CBAM-CGRU-SVM based malware detection method for Android [J]. Journal of Computer Applications, 2024, 44(5): 1539-1545.
[2]	Xing WANG, Guijuan LIU, Zhihao CHEN. Fake review detection algorithm combining Gaussian mixture model and text graph convolutional network [J]. Journal of Computer Applications, 2024, 44(2): 360-368.
[3]	Enbao QIAO, Xiangyang GAO, Jun CHENG. Self-recovery adaptive Monte Carlo localization algorithm based on support vector machine [J]. Journal of Computer Applications, 2024, 44(10): 3246-3251.
[4]	Xueyu HUANG, Huaiyu HE, Huimin LIN, Jinshui CHEN. Classification and recognition method of copper alloy metallograph based on feature aggregation [J]. Journal of Computer Applications, 2023, 43(8): 2593-2601.
[5]	Lei YANG, Hongdong ZHAO, Kuaikuai YU. End-to-end speech emotion recognition based on multi-head attention [J]. Journal of Computer Applications, 2022, 42(6): 1869-1875.
[6]	Zhen QU, Kunting LI, Zhixi FENG. Remote sensing image scene classification based on effective channel attention [J]. Journal of Computer Applications, 2022, 42(5): 1431-1439.
[7]	Guifang QIAO, Shouming HOU, Yanyan LIU. Facial expression recognition algorithm based on combination of improved convolutional neural network and support vector machine [J]. Journal of Computer Applications, 2022, 42(4): 1253-1259.
[8]	Wang TAN, Yi LI. Synthesis of loop bound functions for loop programs [J]. Journal of Computer Applications, 2022, 42(2): 565-573.
[9]	Qian GE, Guangbin ZHANG, Xiaofeng ZHANG. Automatic feature selection algorithm based on interaction of ReliefF with maximum information coefficient and SVM [J]. Journal of Computer Applications, 2022, 42(10): 3046-3053.
[10]	Hongfei JIA, Xi LIU, Yu WANG, Hongbing XIAO, Suxia XING. Application of 3DPCANet in image classification of functional magnetic resonance imaging for Alzheimer’s disease [J]. Journal of Computer Applications, 2022, 42(1): 310-315.
[11]	JIA Heming, JIANG Zichao, LI Yao, SUN Kangjian. Simultaneous feature selection optimization based on improved spotted hyena optimizer algorithm [J]. Journal of Computer Applications, 2021, 41(5): 1290-1298.
[12]	YUAN Qianqian, DENG Hongmin, WANG Xiaohang. Citrus disease and insect pest area segmentation based on superpixel fast fuzzy C-means clustering and support vector machine [J]. Journal of Computer Applications, 2021, 41(2): 563-570.
[13]	Hongliang CAO, Ying ZHANG, Bin WU, Fanyu LI, Xubo NA. Prediction method of liver transplantation complications based on transfer component analysis and support vector machine [J]. Journal of Computer Applications, 2021, 41(12): 3608-3613.
[14]	Kai LI, Jie LI. Structure-fuzzy multi-class support vector machine algorithm based on pinball loss [J]. Journal of Computer Applications, 2021, 41(11): 3104-3112.
[15]	TONG Lin, GUAN Zheng. Fuzzy granulation prediction of traffic flow based on improved whale optimization support vector machine [J]. Journal of Computer Applications, 2021, 41(10): 2919-2927.

Video semantic detection based on topographic independent component analysis and Gaussian mixture model

基于拓扑独立成分分析和高斯混合模型的视频语义概念检测

PDF

Knowledge

Abstract

Cite this article

share this article

References

Related Articles 15

Recommended Articles

Metrics