http://scholars.ntou.edu.tw/handle/123456789/24747
標題: | Reordering video shots for event classification using bag-of-words models and string kernels | 作者: | Yung-Lun Chen Shyi-Chyi Cheng Yi-Ping Phoebe Chen |
公開日期: | 十一月-2012 | 摘要: | This paper presents a novel approach to reorder video shots using the state-of-the-art bag-of-words (BoW) approach. The shot reordering approach eliminates the temporal ambiguity which is likely to degrade the performance of conventional video event recognition algorithms using support vector machine (SVM) classifiers with string kernels. A traditional BoW model constructs feature vectors for video frames, regarding the arrangement of the visual words in the 2D image space, to be histograms of visual words which do not consider spatial-temporal information. Our approach first segments the input video clip into a set of video shots where each of them is further divided into multiple three dimensional video patches and cubes. In this paper we present a method to introduce spatial-temporal information into the BoW model by analytically extracting space-time features from individual 3D cubes. The system learns the BoW codebook from these 3D cubes. Every video shot in an input video sequence is represented as a BoW histogram and the corresponding event is then modelled as a sequence of BoW histograms which are further reordered by the proposed normalization scheme. The string kernels for SVM classification are finally adopted to train the SVM classifiers from a set of training samples. These classifiers are used to recognize the event type of a test video clip. Our framework presents a simple and effective way to infuse both temporal and spatial configurations for video events. Results show that the proposed method gives good performance on several publicly available datasets in terms of robustness and recognition rate. |
URI: | http://scholars.ntou.edu.tw/handle/123456789/24747 | DOI: | 10.1145/2425836.2425876 |
顯示於: | 資訊工程學系 |
在 IR 系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。