Skip navigation
  • 中文
  • English

DSpace CRIS

  • DSpace logo
  • 首頁
  • 研究成果檢索
  • 研究人員
  • 單位
  • 計畫
  • 分類瀏覽
    • 研究成果檢索
    • 研究人員
    • 單位
    • 計畫
  • 機構典藏
  • SDGs
  • 登入
  • 中文
  • English
  1. National Taiwan Ocean University Research Hub
  2. 電機資訊學院
  3. 資訊工程學系
請用此 Handle URI 來引用此文件: http://scholars.ntou.edu.tw/handle/123456789/24747
DC 欄位值語言
dc.contributor.authorYung-Lun Chenen_US
dc.contributor.authorShyi-Chyi Chengen_US
dc.contributor.authorYi-Ping Phoebe Chenen_US
dc.date.accessioned2024-03-15T07:08:20Z-
dc.date.available2024-03-15T07:08:20Z-
dc.date.issued2012-11-
dc.identifier.urihttp://scholars.ntou.edu.tw/handle/123456789/24747-
dc.description.abstractThis paper presents a novel approach to reorder video shots using the state-of-the-art bag-of-words (BoW) approach. The shot reordering approach eliminates the temporal ambiguity which is likely to degrade the performance of conventional video event recognition algorithms using support vector machine (SVM) classifiers with string kernels. A traditional BoW model constructs feature vectors for video frames, regarding the arrangement of the visual words in the 2D image space, to be histograms of visual words which do not consider spatial-temporal information. Our approach first segments the input video clip into a set of video shots where each of them is further divided into multiple three dimensional video patches and cubes. In this paper we present a method to introduce spatial-temporal information into the BoW model by analytically extracting space-time features from individual 3D cubes. The system learns the BoW codebook from these 3D cubes. Every video shot in an input video sequence is represented as a BoW histogram and the corresponding event is then modelled as a sequence of BoW histograms which are further reordered by the proposed normalization scheme. The string kernels for SVM classification are finally adopted to train the SVM classifiers from a set of training samples. These classifiers are used to recognize the event type of a test video clip. Our framework presents a simple and effective way to infuse both temporal and spatial configurations for video events. Results show that the proposed method gives good performance on several publicly available datasets in terms of robustness and recognition rate.en_US
dc.language.isoen_USen_US
dc.titleReordering video shots for event classification using bag-of-words models and string kernelsen_US
dc.typeconference paperen_US
dc.identifier.doi10.1145/2425836.2425876-
item.openairecristypehttp://purl.org/coar/resource_type/c_5794-
item.cerifentitytypePublications-
item.languageiso639-1en_US-
item.fulltextno fulltext-
item.grantfulltextnone-
item.openairetypeconference paper-
crisitem.author.deptCollege of Electrical Engineering and Computer Science-
crisitem.author.deptDepartment of Computer Science and Engineering-
crisitem.author.deptNational Taiwan Ocean University,NTOU-
crisitem.author.parentorgNational Taiwan Ocean University,NTOU-
crisitem.author.parentorgCollege of Electrical Engineering and Computer Science-
顯示於:資訊工程學系
顯示文件簡單紀錄

Page view(s)

152
checked on 2025/6/30

Google ScholarTM

檢查

Altmetric

Altmetric

TAIR相關文章


在 IR 系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。

瀏覽
  • 機構典藏
  • 研究成果檢索
  • 研究人員
  • 單位
  • 計畫
DSpace-CRIS Software Copyright © 2002-  Duraspace   4science - Extension maintained and optimized by NTU Library Logo 4SCIENCE 回饋