Skip navigation
  • 中文
  • English

DSpace CRIS

  • DSpace logo
  • Home
  • Research Outputs
  • Researchers
  • Organizations
  • Projects
  • Explore by
    • Research Outputs
    • Researchers
    • Organizations
    • Projects
  • Communities & Collections
  • SDGs
  • Sign in
  • 中文
  • English
  1. National Taiwan Ocean University Research Hub

Passive/Active Semi-Supervised Clustering with Discriminative Random Fields

View Statistics Email Alert RSS Feed

  • Information

Details

Project title
Passive/Active Semi-Supervised Clustering with Discriminative Random Fields
Code/計畫編號
NSC101-2221-E019-067
Translated Name/計畫中文名
基於鑑別式隨機場域之被動/主動式半監督式分群法
 
Project Coordinator/計畫主持人
Chin-Chun Chang
Funding Organization/主管機關
National Science and Technology Council
 
Department/Unit
Department of Computer Science and Engineering
Website
https://www.grb.gov.tw/search/planDetail?id=2644274
Year
2012
 
Start date/計畫起
01-08-2012
Expected Completion/計畫迄
31-07-2013
 
Bugetid/研究經費
586千元
 
ResearchField/研究領域
資訊科學--軟體
 

Description

Abstract
"半監督式分群可藉由少量的監督式資訊來增加資料分群的準確度。一般而言,監督資訊通常以must-link與cannot-link樣本對的型式呈現。在這個兩年計劃裡,我們將發展一個主動/被動式半監督式分群框架。此框架能夠無縫地與一個傳統的分群演算法整合。此框架尤其適合已選定特定分群演算法的應用,使其能藉由少量的監督式資訊來增加資料分群的準確度。研究主題如下。  第一年,我們將發展一個被動式半監督式分群框架。在這個個框架裡,鑑別式隨機場將被使用來描述分群結果與監督資訊間的一致性。因此,半監督式分群問題則被視為在鑑別式隨機場裡找一個有最大事後機率的資料標籤設定。我們將使用iterated conditional modes與距離學習演算法來找到有最大事後機率的資料標籤設定。  第二年,我們將在第一年所建立的框架上發展一個主動式半監督式分群法。我們將針對第一年所建立的框架的有效性來挑選詢問樣本對。由於從所有樣本對裡挑出詢問樣本對可能非常耗時,我們將從對鑑別式隨機場網路結構最具影響力的樣本,來考慮形成詢問樣本對。在這裡,我們擬用網路向心性(network centrality)度量來決定最有影響力的樣本。另外,我們將使用費雪資訊比例(the Fisher information ratio)的方式來計算每對樣本對於鑑別式隨機場裡的分類器的重要性。藉由這種方式,我們挑選出的詢問樣本對不但對其他樣本具有大的影響力,同時也有利於鑑別式隨機場裡的那個分類器的學習。 我們將研究的框架非常實用但在文獻裡比較少見,深具研究潛力。""Semi-supervised clustering exploits a small quantity of supervised information to improve the accuracy of data clustering. In general, supervised information is often specified by the must-link and the cannot-link constraint. In this two-year project, a framework for active/passive semi-supervised clustering will be developed. This framework is capable of integrating with a traditional clustering algorithm seamlessly, and particularly useful for the application where a traditional clustering is designated to use. The topics of this two-year project are as follows.  In the first year, we shall develop a framework for passive semi-supervised clustering. In this framework, discriminative random fields (DRFs) will be employed to model the consistency between the clustering result of a traditional clustering algorithm and the supervised information with the assumption of semi-supervised learning. The semi-supervised clustering problem is thus formulated as finding the label configuration with the maximum a posteriori (MAP) probability of the DRF. A procedure based on the iterated conditional modes algorithm and a metric-learning algorithm will be developed to find a suboptimal MAP solution of the DRF.  In the second year, we shall develop an active approach for semi-supervised clustering. Our goal is to select query sample-pairs effective to the framework developed in the first year. Because selecting query sample-pairs from all pairs of samples is often prohibited, we shall form query sample-pairs with the sample which is the most influential regarding the network structure of the DRF. In this project, the most influential sample will be determined by measures of network centrality. In addition, the Fisher information ratio will be used to gauge the importance of a sample pair to the classifier for the DRF. As a result, the query sample-pair will be not only important to the network structure of the DRF but also effective to learning of the classifier for the DRF. Since useful but rare in the literature, the aimed framework is worthy of investigation."
 
Keyword(s)
半監督式分群
鑑別式隨機場
主動式學習
網路影響力分析
費雪資訊矩陣
Semi-supervised clustering
Discriminative random fields
Active learning
Social influence analysis
Fisher information ratio
 
Explore by
  • Communities & Collections
  • Research Outputs
  • Researchers
  • Organizations
  • Projects
Build with DSpace-CRIS - Extension maintained and optimized by Logo 4SCIENCE Feedback