http://scholars.ntou.edu.tw/handle/123456789/17887
標題: | A Study on Chinese Spelling Check Using Confusion Sets and N-gram Statistics | 作者: | Chuan-Jie Lin Wei-Cheng Chu |
關鍵字: | Chinese Spelling Check;Confusion Set Expansion;Google Ngram Scoring Function. | 公開日期: | 1-六月-2015 | 卷: | 20 | 期: | 1 | 起(迄)頁: | 23-48 | 摘要: | This paper proposes an automatic method to build a Chinese spelling check system. Confusion sets were expanded by using two language resources, Shuowen Jiezi and the Four-Corner codes, which improved the coverages of the confusion sets. Nine scoring functions which utilize the frequency data in the Google Ngram Datasets were proposed, where the idea of smoothing was also adopted. Thresholds were also decided in an automatic way. The final system achieved far better than our baseline system in CSC 2013 Evaluation Task. |
URI: | http://scholars.ntou.edu.tw/handle/123456789/17887 |
顯示於: | 資訊工程學系 |
在 IR 系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。