http://scholars.ntou.edu.tw/handle/123456789/17906
標題: | Strategies of Processing Japanese Names and Character Variants in Traditional Chinese Text | 作者: | Chuan-Jie Lin Jia-Cheng Zhan Yen-Heng Chen Chien-Wei Pao |
關鍵字: | Semantic Chinese Word Segmentation;Japanese Name Identification;Character Variants. | 公開日期: | 九月-2012 | 出版社: | Computational Linguistics | 卷: | 17 | 期: | 3 | 起(迄)頁: | 87-108 | 來源出版物: | Computational Linguistics and Chinese Language Processing | 摘要: | This paper proposes an approach to identify word candidates that are not Traditional Chinese, including Japanese names (written in Japanese Kanji or Traditional Chinese characters) and word variants, when doing word segmentation on Traditional Chinese text. When handling personal names, a probability model concerning formats of names is introduced. We also propose a method to map Japanese Kanji into the corresponding Traditional Chinese characters. The same method can also be used to detect words written in character variants. After integrating generation rules for various types of special words, as well as their probability models, the F-measure of our word segmentation system rises from 94.16% to 96.06%. Another experiment shows that 83.18% of the 862 Japanese names in a set of 109 human-annotated documents can be successfully detected. |
URI: | http://scholars.ntou.edu.tw/handle/123456789/17906 |
顯示於: | 資訊工程學系 |
在 IR 系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。