http://scholars.ntou.edu.tw/handle/123456789/17892
Title: | Word segmentation refinement by Wikipedia for textual entailment | Authors: | Chuan-Jie Lin Yu-Cheng Tu |
Keywords: | Encyclopedias;Electronic publishing;Internet;Training;Numerical models;Benchmark testing | Issue Date: | 13-Aug-2014 | Publisher: | IEEE | Abstract: | Textual entailment in Chinese differs from the way handling English because of the lack of word delimiters and capitalization. Information from word segmentation and Wikipedia often plays an important role in textual entailment recognition. However, the inconsistency of boundaries of word segmentation and matched Wikipedia titles should be resolved first. This paper proposed 4 ways to incorporate Wikipedia title matching and word segmentation, experimented in several feature combinations. The best system redoes word segmentation after matching Wikipedia titles. The best feature combination for BC task uses content words and Wikipedia titles only, which achieves a macro-average F-measure of 67.33% and an accuracy of 68.9%. The best MC RITE system also achieves a macro-average F-measure of 46.11% and an accuracy of 58.34%. They beat all the runs in NTCIR-10 RITE-2 CT tasks. |
URI: | http://scholars.ntou.edu.tw/handle/123456789/17892 | DOI: | 10.1109/IRI.2014.7051944 |
Appears in Collections: | 資訊工程學系 |
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.