Word segmentation refinement by Wikipedia for textual entailment

Title:	Word segmentation refinement by Wikipedia for textual entailment
Authors:	Chuan-Jie Lin Yu-Cheng Tu
Keywords:	Encyclopedias;Electronic publishing;Internet;Training;Numerical models;Benchmark testing
Issue Date:	13-Aug-2014
Publisher:	IEEE
Abstract:	Textual entailment in Chinese differs from the way handling English because of the lack of word delimiters and capitalization. Information from word segmentation and Wikipedia often plays an important role in textual entailment recognition. However, the inconsistency of boundaries of word segmentation and matched Wikipedia titles should be resolved first. This paper proposed 4 ways to incorporate Wikipedia title matching and word segmentation, experimented in several feature combinations. The best system redoes word segmentation after matching Wikipedia titles. The best feature combination for BC task uses content words and Wikipedia titles only, which achieves a macro-average F-measure of 67.33% and an accuracy of 68.9%. The best MC RITE system also achieves a macro-average F-measure of 46.11% and an accuracy of 58.34%. They beat all the runs in NTCIR-10 RITE-2 CT tasks.
URI:	http://scholars.ntou.edu.tw/handle/123456789/17892
DOI:	10.1109/IRI.2014.7051944
Appears in Collections:	資訊工程學系

Show full item record

Last Week
0

Last month
0

checked on Jun 30, 2025

Check

DSpace CRIS