Skip navigation
  • 中文
  • English

DSpace CRIS

  • DSpace logo
  • Home
  • Research Outputs
  • Researchers
  • Organizations
  • Projects
  • Explore by
    • Research Outputs
    • Researchers
    • Organizations
    • Projects
  • Communities & Collections
  • SDGs
  • Sign in
  • 中文
  • English
  1. National Taiwan Ocean University Research Hub
  2. 電機資訊學院
  3. 資訊工程學系
Please use this identifier to cite or link to this item: http://scholars.ntou.edu.tw/handle/123456789/17906
Title: Strategies of Processing Japanese Names and Character Variants in Traditional Chinese Text
Authors: Chuan-Jie Lin 
Jia-Cheng Zhan
Yen-Heng Chen
Chien-Wei Pao
Keywords: Semantic Chinese Word Segmentation;Japanese Name Identification;Character Variants.
Issue Date: Sep-2012
Publisher: Computational Linguistics
Journal Volume: 17
Journal Issue: 3
Start page/Pages: 87-108
Source: Computational Linguistics and Chinese Language Processing
Abstract: 
This paper proposes an approach to identify word candidates that are not
Traditional Chinese, including Japanese names (written in Japanese Kanji or
Traditional Chinese characters) and word variants, when doing word segmentation
on Traditional Chinese text. When handling personal names, a probability model
concerning formats of names is introduced. We also propose a method to map
Japanese Kanji into the corresponding Traditional Chinese characters. The same
method can also be used to detect words written in character variants. After
integrating generation rules for various types of special words, as well as their
probability models, the F-measure of our word segmentation system rises from
94.16% to 96.06%. Another experiment shows that 83.18% of the 862 Japanese
names in a set of 109 human-annotated documents can be successfully detected.
URI: http://scholars.ntou.edu.tw/handle/123456789/17906
Appears in Collections:資訊工程學系

Show full item record

Page view(s)

118
Last Week
0
Last month
0
checked on Jun 30, 2025

Google ScholarTM

Check

Related Items in TAIR


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.

Explore by
  • Communities & Collections
  • Research Outputs
  • Researchers
  • Organizations
  • Projects
Build with DSpace-CRIS - Extension maintained and optimized by Logo 4SCIENCE Feedback