Skip navigation
  • 中文
  • English

DSpace CRIS

  • DSpace logo
  • 首頁
  • 研究成果檢索
  • 研究人員
  • 單位
  • 計畫
  • 分類瀏覽
    • 研究成果檢索
    • 研究人員
    • 單位
    • 計畫
  • 機構典藏
  • SDGs
  • 登入
  • 中文
  • English
  1. National Taiwan Ocean University Research Hub

A Study on Artistic Artifact Named Entity Recognition

瀏覽統計 Email 通知 RSS Feed

  • 簡歷

基本資料

Project title
A Study on Artistic Artifact Named Entity Recognition
Code/計畫編號
NSC102-2221-E019-057
Translated Name/計畫中文名
藝術類人造物專名實體辨識研究
 
Project Coordinator/計畫主持人
Chuan-Jie Lin
Funding Organization/主管機關
National Science and Technology Council
 
Department/Unit
Department of Computer Science and Engineering
Website
https://www.grb.gov.tw/search/planDetail?id=3102978
Year
2013
 
Start date/計畫起
01-08-2013
Expected Completion/計畫迄
31-07-2014
 
Bugetid/研究經費
693千元
 
ResearchField/研究領域
資訊科學--軟體
 

Description

Abstract
專名實體辨識 (named entity recognition) 在許多自然語言系統中是不可或缺的重要技術,因專名 實體攜帶有重要資訊而應給較高的權重。而中文斷詞、資訊擷取、自動問答、資訊檢索等系統也都會 結合專名實體辨識技術,足見此種技術的重要性。 以往的專名辨識技術發展著重在人名、地名、組織名的辨識。大部份的研究論文在處理人造物時, 都只使用了出現在上下文的詞彙或括號做為特徵。然而由於人造物的範圍非常廣,包括了電影、書籍 等任何可以被取名的物品,很難以單一策略就能判斷差異頗大的各種類型。 有鑑於此,本計畫希望能先針對藝術創作類的人造物研究較好的專名實體辨識策略,研究設計新 的特徵,包括藝術相關專名實體語意特徵、特定領域語意特徵、以及特定句型特徵等。未來可將此經 驗套用至其他種類的人造物專名辨識上。 預計工作項目如下: (1) 建立藝術類專名實體辨識之實驗資料集 1. 蒐集藝術娛樂領域文章。 2. 人工標識藝術類專名實體的位置及其類別。 (2) 研究藝術領域相關之語意特徵 1. 自動比對專名實體上下文出現之語意類型。 2. 研究適於評估語意相關性之數學公式。 3. 選取高度相關之語意類型做為特徵。 (3) 以機器學習法建立藝術類專名實體辨識系統 1. 實驗新舊特徵組合之效果。 2. 評估專名辨識系統的效能。 Named entity recognition (NER) is an indispensable technique in NLP because they carry most information thus should be highly weighted. Systems of Chinese word segmentation, information extraction, question answering, and information retrieval often integrate NER modules in them, showing the importance of NER techniques. Most NER research papers focused on recognition of names of persons, locations, and organizations. When dealing with artifacts, only words in the context or parentheses were considered as features. Due to the large variety of artifacts which includes all kinds of objects who can be named, it is not easy to find good features without further subcategorizing the artifacts. Therefore, we want to first focus on NER on artistic artifacts, including movies, TV programs, books, and musical compositions. This project plans to study and design new features for NER, including art-related named entity types, domain-related semantics, and name patterns. The experiences learned from artistic artifact NER can be applied to other NE types in the future. The tasks planned in this project are as follows. (1) Preparing experimental datasets for artistic artifact named entity recognition 1. Collecting documents in the domain of art and amusement 2. Manually tagging the occurrences of artistic artifacts and their types (2) Studying art-related semantic features 1. Automatically labeling the semantics of phrases in the context of artistic artifacts 2. Exploring mathematical equations in information theory to estimate the relatedness of semantics 3. Selecting highly correlated semantic categories as features (3) Building artistic artifact NER system by machine learning 1. Experimenting on combinations of new and known features 2. Evaluating the performance of named entity recognition
 
Keyword(s)
藝術創作品標題
專名實體辨識
自然語言處理
artistic creation
named entity recognition
natural language processing
 
瀏覽
  • 機構典藏
  • 研究成果檢索
  • 研究人員
  • 單位
  • 計畫
DSpace-CRIS Software Copyright © 2002-  Duraspace   4science - Extension maintained and optimized by NTU Library Logo 4SCIENCE 回饋