Skip navigation
  • 中文
  • English

DSpace CRIS

  • DSpace logo
  • Home
  • Research Outputs
  • Researchers
  • Organizations
  • Projects
  • Explore by
    • Research Outputs
    • Researchers
    • Organizations
    • Projects
  • Communities & Collections
  • SDGs
  • Sign in
  • 中文
  • English
  1. National Taiwan Ocean University Research Hub

A Study on Analysis of Multi-Focus Questions

View Statistics Email Alert RSS Feed

  • Information

Details

Project title
A Study on Analysis of Multi-Focus Questions
Code/計畫編號
NSC98-2221-E019-041
Translated Name/計畫中文名
多焦點問句之自動分析技術研究
 
Project Coordinator/計畫主持人
Chuan-Jie Lin
Funding Organization/主管機關
National Science and Technology Council
 
Department/Unit
Department of Computer Science and Engineering
Website
https://www.grb.gov.tw/search/planDetail?id=1913851
Year
2009
 
Start date/計畫起
01-08-2009
Expected Completion/計畫迄
31-07-2010
 
Bugetid/研究經費
644千元
 
ResearchField/研究領域
資訊科學--軟體
 

Description

Abstract
自動問答系統為新一代的資訊搜尋系統。以往利用關鍵詞做為查詢的搜尋引擎只能 回覆一群相關的文件,使用者仍需一篇一篇地閱讀,以找出自己心中問題的解答。自動 問答系統則是將答案抽取出來直接呈現給使用者,減少尋找資訊所花的時間。 除了網際網路資訊搜尋外,自動問答技術也可以應用在電子圖書館或是電子百科全 書上,可以提供使用者在浩瀚的資訊中快速搜尋解答。雖然有些電子圖書館或是數位典 藏資料庫是以資料庫型式儲存結構化的資料,但是使用者界面若能設計為自然語言查詢 界面 (亦即以日常生活所用文句來發問或描述他的資訊需求),對於使用者有極大的幫 助。因此自動問答的問句分析模組仍扮演著重要的角色! 在自動問答技術的研究過程中,我們發現了一種新的問句型態結構,是具有多焦點 的問句型態。以往的自動問答研究多半停留在單焦點的問句上面,這種新的問句型態在 世界上尚未被注視,對此問句型態的可能句型或分析方法也仍未有研究提及。 本計畫希望能以兩年的時間,對於多焦點問句做個全面性地研究。包含蒐集問句、 觀察並提出多焦點問句的分析準則,研究自動偵測問句中焦點所在以及其間關係的技 術,和自動分解原問句為多個單焦點子問句的技術。 預計工作項目如下: 第一年、多焦點自然語言問句之蒐集與標記、雙焦點問句的自動分析 (1) 建立自然語言問句集 1. 蒐集線上論壇網站實際發文所用的自然語言問句5000 句。 2. 標識各問句的問句型態。 3. 標識各問句的焦點個數。 4. 蒐集其中焦點個數在兩個以上的問句以備研究。 (2) 蒐集雙焦點問句集合 1. 蒐集其中的雙焦點問句。 2. 整理文句以排除與研究無關的雜訊。 3. 標識出各個問句焦點的所在位置。 (3) 分析並自動判斷雙焦點問句的句型 1. 觀察雙焦點問句的特性以及焦點之間的關係。 2. 提出雙焦點問句的分析準則。 3. 研究自動判斷問句焦點位置的技術。 4. 研究判斷問句焦點之間關係的技術。 第二年、多焦點自然語言問句蒐集、標記及自動分析 (1) 擴增自然語言問句集 1. 蒐集新的實際自然語言問句5000 句。 2. 標識各問句的問句型態。 3. 標識各問句的焦點個數。 4. 蒐集其中的多焦點問句以備研究。 5. 整理文句以排除與研究無關的雜訊。 6. 標識出各個問句焦點的所在位置。 (2) 分析並自動判斷多焦點問句的句型 1. 觀察多焦點問句的特性以及焦點之間的關係。 2. 提出多焦點問句的分析準則。 3. 研究自動判斷問句焦點位置的技術。 4. 研究判斷問句焦點之間關係的技術。 (3) 自動分解原問句產生焦點子問句 1. 以人工擬寫每個多焦點問句所能拆解而成的單焦點子問句。 2. 研究由問句句型及焦點間關係自動分解問句的技術。 3. 評估問句分解技術的效能。 Question answering (QA) can be considered as a new generation information retrieval (IR) technique. A traditional search engine only returns a list of relevant documents. The user has to read through the documents in order to find the answers he wants. On the other hand, QA tries to find and reply the answer itself to the user so that it can save lots of time. Besides the applications on the Internet, QA can also be applied to digital libraries or digitized encyclopedias. Users can easily find answers in the huge amount of high quality data. Although some archived digitized data are stored in structural forms such as databases, it is very helpful to provide a natural language interface for retrieving information, i.e. allowing a user to submit questions in his native language. Therefore, question analysis module still plays an important role! During our QA research, we found a new question type: multi-focus questions. In recent QA researches, single-focus questions are widely studied. There are still few studies on multi-focus questions, not to mention the theory to analyze such a kind of questions. This project plans to make a complete study about multi-focus questions in two years, including collecting natural questions, observing and proposing a theory to analyze multi-focus questions, studying techniques to automatically detect question foci and determine the relationships among the foci, as well as the techniques to decompose original questions into several single-focus subquestions automatically. The main goal of the first year is collecting a set of natural language questions and doing research on two-focus questions. The tasks planned in the first year are shown as follows. (1) Collecting a Set of Natural Language Questions 1. Collecting 5000 real questions from websites or web forums 2. Labeling question types manually 3. Deciding numbers of question foci manually 4. Collecting multi-focus questions for further study (2) Preparing Two-Focus Question Set 1. Collecting two-focus questions only 2. Rewriting questions to filter out noise 3. Marking the positions of question foci (3) Analyzing and Automatically Detecting Question Foci 1. Investigating features of foci in two-focus questions and their relationships 2. Proposing a theory to analyze two-focus questions 3. Studying techniques to automatically detect the positions of question foci 4. Studying techniques to automatically detect the relationships among question foci The main goal of the second year is collecting more natural language questions and doing research on multi-focus questions which have more than two question foci. The tasks planned in the second year are shown as follows. (1) Expanding the Set of Natural Language Questions 1. Collecting 5000 new real questions from websites or web forums 2. Labeling question types manually 3. Deciding numbers of question foci manually 4. Collecting multi-focus questions for further study 5. Rewriting questions to filter out noise 6. Marking the positions of question foci (2) Analyzing and Automatically Detecting Question Foci 1. Investigating features of foci in multi-focus questions and their relationships 2. Proposing a theory to analyze multi-focus questions 3. Studying techniques to automatically detect the positions of question foci 4. Studying techniques to automatically detect the relationships among question foci (3) Studying Techniques to Decompose Questions into Single-Focus Subquestions Automatically 1. Manually preparing possible ways to decompose original multi-focus questions into several single-focus subquestions. 2. Studying techniques to do question decomposition automatically. 3. Evaluating the performance of question analysis.
 
Keyword(s)
多焦點問句
問句分析
問句分解
自動問答
multi-focus question
question analysis
question decomposition
questionanswering
 
Explore by
  • Communities & Collections
  • Research Outputs
  • Researchers
  • Organizations
  • Projects
Build with DSpace-CRIS - Extension maintained and optimized by Logo 4SCIENCE Feedback