Skip navigation
  • 中文
  • English

DSpace CRIS

  • DSpace logo
  • Home
  • Research Outputs
  • Researchers
  • Organizations
  • Projects
  • Explore by
    • Research Outputs
    • Researchers
    • Organizations
    • Projects
  • Communities & Collections
  • SDGs
  • Sign in
  • 中文
  • English
  1. National Taiwan Ocean University Research Hub

Ellipsis and Anaphora Resolution in the Dialogue Module of the Computerized Virtual Patient

View Statistics Email Alert RSS Feed

  • Information

Details

Project title
Ellipsis and Anaphora Resolution in the Dialogue Module of the Computerized Virtual Patient
Code/計畫編號
NSC99-2511-S019-003
Translated Name/計畫中文名
系統化虛擬病人之建置以偵測醫學生之臨床診斷缺失---電腦化虛擬病人對話模組中省略與代名詞指涉現象之處理
 
Project Coordinator/計畫主持人
Chuan-Jie Lin
Funding Organization/主管機關
National Science and Technology Council
 
Department/Unit
Department of Computer Science and Engineering
Website
https://www.grb.gov.tw/search/planDetail?id=2141782
Year
2010
 
Start date/計畫起
01-08-2010
Expected Completion/計畫迄
31-07-2011
 
Bugetid/研究經費
600千元
 
ResearchField/研究領域
醫學工程
 

Description

Abstract
本整合型計畫的目的就是建立一個電腦化虛擬病人 (computerized virtual patient, CVP) 系統,屬於醫學領域的PBL教學平台。電腦化虛擬病人系統備有許多建置好的虛擬病人教學方案,每個教案包含病史詢答的各問題及其回答,各種理學檢查與檢驗檢查的結果等。由電腦系統擔任虛擬病人的角色,學生可用平日所使用的自然語言,透過對話平台來問診,系統則回覆教案中所準備的答案來讓學生推敲病情。系統並會記錄學生發問問題的歷程,可用來判斷推理思路上是否有需要改進的缺失,以期能減輕教師教學上的負擔,和省去訓練標準病人的成本。 本子計畫擬針對電腦化虛擬病人的對話模組,提出自動偵測省略現象和代名詞指涉現象的方法,並判斷被省略的資訊,以及代名詞所指稱的對象為何。預計進行時間為三年。 第一年的主要工作重點是擺在實驗環境的建立,包括建立實驗資料、人工標記代名詞指涉及省略現象訓練資料集、研究並分析代名詞指涉及省略現象的特徵,並且打算先以代名詞指涉現象為主,初步提出解決指涉的判斷特徵,再以規則式或是機器學習的方式來設計代名詞指涉解決模組。 第二年的主要工作重點是擺在解決省略現象所需的工作,包含分析醫療問診對話中出現的省略現象,以知識本體、概念關聯及標準對話句型提供之特徵處理省略現象,再以規則式或是機器學習的方式來設計省略現象解決模組。 第三年的工作重點是為處理前兩年所提策略中尚無法解決的例外情形,針對判斷代名詞指涉對象、偵測省略現象出現、以及判斷被省略資訊等,提出可能的策略,並做省略與代名詞指涉處理對於虛擬病人系統的整體效能影響評估 第一年的工作項目如下: (1) 準備省略及代名詞指涉現象的訓練資料集 1. 設計省略及代名詞指涉現象標記界面 2. 訓練省略及代名詞指涉現象標記人員 3. 進行省略及代名詞指涉現象標記工作 (2) 偵測代名詞指涉現象 1. 蒐集醫療問診中常見代名詞列表 2. 準備各代名詞之屬性值 3. 撰寫偵測文句中代名詞出現位置之程式 (3) 設計代名詞指涉處理程式 1. 蒐集代名詞指涉對象之語意類型 2. 建立代名詞指涉對象類型詞彙列表 3. 撰寫判斷代名詞指涉對象之程式 第二年的工作項目如下: (1) 建立醫療問診知識本體 (MDDO) 1. 蒐集被省略資訊之語意類型做為醫療問診知識本體之概念與關係 2. 由語言資源蒐集醫療問診知識本體之概念與關係詞彙 3. 人工修正錯誤 (2) 自動抽取文句所含概念關聯 1. 準備概念關聯抽取模組訓練資料 2. 撰寫概念關聯抽取模組程式 (3) 學習解決省略現象之特徵 1. 由訓練資料集學習概念關聯相關特徵 2. 學習標準對話句型相關特徵 3. 由訓練資料集學習與概念關聯相關之省略資訊判斷規則 (4) 設計省略現象處理程式 1. 設計省略現象偵測程式 2. 改善省略現象偵測程式 3. 設計省略資訊判斷程式 4. 改善省略資訊判斷程式 第三年的工作項目如下: (1) 針對短問句設計文法剖析系統 1. 由樹語料庫蒐集問句之語法樹 2. 設計文法剖析系統 3. 改善文法剖析系統 (2) 改善省略現象處理效能 1. 由訓練資料集學習語法相關特徵 2. 由訓練資料集學習語法相關之省略資訊判斷規則 3. 改善省略資訊判斷規則 (3) 改善代名詞指涉處理效能 1. 設計淺層文法剖析系統 2. 由訓練資料集學習語法相關之代名詞指涉判斷特徵 3. 改善代名詞指涉判斷程式 (4) 對虛擬病人系統效能影響之整體評估 1. 設計虛擬病人系統效能問卷 2. 驗證省略現象與代名詞指涉處理之效能改進 The aim of this integrated project is to build a computerized virtual patient (CVP) system. It is a problem-based learning (PBL) method in the medical domain. CVP contains many teaching materials including sets of questions and answers simulating the dialogues between a doctor and a patient. The system will act as a virtual patient and the user is as a doctor. The user can use his natural language to ask questions and the system will reply answers prepared in the teaching materials so that the user can use the information to figure out the disease this virtual patient may have. The system will also record the dialogue flow to detect the weakness of the user’s diagnostic ability. Such a system can decrease the burdens of teachers and reduce the cost of training virtual patients. The main goal of this sub-project is to develop techniques to do ellipsis and anaphora resolution to find the elided information and the antecedents of the anaphors. The goals are planned to be finished in three years. The main tasks of the first year are the work of building training data sets, annotation of ellipsis and anaphora, feature study for ellipsis and anaphora resolution, and the anaphora resolution system designed by machine learning or hand-made rules. The main tasks of the second year are focused on ellipsis resolution, including analysis of ellipsis in the clinic diagnostic dialogue, feature study on ontology, concept relations, and question patterns for ellipsis resolution, and the ellipsis resolution system designed by machine learning or hand-made rules. The main tasks of the third year are focused on those exceptional cases which cannot be handled in the first two years. Possible strategies will be proposed for ellipsis and anaphora resolution. The overall performance of the computerized virtual patient obtained by ellipsis and anaphora resolution will also be evaluated in this year. The scheduled tasks in the first year are shown as follows. (1) Preparing training data for ellipsis and anaphora resolution 1. Building annotation systems for ellipsis and anaphora analysis 2. Training annotators to ensure their understanding of ellipsis and anaphora resolution 3. Ellipsis and anaphora annotation (2) Detecting occurrences of anaphors 1. Collecting Chinese pronouns from a lexicon which are frequently seen in diagnostic dialogues 2. Preparing attributes of pronouns 3. Writing a program to find pronouns in a sentence (3) Building an anaphora resolution module 1. Collecting semantic categories of the antecedents of the anaphors 2. Building lists of terms in the categories of the antecedents of the anaphors 3. Writing a program to find anaphora candidates in previous sentences and determine the correct ones The scheduled tasks in the second year are shown as follows. (1) Building the Medical Diagnostic Dialogue Ontology (MDDO) 1. Collecting semantic categories of the antecedents of the ellipses as concepts and relationships in MDDO 2. Collecting terms of concepts and relationships from online thesauri such as MeSH or WordNet 3. Error correction manually (2) Extracting concept relations in sentences 1. Preparing training data of concept relation extraction 2. Writing a program to find occurrences of concepts and relationships in a sentence (3) Learning features for ellipsis resolution 1. Learning ellipsis detection features by concept relations from the training data 2. Learning ellipsis detection features by standard dialogue patterns 3. Learning ellipsis resolution rules by concept relations from the training data (4) Building an ellipsis resolution module 1. Writing a program to detect ellipses 2. Revising ellipsis detection methods 3. Writing a program to find the antecedents of ellipses 4. Revising ellipsis resolution rules The scheduled tasks in the third year are shown as follows. (1) Building a syntactic parser for short questions 1. Collecting parsing trees of interrogative sentences from a treebank 2. Building a syntactic parser 3. Revising parsing strategies (2) Improving ellipsis resolution performance 1. Learning ellipsis detection features by syntactic information from the training data 2. Learning ellipsis resolution features by syntactic information from the training data 3. Revising ellipsis resolution methods (3) Improving anaphora resolution performance 1. Building a syntactic shallow parser 2. Learning anaphora resolution features by syntactic information from the training data 3. Revising anaphora resolution methods (4) Overall evaluation in the Computerized Virtual Patient system 1. Designing a questionnaire in the virtual patient system 2. Verifying the correctness of ellipsis and anaphora resolution
 
Keyword(s)
虛擬病人
問診對話
省略現象
代名詞指涉現象
自然語言處理
virtual patient
clinic diagnostic dialogue
ellipsis
anaphora
natural language processing
 
Explore by
  • Communities & Collections
  • Research Outputs
  • Researchers
  • Organizations
  • Projects
Build with DSpace-CRIS - Extension maintained and optimized by Logo 4SCIENCE Feedback