Skip navigation
  • 中文
  • English

DSpace CRIS

  • DSpace logo
  • Home
  • Research Outputs
  • Researchers
  • Organizations
  • Projects
  • Explore by
    • Research Outputs
    • Researchers
    • Organizations
    • Projects
  • Communities & Collections
  • SDGs
  • Sign in
  • 中文
  • English
  1. National Taiwan Ocean University Research Hub

Ellipsis and Co-Reference Resolution in the Dialogue Module of the Computerized Virtual Patient

View Statistics Email Alert RSS Feed

  • Information

Details

Project title
Ellipsis and Co-Reference Resolution in the Dialogue Module of the Computerized Virtual Patient
Code/計畫編號
NSC100-2221-E019-062
Translated Name/計畫中文名
電腦化虛擬病人對話模組中省略與同指涉現象之處理
 
Project Coordinator/計畫主持人
Chuan-Jie Lin
Funding Organization/主管機關
National Science and Technology Council
 
Department/Unit
Department of Computer Science and Engineering
Website
https://www.grb.gov.tw/search/planDetail?id=2339546
Year
2011
 
Start date/計畫起
01-08-2011
Expected Completion/計畫迄
31-07-2012
 
Bugetid/研究經費
505千元
 
ResearchField/研究領域
資訊科學--軟體
 

Description

Abstract
本計畫原屬於一個整合型計畫,目的是建立電腦化虛擬病人 (computerized virtual patient, CVP) 系統,屬於醫學領域的PBL (problem-based learning) 教學平台。電腦化虛擬病人系統備有許多建置好的虛擬病人教學方案,每個教案包含病史詢答的各問題及其回答,各種理學檢查與檢驗檢查的結果等。由電腦系統擔任虛擬病人的角色,學生可用平日所使用的自然語言,透過對話平台來問診,系統則回覆教案中所準備的答案來讓學生推敲病情。系統並會記錄學生發問問題的歷程,可用來判斷推理思路上是否有需要改進的缺失,以期能減輕教師教學上的負擔,和省去訓練標準病人的成本。 本計畫是針對電腦化虛擬病人的對話模組,提出自動偵測省略現象和同指涉現象的方法,並判斷被省略的資訊,以及指涉字串所指稱的對象為何。目前正以整合型計畫方式進行中,未來則預計繼續兩年的研究。 本計畫第一年的主要工作是利用文字的字面資訊來提出解決省略與同指涉現象的方法,包含分析省略或指涉現象出現時的詞性序列、醫療問診對話中常見的句型、省略位置或指涉字串與候選者之間的距離、指涉字串與候選者的相似度等等,再以規則式或是機器學習的方式來設計省略與同指涉現象解決模組。 本計畫第二年的主要工作是利用知識本體所提供的資訊來提出解決省略與同指涉現象的方法,包含建立知識本體,抽取問句中出現的概念關聯、以知識本體、概念關聯提供之特徵處理省略或同指涉現象,再以規則式或是機器學習的方式來設計解決模組。 第一年的工作項目如下: (1) 分析指涉字串之詞性序列 1. 以斷詞及詞性標記系統對實驗資料斷詞並標上詞性 2. 統計指涉字串的詞性序列出現頻率 3. 研究這些詞性序列與同指涉的關係 (2) 分析省略現象與詞性的關係 1. 以斷詞及詞性標記系統對實驗資料斷詞並標上詞性 2. 統計省略現象出現時,各種詞性及詞性序列的出現頻率及互斥性 3. 研究這些詞性與省略的關係 (3) 學習問診句型 1. 對訓練資料集標記問診問句類型 2. 依各種字面或詞性序列線索學習句型 3. 評估句型的可信度 (4) 以問診問句句型偵測省略與同指涉現象的出現 1. 設計句型比對模組 2. 整合句型與各種判斷特徵 3. 決定省略與同指涉的出現位置 (5) 研究機器學習可用特徵 1. 設計各類型可用特徵 2. 觀察各特徵和省略與同指涉的關係 3. 設計自動機器學習系統 (6) 效能評估 第二年的工作項目如下: (1) 建立醫療問診知識本體 (MDDO) 1. 蒐集被省略資訊之語意類型做為醫療問診知識本體之概念與關係 2. 由語言資源蒐集醫療問診知識本體之概念與關係詞彙 3. 人工修正錯誤 (2) 自動抽取文句所含概念關聯 1. 準備概念關聯抽取模組訓練資料 2. 撰寫概念關聯抽取模組程式 (3) 學習解決省略現象之特徵 1. 由訓練資料集學習概念關聯相關特徵 2. 由訓練資料集學習與概念關聯相關之省略資訊判斷規則 (4) 學習解決同指涉現象之特徵 1. 由訓練資料集學習概念關聯相關特徵 2. 由訓練資料集學習與概念關聯相關之同指涉資訊判斷規則 This project is originally a part of an integrated project whose aim is to build a computerized virtual patient (CVP) system. It is a problem-based learning (PBL) method in the medical domain. A CVP system contains many teaching materials including sets of questions and answers simulating the dialogues between a doctor and a patient. The system will act as a virtual patient and the user as a doctor. The user can ask questions in natural language and the system will reply answers prepared in the teaching materials so that the user can use the information to figure out what disease this virtual patient suffers. The system will also record the dialogue flowchart to detect the weakness of the user’s diagnostic ability. Such a system can decrease the burdens of teachers and reduce the cost of training virtual patients. The main goal of this project is to develop techniques to do ellipsis and co- resolution to recover the elided information and the referred antecedents. This project is in progress together with other sub-projects in the integrated project. More works are planned to be finished in the future two years. The main tasks of the first year are focused on designing ellipsis and co-reference resolution systems by surface information. Possible tasks include analyzing POS information related to ellipsis and co-reference, learning frequently-used patterns in medical diagnostic dialogue, and choosing features according to the similarity, proximity and semantics, and designing an ellipsis resolution system by machine learning or hand-made rules. The main tasks of the second year are focused on ellipsis resolution, including analysis of ellipsis in the clinic diagnostic dialogue, studying features for ellipsis resolution by using ontology, concept relations, and question patterns, and designing an ellipsis resolution system by machine learning or hand-made rules. The scheduled tasks in the second year are as follows. (1) Analyzing POS sequences of referents 1. Performing word segmentation and POS tagging on the experimental data 2. Calculate the frequencies of different POS sequences 3. Studying the usefulness of the POS sequences (2) Analyzing words and their POSes in an ellipsis-occurring sentence 1. Performing word segmentation and POS tagging on the experimental data 2. Calculate the frequencies of POSes 3. Studying the usefulness of the POSes (3) Learning frequently-used patterns in medical diagnostic dialogue 1. Tagging diagnostic intention type on the experimental data 2. Learning patterns according to the surface and POS similarities 3. Studying the usefulness of the patterns (4) Detecting ellipsis and co-reference by frequently-used patterns 1. Designing the pattern-matching module 2. Integrate patterns with surface and POS features 3. Detecting ellipsis and co-reference (5) Learning features for ellipsis and co-reference resolution 1. Learning detection features from the training data 2. Learning resolution rules from the training data 3. Developing classifiers by machine learning (6) Performance evaluation The scheduled tasks in the second year are as follows. (1) Building the Medical Diagnostic Dialogue Ontology (MDDO) 1. Collecting semantic categories of the antecedents of the ellipses as concepts and relationships in MDDO 2. Collecting terms of concepts and relationships from online thesauri such as MeSH or WordNet 3. Error correction manually (2) Extracting concept relations in sentences 1. Preparing training data of concept relation extraction 2. Writing a program to find occurrences of concepts and relationships in a sentence (3) Learning features for ellipsis resolution 1. Learning ellipsis detection features by concept relations from the training data 2. Learning ellipsis resolution rules by concept relations from the training data (4) Learning features for co-reference resolution 1. Learning co-reference detection features by concept relations from the training data 2. Learning co-reference resolution rules by concept relations from the training data
 
Keyword(s)
虛擬病人
問診對話
省略現象
同指涉現象
自然語言處理
virtual patient
clinic diagnostic dialogue
ellipsis
co-reference
natural language processing
 
Explore by
  • Communities & Collections
  • Research Outputs
  • Researchers
  • Organizations
  • Projects
Build with DSpace-CRIS - Extension maintained and optimized by Logo 4SCIENCE Feedback