Skip navigation
  • 中文
  • English

DSpace CRIS

  • DSpace logo
  • 首頁
  • 研究成果檢索
  • 研究人員
  • 單位
  • 計畫
  • 分類瀏覽
    • 研究成果檢索
    • 研究人員
    • 單位
    • 計畫
  • 機構典藏
  • SDGs
  • 登入
  • 中文
  • English
  1. National Taiwan Ocean University Research Hub

Interest Detection and Wish-List Suggestion from a Blog Space

瀏覽統計 Email 通知 RSS Feed

  • 簡歷

基本資料

Project title
Interest Detection and Wish-List Suggestion from a Blog Space
Code/計畫編號
NSC96-2221-E019-038-MY2
Translated Name/計畫中文名
由部落格自動偵測興趣並建議購物願望清單
 
Project Coordinator/計畫主持人
Chuan-Jie Lin
Funding Organization/主管機關
National Science and Technology Council
 
Department/Unit
Department of Computer Science and Engineering
Website
https://www.grb.gov.tw/search/planDetail?id=1622444
Year
2008
 
Start date/計畫起
01-08-2008
Expected Completion/計畫迄
31-07-2009
 
Bugetid/研究經費
627千元
 
ResearchField/研究領域
資訊科學--軟體
 

Description

Abstract
部落格 (blog) 已是這幾年來網路上最熱門的話題,近幾年來已有大量的網路使用者投入部落格的建立和寫作風潮之中。而且學術研究中,如何將技術應用在部落格上,獲取有利的資訊,也成為熱門研究題目。畢竟能掌握大多數的網路使用者,就能掌握網路市場先機。 本計畫擬以兩年的時間,研究如何由一個部落格網站中包含的各項資訊,包括部落格主人的個人基本資料、他所發表的在部落格上的文章等等,自動為這位部落格主人建構出他的購物願望清單,亦即他在近期內可能希望購買或有興趣購買的商品清單。這種自動建議願望清單的系統,不論對使用者或是業者都有幫助。而發展過程中所需克服的技術,也對學術研究有所貢獻。 計畫分兩年進行。第一年先建立出含有所有可能商品的資料庫,並整理出適合用以建構願望清單的商品種類。同時考慮何者可稱為人類的興趣,建立出含有所有可能興趣的資料庫。藉由這兩個資料庫裡的資料,以及網際網路或是部落格空間中的大量文件來估計各商品與各興趣之間的關聯性強弱。 第二年的重點則在於由部落格所含各項資訊建議願望清單的工作,包含由部落格文章中判斷所得之部落客個人的興趣、在部落格文章中出現次數頻繁的各商品等等資訊,對每一項商品評估它的願望分數,排名得到建議給使用者的購物願望清單。 第一年的工作項目如下: (1) 建立商品資料庫 1. 蒐集線上拍賣網站所提供的商品種類。 2. 研究哪些商品適合出現在願望清單中。 3. 研究如何改寫不適當或不完整的商品名稱。 4. 將商品資料儲存成資料庫。 5. 將以上步驟發展為自動判斷方式。 6. 評估自動判斷的效能。 (2) 建立興趣資料庫 1. 蒐集線上網站使用者的基本資料。 2. 抽取出基本資料中與「興趣」相關的欄位。 3. 觀察興趣描述,研究出「興趣」的定義為何。 4. 自動分析興趣欄位文字,建立出興趣成資料庫。 5. 評估興趣抽取效能。 (3) 商品與興趣間關聯性的建立 1. 統計各種商品和各種興趣在網際網路中同時出現的相關程度。 2. 設計商品與興趣之間的關聯性強弱公式。 第二年的工作項目如下: (1) 由文章內文判斷作者興趣 1. 由部落客的個人基本資料偵測其個人興趣。 2. 蒐集這位部落客在部落格裡發表的文章。 3. 由部落格文章內文,透過字串比對,偵測其個人興趣。 4. 由部落格文章內文,透過資訊檢索相關性判斷,偵測其個人興趣。 (2) 由部落格所含各資訊建議購物願望清單 1. 偵測部落客個人興趣。 2. 統計這位部落客在部落格裡發表的文章中,各商品的出現頻率。 3. 設計評估一位部落客對於各商品購買願望指數的公式。 4. 對各項商品評分,依分數高低建構出願望清單。 (3) 設計評估方法與建立評估環境 1. 設計願望清單成功與否的評估項目。 2. 由研究人員觀點,評估系統所提之願望清單的完整性與接受度。 3. 發出問卷調查,由真實部落客的個人觀點,評估系統所提之願望清單的完整性與接受度。 Blog (weblog) has been a hot topic in recent years. The number of users building their own blogs has grown rapidly. It is also hot to apply technologies on blogs to develop useful systems. Large amount of users and huge amount of data means a great opportunity in the market on the Internet. This two-year project plans to study how to automatically suggest a wish list for a blogger by all kinds of information available in his blog, including personal profile, contents of articles in his blog, and so on. A wish list is a list of products for which he may like to buy recently. Such a wish-list-suggestion system is useful both for bloggers and companies. Research for developing such a system is also helpful for researchers. This project will take two years. In the first year, a database of products will be built, in which inappropriate products will be discarded. Furthermore, a database of human interests will also be built, considering what can be defined as a 「human interest」. For each pair of products and interests, the strength of their association will be measured. In the second year, a wish-list-suggesting system will be built. Information used in this system includes the detected interests of a blogger, the frequencies of products mentioned in his blog articles, and so on. The system will make a suggested wish list according to the desire scores of products. The tasks which will be investigated in the first year are shown as follows. (1) Build a database of products 1. Collect product types from an online bidding website. 2. Investigate which products are appropriate to appear in a wish list. 3. Study how to rewrite inappropriate or incomplete product tokens. 4. Build a database to store product data. 5. Develop an automatic method to do the above steps. 6. Evaluate the performance of the automatic method. (2) Build a database of human interests 1. Collect profiles of users in a website on the Internet. 2. Extract contents in the profiles related to human interests. 3. Investigate the descriptions of interests. Define 「interest」. 4. Analyze texts in the user profiles automatically. Extracts possible human interests and save them into a database. 5. Evaluate the performance of the automatic method. (3) Build the association between products and human interests 1. Measure the co-occurrence of a product and an interest by using a large corpus. 2. Design a function to estimate the strength of the association of a product and an interest. The tasks which will be investigated in the second year are shown as follows. (1) Detect the interests of a blogger 1. Detect the interests of a blogger from his profile. 2. Collect the articles of this blogger in his blog. 3. Detect the interests of a blogger from the content of his articles using keyword matching. 4. Detect the interests of a blogger from the content of his articles using IR relevance measurement. (2) Suggest a wish list according to the information extracted from a blog 1. Detect the interests of a blogger. 2. Measure the frequency of each product appearing in his blog. 3. Design a function to estimate the desire score of a product to a blogger. 4. Score every product. Rank the product and finalize a wish list. (3) Design evaluation methods 1. Design evaluation aspects to see how good a wish list is. 2. From a researcher』s point of view, evaluate the completeness and appropriateness of a wish list. 3. From a blogger』s point of view, evaluate the completeness and appropriateness of a wish list via a questionnaire.
 
Keyword(s)
部落格
願望清單
興趣偵測
自然語言處理
blog
wish list
interest detection
natural language processing
 
瀏覽
  • 機構典藏
  • 研究成果檢索
  • 研究人員
  • 單位
  • 計畫
DSpace-CRIS Software Copyright © 2002-  Duraspace   4science - Extension maintained and optimized by NTU Library Logo 4SCIENCE 回饋