Skip navigation
  • 中文
  • English

DSpace CRIS

  • DSpace logo
  • 首頁
  • 研究成果檢索
  • 研究人員
  • 單位
  • 計畫
  • 分類瀏覽
    • 研究成果檢索
    • 研究人員
    • 單位
    • 計畫
  • 機構典藏
  • SDGs
  • 登入
  • 中文
  • English
  1. National Taiwan Ocean University Research Hub
  2. 電機資訊學院
  3. 電機工程學系
請用此 Handle URI 來引用此文件: http://scholars.ntou.edu.tw/handle/123456789/25515
DC 欄位值語言
dc.contributor.authorWang, Jung-Huaen_US
dc.contributor.authorLee, Shih-Kaien_US
dc.contributor.authorWang, Ting-Yuanen_US
dc.contributor.authorChen, Ming-Jeren_US
dc.contributor.authorHsu, Shu-Weien_US
dc.date.accessioned2024-11-01T09:18:05Z-
dc.date.available2024-11-01T09:18:05Z-
dc.date.issued2024/1/1-
dc.identifier.issn2169-3536-
dc.identifier.urihttp://scholars.ntou.edu.tw/handle/123456789/25515-
dc.description.abstractWe present an ensemble learning-based data cleaning approach (touted as ELDC) capable of identifying and pruning anomaly data. ELDC is characterized in that an ensemble of base models can be trained directly with the noisy in-sample data and can dynamically provide clean data during the iterative training. Each base model uses a random subset of the target dataset that may initially contain up to 40% of label errors. Following each training iteration, anomaly data are discriminated against clean ones by a majority voting scheme, and three different types of anomaly (mislabeled, confusing, and outliers) can be identified using a statistical pattern jointly determined by the prediction output of the base models. By iterating such a cycle of train-vote-remove, noisy in-sample data are progressively removed until a prespecified condition is reached. Comprehensive experiments, including out-sample data tests, are conducted to verify the effectiveness of ELDC in simultaneously suppressing bias and variance of the prediction output. The ELDC framework is highly flexible as it is not bound to a specific model and allows different transfer-learning configurations. Neural networks of AlexNet, ResNet50, and GoogleNet are used as based models and trained with various benchmark datasets, the results show that ELDC outperforms state-of-the-art cleaning methods.en_US
dc.language.isoEnglishen_US
dc.publisherIEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INCen_US
dc.relation.ispartofIEEE ACCESSen_US
dc.subjectTrainingen_US
dc.subjectData modelsen_US
dc.subjectCleaningen_US
dc.subjectNoise measurementen_US
dc.subjectImage classificationen_US
dc.subjectComplexity theoryen_US
dc.subjectTraining dataen_US
dc.subjectEnsemble learningen_US
dc.subjectData integrityen_US
dc.subjectTransfer learningen_US
dc.subjectConvolutional neural networksen_US
dc.subjectNoisy dataen_US
dc.subjectensemble learningen_US
dc.subjectdata cleanlineen_US
dc.titleProgressive Ensemble Learning for in-Sample Data Cleaningen_US
dc.typejournal articleen_US
dc.identifier.doi10.1109/ACCESS.2024.3468035-
dc.identifier.isiWOS:001329024200001-
dc.relation.journalvolume12en_US
dc.relation.pages140643-140659en_US
item.openairecristypehttp://purl.org/coar/resource_type/c_6501-
item.cerifentitytypePublications-
item.languageiso639-1English-
item.fulltextno fulltext-
item.grantfulltextnone-
item.openairetypejournal article-
crisitem.author.deptCollege of Electrical Engineering and Computer Science-
crisitem.author.deptDepartment of Electrical Engineering-
crisitem.author.deptNational Taiwan Ocean University,NTOU-
crisitem.author.parentorgNational Taiwan Ocean University,NTOU-
crisitem.author.parentorgCollege of Electrical Engineering and Computer Science-
顯示於:電機工程學系
顯示文件簡單紀錄

Page view(s)

102
checked on 2025/6/30

Google ScholarTM

檢查

Altmetric

Altmetric

TAIR相關文章


在 IR 系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。

瀏覽
  • 機構典藏
  • 研究成果檢索
  • 研究人員
  • 單位
  • 計畫
DSpace-CRIS Software Copyright © 2002-  Duraspace   4science - Extension maintained and optimized by NTU Library Logo 4SCIENCE 回饋