http://scholars.ntou.edu.tw/handle/123456789/17875
DC 欄位 | 值 | 語言 |
---|---|---|
dc.contributor.author | Hsu Yang | en_US |
dc.contributor.author | Chuan-Jie Lin | en_US |
dc.date.accessioned | 2021-10-21T01:00:36Z | - |
dc.date.available | 2021-10-21T01:00:36Z | - |
dc.date.issued | 2020-05 | - |
dc.identifier.uri | http://scholars.ntou.edu.tw/handle/123456789/17875 | - |
dc.description.abstract | This paper introduced TOCP, a larger dataset of Chinese profanity. This dataset contains natural sentences collected from social media sites, the profane expressions appearing in the sentences, and their rephrasing suggestions which preserve their meanings in a less offensive way. We proposed several baseline systems using neural network models to test this benchmark. We trained embedding models on a profanity-related dataset and proposed several profanity-related features. Our baseline systems achieved an F1-score of 86.37% in profanity detection and an accuracy of 77.32% in profanity rephrasing. | en_US |
dc.language.iso | en | en_US |
dc.publisher | European Language Resources Association (ELRA) | en_US |
dc.title | TOCP: A Dataset for Chinese Profanity Processing. | en_US |
dc.type | conference paper | en_US |
dc.relation.journalvolume | Proceedings of the Second Workshop on Trolling, Aggression and Cyberbullying | en_US |
dc.relation.pages | 6–12 | en_US |
item.cerifentitytype | Publications | - |
item.openairetype | conference paper | - |
item.openairecristype | http://purl.org/coar/resource_type/c_5794 | - |
item.fulltext | no fulltext | - |
item.grantfulltext | none | - |
item.languageiso639-1 | en | - |
crisitem.author.dept | College of Electrical Engineering and Computer Science | - |
crisitem.author.dept | Department of Computer Science and Engineering | - |
crisitem.author.dept | National Taiwan Ocean University,NTOU | - |
crisitem.author.parentorg | National Taiwan Ocean University,NTOU | - |
crisitem.author.parentorg | College of Electrical Engineering and Computer Science | - |
顯示於: | 資訊工程學系 |
在 IR 系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。