Skip navigation
  • 中文
  • English

DSpace CRIS

  • DSpace logo
  • Home
  • Research Outputs
  • Researchers
  • Organizations
  • Projects
  • Explore by
    • Research Outputs
    • Researchers
    • Organizations
    • Projects
  • Communities & Collections
  • SDGs
  • Sign in
  • 中文
  • English
  1. National Taiwan Ocean University Research Hub
  2. 電機資訊學院
  3. 資訊工程學系
Please use this identifier to cite or link to this item: http://scholars.ntou.edu.tw/handle/123456789/17875
Title: TOCP: A Dataset for Chinese Profanity Processing.
Authors: Hsu Yang
Chuan-Jie Lin 
Issue Date: May-2020
Publisher: European Language Resources Association (ELRA)
Journal Volume: Proceedings of the Second Workshop on Trolling, Aggression and Cyberbullying
Start page/Pages: 6–12
Abstract: 
This paper introduced TOCP, a larger dataset of Chinese profanity. This dataset contains natural sentences collected from social media sites, the profane expressions appearing in the sentences, and their rephrasing suggestions which preserve their meanings in a less offensive way. We proposed several baseline systems using neural network models to test this benchmark. We trained embedding models on a profanity-related dataset and proposed several profanity-related features. Our baseline systems achieved an F1-score of 86.37% in profanity detection and an accuracy of 77.32% in profanity rephrasing.
URI: http://scholars.ntou.edu.tw/handle/123456789/17875
Appears in Collections:資訊工程學系

Show full item record

Page view(s)

299
Last Week
0
Last month
1
checked on Jun 30, 2025

Google ScholarTM

Check

Related Items in TAIR


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.

Explore by
  • Communities & Collections
  • Research Outputs
  • Researchers
  • Organizations
  • Projects
Build with DSpace-CRIS - Extension maintained and optimized by Logo 4SCIENCE Feedback