Automated Human Transcription Error Detection Framework for Korean ASR Corpus
Automated Transcription Error Detection for KSponSpeech
- Developed a framework for automatically detecting and correcting human transcription errors in the KSponSpeech corpus using ASR and language model–based validation.
- Presented the work at Korea Computer Congress (KCC) 2023
- A Model-based Method for Automatic Transcription Error Detection in ASR Corpora.
- Won the Best Paper award!
- Published the extended study in the Journal of the Korean Institute of Information Scientists and Engineers (KIISE) (2024).
- Jeongpil Lee, Jeehyun Lee, Yerin Choi, Jaehoo Jang, & Myoung-Wan Koo (2024). An Automated Error Detection Method for Speech Transcription Corpora Based on Speech Recognition and Language Models. Journal of KIISE, 51(4), 362–369. https://doi.org/10.5626/JOK.2024.51.4.362
My Contribution
- Led the design and implementation of the automatic error detection framework
- Tuned model confidence and LM scoring thresholds through extensive validation experiments