DOI QR코드

DOI QR Code

Implement of Semi-automatic Labeling Using Transcripts Text

전사텍스트를 이용한 반자동 레이블링 구현

  • Won, Dong-Jin (School of Electronic Computer, Seokyeong University) ;
  • Chang, Moon-soo (School of Computer science, Seokyeong University) ;
  • Kang, Sun-Mee (School of Electronic Engineering, Seokyeong University)
  • 원동진 (서경대학교 전자컴퓨터학과) ;
  • 장문수 (서경대학교 컴퓨터과학과) ;
  • 강선미 (서경대학교 전자공학과)
  • Received : 2015.06.15
  • Accepted : 2015.08.24
  • Published : 2015.12.25

Abstract

In transcription for spoken language research, labeling is a work linking text-represented utterance to recorded speech. Most existing labeling tools have been working manually. Semi-automatic labeling we are proposing consists of automation module and manual adjustment module. Automation module extracts voice boundaries utilizing G.Saha's algorithm, and predicts utterance boundaries using the number and length of utterance which established utterance text. For maintaining existing manual tool's accuracy, we provide manual adjustment user interface revising the auto-labeling utterance boundaries. The implemented tool of our semi-automatic algorithm speed up to 27% than existing manual labeling tools.

구어 연구를 위한 전사 과정에서 문자로 표현된 발화를 녹음 음성에 연결해주는 작업을 레이블링이라고 한다. 기존 레이블링 도구들은 대부분 수동으로 작업이 이루어진다. 제안하는 반자동 레이블링은 자동화 모듈과 수동 조정 모듈로 구성된다. 자동화 모듈은 G.Saha 알고리즘을 활용하여 음성구간을 추출하고, 기구축된 발화텍스트의 발화 수와 발화의 길이 정보를 이용하여 발화구간을 예측한다. 본 논문에서는 기존 수동 도구의 정확성을 유지하기 위하여 자동 레이블링된 발화구간을 보정하기 위한 수동 조정 사용자 인터페이스를 제공한다. 제안하는 반자동 레이블링 알고리즘으로 구현한 도구는 기존 수동 레이블링 도구와 비교하여 작업 속도가 평균 27% 향상되었다.

Keywords

References

  1. TalkBank, "TalkBank Transcript Browser," Available: http://talkbank.org, [Accessed: Feb 2, 2015].
  2. CHILDES, "CHILDES Transcript Browser," Available: http://childes.psy.cmu.edu/browser, [Accessed: Feb 2, 2015].
  3. Bigi, Brigitte, "SPPAS: a tool for the phonetic segmentations of Speech," The eighth international conference on Language Resources and Evaluation, vol. 8, pp. 1748-1755, 2012.
  4. Sharmistha S. Gray, et al., "Child Automatic Speech Recognition for US English: Child Interaction with Living-Room-Electronic-Devices," WOCCI 2014, poster session, 2014.
  5. Jiyoung Shin et al., "Developing a Korean Standard Speech DB," Journal of the Korean society of speech sciences, vol. 7, no. 1, pp. 139-150, 2015.
  6. CHILDES, "Using CLAN," Available: http://childes.psy.cmu.edu/clan/, [Accessed: Feb 2, 2015].
  7. Claude Barras, et al., "Transcriber: a free tool for segmenting, labeling and transcribing speech." First international conference on language resources and evaluation (LREC). pp. 1373-1376, 1998.
  8. Boersma, P. and Weenink, D., "Praat: doing phonetics by computer," Available: http://www.praat.org, 2009, [Accessed: Feb 2, 2015].
  9. Jongmo Sung and Hyung Soon Kim, "Implemen- tation of the Automatic Speech Segmentation and Labeling System," The Journal of The Acoustical Society of Korea, vol. 16, no. 5, pp. 50-59, 1997.
  10. Kang-Chun So, "A Study on the Method of Computational Processing of Dialectal Sound Data," The Society of Korean Language and Literature, vol. 142, pp. 7-30, 2006.
  11. Sun-dong Kwak and Moon-soo Chang, "CosmoScriBe 2.0 : The development of Korean transcription tools," Journal of Korean Institute of Intelligent Systems, vol. 24, no. 3, pp. 323-329, 2014. https://doi.org/10.5391/JKIIS.2014.24.3.323
  12. G. Saha, Sandipan Chakroborty, and Suman Senapati, "A new silence removal and endpoint detection algorithm for speech and speaker recognition applications," Proceedings of the 11th National Conference on Communications (NCC), pp. 291-295, 2005.
  13. Dong-jin Won and Moon-soo Chang, "An Improvement of Audio controller in Transcription Tool," Proceedings of KIIS Spring Conference, Vol. 22, no. 2, pp. 121-122, 2012.
  14. Donald A. Norman, The Design of Everyday Things, Basic Books, 2002.
  15. Tekla S. Perry, and John Voelcker, "Of mice and menus: designing the user-friendly interface," IEEE Spectrum, vol. 26, no. 9, pp. 46-51, 1989. https://doi.org/10.1109/6.90184