초록
This paper describes the development of the Korean corpus of spontaneous speech, also called the Seoul corpus. The corpus contains the audio recording of the interview-style spontaneous speech from the 40 native speakers of Seoul Korean. The talkers are divided into four age groups; talkers in their teens, twenties, thirties and forties. Each age group has ten talkers, five males and five females. The method used to elicit and record the speech is described. The corpus containing around 220,000 phrasal words was phonemically labeled along with information on the boundaries for Korean phrasal words and utterances, which were additionally romanized. According to the test result of labeling consistency, the inter-labeler agreement on phoneme identification was 98.1% and the mean deviation on boundary placement was 9.04 msec. The corpus will be made available for free to the research community in March, 2015.