• 제목/요약/키워드: Korean Standard Speech Database

검색결과 15건 처리시간 0.018초

음성 DB의 메타데이타 표준화 (Meta-data Standardization of Speech Database)

  • 김상훈
    • 대한음성학회:학술대회논문집
    • /
    • 대한음성학회 2003년도 10월 학술대회지
    • /
    • pp.61-64
    • /
    • 2003
  • In this paper, we introduce a new description method of annotation information of speech database. As one of structured description methods, XML based description which has been standardized by W3C will be applied to represent metadata of speech database. It will be continuously revised through the speech technology standard forum during this year

  • PDF

음성 DB 부가 정보 기술방안 표준화를 위한 제안 (Standardization for Annotation Information Description of Speech Database)

  • 김상훈;이영직;한민수
    • 대한음성학회지:말소리
    • /
    • 제47호
    • /
    • pp.109-120
    • /
    • 2003
  • This paper presents about the activities of speech database standardization in ETRI. Recently, with the support of government, ETRI and SiTEC have been gathering the large speech corpus for the domestic speech related companies. First, due to the lack of sharing the knowledge of speech database specification, the distributed speech database has a different format. Hence it seems to be needed to have the same format as soon as possible. ETRI and SiTEC are trying to find the better representation format of speech database. Second, we introduce a new description method of the annotation information of speech database. As one of the structured description method, XML based description will be applied to represent the metadata of the speech database. It will be continuously revised through the speech technology standard forum during this year.

  • PDF

한국인 표준 음성 DB 구축(II) (Developing a Korean standard speech DB (II))

  • 신지영;김경화
    • 말소리와 음성과학
    • /
    • 제9권2호
    • /
    • pp.9-22
    • /
    • 2017
  • The purpose of this paper is to report the whole process of developing Korean Standard Speech Database (KSS DB). This project is supported by SPO (Supreme Prosecutors' Office) research grant for three years from 2014 to 2016. KSS DB is designed to provide speech data for acoustic-phonetic and phonological studies and speaker recognition system. For the samples to represent the spoken Korean, sociolinguistic factors, such as region (9 regional dialects), age (5 age groups over 20) and gender (male and female) were considered. The goal of the project is to collect over 3,000 male and female speakers of nine regional dialects and five age groups employing direct and indirect methods. Speech samples of 3,191 speakers (2,829 speakers and 362 speakers using direct and indirect methods, respectively) are collected and databased. KSS DB designs to collect read and spontaneous speech samples from each speaker carrying out 5 speech tasks: three (pseudo-)spontaneous speech tasks (producing prolonged simple vowels, 28 blanked sentences and spontaneous talk) and two read speech tasks (reading 55 phonetically and phonologically rich sentences and reading three short passages). KSS DB includes a 16-bit, 44.1kHz speech waveform file and a orthographic file for each speech task.

산업용 음성 DB를 위한 XML 기반 메타데이터 (XML Based Meta-data Specification for Industrial Speech Databases)

  • 주영희;홍기형
    • 대한음성학회지:말소리
    • /
    • 제55권
    • /
    • pp.77-91
    • /
    • 2005
  • In this paper, we propose an XML based meta-data specification for industrial speech databases. Building speech databases is very time-consuming and expensive. Recently, by the government supports, huge amount of speech corpus has been collected as speech databases. However, the formats and meta-data for speech databases are different depending on the constructing institutions. In order to advance the reusability and portability of speech databases, a standard representation scheme should be adopted by all speech database construction institutions. ETRI proposed a XML based annotation scheme [51 for speech databases, but the scheme has too simple and flat modeling structure, and may cause duplicated information. In order to overcome such disadvantages in this previous scheme, we first define the speech database more formally and then identify object appearing in speech databases. We then design the data model for speech databases in an object-oriented way. Based on the designed data model, we develop the meta-data specification for industrial speech databases.

  • PDF

한국인 표준 음성 DB 구축 (Developing a Korean Standard Speech DB)

  • 신지영;장혜진;강연민;김경화
    • 말소리와 음성과학
    • /
    • 제7권1호
    • /
    • pp.139-150
    • /
    • 2015
  • The data accumulated in this database will be used to develop a speaker identification system. This may also be applied towards, but not limited to, fields of phonetic studies, sociolinguistics, and language pathology. We plan to supplement the large-scale speech corpus next year, in terms of research methodology and content, to better answer the needs of diverse fields. The purpose of this study is to develop a speech corpus for standard Korean speech. For the samples to viably represent the state of spoken Korean, demographic factors were considered to modulate a balanced spread of age, gender, and dialects. Nine separate regional dialects were categorized, and five age groups were established from individuals in their 20s to 60s. A speech-sample collection protocol was developed for the purpose of this study where each speaker performs five tasks: two reading tasks, two semi-spontaneous speech tasks, and one spontaneous speech task. This particular configuration of sample data collection accommodates gathering of rich and well-balanced speech-samples across various speech types, and is expected to improve the utility of the speech corpus developed in this study. Samples from 639 individuals were collected using the protocol. Speech samples were collected also from other sources, for a combined total of samples from 1,012 individuals.

한국어 후설 고·중모음에 대한 사회음성학적 연구 (A sociophonetic study on high/mid back vowels in Korean)

  • 이향원;신우봉;신지영
    • 말소리와 음성과학
    • /
    • 제9권2호
    • /
    • pp.39-51
    • /
    • 2017
  • The current study aims to investigate the effect of sociolinguistic factors such as region, generation and gender on the acoustic properties of Korean high and mid back vowels. We analyzed the vowel productions of one hundred twenty-eight subjects from the Korean Standard Speech Database, chosen to represent the different possible combinations of region, generation, and gender. The results reveal a chain-like shift in the back vowels. Unlike previous studies that have reported /o/-/u/ becoming closer as a result of a decreasing F1 in /o/, we found that the distance between the two vowels is decided more by the changing F2 in /u/. Also, the F2 of /u/ and /ɯ/, and the F2 of /ʌ/ and F1 of /o/ appear to move in tandem. Lastly, this study suggests that the reason the vowel changes differ across gender and regional dialects could be because they are all converging on to the standard Korean.

한국어 발화 속도의 지역, 성별, 세대에 따른 특징 연구 (Speech rate in Korean across region, gender and generation)

  • 이나라;신지영;유도영;김경화
    • 말소리와 음성과학
    • /
    • 제9권1호
    • /
    • pp.27-39
    • /
    • 2017
  • This paper deals with how speech rate in Korean is affected by the sociolinguistic factors such as region, gender and generation. Speech rate was quantified as articulation rate (excluding physical pauses) and speaking rate (including physical pauses), both expressed as the number of syllables per second (sps). Other acoustic measures such as pause frequency and duration were also examined. Four hundred twelve subjects were chosen from Korean Standard Speech Database considering their age, gender and region. The result shows that generation has a significant effect on both speaking rate and articulation rate. Younger speakers produce their speech with significantly faster speaking rate and articulation rate than older speakers. Mean duration of total pause interval and the total number of pause of older speakers are also significantly different to those of younger speakers. Gender has a significant effect only on articulation rate, which means male speakers' speech rate is characterized by faster articulation rate, longer and more frequent pauses. Finally, region has no effect both on speaking and articulation rates.

모음 연장 발성이 보이는 연령대별 음향음성학적 특성 연구 (Acoustic characteristics of the sustained vowel phonation according to age groups)

  • 서윤정;신지영
    • 말소리와 음성과학
    • /
    • 제10권4호
    • /
    • pp.67-76
    • /
    • 2018
  • This study was performed to investigate acoustic characteristics of sustained vowels produced by Seoul Korean speakers. For this study, three hundred nine healthy adults were chosen as participants from Korean Standard Speech Database. These subjects were divided into five chronological age groups (20s, 30s, 40s, 50s, 60-70s) and two gender groups (male and female). Fundamental frequency (f0), jitter, shimmer, and NHR (noise-to-harmonics ratio) was measured with 8 Korean vowels (/ɑ/, /æ/, /ʌ/, /e/, /o/, /u/, /ɯ/, /i/) by using Praat. The results showed that the vowel type significantly affected all acoustic parameters. Gender affected f0, jitter, and NHR significantly. The mean female speakers' f0 was greater than the males', and the mean jitter and NHR of male speakers was greater than the females'. Moreover, age affected shimmer and NHR significantly; in particular, the shimmer and NHR of elderly speakers was greater than the young speakers.

한국어 음가를 한글 표기로 변환하는 표준규칙 제정 (Establishment of the Korean Standard Vocal Sound into Character Conversion Rule)

  • 이계영;임재걸
    • 전자공학회논문지CI
    • /
    • 제41권2호
    • /
    • pp.51-64
    • /
    • 2004
  • 표기 체계인 한글을 한국어 음가로 변환하는 음운변동 규칙을 역으로 적용하여, 한국어 음가를 한국어를 표기하는 문자 체계인 한글로 전환시키는 규칙을 고안하는 것이 본 연구의 목표이다. 제정된 규칙은 한국어 음성인식에 있어서 매우 귀중한 역할을 담당한다. 일반적인 음성인식 기법은 수회의 학습과정을 통하여 추출된 음성의 표준패턴과 인식 대상으로 입력된 음성을 비교하여 가장 유사한 패턴을 찾는 방법을 사용한다. 이 때 표준 음성패턴이 띄어쓰기 단위의 어절이라면 수백만 개의 표준 패턴이 수록되어야 하므로 표준패턴을 위한 방대한 데이터베이스의 구축은 물론 표준패턴과의 비교 회수도 너무 많아져서 실용화가 불가능하다. 이에 대한 대안인 음절 단위 인식의 경우는 인식된 음가가 실제의 한글 표기와 맞지 않으므로, 인식된 결과를 출력할 때에 실제의 한글표기로 변환해 주어야 하는 과제를 안게 된다. 이 과제를 해결하는 과정, 즉 일련의 한국어 음가들을 일련의 한글 표기 문자로 바꾸어 주는 과정에서는 본 논문에서 제안한 표준 한국어 음가 - 표기 문자 변환 규칙을 적용할 수 있을 것이다. 본 논문에서는 새롭게 제안된 표준 한글 음가-표기 문자 변환 규칙을 사용하여 한국어 음가를 한글 표기로 변환하는 시스템을 구현하였다. 그리고, 고안된 규칙의 무결성을 보이기 위하여 표준 발음규칙 30항을 반영하는 데이터 집합을 이용하여 구현된 시스템을 시험하였으며, 그 실험 결과를 제시한다.

성별에 따른 한국 정상 성인 음성의 음향학적 평가 기준치 (Acoustic Characteristics of the Voices of Korean Normal Adults by Gender on MDVP)

  • 김재옥
    • 말소리와 음성과학
    • /
    • 제1권4호
    • /
    • pp.147-157
    • /
    • 2009
  • The purpose of the study is to develop the normal voice database and to analyze the acoustic characteristics of Korean adults' voices by gender using MDVP. Eight categories in the 34 parameters of MDVP were analyzed in the voices of 170 Korean normal adults taken from /a/ vowel. Among them, Fundamental Frequency Parameters and Frequency Perturbation Parameters were significantly different by gender. In addition, Fundamental Frequency Parameters of our data were remarkably different from the data suggested in the MDVP program which currently used in clinics. Therefore, the data obtained from the current study can be effectively used for the diagnosis of voice disorders of Korean adults as the standard parameter values of MDVP.

  • PDF