• Title/Summary/Keyword: speech technology

Search Result 1,900, Processing Time 0.03 seconds

Analysis of the Korean Tokenizing Library Module (한글 토크나이징 라이브러리 모듈 분석)

  • Lee, Jae-kyung;Seo, Jin-beom;Cho, Young-bok
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2021.05a
    • /
    • pp.78-80
    • /
    • 2021
  • Currently, research on natural language processing (NLP) is rapidly evolving. Natural language processing is a technology that allows computers to analyze the meanings of languages used in everyday life, and is used in various fields such as speech recognition, spelling tests, and text classification. Currently, the most commonly used natural language processing library is NLTK based on English, which has a disadvantage in Korean language processing. Therefore, after introducing KonLPy and Soynlp, the Korean Tokenizing libraries, we will analyze morphology analysis and processing techniques, compare and analyze modules with Soynlp that complement KonLPy's shortcomings, and use them as natural language processing models.

  • PDF

Research on PEFT Feasibility for On-Device Military AI (온 디바이스 국방 AI를 위한 PEFT 효용성 연구)

  • Gi-Min Bae;Hak-Jin Lee;Sei-Ok Kim;Jang-Hyong Lee
    • Proceedings of the Korean Society of Computer Information Conference
    • /
    • 2024.01a
    • /
    • pp.51-54
    • /
    • 2024
  • 본 논문에서는 온 디바이스 국방 AI를 위한 효율적인 학습 방법을 제안한다. 제안하는 방법은 모델 전체를 재학습하는 대신 필요한 부분만 세밀하게 조정하여 계산 비용과 시간을 대폭 줄이는 PEFT 기법의 LoRa를 적용하였다. LoRa는 기존의 신경망 가중치를 직접 수정하지 않고 추가적인 낮은 랭크의 매트릭스를 학습하는 방식으로 기존 모델의 구조를 크게 변경하지 않으면서도, 효율적으로 새로운 작업에 적응할 수 있다. 또한 학습 파라미터 및 연산 입출력에 데이터에 대하여 32비트의 부동소수점(FP32) 대신 부동소수점(FP16, FP8) 또는 정수형(INT8)을 활용하는 경량화 기법인 양자화도 적용하였다. 적용 결과 학습시 요구되는 GPU의 사용량이 32GB에서 5.7GB로 82.19% 감소함을 확인하였다. 동일한 조건에서 동일한 데이터로 모델의 성능을 평가한 결과 동일 학습 횟수에선 LoRa와 양자화가 적용된 모델의 오류가 기본 모델보다 53.34% 증가함을 확인하였다. 모델 성능의 감소를 줄이기 위해서는 학습 횟수를 더 증가시킨 결과 오류 증가율이 29.29%로 동일 학습 횟수보다 더 줄어듬을 확인하였다.

  • PDF

영어 발음 교육

  • 이영길
    • Proceedings of the KSPS conference
    • /
    • 1997.07a
    • /
    • pp.258-259
    • /
    • 1997
  • 1. 외국어로서의 영어 교육에 있어서 발음 지도는 어느 정도의 영어 수준에 도달하기를 기다릴 필요없이 가능한 한 저학년에서부터 직접 지도되어야 한다. 즉 영어 교육은 영어 발음 교육부터 시작되는 것이 가장 바람직하다. 어느 정도의 수준 높은 문법 이론을 알고 있는 (대)학생들이라도 발음에 관한 한 많은 연습이 요구되는 경우가 흔히 있다. 바꿔 말하면 이러한 학생들은 그들이 갖고 있는 문법 지식만큼 발음에 대한 적극적인 구사력도 당연히 발휘할 수 있어야할 것이다. 영어 교육을 강조할 때 문장 구조와 어휘 교육이 중요시된다면 발음 또한 조기 교육 단계부터 영어 교육 프로그램의 필수불가결한 요소로 인식되어야 한다. 그렇다면 제일 처음 무엇을 어떻게 시작 해야할 것인가\ulcorner 흔히 음소(phoneme)라는 말의 최소 단위부터 시작하여 자음군(consonant cluster)과 같은 음 결합체를 가르친 다음 단어 강세(word stress)를 다루며, 마지막으로 문장 강세(sentence stress), 리듬(rhythm), 억양(intonation) 등을 포함함 이음말(connected speech)을 가르치는 순서가 될 수 있을 것이다. 그러나 이러한 방법이 이론상 논리적이긴 하지만 실제로 영어를 외국어로 배우는 우리 학생들에게는 얼마나 효과를 거둘 수 있는지 매우 의심스렵다. 오히려 가장 유익한 순서는 기본 억양 과 같은 적절한 표현과 함께 주어진 화맥 속에서의 의미 있는 문장 강세를 가르치고 그 다음에 그에 수반되는 중요한 소리의 발음을 지적해 주는 것이다. 예를 들면 Give it to him과 같은 구조를 교사가 구두로 제시할 때 단어 하나 하나를 강조한 나머지 너무 천천히 말하게 되면 전체 문장의 발음을 오히려 어렵게 만들어버린다. 중요한 것 은 기본 의사소통에 필요한 부분에 초점을 맞추는 일이다. 개별 단어에 부수되는 문제점은 '보충 지도'(remedial teaching)로 교정이 가능하다. 2. 우리의 초등학교 영어 교육의 현황을 고려할 때 비록 발음 지도가 쉬운 일은 아니지만 미래 지향적 결과를 기대할 때 우선 두 가지를 생각할 수 있다. 첫째로 현재의 교육대학교의 교사양성에 있어서 영어교육의 교과과정을 염두에 두지 않을 수 없다. 1981년도부터 교육대학교가 4년제가 명실공히 영어과로 운영되기는 수년밖에 되지 않는 실정이다. 현재의 교과과정도 현장에서 영어교육을 담당하기에는 불충분할 뿐만 아니라 영어발음에 관한 뚜렷한 과정이 없는 실정이다. 혼히 외국인 강사가 담당하는 이른바 영어회화 시간이 곧 발음 시간도 될 수 있다고 생각하기 쉬우나 이것은 전적으로 별개의 문제이다. 따라서 체계적인 발음 교육을 할 수 있는 교과과정이 되기를 바란다. 3. 앞에서 언급했듯이 4년제 이전에 졸업한 현직 교사들은 재학 중 영어 발음에 관한 지도를 받아본 적이 없다. 여기서 중요한 것은 이들 교사들에게 적절하고도 충분한 발음 교육을 시켜야 하는 연수 과정이다. 소리로 듣고 말해야 하는 초둥 영어 교육에 서 교사의 발음에 관한 지식은 그 중요성을 아무리 과대평가해도 지나치지 않을 것이다. 문제는 연수 내용이다. 적어도 현재까지 실시되어 온 초둥영어교육 담당자 연수 교과목 내용은 핵심을 찾기 힘들 정도로 교파목이 다양하고 산만하다. 따라서 예를 들면 영어발음 지도에 관한 과목도 마지못해 끼워 넣는 식의 과목 배정이다. 여기에 고작 할당된 시간은 많아야 4시간 정도이다. 대학에서 한 학기에도 부족한 영어 발음을 아 무런 배경 지식도 없는 초등 교사들에게 4시간 동안 무엇을 어떻게 가르칠 것인가\ulcorner

  • PDF

An Epidemiologic Study of Symptoms of Temporomandibular Disorders in Korean College Students (경기도 지역 대학생의 측두하악장애증상에 관한 역학적 연구)

  • Park, Hye-Sook
    • Journal of Oral Medicine and Pain
    • /
    • v.32 no.1
    • /
    • pp.91-104
    • /
    • 2007
  • An epidemiologic investigation was carried out to determine the prevalence of symptoms of temporomandibular disorders in college students that aged 19-31 years. 460 students were investigated with a questionnaire from September to December 2006. The obtained results were as follows : 1. The prevalence of symptoms of temporomandibular disorders was 80.6%. 2. The most frequently complained symptom was headache and joint sound was the next one without distinct difference between men and women. 3. While the rate of occurrence of symptom of acute malocclusion decreased with age in men, that of TMJ pain during chewing or speech increased with age in women. 4. Symptoms including TMJ pain during mouth opening, chewing or speech, TMJ fatigue and acute malocclusion occurred significantly more frequently in women than in men. Contributing factors including resting cheeks on hands, stressful state, gum chewing, insomnia and clenching occurred significantly more frequently in women than in men. 5. There was a highly significant relationship between symptoms and contributing factors including resting cheeks on hands, stressful state, unilateral chewing, insomnia and clenching. 6. There was a highly significant relationship between symptoms and general personality.

Development of A-ABR System Using a Microprocessor (마이크로프로세서를 이용한 자동청력검사 시스템 개발)

  • Noh, Hyung-Wook;Lee, Tak-Hyung;Kim, Nam-Hyun;Kim, Soo-Chan;Cha, Eun-Jong;Kim, Deok-Won
    • Journal of the Institute of Electronics Engineers of Korea SC
    • /
    • v.46 no.2
    • /
    • pp.15-21
    • /
    • 2009
  • Hearing loss is one of the most common birth defects among infants. Most of hearing-impaired children are not diagnosed until 1 to 3 years of age - which is too late for the critical period (6 month) for normal speech and language development. If a hearing impairment is identified and treated in its early stage, child's speech and language skills could be comparable to his or her normal-hearing peers. For these reasons, hearing screening at birth and throughout childhood is extremely important. ABR (Auditory brain-stem response) is nowadays one of the most reliable diagnostic tools in the early detection of hearing impairment. In this study, we have developed the system that automatically detects if there is hearing impairment or not for infants or children. For future studies, it will be developed as a portable system to be able to take a measurement not only in sound proof room but also in nursery for neonates.

Limitations and Challenges of Game Regulatory Law and Policy in Korea (현행 게임규제정책의 한계와 과제 : 합리적인 규제를 위한 고려사항)

  • Kwon, Hun-Yeong
    • Journal of Information Technology Services
    • /
    • v.13 no.3
    • /
    • pp.149-164
    • /
    • 2014
  • The laws and policies governing Korea's game regulations are becoming more and more topics for debate as we enter the Age of Internet. The nature of the basis for Internet regulations and policies are not rooted in freedom of speech or fundamental values of democracy, but rather focused on solving real-world problems such as protection of the youth. Furthermore, the reality is that regulatory devices for keeping the social order such as regulating gambling are being applied directly to games without consideration on the characteristics of Internet gaming, raising concerns that the expansion of constitutional values and innovative empowerment inherent to the Internet are being weakened. The Geun-Hye Park Administration which succeeded Myung-Bak Lee's Administration, even went so far as to implement the so-called "Shutdown Policy", which prohibits access to Internet games during pre-defined time zones and also instigated a time zone selection rule. In order to curb the gambling nature of Internet games, government-led policies such as the mandatory personal identification and prohibition of player selection or in other words mandatory random player selection are being implemented. These institutions can inhibit freedom of speech, which is the basis of democracy, violate the right of equality through unreasonable discrimination between domestic and foreign service providers, and infringe upon the principles of administrative law, such as laws, due process in policies, and balance in among policies and governmental bodies. Going forward, if Korea's Internet game regulations and polices is to develop in a rational manner, regulatory frameworks will need to be designed to protect the nature of the Internet and its innovative values that enable the realization of constitutional values; for example, the Internet acting as the "catalytic media for freedom of expression as a fundamental human right ", which has already been acknowledged by the Korea's Constitutional Court. At the same time, transparent procedures should be put into place that will allow diverse participation of stakeholders including game service providers, game users, the youth and parents in the legislation and enforcement process of regulatory institutions; policies will also need to be transformed to enable not only regulatory laws but also self-regulation system to be established. And in this process, scientific and empirical analysis on the expected effects before introducing regulations and the results of enforcing regulations after being introduced will need to be strengthened.

Development of a Voice-activated Map Information Retrieval System based on MFC (MFC 기반 음성구동 수치지도정보 검색시스템의 구현)

  • Kim, Nag-Cheol;Kim, Tae-Soo;Jo, Myung-Hee;Chung, Hyun-Yeol
    • Journal of the Korean Association of Geographic Information Studies
    • /
    • v.3 no.1
    • /
    • pp.69-77
    • /
    • 2000
  • In retrieving and analyzing digital map information using mouse or key strokes, it needs several times of repeated mouse operation for designating the range of study area. In this study, we proposed a voice activated map information retrieval system for eliminating such repetitions and we realized the system on the personal computer. The system was constructed in two ways - traditional OLE(object linking embedding) method and MFC(Microsoft fundamental class) method in controlling of window display for practical use. In the system performance evaluation, the retrieval data for digital map were consisted of 68 words uttered by 3 male persons which include attribute words and control words for Susung-gu area of Taegu city in a 1:5,000 map. As the results, we obtained the average 98.02% of recognition rate through on-line tests in the office environment and the operating speed of 5.39 seconds by OLE, 10.38 seconds by MFC. These results showed the possibility for practical use of information retrieval system using speech recognition in digital map.

  • PDF

The Speaker Recognition System using the Pitch Alteration (피치변경을 이용한 화자인식 시스템)

  • Jung JongSoon;Bae MyungJin
    • Proceedings of the Acoustical Society of Korea Conference
    • /
    • spring
    • /
    • pp.115-118
    • /
    • 2002
  • Parameters used in a speaker recognition system are desirable expressing speaker's characteristics filly and have in a speech. That is to say, if inter-speaker than intra-speaker variance a big characteristic, it is useful to distinguish between speakers. Also, to make minimum error between speakers, it is required the improved recognition technology as well as the distinguishing characteristics. When we see the result of recent simulation performance, we obtain more exact performance by using dynamic characteristics and constant characteristics by a speaking habit. Therefore we suggest it to solve this problem as followings. The prosodic information is used by a characteristic vector of speech. Characteristics vector generally using in speaker recognition system is a modeling spectrum information and is working for a high performance in non-noise circumstance. However, it is found a problem that characteristic vector is distorted in noise circumstance and it makes a reduction of recognition rate. In this paper, we change pitch line divided by segment which can estimate a dynamic characteristic and it is used as a recognition characteristic. we confirmed that the dynamic characteristic is very robust in noise circumstance with a simulation. We make a decision of acceptance or rejection by comparing test pattern and recognition rate using the proposed algorithm has more improvement than using spectrum and prosodic information. Especially stational recognition rate can be obtained in noise circumstance through the simulation.

  • PDF

Performance Comparison of Out-Of-Vocabulary Word Rejection Algorithms in Variable Vocabulary Word Recognition (가변어휘 단어 인식에서의 미등록어 거절 알고리즘 성능 비교)

  • 김기태;문광식;김회린;이영직;정재호
    • The Journal of the Acoustical Society of Korea
    • /
    • v.20 no.2
    • /
    • pp.27-34
    • /
    • 2001
  • Utterance verification is used in variable vocabulary word recognition to reject the word that does not belong to in-vocabulary word or does not belong to correctly recognized word. Utterance verification is an important technology to design a user-friendly speech recognition system. We propose a new utterance verification algorithm for no-training utterance verification system based on the minimum verification error. First, using PBW (Phonetically Balanced Words) DB (445 words), we create no-training anti-phoneme models which include many PLUs(Phoneme Like Units), so anti-phoneme models have the minimum verification error. Then, for OOV (Out-Of-Vocabulary) rejection, the phoneme-based confidence measure which uses the likelihood between phoneme model (null hypothesis) and anti-phoneme model (alternative hypothesis) is normalized by null hypothesis, so the phoneme-based confidence measure tends to be more robust to OOV rejection. And, the word-based confidence measure which uses the phoneme-based confidence measure has been shown to provide improved detection of near-misses in speech recognition as well as better discrimination between in-vocabularys and OOVs. Using our proposed anti-model and confidence measure, we achieve significant performance improvement; CA (Correctly Accept for In-Vocabulary) is about 89%, and CR (Correctly Reject for OOV) is about 90%, improving about 15-21% in ERR (Error Reduction Rate).

  • PDF

A Semi-Automatic Semantic Mark Tagging System for Building Dialogue Corpus (대화 말뭉치 구축을 위한 반자동 의미표지 태깅 시스템)

  • Park, Junhyeok;Lee, Songwook;Lim, Yoonseob;Choi, Jongsuk
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.8 no.5
    • /
    • pp.213-222
    • /
    • 2019
  • Determining the meaning of a keyword in a speech dialogue system is an important technology for the future implementation of an intelligent speech dialogue interface. After extracting keywords to grasp intention from user's utterance, the intention of utterance is determined by using the semantic mark of keyword. One keyword can have several semantic marks, and we regard the task of attaching the correct semantic mark to the user's intentions on these keyword as a problem of word sense disambiguation. In this study, about 23% of all keywords in the corpus is manually tagged to build a semantic mark dictionary, a synonym dictionary, and a context vector dictionary, and then the remaining 77% of all keywords is automatically tagged. The semantic mark of a keyword is determined by calculating the context vector similarity from the context vector dictionary. For an unregistered keyword, the semantic mark of the most similar keyword is attached using a synonym dictionary. We compare the performance of the system with manually constructed training set and semi-automatically expanded training set by selecting 3 high-frequency keywords and 3 low-frequency keywords in the corpus. In experiments, we obtained accuracy of 54.4% with manually constructed training set and 50.0% with semi-automatically expanded training set.