• Title/Summary/Keyword: 음성 녹음 파일

Search Result 21, Processing Time 0.027 seconds

Feature analysis of deaf students' English language by frequency (청각장애학생의 영어 발성 주파수별 특징 분석)

  • Lee, Gun-Min;Park, Hye Jung
    • Journal of the Korean Data and Information Science Society
    • /
    • v.25 no.4
    • /
    • pp.819-828
    • /
    • 2014
  • In this paper, we analyze the characteristics of the English vocalization of deaf students and present the basic data for the development of personalized English learning aid tools that reflect its features. We visited hearing special schools in Seoul and Daegu and recorded English vocalization of the deaf students in order to analyze the characteristics of deaf students' English vocalization. We analyzed the data by Praat program, an professional voice analysis program. The voice features of deaf students' English vocalization were extracted and then compared with those of non-deaf students' English vocalization.

Development and validation of a Korean Affective Voice Database (한국형 감정 음성 데이터베이스 구축을 위한 타당도 연구)

  • Kim, Yeji;Song, Hyesun;Jeon, Yesol;Oh, Yoorim;Lee, Youngmee
    • Phonetics and Speech Sciences
    • /
    • v.14 no.3
    • /
    • pp.77-86
    • /
    • 2022
  • In this study, we reported the validation results of the Korean Affective Voice Database (KAV DB), an affective voice database available for scientific and clinical use, comprising a total of 113 validated affective voice stimuli. The KAV DB includes audio-recordings of two actors (one male and one female), each uttering 10 semantically neutral sentences with the intention to convey six different affective states (happiness, anger, fear, sadness, surprise, and neutral). The database was organized into three separate voice stimulus sets in order to validate the KAV DB. Participants rated the stimuli on six rating scales corresponding to the six targeted affective states by using a 100 horizontal visual analog scale. The KAV DB showed high internal consistency for voice stimuli (Cronbach's α=.847). The database had high sensitivity (mean=82.8%) and specificity (mean=83.8%). The KAV DB is expected to be useful for both academic research and clinical purposes in the field of communication disorders. The KAV DB is available for download at https://kav-db.notion.site/KAV-DB-75 39a36abe2e414ebf4a50d80436b41a.

Carving deleted voice data in mobile (삭제된 휴대폰 음성 데이터 복원 방법론)

  • Kim, Sang-Dae;Byun, Keun-Duck;Lee, Sang-Jin
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.22 no.1
    • /
    • pp.57-65
    • /
    • 2012
  • People leave voicemails or record phone conversations in their daily cell phone use. Sometimes important voice data is deleted by the user accidently, or purposely to cover up criminal activity. In these cases, deleted voice data must be able to be recovered for forensics, since the voice data can be used as evidence in a criminal case. Because cell phones store data that is easily fragmented in flash memory, voice data recovery is very difficult. However, if there are identifiable patterns for the deleted voice data, we can recover a significant amount of it by researching images of it. There are several types of voice data, such as QCP, AMR, MP4, etc.. This study researches the data recovery solutions for EVRC codec and AMR codec in QCP file, Qualcumm's voice data format in cell phone.

Hi, KIA! Classifying Emotional States from Wake-up Words Using Machine Learning (Hi, KIA! 기계 학습을 이용한 기동어 기반 감성 분류)

  • Kim, Taesu;Kim, Yeongwoo;Kim, Keunhyeong;Kim, Chul Min;Jun, Hyung Seok;Suk, Hyeon-Jeong
    • Science of Emotion and Sensibility
    • /
    • v.24 no.1
    • /
    • pp.91-104
    • /
    • 2021
  • This study explored users' emotional states identified from the wake-up words -"Hi, KIA!"- using a machine learning algorithm considering the user interface of passenger cars' voice. We targeted four emotional states, namely, excited, angry, desperate, and neutral, and created a total of 12 emotional scenarios in the context of car driving. Nine college students participated and recorded sentences as guided in the visualized scenario. The wake-up words were extracted from whole sentences, resulting in two data sets. We used the soundgen package and svmRadial method of caret package in open source-based R code to collect acoustic features of the recorded voices and performed machine learning-based analysis to determine the predictability of the modeled algorithm. We compared the accuracy of wake-up words (60.19%: 22%~81%) with that of whole sentences (41.51%) for all nine participants in relation to the four emotional categories. Accuracy and sensitivity performance of individual differences were noticeable, while the selected features were relatively constant. This study provides empirical evidence regarding the potential application of the wake-up words in the practice of emotion-driven user experience in communication between users and the artificial intelligence system.

Vulnerability and Attacks of Bluetooth System (블루투스의 보안 취약성과 공격)

  • Rhee, In-Baum;Ryu, Dae-Hyun
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2011.05a
    • /
    • pp.650-653
    • /
    • 2011
  • In this paper, we describe Bluetooth system and Bluetooth security. And we analyze the structure of information security and vulnerability, introduced one of Bluetooth hacking techniques. We show a demo of the attack process to inject arbitrary hands-free voice messages and save the file information, recording a conversation through hands-free device.

  • PDF

Design of Real-Time Voice Phishing Detection Techniques using KoBERT (KoBERT를 활용한 실시간 보이스피싱 탐지기법 개념설계)

  • Yeong Jin Kim;Byoung-Yup Lee;Ah Reum Kang
    • Proceedings of the Korean Society of Computer Information Conference
    • /
    • 2024.01a
    • /
    • pp.95-96
    • /
    • 2024
  • 본 논문은 금융 범죄 중 하나인 보이스피싱을 실시간으로 예방하기 위한 탐지 기법을 제안한다. 제안된 모델은 수화기에 출력되는 음성을 녹음하고 네이버 CSR(Cloud Speech Recognition)을 통해 텍스트 파일로 변환한 후 딥러닝 기반의 KoBERT를 바탕으로 다양한 보이스피싱 패턴을 학습하여 실시간 환경에서의 신속하고 정확한 탐지를 위해 실제 통화 데이터를 적절하게 처리하여, 이를 통해 효과적인 보이스피싱 예방에 도움을 줄 것으로 예상된다.

  • PDF

An Implementation of Travel Information Service Using VoiceXML and GPS (VoiceXML과 GPS를 이용한 여행정보 서비스의 구현)

  • Oh, Jae-Gyu;Kim, Sun-Hyung
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.8 no.6
    • /
    • pp.1443-1448
    • /
    • 2007
  • In this paper, we implement a distributed computing environment-based travel information service that can use web(internet) and speech interface at the same time and can apply location information, using voice and web browser-based VoiceXML and GPS, to escape the limitations of traditional web(internet)-based travel information services. Because of IVR(Interactive Voice Response) of traditional call center has operated to a pre-installation scenario, it takes much a service time and has the inconveniences that must repeat speech recording according to the revised scenarios in case change response contents. However, suggested VoiceXML and GPS-based travel information service system has advantages that reorganization of system setups is easy, because it consists of the method to update server after make individual conversation scenarios by file format(document), and can provide usefully various travel information in environmental restriction conditions such as the back regions environment, according as our prototype find user's present location using GPS information and then provide various travel information service by this information.

  • PDF

Pronunciation Dictionary For Continuous Speech Recognition (한국어 연속음성인식을 위한 발음사전 구축)

  • 이경님;정민화
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2000.10b
    • /
    • pp.197-199
    • /
    • 2000
  • 연속음성인식을 수행하기 위해서는 발음사전과 언어모델이 필요하다. 이 둘 사이에는 디코딩 단위가 일치하여야 하므로 발음사전 구축시 디코딩 단위로 표제어 단위를 선정하며 표제어 사이의 음운변화 현상을 반영한 발음사전을 구축하여야 한다. 한국어에 부합하는 음운변화현상을 분석하여 학습용 자동 발음열을 생성하고, 이를 통하여 발음사전을 구축한다. 전처리 단계로 기호, 단위, 숫자 등 전처리 과정 및 형태소 분석 과정을 수행하며, 디코딩 단위인 의사 형태소 단위를 생성하기 위해 규칙을 이용한 태깅 과정을 거친다. 이를 통해 나온 결과를 발음열 생성기 입력으로 하며, 결과는 학습용 발음열 또는 발음사전 구성을 위한 형태로 출력한다. 표제어간 음운변화 현상이 반영된 상태의 표제어 단위이므로 실제 음운변화가 반영되지 않은 상태의 표제어와는 그 형태가 상이하다. 이는 연속 발음시 생기는 현상으로 실제 인식에는 이 음운변화 현상이 반영된 사전이 필요하게 된다. 생성된 발음사전의 효용성을 확인하기 위해 다음과 같은 실험을 통해 성능을 평가하였다. 음향학습을 위하여 PBS(Phonetically Balanced Sentence) 낭독체 17200문장을 녹음하고 그 전사파일을 사용하여 학습을 수행하였고, 발음사전의 평가를 위하여 이 중 각각 3100문장을 사용하여 다음과 같은 실험을 수행하였다. 형태소 태그정보를 이용하여 표제어간 음운변화 현상을 반영한 최적의 발음사전과 다중 발음사전, 언어학적 기준에 의한 수작업으로 생성한 표준 발음사전, 그리고 표제어간의 음운변화 현상을 고려하지 않고 독립된 단어로 생성한 발음사전과의 비교 실험을 수행하였다. 실험결과 표제어간 음운변화 현상을 반영하지 않은 경우 단어 인식률이 43.21%인 반면 표제어간 음운변화 현상을 반영한 1-Best 사전의 경우 48.99%, Multi 사전의 경우 50.19%로 인식률이 5~6%정도 향상되었음을 볼 수 있었고, 수작업에 의한 표준발음사전의 단어 인식률 45.90% 보다도 약 3~4% 좋은 성능을 보였다.

  • PDF

Application of Standardized North American Marsh Bird Monitoring Protocols to Survey Inconspicuous Marsh Birds in Korea (은둔형 습지 조류의 효과적인 조사 방법 탐색을 위한 국외 프로토콜의 시범 적용)

  • Lee, Sang-Yeon;Sung, Ha-Cheol
    • Korean Journal of Ecology and Environment
    • /
    • v.52 no.2
    • /
    • pp.143-150
    • /
    • 2019
  • Although inconspicuous marsh birds are an indicator of marsh health, there is little understanding of their status and population trends due to their behavioral characteristics and lack of reliable survey methods in Korea. We applied the Standardized North American Marsh Bird Monitoring Protocols(SNAMBMP) already validated in North America for effective survey of the marsh birds. We selected 29 sites with emergent marshes, rice fields and riparian forests in Seocheon-gun, Buyeo-gun and Gunsan-si. We conducted the survey with a combination of passive 5 minute point-count and vocal survey method (30 seconds call-broadcasting+30 seconds silence) that was targeted eight species 2~7 times/site from March to July 2017. Four species, Brown-cheeked Rail(Rallus indicus), Ruddy-breasted Crake (Porzana fusca), Watercock (Gallicrex cinerea) and Greater Painted-snipe (Rostatula benghalensis), were detected at one site respectively (naïve occupancy rate=0.035). Vocal survey method with conspecific call-broadcasting provided better on Brown-cheeked Rail and Watercock than the others. We suggest a combination of passive point-count and vocal survey method like SNAMBMP to monitor inconspicuous marsh birds at nationwide scale and collection of sound files through recording of the entire process during the survey.

A Study on Implementation of Sound Recording and Player of Smartphone for Mobile Learning (모바일 학습을 위한 스마트폰의 사운드 레코딩과 플레이어 구현에 관한 연구)

  • Seo, Jung-Hee;Park, Hung-Bog
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • v.8 no.6
    • /
    • pp.847-854
    • /
    • 2013
  • This paper implements a smartphone application for sound recording and player of mobile learning. Due to its ubiquitous nature, smartphones could be used anytime anywhere, and because they combine an audio and a microphone, the application for sound recording and player that this paper suggests can be easily and cost effectively developed without additional infrastructure. This paper also explains a technique which processes data of music lyrics. The technique is built on a database technology by using SQLite, a DBMS combined in a platform of android. Thus, as long as the smartphone application for sound recording and player is developed and the mobile phone has sound source files, learners could record their own voices to the sound. Therefore, we expect the learners without additional infrastructure to enable mobile learning.