• Title/Summary/Keyword: speech analysis

Search Result 1,580, Processing Time 0.034 seconds

A Study on Area Detection Using Transfer-Learning Technique (Transfer-Learning 기법을 이용한 영역검출 기법에 관한 연구)

  • Shin, Kwang-seong;Shin, Seong-yoon
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2018.10a
    • /
    • pp.178-179
    • /
    • 2018
  • Recently, methods of using machine learning in artificial intelligence such as autonomous navigation and speech recognition have been actively studied. Classical image processing methods such as classical boundary detection and pattern recognition have many limitations in order to recognize a specific object or area in a digital image. However, when a machine learning method such as deep-learning is used, Can be obtained. However, basically, a large amount of learning data must be secured for machine learning such as deep-learning. Therefore, it is difficult to apply the machine learning for area classification when the amount of data is very small, such as aerial photographs for environmental analysis. In this study, we apply a transfer-learning technique that can be used when the dataset size of the input image is small and the shape of the input image is not included in the category of the training dataset.

  • PDF

A Study on the realization of the right to be forgotten on social normative context: focusing on comparison of Korea-US-EU and the legal, technical, and service market (사회규범적 맥락에서 본 잊혀질 권리의 다차원적 실현범위 연구: 한-미-EU 비교 및 법제, 기술, 서비스 시장의 비교를 중심으로)

  • Shim, Mina
    • Journal of Convergence for Information Technology
    • /
    • v.8 no.2
    • /
    • pp.141-148
    • /
    • 2018
  • The purpose of this paper is to explore the scope of realization of multiple perspectives so that the implementation of the right to be forgotten is more realistic than the ideal information deletion concept. We examined domestic and foreign legal system and technology/service trends, and reflected the classification realization level of service realization, processing type and information characteristics of personal information processor, and legislative/technical factors for multi-level scope analysis. As a result, we have presented a matrix of the range of realization of the right to be forgotten and the scope of diversified regulation by the subject of protection. This study will be extended to the convergence of law and engineering, and will contribute to the prediction of social costs and expansion of the market by identifying the scope of 'deletion rights'.

Analysis on Preschoolers' Mean Length of Utterance and Type-Token Ratio by their Sex and Play Situation Type (유아의 성별과 놀이상황 유형별 평균발화길이와 어휘다양도)

  • Sung, Mi Young;Chang, Moon Soo
    • Korean Journal of Childcare and Education
    • /
    • v.10 no.6
    • /
    • pp.43-56
    • /
    • 2014
  • The purpose of this study was to analyze the differences of preschoolers' utterance features by their gender and play situation type. For this purpose, a total of 40 5-year-old children participated in this study. Dyad were participated in each play session during 10 minutes. The play session was videotaped and the videotaped data were transcribed by CBS(2014). The collected data were analyzed by using a independent t-test and paired t-test. The main results are as follows. First, girls' MLU-e, MLU-w, MLU-m were longer than that of boys in a familiar play situation. Second, preschoolers' MLU-w was longer in an unfamiliar play situation than in familiar ones and preschoolers' type-token ratio were higher in an unfamiliar play situation than in familiar ones. Implications for the importance of preschoolers' spontaneous speech are discussed.

Analysis and Prediction of Prosodic Phrage Boundary (운율구 경계현상 분석 및 텍스트에서의 운율구 추출)

  • Kim, Sang-Hun;Seong, Cheol-Jae;Lee, Jung-Chul
    • The Journal of the Acoustical Society of Korea
    • /
    • v.16 no.1
    • /
    • pp.24-32
    • /
    • 1997
  • This study aims to describe, at one aspect, the relativity between syntactic structure and prosodic phrasing, and at the other, to establish a suitable phrasing pattern to produce more natural synthetic speech. To get meaningful results, all the word boundaries in the prosodic database were statistically analyzed, and assigned by the proper boundary type. The resulting 10 types of prosodic boundaries were classified into 3 types according to the strength of the breaks, which are zero, minor, and major break respectively. We have found out that the durational information was a main cue to determine the major prosodic boundary. Using the bigram and trigram of syntactic information, we predicted major and minor classification of boundary types. With brigram model, we obtained the correct major break prediction rates of 4.60%, 38.2%, the insertion error rates of 22.8%, 8.4% on each Test-I and Test-II text database respectively. With trigram mode, we also obtained the correct major break prediction rates of 58.3%, 42.8%, the insertion error rates of 30.8%, 42.8%, the insertion error rates of 30.8%, 11.8% on Test-I and Test-II text database respectively.

  • PDF

Research on Emotional Factors and Voice Trend by Country to be considered in Designing AI's Voice - An analysis of interview with experts in Finland and Norway (AI의 음성 디자인에서 고려해야 할 감성적 요소 및 국가별 음성 트랜드에 관한 연구 - 핀란드와 노르웨이의 전문가 인뎁스 인터뷰를 중심으로)

  • Namkung, Kiechan
    • Journal of the Korea Convergence Society
    • /
    • v.11 no.9
    • /
    • pp.91-97
    • /
    • 2020
  • Use of voice-based interfaces that can interact with users is increasing as AI technology develops. To date, however, most of the research on voice-based interfaces has been technical in nature, focused on areas such as improving the accuracy of speech recognition. Thus, the voice of most voice-based interfaces is uniform and does not provide users with differentiated sensibilities. The purpose of this study is to add a emotional factor suitable for the AI interface. To this end, we have derived emotional factors that should be considered in designing voice interface. In addition, we looked at voice trends that differed from country to country. For this study, we conducted interviews with voice industry experts from Finland and Norway, countries that use their own independent languages.

e-Learning Course Reviews Analysis based on Big Data Analytics (빅데이터 분석을 이용한 이러닝 수강 후기 분석)

  • Kim, Jang-Young;Park, Eun-Hye
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.21 no.2
    • /
    • pp.423-428
    • /
    • 2017
  • These days, various and tons of education information are rapidly increasing and spreading due to Internet and smart devices usage. Recently, as e-Learning usage increasing, many instructors and students (learners) need to set a goal to maximize learners' result of education and education system efficiency based on big data analytics via online recorded education historical data. In this paper, the author applied Word2Vec algorithm (neural network algorithm) to find similarity among education words and classification by clustering algorithm in order to objectively recognize and analyze online recorded education historical data. When the author applied the Word2Vec algorithm to education words, related-meaning words can be found, classified and get a similar vector values via learning repetition. In addition, through experimental results, the author proved the part of speech (noun, verb, adjective and adverb) have same shortest distance from the centroid by using clustering algorithm.

Research Trends for the Deep Learning-based Metabolic Rate Calculation (재실자 활동량 산출을 위한 딥러닝 기반 선행연구 동향)

  • Park, Bo-Rang;Choi, Eun-Ji;Lee, Hyo Eun;Kim, Tae-Won;Moon, Jin Woo
    • KIEAE Journal
    • /
    • v.17 no.5
    • /
    • pp.95-100
    • /
    • 2017
  • Purpose: The purpose of this study is to investigate the prior art based on deep learning to objectively calculate the metabolic rate which is the subjective factor for the PMV optimum control and to make a plan for future research based on this study. Methods: For this purpose, the theoretical and technical review and applicability analysis were conducted through various documents and data both in domestic and foreign. Results: As a result of the prior art research, the machine learning model of artificial neural network and deep learning has been used in various fields such as speech recognition, scene recognition, and image restoration. As a representative case, OpenCV Background Subtraction is a technique to separate backgrounds from objects or people. PASCAL VOC and ILSVRC are surveyed as representative technologies that can recognize people, objects, and backgrounds. Based on the results of previous researches on deep learning based on metabolic rate for occupational metabolic rate, it was found out that basic technology applicable to occupational metabolic rate calculation technology to be developed in future researches. It is considered that the study on the development of the activity quantity calculation model with high accuracy will be done.

Effects of Injection Laryngoplasty with Hyaluronic Acid in Patients with Vocal Fold Paralysis

  • Kim, Geun-Hyo;Lee, Jae-Seok;Lee, Chang-Yoon;Lee, Yeon-Woo;Bae, In-Ho;Park, Hee-June;Lee, Byung-Joo;Kwon, Soon-Bok
    • Osong Public Health and Research Perspectives
    • /
    • v.9 no.6
    • /
    • pp.354-361
    • /
    • 2018
  • Objectives: The purpose of this study was to explore the effects of injection laryngoplasty (IL) with hyaluronic acid in patients with vocal fold paralysis (VFP). Methods: A total of 50 patients with VFP participated in this study. Pre- and post-IL assessments were performed, which included analyzing the sustained vowel /a/ phonation, and the patient reading 1 Korean sentence from the "Walk" passage that comprised 25 syllables in 10 words. To investigate the effect of IL on vocal fold function, acoustic analysis (acoustic voice quality index, cepstral peak prominence, maximum phonation time, speaking fundamental frequency) was conducted and auditory-perceptual (grade and overall severity), visual judgment (gap), and self-questionnaire (voice handicap index-10) assessments were performed. Results: The patients with VFP showed statistically significant differences between pre-and post-IL assessments for acoustic and auditory-perception, visual judgment, and self-questionnaire assessments. Conclusion: The patients with VFP showed positive change in vocal fold function between pre- and post-IL measurements. The findings showed that IL with hyaluronic acid is an effective method to improve vocal fold function in patients with VFP.

Paralinguistic Behavior as a Deception Cue (거짓말의 단서로서 준언어행위)

  • Kim, Daejoong;Park, Jihye
    • The Journal of the Korea Contents Association
    • /
    • v.19 no.4
    • /
    • pp.187-196
    • /
    • 2019
  • This experimental study examines whether paralinguistic behavior is a deception cue in an interrogation. 92 college students participated in an experiment and were randomly assigned to two conditions. Participant were then asked to take the money or not to take the money according to the condition they were assigned. Then participants had a face-to-face interrogation. During the interrogation, participants' paralinguistic behavior was recorded and used for coding and analysis. Results reveal that participants' paralinguistic behaviors differ depending on question types and deceptive paralinguistic cues are speech speed and fillers for the closed critical question and response latency, response length, and fillers for the open critical question. These findings implicate that part of paralinguistic behavior could be a deception cue and thus these cues might be applicable to deception detection in real world criminal investigations.

Analysis of privacy issues and countermeasures in neural network learning (신경망 학습에서 프라이버시 이슈 및 대응방법 분석)

  • Hong, Eun-Ju;Lee, Su-Jin;Hong, Do-won;Seo, Chang-Ho
    • Journal of Digital Convergence
    • /
    • v.17 no.7
    • /
    • pp.285-292
    • /
    • 2019
  • With the popularization of PC, SNS and IoT, a lot of data is generated and the amount is increasing exponentially. Artificial neural network learning is a topic that attracts attention in many fields in recent years by using huge amounts of data. Artificial neural network learning has shown tremendous potential in speech recognition and image recognition, and is widely applied to a variety of complex areas such as medical diagnosis, artificial intelligence games, and face recognition. The results of artificial neural networks are accurate enough to surpass real human beings. Despite these many advantages, privacy problems still exist in artificial neural network learning. Learning data for artificial neural network learning includes various information including personal sensitive information, so that privacy can be exposed due to malicious attackers. There is a privacy risk that occurs when an attacker interferes with learning and degrades learning or attacks a model that has completed learning. In this paper, we analyze the attack method of the recently proposed neural network model and its privacy protection method.