• Title/Summary/Keyword: Elderly voice

Search Result 71, Processing Time 0.026 seconds

A Multi-Modal Complex Motion Authoring Tool for Creating Robot Contents

  • Seok, Kwang-Ho;Kim, Yoon-Sang
    • Journal of Korea Multimedia Society
    • /
    • v.13 no.6
    • /
    • pp.924-932
    • /
    • 2010
  • This paper proposes a multi-modal complex motion authoring tool for creating robot contents. The proposed tool is user-friendly and allows general users without much knowledge about robots, including children, women and the elderly, to easily edit and modify robot contents. Furthermore, the tool uses multi-modal data including graphic motion, voice and music to simulate user-created robot contents in the 3D virtual environment. This allows the user to not only view the authoring process in real time but also transmit the final authored contents to control the robot. The validity of the proposed tool was examined based on simulations using the authored multi-modal complex motion robot contents as well as experiments of actual robot motions.

Reinterpretation of Stop Production in Korean Elderly Speakers (노년층 파열음 발음의 재해석)

  • Kim, Ji-Eun
    • Phonetics and Speech Sciences
    • /
    • v.7 no.2
    • /
    • pp.139-145
    • /
    • 2015
  • Researchers have claimed that Korean younger speakers tend to less clearly differentiate aspirated and lax stops with VOT values while older speakers clearly differentiate these two stops with VOT values. To explain this phenomena, the current study consider both an aging effect and a general sound shift. For this study, VOT values and F0 of Korean stops produced by eight male speakers(years of birth were 1942 ~ 1952) analyzed using Praat. Their productions were compared with the values of participants whose year of birth were 1943 ~ 1952) in Silva(2006)'s research. Silva's research was conducted in 2004 using the same methods. The result shows that 2014's VOT gap between aspirated and lax stops and less F0 gap between aspirated and lax stops than those of 2004. When the F0 values related to physical conditions of the larynx is considered, it could be analyzed as the following: to distinguish the three-way phonation type clearly, older speakers depend on the VOT value more instead of F0 which they have difficulty to control.

Development of technology to improve information accessibility of information vulnerable class using crawling & clipping

  • Jeong, Seong-Bae;Kim, Kyung-Shin
    • Journal of the Korea Society of Computer and Information
    • /
    • v.23 no.2
    • /
    • pp.99-107
    • /
    • 2018
  • This study started from the public interest purpose to help accessibility for the information acquisition of the vulnerable groups due to visual difficulties such as the elderly and the visually impaired. In this study, the server resources are minimized and implemented in most of the user smart phones. In addition, we implement a method to gather necessary information by collecting only pattern information by utilizing crawl & clipping without having to visit the site of the information of the various sites having the data necessary for the user, and to have it in the server. Especially, we applied the TTS(Text-To-Speech) service composed of smart phone apps and tried to develop a unified customized information collection service based on voice-based information collection method.

Compact Robotic Arm to Assist with Eating using a Closed Link Mechanism (크로스 링크 기구를 적용한 소형 식사지원 로봇)

  • 강철웅;임종환
    • Journal of the Korean Society for Precision Engineering
    • /
    • v.20 no.3
    • /
    • pp.202-209
    • /
    • 2003
  • We succeeded to build a cost effective assistance robotic arm with a compact and lightweight body. The robotic arm has three joints, and the tip of robotic arm to install tools consists of a closed link mechanism, which consisted of two actuators and several links. The robotic arm has been made possible by the use of actuators typically used in radio control devices. The controller of the robotic arm consists of a single chip PIC only. The robotic arm has a friendly user interface, as the operators are aged and disabled in most cases. The operator can manipulate the robotic arm by voice commands or by pressing a push button. The robotic arm has been successfully prototyped and tested on an elderly patient to assist with eating. The results of field test were satisfactory.

Acoustic characteristics of the sustained vowel phonation according to age groups (모음 연장 발성이 보이는 연령대별 음향음성학적 특성 연구)

  • Seo, Yoon-Jeong;Shin, Jiyoung
    • Phonetics and Speech Sciences
    • /
    • v.10 no.4
    • /
    • pp.67-76
    • /
    • 2018
  • This study was performed to investigate acoustic characteristics of sustained vowels produced by Seoul Korean speakers. For this study, three hundred nine healthy adults were chosen as participants from Korean Standard Speech Database. These subjects were divided into five chronological age groups (20s, 30s, 40s, 50s, 60-70s) and two gender groups (male and female). Fundamental frequency (f0), jitter, shimmer, and NHR (noise-to-harmonics ratio) was measured with 8 Korean vowels (/ɑ/, /æ/, /ʌ/, /e/, /o/, /u/, /ɯ/, /i/) by using Praat. The results showed that the vowel type significantly affected all acoustic parameters. Gender affected f0, jitter, and NHR significantly. The mean female speakers' f0 was greater than the males', and the mean jitter and NHR of male speakers was greater than the females'. Moreover, age affected shimmer and NHR significantly; in particular, the shimmer and NHR of elderly speakers was greater than the young speakers.

A Study on Preprocessing for Elderly Voice Recognition (노인음성인식을 위한 전처리에 관한 연구)

  • Park, Ji-Woong;Lee, Seoung-Jun;Kwon, Soonil
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2013.11a
    • /
    • pp.1646-1648
    • /
    • 2013
  • 고령화 되어 가는 현대 사회에서 노인들이 일반 성인과 동등한 수준에서 정보를 접근 가능하도록 스마트기기의 손쉬운 인터페이스 방법이 요구된다. 음성 인터페이스는 노인들의 스마트기기 활용도를 높여 줄 수 있지만, 성능이 평균적 성인연령 대의 발성행태에 최적화되어 있어, 노인들이 사용할 경우 음성인식률 저하를 초래한다. 그래서 노인 친화형 음성 인터페이스를 개발하기 위한 일환으로 노인음성에 대한 인식률을 향상시켜 줄 수 있는 전처리 알고리즘을 개발하고자 한다. 이를 위해 노인층과 청년층을 대상으로 음성샘플을 수집하여 분석하였고, 그 결과 노인이 청년에 비해 발성속도가 느리며 이는 스마트기기의 음성인식 기능저하로 이어진다는 것을 확인할 수 있었다.

Mature Market Sub-segmentation and Its Evaluation by the Degree of Homogeneity (동질도 평가를 통한 실버세대 세분군 분류 및 평가)

  • Bae, Jae-ho
    • Journal of Distribution Science
    • /
    • v.8 no.3
    • /
    • pp.27-35
    • /
    • 2010
  • As the population, buying power, and intensity of self-expression of the elderly generation increase, its importance as a market segment is also growing. Therefore, the mass marketing strategy for the elderly generation must be changed to a micro-marketing strategy based on the results of sub-segmentation that suitably captures the characteristics of this generation. Furthermore, as a customer access strategy is decided by sub-segmentation, proper segmentation is one of the key success factors for micro-marketing. Segments or sub-segments are different from sectors, because segmentation or sub-segmentation for micro-marketing is based on the homogeneity of customer needs. Theoretically, complete segmentation would reveal a single voice. However, it is impossible to achieve complete segmentation because of economic factors, factors that affect effectiveness, etc. To obtain a single voice from a segment, we sometimes need to divide it into many individual cases. In such a case, there would be a many segments to deal with. On the other hand, to maximize market access performance, fewer segments are preferred. In this paper, we use the term "sub-segmentation" instead of "segmentation," because we divide a specific segment into more detailed segments. To sub-segment the elderly generation, this paper takes their lifestyles and life stages into consideration. In order to reflect these aspects, various surveys and several rounds of expert interviews and focused group interviews (FGIs) were performed. Using the results of these qualitative surveys, we can define six sub-segments of the elderly generation. This paper uses five rules to divide the elderly generation. The five rules are (1) mutually exclusive and collectively exhaustive (MECE) sub-segmentation, (2) important life stages, (3) notable lifestyles, (4) minimum number of and easy classifiable sub-segments, and (5) significant difference in voices among the sub-segments. The most critical point for dividing the elderly market is whether children are married. The other points are source of income, gender, and occupation. In this paper, the elderly market is divided into six sub-segments. As mentioned, the number of sub-segments is a very key point for a successful marketing approach. Too many sub-segments would lead to narrow substantiality or lack of actionability. On the other hand, too few sub-segments would have no effects. Therefore, the creation of the optimum number of sub-segments is a critical problem faced by marketers. This paper presents a method of evaluating the fitness of sub-segments that was deduced from the preceding surveys. The presented method uses the degree of homogeneity (DoH) to measure the adequacy of sub-segments. This measure uses quantitative survey questions to calculate adequacy. The ratio of significantly homogeneous questions to the total numbers of survey questions indicates the DoH. A significantly homogeneous question is defined as a question in which one case is selected significantly more often than others. To show whether a case is selected significantly more often than others, we use a hypothesis test. In this case, the null hypothesis (H0) would be that there is no significant difference between the selection of one case and that of the others. Thus, the total number of significantly homogeneous questions is the total number of cases in which the null hypothesis is rejected. To calculate the DoH, we conducted a quantitative survey (total sample size was 400, 60 questions, 4~5 cases for each question). The sample size of the first sub-segment-has no unmarried offspring and earns a living independently-is 113. The sample size of the second sub-segment-has no unmarried offspring and is economically supported by its offspring-is 57. The sample size of the third sub-segment-has unmarried offspring and is employed and male-is 70. The sample size of the fourth sub-segment-has unmarried offspring and is not employed and male-is 45. The sample size of the fifth sub-segment-has unmarried offspring and is female and employed (either the female herself or her husband)-is 63. The sample size of the last sub-segment-has unmarried offspring and is female and not employed (not even the husband)-is 52. Statistically, the sample size of each sub-segment is sufficiently large. Therefore, we use the z-test for testing hypotheses. When the significance level is 0.05, the DoHs of the six sub-segments are 1.00, 0.95, 0.95, 0.87, 0.93, and 1.00, respectively. When the significance level is 0.01, the DoHs of the six sub-segments are 0.95, 0.87, 0.85, 0.80, 0.88, and 0.87, respectively. These results show that the first sub-segment is the most homogeneous category, while the fourth has more variety in terms of its needs. If the sample size is sufficiently large, more segmentation would be better in a given sub-segment. However, as the fourth sub-segment is smaller than the others, more detailed segmentation is not proceeded. A very critical point for a successful micro-marketing strategy is measuring the fit of a sub-segment. However, until now, there have been no robust rules for measuring fit. This paper presents a method of evaluating the fit of sub-segments. This method will be very helpful for deciding the adequacy of sub-segmentation. However, it has some limitations that prevent it from being robust. These limitations include the following: (1) the method is restricted to only quantitative questions; (2) the type of questions that must be involved in calculation pose difficulties; (3) DoH values depend on content formation. Despite these limitations, this paper has presented a useful method for conducting adequate sub-segmentation. We believe that the present method can be applied widely in many areas. Furthermore, the results of the sub-segmentation of the elderly generation can serve as a reference for mature marketing.

  • PDF

Gendered innovation for algorithm through case studies (음성·영상 신호 처리 알고리즘 사례를 통해 본 젠더혁신의 필요성)

  • Lee, JiYeoun;Lee, Heisook
    • Journal of Digital Convergence
    • /
    • v.16 no.12
    • /
    • pp.459-466
    • /
    • 2018
  • Gendered innovations is a term used by policy makers and academics to refer the process of creating better research and development (R&D) for both men and women. In this paper, we analyze the literatures in image and speech signal processing that can be used in ICT, examine the importance of gendered innovations through case study. Therefore the latest domestic and foreign literature related to image and speech signal processing based on gender research is searched and a total of 9 papers are selected. In terms of gender analysis, research subjects, research environment, and research design are examined separately. Especially, through the case analysis of algorithms of the elderly voice signal processing, machine learning, machine translation technology, and facial gender recognition technology, we found that there is gender bias in existing algorithms, and which leads to gender analysis is required. We also propose a gendered innovations method integrating sex and gender analysis in algorithm development. Gendered innovations in ICT can contribute to the creation of new markets by developing products and services that reflect the needs of both men and women.

Age classification of emergency callers based on behavioral speech utterance characteristics (발화행태 특징을 활용한 응급상황 신고자 연령분류)

  • Son, Guiyoung;Kwon, Soonil;Baik, Sungwook
    • The Journal of Korean Institute of Next Generation Computing
    • /
    • v.13 no.6
    • /
    • pp.96-105
    • /
    • 2017
  • In this paper, we investigated the age classification from the speaker by analyzing the voice calls of the emergency center. We classified the adult and elderly from the call center calls using behavioral speech utterances and SVM(Support Vector Machine) which is a machine learning classifier. We selected two behavioral speech utterances through analysis of the call data from the emergency center: Silent Pause and Turn-taking latency. First, the criteria for age classification selected through analysis based on the behavioral speech utterances of the emergency call center and then it was significant(p <0.05) through statistical analysis. We analyzed 200 datasets (adult: 100, elderly: 100) by the 5 fold cross-validation using the SVM(Support Vector Machine) classifier. As a result, we achieved 70% accuracy using two behavioral speech utterances. It is higher accuracy than one behavioral speech utterance. These results can be suggested age classification as a new method which is used behavioral speech utterances and will be classified by combining acoustic information(MFCC) with new behavioral speech utterances of the real voice data in the further work. Furthermore, it will contribute to the development of the emergency situation judgment system related to the age classification.

Research on Developing a Conversational AI Callbot Solution for Medical Counselling

  • Won Ro LEE;Jeong Hyon CHOI;Min Soo KANG
    • Korean Journal of Artificial Intelligence
    • /
    • v.11 no.4
    • /
    • pp.9-13
    • /
    • 2023
  • In this study, we explored the potential of integrating interactive AI callbot technology into the medical consultation domain as part of a broader service development initiative. Aimed at enhancing patient satisfaction, the AI callbot was designed to efficiently address queries from hospitals' primary users, especially the elderly and those using phone services. By incorporating an AI-driven callbot into the hospital's customer service center, routine tasks such as appointment modifications and cancellations were efficiently managed by the AI Callbot Agent. On the other hand, tasks requiring more detailed attention or specialization were addressed by Human Agents, ensuring a balanced and collaborative approach. The deep learning model for voice recognition for this study was based on the Transformer model and fine-tuned to fit the medical field using a pre-trained model. Existing recording files were converted into learning data to perform SSL(self-supervised learning) Model was implemented. The ANN (Artificial neural network) neural network model was used to analyze voice signals and interpret them as text, and after actual application, the intent was enriched through reinforcement learning to continuously improve accuracy. In the case of TTS(Text To Speech), the Transformer model was applied to Text Analysis, Acoustic model, and Vocoder, and Google's Natural Language API was applied to recognize intent. As the research progresses, there are challenges to solve, such as interconnection issues between various EMR providers, problems with doctor's time slots, problems with two or more hospital appointments, and problems with patient use. However, there are specialized problems that are easy to make reservations. Implementation of the callbot service in hospitals appears to be applicable immediately.