• Title/Summary/Keyword: speakers

Search Result 1,232, Processing Time 0.024 seconds

An Influence of Artificial Intelligence Attributes on the Adoption Level of Artificial Intelligence-Enabled Products (인공지능 기반 제품 수용 정도에 인공지능 속성이 미치는 영향 연구)

  • Kwonsang Sohn;Kun Woo Yoo;Ohbyung Kwon
    • Information Systems Review
    • /
    • v.21 no.3
    • /
    • pp.111-129
    • /
    • 2019
  • Recently, artificial intelligence (AI)-enabled products and services such as smartphones, smart speakers, chatbots are being released due to advances in AI technology. Thus researchers making effort to reveal that consumers' intention to adopt AI-enabled products. Yet, little is known about the intended adoption of AI-enabled products. Because most of studies has been not consideredthe perceived utility value of consumers for each attribute by classified based on the characteristics of AI-enabled products. Therefore, the purpose of this study is to investigate the difference in importance between attributes that affect the intention to adopt of AI-enabled products. For this, first, identified and classified the attributes of AI-enabled products based on IS Success Model of DeLone and McLean. Second, measured the utility value of each attribute on the adoption of AI-enabled products through conjoint analysis. And we employed construal level theory to see whether there are differences in the relative importance of AI-enabled products attributes depending on the temporal distance. Third, we segmented the market based on the utility value of each respondent through cluster analysis and tried to understand the characteristics and needs of consumers in each segment market. We expect to provide theoretical implications for conceptually structured attributes and factors of AI-enabled products and practical implications for how development efforts of AI-enabled products are needed to reach consumers need for each segment.

One-shot multi-speaker text-to-speech using RawNet3 speaker representation (RawNet3를 통해 추출한 화자 특성 기반 원샷 다화자 음성합성 시스템)

  • Sohee Han;Jisub Um;Hoirin Kim
    • Phonetics and Speech Sciences
    • /
    • v.16 no.1
    • /
    • pp.67-76
    • /
    • 2024
  • Recent advances in text-to-speech (TTS) technology have significantly improved the quality of synthesized speech, reaching a level where it can closely imitate natural human speech. Especially, TTS models offering various voice characteristics and personalized speech, are widely utilized in fields such as artificial intelligence (AI) tutors, advertising, and video dubbing. Accordingly, in this paper, we propose a one-shot multi-speaker TTS system that can ensure acoustic diversity and synthesize personalized voice by generating speech using unseen target speakers' utterances. The proposed model integrates a speaker encoder into a TTS model consisting of the FastSpeech2 acoustic model and the HiFi-GAN vocoder. The speaker encoder, based on the pre-trained RawNet3, extracts speaker-specific voice features. Furthermore, the proposed approach not only includes an English one-shot multi-speaker TTS but also introduces a Korean one-shot multi-speaker TTS. We evaluate naturalness and speaker similarity of the generated speech using objective and subjective metrics. In the subjective evaluation, the proposed Korean one-shot multi-speaker TTS obtained naturalness mean opinion score (NMOS) of 3.36 and similarity MOS (SMOS) of 3.16. The objective evaluation of the proposed English and Korean one-shot multi-speaker TTS showed a prediction MOS (P-MOS) of 2.54 and 3.74, respectively. These results indicate that the performance of our proposed model is improved over the baseline models in terms of both naturalness and speaker similarity.

Text Mining-Based Analysis of Hyundai Automobile Consumer Satisfaction and Dissatisfaction Factors in the Chinese Market: A Comparison with Other Brands (텍스트 마이닝을 이용한 현대 자동차 중국시장 소비자의 만족 및 불만족 요인 분석 연구: 다른 브랜드와의 비교)

  • Cui Ran;Inyong Nam
    • The Journal of the Convergence on Culture Technology
    • /
    • v.10 no.1
    • /
    • pp.539-549
    • /
    • 2024
  • This study employed text mining techniques like frequency analysis, word clouds, and LDA topic modeling to assess consumer satisfaction and dissatisfaction with Hyundai Motor Company in the Chinese market, compared to brands such as Toyota, Volkswagen, Buick, and Geely. Focusing on compact vehicles from these brands between 2021 and 2023, this study analyzed customer reviews. The results indicated Hyundai Avante's positive factors, including a long wheelbase. However, it also highlighted dissatisfaction aspects like Manipulate, engine performance, trunk space, chassis and suspension, safety features, quantity and brand of audio speakers, music membership service, separation band, screen reflection, CarLife, and map services. Addressing these issues could significantly enhance Hyundai's competitiveness in the Chinese market. Previous studies mainly focused on literature research and surveys, which only revealed consumer perceptions limited to the variables set by the researchers. This study, through text mining and comparing various car brands, aims to gain a deeper understanding of market trends and consumer preferences, providing useful information for marketing strategies of Hyundai and other brands in the Chinese market.

Comparison of acoustic features due to the Lombard effect in typically developing children and adults (롬바르드 효과가 아동과 성인의 말소리 산출에 미치는 영향: 음향학적 특성과 모음공간면적을 중심으로)

  • Yelim Jang;Jaehee Hwang;Nuri Lee;Nakyung Lee;Seeun Eum;Youngmee Lee
    • Phonetics and Speech Sciences
    • /
    • v.16 no.2
    • /
    • pp.19-27
    • /
    • 2024
  • The Lombard effect is an involuntary response to speakers' experiences in the presence of noise during voice communication. This study aimed to investigate the Lombard effect by comparing the acoustic features of children and adults under different listening conditions. Twelve male children (5-9 years old) and 12 young adult men (24-35 years old) were recruited to produce speech under three different listening conditions (quiet, noise-55 dB, noise-70 dB). Acoustic analyses were then carried out to characterize their acoustic features, such as F0, intensity, duration, and vowel space area, under the three listening conditions. A Lombard effect was observed in the intensity and duration for children and adults who participated in this study under adverse listening conditions. However, we did not observe a Lombard effect in the F0 and vowel space areas of either group. These findings suggest that children can adjust their speech production in challenging listening conditions as much as adults.

Why A Multimedia Approach to English Education\ulcorner

  • Keem, Sung-uk
    • Proceedings of the KSPS conference
    • /
    • 1997.07a
    • /
    • pp.176-178
    • /
    • 1997
  • To make a long story short I made up my mind to experiment with a multimedia approach to my classroom presentations two years ago because my ways of giving instructions bored the pants off me as well as my students. My favorite ways used to be sometimes referred to as classical or traditional ones, heavily dependent on the three elements: teacher's mouth, books, and chalk. Some call it the 'MBC method'. To top it off, I tried audio-visuals such as tape recorders, cassette players, VTR, pictures, and you name it, that could help improve my teaching method. And yet I have been unhappy about the results by a trial and error approach. I was determined to look for a better way that would ensure my satisfaction in the first place. What really turned me on was a multimedia CD ROM title, ELLIS (English Language Learning Instructional Systems) developed by Dr. Frank Otto. This is an integrated system of learning English based on advanced computer technology. Inspired by the utility and potential of such a multimedia system for regular classroom or lab instructions, I designed a simple but practical multimedia language learning laboratory in 1994 for the first time in Korea(perhaps for the first time in the world). It was high time that the conventional type of language laboratory(audio-passive) at Hahnnam be replaced because of wear and tear. Prior to this development, in 1991, I put a first CALL(Computer Assisted Language Learning) laboratory equipped with 35 personal computers(286), where students were encouraged to practise English typing, word processing and study English grammar, English vocabulary, and English composition. The first multimedia language learning laboratory was composed of 1) a multimedia personal computer(486DX2 then, now 586), 2) VGA multipliers that enable simultaneous viewing of the screen at control of the instructor, 3) an amplifIer, 4) loud speakers, 5)student monitors, 6) student tables to seat three students(a monitor for two students is more realistic, though), 7) student chairs, 8) an instructor table, and 9) cables. It was augmented later with an Internet hookup. The beauty of this type of multimedia language learning laboratory is the economy of furnishing and maintaining it. There is no need of darkening the facilities, which is a must when an LCD/beam projector is preferred in the laboratory. It is headset free, which proved to make students exasperated when worn more than- twenty minutes. In the previous semester I taught three different subjects: Freshman English Lab, English Phonetics, and Listening Comprehension Intermediate. I used CD ROM titles like ELLIS, Master Pronunciation, English Tripple Play Plus, English Arcade, Living Books, Q-Steps, English Discoveries, Compton's Encyclopedia. On the other hand, I managed to put all teaching materials into PowerPoint, where letters, photo, graphic, animation, audio, and video files are orderly stored in terms of slides. It takes time for me to prepare my teaching materials via PowerPoint, but it is a wonderful tool for the sake of presentations. And it is worth trying as long as I can entertain my students in such a way. Once everything is put into the computer, I feel relaxed and a bit excited watching my students enjoy my presentations. It appears to be great fun for students because they have never experienced this type of instruction. This is how I freed myself from having to manipulate a cassette tape player, VTR, and write on the board. The student monitors in front of them seem to help them concentrate on what they see, combined with what they hear. All I have to do is to simply click a mouse to give presentations and explanations, when necessary. I use a remote mouse, which prevents me from sitting at the instructor table. Instead, I can walk around in the room and enjoy freer interactions with students. Using this instrument, I can also have my students participate in the presentation. In particular, I invite my students to manipulate the computer using the remote mouse from the student's seat not from the instructor's seat. Every student appears to be fascinated with my multimedia approach to English teaching because of its unique nature as a new teaching tool as we face the 21st century. They all agree that the multimedia way is an interesting and fascinating way of learning to satisfy their needs. Above all, it helps lighten their drudgery in the classroom. They feel other subjects taught by other teachers should be treated in the same fashion. A multimedia approach to education is impossible without the advent of hi-tech computers, of which multi functions are integrated into a unified system, i.e., a personal computer. If you have computer-phobia, make quick friends with it; the sooner, the better. It can be a wonderful assistant to you. It is the Internet that I pay close attention to in conjunction with the multimedia approach to English education. Via e-mail system, I encourage my students to write to me in English. I encourage them to enjoy chatting with people all over the world. I also encourage them to visit the sites where they offer study courses in English conversation, vocabulary, idiomatic expressions, reading, and writing. I help them search any subject they want to via World Wide Web. Some day in the near future it will be the hub of learning for everybody. It will eventually free students from books, teachers, libraries, classrooms, and boredom. I will keep exploring better ways to give satisfying instructions to my students who deserve my entertainment.

  • PDF

Speech Recognition Using Linear Discriminant Analysis and Common Vector Extraction (선형 판별분석과 공통벡터 추출방법을 이용한 음성인식)

  • 남명우;노승용
    • The Journal of the Acoustical Society of Korea
    • /
    • v.20 no.4
    • /
    • pp.35-41
    • /
    • 2001
  • This paper describes Linear Discriminant Analysis and common vector extraction for speech recognition. Voice signal contains psychological and physiological properties of the speaker as well as dialect differences, acoustical environment effects, and phase differences. For these reasons, the same word spelled out by different speakers can be very different heard. This property of speech signal make it very difficult to extract common properties in the same speech class (word or phoneme). Linear algebra method like BT (Karhunen-Loeve Transformation) is generally used for common properties extraction In the speech signals, but common vector extraction which is suggested by M. Bilginer et at. is used in this paper. The method of M. Bilginer et al. extracts the optimized common vector from the speech signals used for training. And it has 100% recognition accuracy in the trained data which is used for common vector extraction. In spite of these characteristics, the method has some drawback-we cannot use numbers of speech signal for training and the discriminant information among common vectors is not defined. This paper suggests advanced method which can reduce error rate by maximizing the discriminant information among common vectors. And novel method to normalize the size of common vector also added. The result shows improved performance of algorithm and better recognition accuracy of 2% than conventional method.

  • PDF

Considerations for Helping Korean Students Write Better Technical Papers in English (한국 대학생들의 영어 기술 논문 작성 능력 향상을 위한 고찰)

  • Kim, Yee-Jin;Pak, Bo-Young;Lee, Chang-Ha;Kim, Moon-Kyum
    • Journal of Engineering Education Research
    • /
    • v.10 no.3
    • /
    • pp.64-78
    • /
    • 2007
  • For Korean researchers, English is essential. In fact, this is the case for any researcher who is a non-native English speaker, as recognition and success is predicated on being published, while publications that reach the broadest audiences are in English. Unfortunately, university science and engineering programs in Korea often do not provide formal coursework to help students attain greater competence in English composition. Aggravating this situation is the general lack of literature covering this specific pedagogical issue. While there is plenty of information to help native speakers with technical writing and much covering general English composition for EFL learners, there is very little information available to help EFL learners become better technical writers. Thus, the purpose of this report is twofold. First, as most Korean educators in science and engineering are not well acquainted with pedagogical issues of EFL writing, this report provides a general introduction to some relevant issues. It reviews the importance of contrastive rhetoric as well as some considerations for choosing the appropriate teaching approach, class arrangement, and use of computer assisted learning tools. Secondly, a course proposal is discussed. Based on a review of student writing samples as well as student responses to a self-assessment questionnaire, the proposed course is intended to balance the needs of Korean EFL learners to develop grammar, process, and genre skills involved in technical writing. Although, the scope of this report is very modest, by sharing the considerations made towards the development of an EFL technical writing course it seeks to provide a small example to a field that is perhaps lacking examples.

Clinical Study on Laryngo - Microscopic Surgery For Vocal Nodules and Polyps (후두결절 및 폴립의 후두미세 수술에 관한 임상연구)

  • 문영일
    • Proceedings of the KOR-BRONCHOESO Conference
    • /
    • 1983.05a
    • /
    • pp.11.2-11
    • /
    • 1983
  • Vocal nodules and polyps are much more frequent in singers, public speakers, teachers and actors. Voice trauma and voice misuse, at times associated with mild inflammatory reaction, appear to be important in their etiology. It is generally agreed that vocal cord nodules and polyps are inflammatory in nature and they arise in the subepithelial layer of loose connective tissue of the vocal cord. Since the junction of anterior and middle thirds of the membranous cord and has the greatest amplitude of vibration. This is the site of predilection for vocal cord nodules. The author performed laryngomicrosurgery for 70 cases of vocal nodules and polyps at Ewha Womans University Hospital during the period of 5 years. The result obtained were as follows ; 1) Surgical excision is not necessarily the best approach because vocal nodules in the early stages will resolve with the simplest voice therapy. 2) In children, surgery is rarely indicated because most nodules in children regress during adolescence. 3) For patients who use their voices professionally, voice therapy is indicated for three months. 4) If after three month of conservative treatment the cord lesion does not improve and the patient it still dissatisfied with his voice, laryngomicrosurgery can then be considered. 5) The small cuffed endotracheal tube in the interarytenoid space helps to keep the cords immobile and in an abducted position. 6) Removal of the nodule shoule be started by gentle retraction posteriorly and as soon as a tear appears anterior to the nodule. 7) On occasion it is preferable to start the dissection with a siccle knife while the nodule is held on the stretch. 8) Voice rest should be maintained for a week following which the free edges of the cords are usually healed.

  • PDF

The characteristics of sentence reading intonations in North Korean defectors based on pitch range and an auditory-perceptual rating scale (북한이탈주민의 문장 읽기 억양 특성-음도범위와 청지각적 평가를 중심으로)

  • Kim, Damee;Kim, Shinhee;Kim, Jiseong;An, Eunsol;Cho, Yongyun;Yang, Yoonhee;Yim, Dongsun
    • Phonetics and Speech Sciences
    • /
    • v.11 no.3
    • /
    • pp.9-21
    • /
    • 2019
  • This study aimed to compare the prosodic characteristics of North Korean defectors and South Koreans in three types of sentences (declarative, interrogative, and negative) in two reading tasks (short and dialogue) through acoustic analysis and auditory-perceptual evaluation. In addition, this study examined the relationship between the auditory-perceptual evaluation scores and self-assessment questionnaires on intonation for North Korean defectors. The participants were 15 North Korean defectors and 15 Korean speakers with standard Seoul accents. For statistical analysis, three-way mixed ANOVA and multivariate analysis were performed within the three types of sentences in the reading tasks through acoustic analysis and the Mann-Whitney U Test for auditory-perceptual evaluation. Pearson's product-moment correlation coefficients were also used to identify the correlations between the results of the self-assessment questionnaire on intonation and the auditory-perceptual evaluation. The North Korean defectors were found to have a significantly lower pitch range and auditory-perceptual evaluation score than South Koreans in reading tasks. Moreover, there was a significant correlation between their auditory-perceptual evaluations and self-assessment questionnaires on intonation. The study findings suggest that North Korean defectors, who face many challenges with intonation, showed a tendency to think that their intonation differed from the standard Korean intonation and showed better auditory evaluation results for interrogative sentences.

Changes in fundamental frequency depending on language, context, and language proficiency for bilinguals (한국어-영어 이중언어 화자의 사용 언어, 문맥, 언어 능숙도에 따른 기본 주파수 변화)

  • Yoon, Somang;Mok, Sora;Youn, Jungseon;Han, Jiyun;Yim, Dongsun
    • Phonetics and Speech Sciences
    • /
    • v.11 no.1
    • /
    • pp.9-18
    • /
    • 2019
  • The purpose of this study is to determine whether the mean fundamental frequency (F0) changes depending on language, task, or language proficiency for Korean-English bilinguals. A total of forty-eight Korean-English speakers (28 balanced bilinguals and 20 Korean dominant bilinguals) participated in the study. Participants were asked to read aloud two types of tasks in English and Korean. For statistical analyses, the language ${\times}$ task two-way repeated ANOVAs were conducted within the balanced bilingual group first, and then group ${\times}$ language two-way mixed ANOVAs. The results showed that the females in both bilingual groups changed their mean F0 depending on the language they used and the tasks (p<.05), whereas no significant results were found in the males in either group under any conditions. The mean fundamental frequency in the Korean reading task was significantly higher than that in the English reading task for females in both balanced and Korean dominant bilingual groups. Thus, changes in mean F0 depending on language and context may reflect gender-specific characteristics, and females seem to be more sensitive to the socio-cultural standards that are imposed on them.