• Title/Summary/Keyword: Speech Recognition Technology

Search Result 523, Processing Time 0.027 seconds

Language Variation and World Englishes (언어변이와 세계영어들)

  • Kim, Yangsoon
    • The Journal of the Convergence on Culture Technology
    • /
    • v.7 no.1
    • /
    • pp.234-239
    • /
    • 2021
  • The purpose of this paper is to find out the nature of language variation by exploring the ways of the progress of the language variation that produces all English-lects, i.e., the World Englishes. The study of language variation in linguistics is a hybrid enterprise, so the study of World Englishes has led to the recognition of a highly diverse set of all English-lects, encompassing regional dialects, sociolects, ethnolects and (post-)colonial dialects of World Englishes. In this paper, we propose a hybrid language variation model with three interacting factors of social distancing, on/off-contact, and linguistic diversity to examine the characteristics of language variation. In the context of World Englishes, the social distance is typically low in terms of their local location (country/speech) for local purposes. The social distance also varies based on online/offline communication modes and other social factors like gender, age and ethnic groups, resulting in all English-lects. To clarify the nature of World Englishes, the core Englishes, BrE, AmE and CanE are discussed here.

A Study on the Linkage Model Between Institutions Related to Lifelong Education for People with Developmental Disabilities Based on the K-PACE Center of Daegu University: A Perspective on the Whole Life Cycle for People with Developmental Disabilities

  • Kim, Young-Jun;Kim, Wha-Soo;Rhee, Kun-Yong
    • International Journal of Advanced Culture Technology
    • /
    • v.10 no.1
    • /
    • pp.24-35
    • /
    • 2022
  • The purpose of this study was to form a linked model in which local institutions related to lifelong education for the disabled can cooperate based on the Daegu University K-PACE Center. The contents of the study started with recognizing the problem that the adult-centered lifelong education support system does not effectively cope with these factors, even though the independent life of people with developmental disabilities is a major factor determining the quality of life. Regarding this problem recognition, this study primarily emphasized the view that educational support for independent life of people with developmental disabilities should establish the context of the school foundation. The context of the school foundation is established for lifelong education centered on adulthood for people with developmental disabilities because the curriculum is embodied through the standards of subject matter education. In this regard, the Daegu University K-PACE Center, which established a curriculum that supports the independent life of people with developmental disabilities in terms of linking higher and lifelong education, actually reflects the context of the school foundation. As a result, this study prepared a strategy that could be considered as a transition to advance the curriculum organized by the Daegu University K-PACE Center, and the strategy was secondarily reflected as a procedure that could be linked to local lifelong education-related institutions for the disabled. Finally, this study presented a form of transition in which people with developmental disabilities can access the curriculum of lifelong education through the connection of local lifelong education-related institutions for the disabled, centering on the entire life of adulthood.

Contextual In-Video Advertising Using Situation Information (상황 정보를 활용한 동영상 문맥 광고)

  • Yi, Bong-Jun;Woo, Hyun-Wook;Lee, Jung-Tae;Rim, Hae-Chang
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.11 no.8
    • /
    • pp.3036-3044
    • /
    • 2010
  • With the rapid growth of video data service, demand to provide advertisements or additional information with regard to a particular video scene is increasing. However, the direct use of automated visual analysis or speech recognition on videos virtually has limitations with current level of technology; the metadata of video such as title, category information, or summary does not reflect the content of continuously changing scenes. This work presents a new video contextual advertising system that serves relevant advertisements on a given scene by leveraging the scene's situation information inferred from video scripts. Experimental results show that the use of situation information extracted from scripts leads to better performance and display of more relevant advertisements to the user.

A Study on LMS Using Effective User Interface in Mobile Environment (모바일 환경에서 효과적인 사용자 인터페이스를 이용한 LMS에 관한 연구)

  • Kim, Si-Jung;Cho, Do-Eun
    • Journal of Advanced Navigation Technology
    • /
    • v.16 no.1
    • /
    • pp.76-81
    • /
    • 2012
  • With the spread of the various mobile devices, the studies on the learning management system based on the u-learning are actively proceeding. The u-learning-based learning management system is very convenient in that there are no restrictions on the various access devices as well as the access time and place. However, the judgments on the authentication for the user and whether learning is focused on are difficult. In this paper, the voice and user face capture interface rather than the common user event oriented interface was applied to the learning management system. When a user is accessing the learning management system, user's registered password is input and login as voice, and the user's learning attitude is judged through the response utterance of simple words during the process of learning through contents. As a result of evaluating the proposed learning management system, the user's learning achievement and concentration were improved, thus enabling the manager to monitor the user's abnormal learning attitude.

Development for Estimation Model of Runway Visual Range using Deep Neural Network (심층신경망을 활용한 활주로 가시거리 예측 모델 개발)

  • Ku, SungKwan;Hong, SeokMin
    • Journal of Advanced Navigation Technology
    • /
    • v.21 no.5
    • /
    • pp.435-442
    • /
    • 2017
  • The runway visual range affected by fog and so on is one of the important indicators to determine whether aircraft can take off and land at the airport or not. In the case of airports where transportation airplanes are operated, major weather forecasts including the runway visual range for local area have been released and provided to aviation workers for recognizing that. This paper proposes a runway visual range estimation model with a deep neural network applied recently to various fields such as image processing, speech recognition, natural language processing, etc. It is developed and implemented for estimating a runway visual range of local airport with a deep neural network. It utilizes the past actual weather observation data of the applied airfield for constituting the learning of the neural network. It can show comparatively the accurate estimation result when it compares the results with the existing observation data. The proposed model can be used to generate weather information on the airfield for which no other forecasting function is available.

Research on Emotional Factors and Voice Trend by Country to be considered in Designing AI's Voice - An analysis of interview with experts in Finland and Norway (AI의 음성 디자인에서 고려해야 할 감성적 요소 및 국가별 음성 트랜드에 관한 연구 - 핀란드와 노르웨이의 전문가 인뎁스 인터뷰를 중심으로)

  • Namkung, Kiechan
    • Journal of the Korea Convergence Society
    • /
    • v.11 no.9
    • /
    • pp.91-97
    • /
    • 2020
  • Use of voice-based interfaces that can interact with users is increasing as AI technology develops. To date, however, most of the research on voice-based interfaces has been technical in nature, focused on areas such as improving the accuracy of speech recognition. Thus, the voice of most voice-based interfaces is uniform and does not provide users with differentiated sensibilities. The purpose of this study is to add a emotional factor suitable for the AI interface. To this end, we have derived emotional factors that should be considered in designing voice interface. In addition, we looked at voice trends that differed from country to country. For this study, we conducted interviews with voice industry experts from Finland and Norway, countries that use their own independent languages.

An analysis study on the quality of article to improve the performance of hate comments discrimination (악성댓글 판별의 성능 향상을 위한 품사 자질에 대한 분석 연구)

  • Kim, Hyoung Ju;Min, Moon Jong;Kim, Pan Koo
    • Smart Media Journal
    • /
    • v.10 no.4
    • /
    • pp.71-79
    • /
    • 2021
  • One of the social aspects that changes as the use of the Internet becomes widespread is communication in online space. In the past, only one-on-one conversations were possible remotely, except when they were physically in the same space, but nowadays, technology has been developed to enable communication with a large number of people remotely through bulletin boards, communities, and social network services. Due to the development of such information and communication networks, life becomes more convenient, and at the same time, the damage caused by rapid information exchange is also constantly increasing. Recently, cyber crimes such as sending sexual messages or personal attacks to certain people with recognition on the Internet, such as not only entertainers but also influencers, have occurred, and some of those exposed to these cybercrime have committed suicide. In this paper, in order to reduce the damage caused by malicious comments, research a method for improving the performance of discriminate malicious comments through feature extraction based on parts-of-speech.

Efficient Thread Allocation Method of Convolutional Neural Network based on GPGPU (GPGPU 기반 Convolutional Neural Network의 효율적인 스레드 할당 기법)

  • Kim, Mincheol;Lee, Kwangyeob
    • Asia-pacific Journal of Multimedia Services Convergent with Art, Humanities, and Sociology
    • /
    • v.7 no.10
    • /
    • pp.935-943
    • /
    • 2017
  • CNN (Convolution neural network), which is used for image classification and speech recognition among neural networks learning based on positive data, has been continuously developed to have a high performance structure to date. There are many difficulties to utilize in an embedded system with limited resources. Therefore, we use GPU (General-Purpose Computing on Graphics Processing Units), which is used for general-purpose operation of GPU to solve the problem because we use pre-learned weights but there are still limitations. Since CNN performs simple and iterative operations, the computation speed varies greatly depending on the thread allocation and utilization method in the Single Instruction Multiple Thread (SIMT) based GPGPU. To solve this problem, there is a thread that needs to be relaxed when performing Convolution and Pooling operations with threads. The remaining threads have increased the operation speed by using the method used in the following feature maps and kernel calculations.

Research on Developing a Conversational AI Callbot Solution for Medical Counselling

  • Won Ro LEE;Jeong Hyon CHOI;Min Soo KANG
    • Korean Journal of Artificial Intelligence
    • /
    • v.11 no.4
    • /
    • pp.9-13
    • /
    • 2023
  • In this study, we explored the potential of integrating interactive AI callbot technology into the medical consultation domain as part of a broader service development initiative. Aimed at enhancing patient satisfaction, the AI callbot was designed to efficiently address queries from hospitals' primary users, especially the elderly and those using phone services. By incorporating an AI-driven callbot into the hospital's customer service center, routine tasks such as appointment modifications and cancellations were efficiently managed by the AI Callbot Agent. On the other hand, tasks requiring more detailed attention or specialization were addressed by Human Agents, ensuring a balanced and collaborative approach. The deep learning model for voice recognition for this study was based on the Transformer model and fine-tuned to fit the medical field using a pre-trained model. Existing recording files were converted into learning data to perform SSL(self-supervised learning) Model was implemented. The ANN (Artificial neural network) neural network model was used to analyze voice signals and interpret them as text, and after actual application, the intent was enriched through reinforcement learning to continuously improve accuracy. In the case of TTS(Text To Speech), the Transformer model was applied to Text Analysis, Acoustic model, and Vocoder, and Google's Natural Language API was applied to recognize intent. As the research progresses, there are challenges to solve, such as interconnection issues between various EMR providers, problems with doctor's time slots, problems with two or more hospital appointments, and problems with patient use. However, there are specialized problems that are easy to make reservations. Implementation of the callbot service in hospitals appears to be applicable immediately.

Robust Real-time Pose Estimation to Dynamic Environments for Modeling Mirror Neuron System (거울 신경 체계 모델링을 위한 동적 환경에 강인한 실시간 자세추정)

  • Jun-Ho Choi;Seung-Min Park
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • v.19 no.3
    • /
    • pp.583-588
    • /
    • 2024
  • With the emergence of Brain-Computer Interface (BCI) technology, analyzing mirror neurons has become more feasible. However, evaluating the accuracy of BCI systems that rely on human thoughts poses challenges due to their qualitative nature. To harness the potential of BCI, we propose a new approach to measure accuracy based on the characteristics of mirror neurons in the human brain that are influenced by speech speed, depending on the ultimate goal of movement. In Chapter 2 of this paper, we introduce mirror neurons and provide an explanation of human posture estimation for mirror neurons. In Chapter 3, we present a powerful pose estimation method suitable for real-time dynamic environments using the technique of human posture estimation. Furthermore, we propose a method to analyze the accuracy of BCI using this robotic environment.