Search | Korea Science

Building robust Korean speech recognition model by fine-tuning large pretrained model (대형 사전훈련 모델의 파인튜닝을 통한 강건한 한국어 음성인식 모델 구축)

Changhan Oh;Cheongbin Kim;Kiyoung Park
- Phonetics and Speech Sciences
- /
- v.15 no.3
- /
- pp.75-82
- /
- 2023
Automatic speech recognition (ASR) has been revolutionized with deep learning-based approaches, among which self-supervised learning methods have proven to be particularly effective. In this study, we aim to enhance the performance of OpenAI's Whisper model, a multilingual ASR system on the Korean language. Whisper was pretrained on a large corpus (around 680,000 hours) of web speech data and has demonstrated strong recognition performance for major languages. However, it faces challenges in recognizing languages such as Korean, which is not major language while training. We address this issue by fine-tuning the Whisper model with an additional dataset comprising about 1,000 hours of Korean speech. We also compare its performance against a Transformer model that was trained from scratch using the same dataset. Our results indicate that fine-tuning the Whisper model significantly improved its Korean speech recognition capabilities in terms of character error rate (CER). Specifically, the performance improved with increasing model size. However, the Whisper model's performance on English deteriorated post fine-tuning, emphasizing the need for further research to develop robust multilingual models. Our study demonstrates the potential of utilizing a fine-tuned Whisper model for Korean ASR applications. Future work will focus on multilingual recognition and optimization for real-time inference.
https://doi.org/10.13064/KSSS.2023.15.3.075 인용 PDF

Dynamic Distributed Adaptation Framework for Quality Assurance of Web Service in Mobile Environment (모바일 환경에서 웹 서비스 품질보장을 위한 동적 분산적응 프레임워크)

Lee, Seung-Hwa;Cho, Jae-Woo;Lee, Eun-Seok
- The KIPS Transactions:PartD
- /
- v.13D no.6 s.109
- /
- pp.839-846
- /
- 2006
Context-aware adaptive service for overcoming the limitations of wireless devices and maintaining adequate service levels in changing environments is becoming an important issue. However, most existing studies concentrate on an adaptation module on the client, proxy, or server. These existing studies thus suffer from the problem of having the workload concentrated on a single system when the number of users increases md, and as a result, increases the response time to a user's request. Therefore, in this paper the adaptation module is dispersed and arranged over the client, proxy, and server. The module monitors the contort of the system and creates a proposition as to the dispersed adaptation system in which the most adequate system for conducting operations. Through this method faster adaptation work will be made possible even when the numbers of users increase, and more stable system operation is made possible as the workload is divided. In order to evaluate the proposed system, a prototype is constructed and dispersed operations are tested using multimedia based learning content, simulating server overload and compared the response times and system stability with the existing server based adaptation method. The effectiveness of the system is confirmed through this results.
https://doi.org/10.3745/KIPSTD.2006.13D.6.839 인용 PDF KSCI

Development of a method for urban flooding detection using unstructured data and deep learing (비정형 데이터와 딥러닝을 활용한 내수침수 탐지기술 개발)

Lee, Haneul;Kim, Hung Soo;Kim, Soojun;Kim, Donghyun;Kim, Jongsung
- Journal of Korea Water Resources Association
- /
- v.54 no.12
- /
- pp.1233-1242
- /
- 2021
In this study, a model was developed to determine whether flooding occurred using image data, which is unstructured data. CNN-based VGG16 and VGG19 were used to develop the flood classification model. In order to develop a model, images of flooded and non-flooded images were collected using web crawling method. Since the data collected using the web crawling method contains noise data, data irrelevant to this study was primarily deleted, and secondly, the image size was changed to 224×224 for model application. In addition, image augmentation was performed by changing the angle of the image for diversity of image. Finally, learning was performed using 2,500 images of flooding and 2,500 images of non-flooding. As a result of model evaluation, the average classification performance of the model was found to be 97%. In the future, if the model developed through the results of this study is mounted on the CCTV control center system, it is judged that the respons against flood damage can be done quickly.
https://doi.org/10.3741/JKWRA.2021.54.12.1233 인용 PDF KSCI

Development of Coupon System for Youth's Experiential Learning using QR Code (QR코드를 이용한 청소년 체험학습 쿠폰 시스템 개발)

Park, Soon-Ho;Kim, Yu-Doo;Moon, Il-Young
- The Journal of Korean Institute for Practical Engineering Education
- /
- v.5 no.1
- /
- pp.52-57
- /
- 2013
Because of rapid spread of the PC, many users have been enjoying a variety of content as PC. Especially in recent years, young people has increased dramatically PC usage. Young people get more easily information using a PC. Especially they relieve their stress through online games and feel another fun of virtual reality. It is obviously a good effect that they contact IT culture with rapidly developed. But young people's perspective with world is narrow because of doing more indoor activities than outdoor. Therefore we built Spot experience voucher system using smart phone application. We hope that many young people act outdoor activities. And Our product offer hybrid device by developing HTML5-based app. Thus this app will give interest of spot-experience to young-people. So If young people use this app, they can have many experience and see diverse aspects.
PDF

Light-Weight Mobile VR Platform using HMD with 6 Axis (6 축센서를 갖는 HMD 경량 모바일 VR Platform)

Kang, Yunhee;Kang, JungJu
- Journal of Platform Technology
- /
- v.6 no.2
- /
- pp.3-9
- /
- 2018
Recently VR environment is used in many areas including mobile learning, smart factory. However HMD(head-mounted display) is required to a dedicated and expensive system with high-end specification. When designing a VR system, it is needed to handle performance, mobility and usability. Many VR applications need to handle diverse sensors and user inputs continuously in a streaming manner. In this paper we design a VR mobile platform and implement a low-cost mobile VR HMD running on the platform. The VR HMD supports 3D contents delivery in a mobile manner. It is used to detect the motion detection based on angle value of a VR player from accelerator and gyro sensor. The MPU-6050, 6-axis sensor, is used to get a sensory value and the sensory value is taken as an input to a VR rendering server on a Unity game engine that is generated 3D images.

Natural Language Processing-based Personalized Twitter Recommendation System (자연어 처리 기반 맞춤형 트윗 추천 시스템)

Lee, Hyeon-Chang;Yu, Dong-Pil;Jung, Ga-Bin;Nam, Yong-Wook;Kim, Yong-Hyuk
- Journal of the Korea Convergence Society
- /
- v.9 no.12
- /
- pp.39-45
- /
- 2018
Twitter users use 'Following', 'Retweet' and so on to find tweets that they are interested in. However, it is difficult for users to find tweets that are of interest to them on Twitter, which has more than 300 million users. In this paper, we developed a customized tweet recommendation system to resolve it. First, we gather current trends to collect tweets that are worth recommending to users and popular tweets that talk about trends. Later, to analyze users and recommend customized tweets, the users' tweets and the collected tweets are categorized. Finally, using Web service, we recommend tweets that match with user categorization and users whose interests match. Consequentially, we recommended 67.2% of proper tweet.
https://doi.org/10.15207/JKCS.2018.9.12.039 인용 PDF KSCI HTML

Implementation of a Classification System for Dog Behaviors using YOLI-based Object Detection and a Node.js Server (YOLO 기반 개체 검출과 Node.js 서버를 이용한 반려견 행동 분류 시스템 구현)

Jo, Yong-Hwa;Lee, Hyuek-Jae;Kim, Young-Hun
- Journal of the Institute of Convergence Signal Processing
- /
- v.21 no.1
- /
- pp.29-37
- /
- 2020
This paper implements a method of extracting an object about a dog through real-time image analysis and classifying dog behaviors from the extracted images. The Darknet YOLO was used to detect dog objects, and the Teachable Machine provided by Google was used to classify behavior patterns from the extracted images. The trained Teachable Machine is saved in Google Drive and can be used by ml5.js implemented on a node.js server. By implementing an interactive web server using a socket.io module on the node.js server, the classified results are transmitted to the user's smart phone or PC in real time so that it can be checked anytime, anywhere.
PDF KSCI

GAN-based Automated Generation of Web Page Metadata for Search Engine Optimization (검색엔진 최적화를 위한 GAN 기반 웹사이트 메타데이터 자동 생성)

An, Sojung;Lee, O-jun;Lee, Jung-Hyeon;Jung, Jason J.;Yong, Hwan-Sung
- Proceedings of the Korean Institute of Information and Commucation Sciences Conference
- /
- 2019.05a
- /
- pp.79-82
- /
- 2019
This study aims to design and implement automated SEO tools that has applied the artificial intelligence techniques for search engine optimization (SEO; Search Engine Optimization). Traditional Search Engine Optimization (SEO) on-page optimization show limitations that rely only on knowledge of webpage administrators. Thereby, this paper proposes the metadata generation system. It introduces three approaches for recommending metadata; i) Downloading the metadata which is the top of webpage ii) Generating terms which is high relevance by using bi-directional Long Short Term Memory (LSTM) based on attention; iii) Learning through the Generative Adversarial Network (GAN) to enhance overall performance. It is expected to be useful as an optimizing tool that can be evaluated and improve the online marketing processes.
PDF

Development of Dataset Evaluation Criteria for Learning Deepfake Video (딥페이크 영상 학습을 위한 데이터셋 평가기준 개발)

Kim, Rayng-Hyung;Kim, Tae-Gu
- Journal of Korean Society of Industrial and Systems Engineering
- /
- v.44 no.4
- /
- pp.193-207
- /
- 2021
As Deepfakes phenomenon is spreading worldwide mainly through videos in web platforms and it is urgent to address the issue on time. More recently, researchers have extensively discussed deepfake video datasets. However, it has been pointed out that the existing Deepfake datasets do not properly reflect the potential threat and realism due to various limitations. Although there is a need for research that establishes an agreed-upon concept for high-quality datasets or suggests evaluation criterion, there are still handful studies which examined it to-date. Therefore, this study focused on the development of the evaluation criterion for the Deepfake video dataset. In this study, the fitness of the Deepfake dataset was presented and evaluation criterions were derived through the review of previous studies. AHP structuralization and analysis were performed to advance the evaluation criterion. The results showed that Facial Expression, Validation, and Data Characteristics are important determinants of data quality. This is interpreted as a result that reflects the importance of minimizing defects and presenting results based on scientific methods when evaluating quality. This study has implications in that it suggests the fitness and evaluation criterion of the Deepfake dataset. Since the evaluation criterion presented in this study was derived based on the items considered in previous studies, it is thought that all evaluation criterions will be effective for quality improvement. It is also expected to be used as criteria for selecting an appropriate deefake dataset or as a reference for designing a Deepfake data benchmark. This study could not apply the presented evaluation criterion to existing Deepfake datasets. In future research, the proposed evaluation criterion will be applied to existing datasets to evaluate the strengths and weaknesses of each dataset, and to consider what implications there will be when used in Deepfake research.
https://doi.org/10.11627/jksie.2021.44.4.193 인용 PDF KSCI

Building Hierarchical Knowledge Base of Research Interests and Learning Topics for Social Computing Support (소셜 컴퓨팅을 위한 연구·학습 주제의 계층적 지식기반 구축)

Kim, Seonho;Kim, Kang-Hoe;Yeo, Woondong
- The Journal of the Korea Contents Association
- /
- v.12 no.12
- /
- pp.489-498
- /
- 2012
This paper consists of two parts: In the first part, we describe our work to build hierarchical knowledge base of digital library patron's research interests and learning topics in various scholarly areas through analyzing well classified Electronic Theses and Dissertations (ETDs) of NDLTD Union catalog. Journal articles from ACM Transactions and conference web sites of computing areas also are added in the analysis to specialize computing fields. This hierarchical knowledge base would be a useful tool for many social computing and information service applications, such as personalization, recommender system, text mining, technology opportunity mining, information visualization, and so on. In the second part, we compare four grouping algorithms to select best one for our data mining researches by testing each one with the hierarchical knowledge base we described in the first part. From these two studies, we intent to show traditional verification methods for social community miming researches, based on interviewing and answering questionnaires, which are expensive, slow, and privacy threatening, can be replaced with systematic, consistent, fast, and privacy protecting methods by using our suggested hierarchical knowledge base.
https://doi.org/10.5392/JKCA.2012.12.12.489 인용 PDF KSCI

Search Result 716, Processing Time 0.022 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)