Search | Korea Science

Comparison of Korean Real-time Text-to-Speech Technology Based on Deep Learning (딥러닝 기반 한국어 실시간 TTS 기술 비교)

Kwon, Chul Hong
- The Journal of the Convergence on Culture Technology
- /
- v.7 no.1
- /
- pp.640-645
- /
- 2021
The deep learning based end-to-end TTS system consists of Text2Mel module that generates spectrogram from text, and vocoder module that synthesizes speech signals from spectrogram. Recently, by applying deep learning technology to the TTS system the intelligibility and naturalness of the synthesized speech is as improved as human vocalization. However, it has the disadvantage that the inference speed for synthesizing speech is very slow compared to the conventional method. The inference speed can be improved by applying the non-autoregressive method which can generate speech samples in parallel independent of previously generated samples. In this paper, we introduce FastSpeech, FastSpeech 2, and FastPitch as Text2Mel technology, and Parallel WaveGAN, Multi-band MelGAN, and WaveGlow as vocoder technology applying non-autoregressive method. And we implement them to verify whether it can be processed in real time. Experimental results show that by the obtained RTF all the presented methods are sufficiently capable of real-time processing. And it can be seen that the size of the learned model is about tens to hundreds of megabytes except WaveGlow, and it can be applied to the embedded environment where the memory is limited.
https://doi.org/10.17703/JCCT.2021.7.1.640 인용 PDF KSCI

Time series and deep learning prediction study Using container Throughput at Busan Port (부산항 컨테이너 물동량을 이용한 시계열 및 딥러닝 예측연구)

Seung-Pil Lee;Hwan-Seong Kim
- Proceedings of the Korean Institute of Navigation and Port Research Conference
- /
- 2022.06a
- /
- pp.391-393
- /
- 2022
In recent years, technologies forecasting demand based on deep learning and big data have accelerated the smartification of the field of e-commerce, logistics and distribution areas. In particular, ports, which are the center of global transportation networks and modern intelligent logistics, are rapidly responding to changes in the global economy and port environment caused by the 4th industrial revolution. Port traffic forecasting will have an important impact in various fields such as new port construction, port expansion, and terminal operation. Therefore, the purpose of this study is to compare the time series analysis and deep learning analysis, which are often used for port traffic prediction, and to derive a prediction model suitable for the future container prediction of Busan Port. In addition, external variables related to trade volume changes were selected as correlations and applied to the multivariate deep learning prediction model. As a result, it was found that the LSTM error was low in the single-variable prediction model using only Busan Port container freight volume, and the LSTM error was also low in the multivariate prediction model using external variables.
PDF

Source Tracking Models on Chemical Leaks for Emergency Response in Chemical Plants Based on Deep Learning of Big Data (화학공장 누출사고 대응을 위한 빅데이터-딥러닝 누출원 추적모델)

Kim, Hyunseung;Shin, Dongil
- Proceedings of the Korean Society of Disaster Information Conference
- /
- 2017.11a
- /
- pp.339-340
- /
- 2017
화학공장의 누출사고는 초기에 적절히 대응하지 못할 경우 화재 폭발과 같은 2차 3차의 복합재난사고로 확산될 위험성이 매우 높다. 이러한 이유로 누출사고 발생 초기에 누출이 발생한 지점을 신속히 파악하여 현장안전요원에게 알림으로써, 보다 체계적이고 효율적인 초기대응을 가능하게 하여, 사고피해를 완화시킬 수 있는 통합적인 누출사고 대응시스템 구축은 매우 중요하다고 할 수 있다. 본 연구에서는, 통합적인 누출사고 대응시스템 구축을 위한 선행연구로, 딥러닝 기반의 누출원추적 모델 개발을 제안한다. 여수에 위치한 실제 화학공장을 대상으로 누출사고 시나리오에 대한 Computational Fluid Dynamics (CFD) 시뮬레이션을 진행한 뒤, 화학공장 경계면에 배치된 각 센서별 위치에서의 농도, 풍향 그리고 풍속데이터를 추출하고, 센서 좌표를 추가하여 인공신경망을 학습시켰다. 학습된 모델은 40개의 누출후보군에 대해 학습에 사용되지 않은 상황들에서도 75.43%의 정확도로 누출이 일어난 지점을 실시간 예측해냄을 확인하였다. 또한 누출지점 예측이 일치하지 않은 경우도, 예측된 지점이 실제 누출이 일어난 지점과 물리적으로 매우 인접함을 확인함으로써 제안된 모델을 실제 현장에 적용할시 기대되는 효과는 더 클 것으로 판단하였다.
PDF

Deep Learning Based Electricity Demand Prediction and Power Grid Operation according to Urbanization Rate and Industrial Differences (도시화율 및 산업 구성 차이에 따른 딥러닝 기반 전력 수요 변동 예측 및 전력망 운영)

KIM, KAYOUNG;LEE, SANGHUN
- Journal of Hydrogen and New Energy
- /
- v.33 no.5
- /
- pp.591-597
- /
- 2022
Recently, technologies for efficient power grid operation have become important due to climate change. For this reason, predicting power demand using deep learning is being considered, and it is necessary to understand the influence of characteristics of each region, industrial structure, and climate. This study analyzed the power demand of New Jersey in US, with a high urbanization rate and a large service industry, and West Virginia in US, a low urbanization rate and a large coal, energy, and chemical industries. Using recurrent neural network algorithm, the power demand from January 2020 to August 2022 was learned, and the daily and weekly power demand was predicted. In addition, the power grid operation based on the power demand forecast was discussed. Unlike previous studies that have focused on the deep learning algorithm itself, this study analyzes the regional power demand characteristics and deep learning algorithm application, and power grid operation strategy.
https://doi.org/10.7316/KHNES.2022.33.5.591 인용 PDF KSCI

Development of Disabled Parking System Using Deep Learning Model (딥러닝 모델을 적용한 장애인 주차구역 단속시스템의 개발)

Lee, Jiwon;Lee, Dongjin;Jang, Jongwook;Jang, Sungjin
- Proceedings of the Korean Institute of Information and Commucation Sciences Conference
- /
- 2021.05a
- /
- pp.175-177
- /
- 2021
The parking area for the disabled is a parking facility for the pedestrian disabled and is a parking space for securing pedestrian safety passage for the disabled. However, due to the lack of social awareness of areas for the disabled, the use of parking areas is restricted, and violations such as illegal parking and obstruction of parking are increasing every year. Therefore, in this study, we propose a system to crack down on illegal parking in handicapped parking areas using the YOLOv5 model, a deep learning object recognition model to improve parking interference within parking spaces.
PDF

Implementation of Face-Touching Action Recognition System based on Deep Learning for Preventing Contagious Diseases (전염병 확산 방지를 위한 딥러닝 기반 얼굴 만지기 행동 인식 연구)

Cho, Sungman;Kim, Minjee;Choi, Joonmyeong;Kim, Taehyung;Park, Juyoung;Kim, Namkug
- Proceedings of the Korean Society of Broadcast Engineers Conference
- /
- 2020.07a
- /
- pp.630-633
- /
- 2020
무의식적인 손-얼굴의 접촉으로 인한 감염의 문제점을 해결하기 위해, 얼굴 만지기 행동을 인식할 필요가 있다. 본 연구는 최근 각광을 받는 딥러닝 기술을 이용하여 비디오 영상에서 얼굴 만지기 행동 인식에 대한 연구이다. 우선, 비디오 영상에서 얼굴 만지기와 관련된 11 가지 행동에 대한 시, 공간적 특징을 컨볼루션 신경망을 통해 추출한다. 추출된 정보는 각 행동 레이블로 인코딩되어 비디오 영상에서 얼굴 만지기 행동을 분류한다. 또한, 3D, 2D 컨볼루션 신경망의 대표 네트워크인 I3D, MobileNet v3에 대해 비교 실험을 진행한다. 제안하는 시스템을 적용하여 인간의 행동을 분류하는 실험을 진행했을 때, 얼굴을 만지는 행동을 99%의 확률로 구분했다. 이 시스템을 이용하여 일반인이 무의식적인 얼굴 만지기 행동에 대해서 정량적으로 또는 적시적으로 인식을 하여, 안전한 위생 습관을 확립하여 감염의 확산방지에 도움을 줄수 있기를 바란다.
PDF

Exploration of deep learning facial motions recognition technology in college students' mental health (딥러닝의 얼굴 정서 식별 기술 활용-대학생의 심리 건강을 중심으로)

Li, Bo;Cho, Kyung-Duk
- Journal of the Korea Institute of Information and Communication Engineering
- /
- v.26 no.3
- /
- pp.333-340
- /
- 2022
The COVID-19 has made everyone anxious and people need to keep their distance. It is necessary to conduct collective assessment and screening of college students' mental health in the opening season of every year. This study uses and trains a multi-layer perceptron neural network model for deep learning to identify facial emotions. After the training, real pictures and videos were input for face detection. After detecting the positions of faces in the samples, emotions were classified, and the predicted emotional results of the samples were sent back and displayed on the pictures. The results show that the accuracy is 93.2% in the test set and 95.57% in practice. The recognition rate of Anger is 95%, Disgust is 97%, Happiness is 96%, Fear is 96%, Sadness is 97%, Surprise is 95%, Neutral is 93%, such efficient emotion recognition can provide objective data support for capturing negative. Deep learning emotion recognition system can cooperate with traditional psychological activities to provide more dimensions of psychological indicators for health.
https://doi.org/10.6109/jkiice.2022.26.3.333 인용 PDF KSCI

A Study on the Deep Learning-Based Tomato Disease Diagnosis Service (딥러닝기반 토마토 병해 진단 서비스 연구)

Jo, YuJin;Shin, ChangSun
- Smart Media Journal
- /
- v.11 no.5
- /
- pp.48-55
- /
- 2022
Tomato crops are easy to expose to disease and spread in a short period of time, so late measures against disease are directly related to production and sales, which can cause damage. Therefore, there is a need for a service that enables early prevention by simply and accurately diagnosing tomato diseases in the field. In this paper, we construct a system that applies a deep learning-based model in which ImageNet transition is learned in advance to classify and serve nine classes of tomatoes for disease and normal cases. We use the input of MobileNet, ResNet, with a deep learning-based CNN structure that builds a lighter neural network using a composite product for the image set of leaves classifying tomato disease and normal from the Plant Village dataset. Through the learning of two proposed models, it is possible to provide fast and convenient services using MobileNet with high accuracy and learning speed.
https://doi.org/10.30693/SMJ.2022.11.5.48 인용 PDF KSCI

Search Re-ranking Through Weighted Deep Learning Model (검색 재순위화를 위한 가중치 반영 딥러닝 학습 모델)

Gi-Taek An;Woo-Seok Choi;Jun-Yong Park;Jung-Min Park;Kyung-Soon Lee
- The Transactions of the Korea Information Processing Society
- /
- v.13 no.5
- /
- pp.221-226
- /
- 2024
In information retrieval, queries come in various types, ranging from abstract queries to those containing specific keywords, making it a challenging task to accurately produce results according to user demands. Additionally, search systems must handle queries encompassing various elements such as typos, multilingualism, and codes. Reranking is performed through training suitable documents for queries using DeBERTa, a deep learning model that has shown high performance in recent research. To evaluate the effectiveness of the proposed method, experiments were conducted using the test collection of the Product Search Track at the TREC 2023 international information retrieval evaluation competition. In the comparison of NDCG performance measurements regarding the experimental results, the proposed method showed a 10.48% improvement over BM25, a basic information retrieval model, in terms of search through query error handling, provisional relevance feedback-based product title-based query expansion, and reranking according to query types, achieving a score of 0.7810.
https://doi.org/10.3745/TKIPS.2024.13.5.221 인용 PDF

Spatial Entities Extraction using Bidirectional LSTM-CRF Ensemble (Bidirectional LSTM-CRF 앙상블을 이용한 공간 개체 추출)

Min, Tae Hong;Lee, Jae Sung
- Annual Conference on Human and Language Technology
- /
- 2017.10a
- /
- pp.133-136
- /
- 2017
공간 정보 추출은 대량의 텍스트 문서에서 자연어로 표현된 공간 관련 개체 및 관계를 추출하는 것으로 질의응답 시스템, 챗봇 시스템, 네비게이션 시스템 등에서 활용될 수 있다. 본 연구는 한국어에 나타나 있는 공간 개체들을 효과적으로 추출하기 위한 앙상블 기법이 적용된 Bidirectional LSTM-CRF 모델을 소개한다. 한국어 공간 정보 말뭉치를 이용하여 실험한 결과, 기존 모델보다 매크로 평균이 향상되어 전반적인 공간 관계 추출에 유용할 것으로 기대한다.
PDF

Search Result 1,319, Processing Time 0.035 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)