Search | Korea Science

Comparison of Korean Real-time Text-to-Speech Technology Based on Deep Learning (딥러닝 기반 한국어 실시간 TTS 기술 비교)

Kwon, Chul Hong
- The Journal of the Convergence on Culture Technology
- /
- v.7 no.1
- /
- pp.640-645
- /
- 2021
The deep learning based end-to-end TTS system consists of Text2Mel module that generates spectrogram from text, and vocoder module that synthesizes speech signals from spectrogram. Recently, by applying deep learning technology to the TTS system the intelligibility and naturalness of the synthesized speech is as improved as human vocalization. However, it has the disadvantage that the inference speed for synthesizing speech is very slow compared to the conventional method. The inference speed can be improved by applying the non-autoregressive method which can generate speech samples in parallel independent of previously generated samples. In this paper, we introduce FastSpeech, FastSpeech 2, and FastPitch as Text2Mel technology, and Parallel WaveGAN, Multi-band MelGAN, and WaveGlow as vocoder technology applying non-autoregressive method. And we implement them to verify whether it can be processed in real time. Experimental results show that by the obtained RTF all the presented methods are sufficiently capable of real-time processing. And it can be seen that the size of the learned model is about tens to hundreds of megabytes except WaveGlow, and it can be applied to the embedded environment where the memory is limited.
https://doi.org/10.17703/JCCT.2021.7.1.640 인용 PDF KSCI

Design and Implementation of BNN based Human Identification and Motion Classification System Using CW Radar (연속파 레이다를 활용한 이진 신경망 기반 사람 식별 및 동작 분류 시스템 설계 및 구현)

Kim, Kyeong-min;Kim, Seong-jin;NamKoong, Ho-jung;Jung, Yun-ho
- Journal of Advanced Navigation Technology
- /
- v.26 no.4
- /
- pp.211-218
- /
- 2022
Continuous wave (CW) radar has the advantage of reliability and accuracy compared to other sensors such as camera and lidar. In addition, binarized neural network (BNN) has a characteristic that dramatically reduces memory usage and complexity compared to other deep learning networks. Therefore, this paper proposes binarized neural network based human identification and motion classification system using CW radar. After receiving a signal from CW radar, a spectrogram is generated through a short-time Fourier transform (STFT). Based on this spectrogram, we propose an algorithm that detects whether a person approaches a radar. Also, we designed an optimized BNN model that can support the accuracy of 90.0% for human identification and 98.3% for motion classification. In order to accelerate BNN operation, we designed BNN hardware accelerator on field programmable gate array (FPGA). The accelerator was implemented with 1,030 logics, 836 registers, and 334.904 Kbit block memory, and it was confirmed that the real-time operation was possible with a total calculation time of 6 ms from inference to transferring result.
https://doi.org/10.12673/jant.2022.26.4.211 인용 PDF KSCI HTML

A Study on Effective Real Estate Big Data Management Method Using Graph Database Model (그래프 데이터베이스 모델을 이용한 효율적인 부동산 빅데이터 관리 방안에 관한 연구)

Ju-Young, KIM;Hyun-Jung, KIM;Ki-Yun, YU
- Journal of the Korean Association of Geographic Information Studies
- /
- v.25 no.4
- /
- pp.163-180
- /
- 2022
Real estate data can be big data. Because the amount of real estate data is growing rapidly and real estate data interacts with various fields such as the economy, law, and crowd psychology, yet is structured with complex data layers. The existing Relational Database tends to show difficulty in handling various relationships for managing real estate big data, because it has a fixed schema and is only vertically extendable. In order to improve such limitations, this study constructs the real estate data in a Graph Database and verifies its usefulness. For the research method, we modeled various real estate data on MySQL, one of the most widely used Relational Databases, and Neo4j, one of the most widely used Graph Databases. Then, we collected real estate questions used in real life and selected 9 different questions to compare the query times on each Database. As a result, Neo4j showed constant performance even in queries with multiple JOIN statements with inferences to various relationships, whereas MySQL showed a rapid increase in its performance. According to this result, we have found out that a Graph Database such as Neo4j is more efficient for real estate big data with various relationships. We expect to use the real estate Graph Database in predicting real estate price factors and inquiring AI speakers for real estate.
https://doi.org/10.11108/kagis.2022.25.4.163 인용 PDF KSCI

Technology Trends of Smart Abnormal Detection and Diagnosis System for Gas and Hydrogen Facilities (가스·수소 시설의 스마트 이상감지 및 진단 시스템 기술동향)

Park, Myeongnam;Kim, Byungkwon;Hong, Gi Hoon;Shin, Dongil
- Journal of the Korean Institute of Gas
- /
- v.26 no.4
- /
- pp.41-57
- /
- 2022
The global demand for carbon neutrality in response to climate change is in a situation where it is necessary to prepare countermeasures for carbon trade barriers for some countries, including Korea, which is classified as an export-led economic structure and greenhouse gas exporter. Therefore, digital transformation, which is one of the predictable ways for the carbon-neutral transition model to be applied, should be introduced early. By applying digital technology to industrial gas manufacturing facilities used in one of the major industries, high-tech manufacturing industry, and hydrogen gas facilities, which are emerging as eco-friendly energy, abnormal detection, and diagnosis services are provided with cloud-based predictive diagnosis monitoring technology including operating knowledge. Here are the trends. Small and medium-sized companies that are in the blind spot of carbon-neutral implementation by confirming the direction of abnormal diagnosis predictive monitoring through optimization, augmented reality technology, IoT and AI knowledge inference, etc., rather than simply monitoring real-time facility status It can be seen that it is possible to disseminate technologies such as consensus knowledge in the engineering domain and predictive diagnostic monitoring that match the economic feasibility and efficiency of the technology. It is hoped that it will be used as a way to seek countermeasures against carbon emission trade barriers based on the highest level of ICT technology.
https://doi.org/10.7842/kigas.2022.26.4.41 인용 PDF KSCI

Matching Analysis between Actress Son Ye-jin's Core Persona and Audience Responses to Her Starring Works (배우 손예진의 코어 페르소나와 주연 작품에 대한 수용자 반응과의 정합성 분석)

Kim, Jeong-Seob
- Journal of Korea Entertainment Industry Association
- /
- v.13 no.4
- /
- pp.93-106
- /
- 2019
Persona is an actor's external ego constructed by playing various roles, and his/her another self-portrait in the eyes of the audience. This study was conducted to analyze persona identity containing core persona(CP) and to gain implications for the growth strategy of the actress Son Ye-jin called "melo queen" by verifying consistency between the CP and audience responses to her starring works of the past. According to the related theories and models, the persona was firstly set as image, visuality, personality and consistency, and it was used to extract and sort descriptive texts about Son related news articles in the last 5 years of the six major Korean newspapers using the content analysis method. After that, we analyzed the number of viewers of her movies and the audience share of her dramas by genre. As a result, Son's persona components were found to have a proportion for 54.2% images (34.0% for melo and romance images, 20.2% for non-melo and romance images), 25.6% for visibility, 13.8% for consistency, and 6.4% for personality. Her CP was derived from a melo and romance image. Comparing this with the audience reaction, the melo romance genre dominated and showed consistency in the drama, but in the case of the film, the non-melo romance was dominant and did not match each other. The results were attributed to a wide gap between dramas and movies in terms of key viewers, box office factors, degree of genre hybridity and experimentality. Therefore, Son should actively use her CP in the drama and challenge the various roles in order to expand her persona spectrum in the film.
https://doi.org/10.21184/jkeia.2019.6.13.4.93 인용

Development of Elementary Maker Education Program using WeDo Robot (WeDo 로봇 활용 초등 메이커 교육 프로그램 개발)

Kweon, Soonhwan;Park, Jungho
- 한국정보교육학회:학술대회논문집
- /
- 2021.08a
- /
- pp.335-340
- /
- 2021
This study conducted research on creating an environment for maker education programs for robot and SW education, development and application of maker education programs for low-grade elementary school students in farming and fishing villages. Based on the preceding maker education model, the OMCSI model was developed for the lower grade level of elementary school, and based on this, five WeDo-utilized elementary maker education programs were developed. From April 1, 2020 to October 30, 2020, the results of applying the elementary school maker education program using WeDo Robot 2.0 to 10 second graders of 10 Elementary School in Gyeongsangnam-do are as follows. The average increased by 3.40 points (t=-2.378, p=0.034) and the average increased by 3.30 points (t=-2.329, p=0.040). The average was also increased by 3.40 points (t=-2.458, p=0.038). Finally, it rose to 3.70 points (t=-2.449, p=0.037) for its reasoning ability. That is, all four sub-elements of computing thinking had a significant probability of 0.04, indicating statistical significant differences between scores of pre-post computing thinking. Therefore, the Elementary Maker Education Program using WeDo robots has worked very effectively to improve students' computing thinking skills.
PDF

A Study on the Application of Task Offloading for Real-Time Object Detection in Resource-Constrained Devices (자원 제약적 기기에서 자율주행의 실시간 객체탐지를 위한 태스크 오프로딩 적용에 관한 연구)

Jang Shin Won;Yong-Geun Hong
- KIPS Transactions on Computer and Communication Systems
- /
- v.12 no.12
- /
- pp.363-370
- /
- 2023
Object detection technology that accurately recognizes the road and surrounding conditions is a key technology in the field of autonomous driving. In the field of autonomous driving, object detection technology requires real-time performance as well as accuracy of inference services. Task offloading technology should be utilized to apply object detection technology for accuracy and real-time on resource-constrained devices rather than high-performance machines. In this paper, experiments such as performance comparison of task offloading, performance comparison according to input image resolution, and performance comparison according to camera object resolution were conducted and the results were analyzed in relation to the application of task offloading for real-time object detection of autonomous driving in resource-constrained devices. In this experiment, the low-resolution image could derive performance improvement through the application of the task offloading structure, which met the real-time requirements of autonomous driving. The high-resolution image did not meet the real-time requirements for autonomous driving due to the increase in communication time, although there was an improvement in performance. Through these experiments, it was confirmed that object recognition in autonomous driving affects various conditions such as input images and communication environments along with the object recognition model used.
https://doi.org/10.3745/KTCCS.2023.12.12.363 인용 PDF

AI-Based Object Recognition Research for Augmented Reality Character Implementation (증강현실 캐릭터 구현을 위한 AI기반 객체인식 연구)

Seok-Hwan Lee;Jung-Keum Lee;Hyun Sim
- The Journal of the Korea institute of electronic communication sciences
- /
- v.18 no.6
- /
- pp.1321-1330
- /
- 2023
This study attempts to address the problem of 3D pose estimation for multiple human objects through a single image generated during the character development process that can be used in augmented reality. In the existing top-down method, all objects in the image are first detected, and then each is reconstructed independently. The problem is that inconsistent results may occur due to overlap or depth order mismatch between the reconstructed objects. The goal of this study is to solve these problems and develop a single network that provides consistent 3D reconstruction of all humans in a scene. Integrating a human body model based on the SMPL parametric system into a top-down framework became an important choice. Through this, two types of collision loss based on distance field and loss that considers depth order were introduced. The first loss prevents overlap between reconstructed people, and the second loss adjusts the depth ordering of people to render occlusion inference and annotated instance segmentation consistently. This method allows depth information to be provided to the network without explicit 3D annotation of the image. Experimental results show that this study's methodology performs better than existing methods on standard 3D pose benchmarks, and the proposed losses enable more consistent reconstruction from natural images.
https://doi.org/10.13067/JKIECS.2023.18.6.1321 인용 PDF

Performance Evaluation and Analysis on Single and Multi-Network Virtualization Systems with Virtio and SR-IOV (가상화 시스템에서 Virtio와 SR-IOV 적용에 대한 단일 및 다중 네트워크 성능 평가 및 분석)

Jaehak Lee;Jongbeom Lim;Heonchang Yu
- The Transactions of the Korea Information Processing Society
- /
- v.13 no.2
- /
- pp.48-59
- /
- 2024
As functions that support virtualization on their own in hardware are developed, user applications having various workloads are operating efficiently in the virtualization system. SR-IOV is a virtualization support function that takes direct access to PCI devices, thus giving a high I/O performance by minimizing the need for hypervisor or operating system interventions. With SR-IOV, network I/O acceleration can be realized in virtualization systems that have relatively long I/O paths compared to bare-metal systems and frequent context switches between the user area and kernel area. To take performance advantages of SR-IOV, network resource management policies that can derive optimal network performance when SR-IOV is applied to an instance such as a virtual machine(VM) or container are being actively studied.This paper evaluates and analyzes the network performance of SR-IOV implementing I/O acceleration is compared with Virtio in terms of 1) network delay, 2) network throughput, 3) network fairness, 4) performance interference, and 5) multi-network. The contributions of this paper are as follows. First, the network I/O process of Virtio and SR-IOV was clearly explained in the virtualization system, and second, the evaluation results of the network performance of Virtio and SR-IOV were analyzed based on various performance metrics. Third, the system overhead and the possibility of optimization for the SR-IOV network in a virtualization system with high VM density were experimentally confirmed. The experimental results and analysis of the paper are expected to be referenced in the network resource management policy for virtualization systems that operate network-intensive services such as smart factories, connected cars, deep learning inference models, and crowdsourcing.
https://doi.org/10.3745/TKIPS.2024.13.2.48 인용 PDF

Fun of Animation-on the Correlation among the Perceptive fun, the Cognitive fun and the Psychological fun (애니메이션의 재미 - 감각적 재미, 인지적 재미, 심리적 재미의 상관관계)

Sung, Re-A
- Cartoon and Animation Studies
- /
- s.33
- /
- pp.99-126
- /
- 2013
This study is meant to be seeing how fun of animation works by reviewing it theoretically and coordinating it to suggest the structure which integrates fun of animation and validates the proposed fun model. After reviewing fun theoretically, the fun of animation could be able to coordinate that fun of animation is consist of perceptive fun, cognitive fun, and psychological fun. Perceptive fun is induced by visual, auditory and other sensory information and it is directly affected the image, sound, and movement. Cognitive fun can be obtained by reasoning and interpretation to mobilize their knowledge with sensuously perceived stimulation and it is directly affected the story. Psychological fun occurs when the audience see the animation. The psychological fun is the psychological emotional state when the audience watches animation by relieving psychological congestion. It consists of fun of unfamiliarity or identification. By suggesting research model and validating it how the perceptive fun, cognitive fun, and psychological fun affects each other, perceptive fun enhances cognitive fun and psychological fun. Although cognitive fun enhances psychological fun, cognitive fun enhances psychological fun twice than perceptive fun. Also when perceptive fun affects psychological fun, cognitive fun shows the indirect effect as a parameter. In conclusion, perceptive fun affects psychological fun directly and be enhanced through cognitive fun. Fun of animation can be experienced when perceptive fun caused by accepting sensory information of animation instantly, cognitive fun caused by interpretation and understanding sensory information of animation, and psychological fun caused by relieving psychological identity through recognition fuses and acts as one. An animation emphasized a certain element is difficult to be loved by the audience. In this reason, an harmonical combination among the elements of story, image, sound and movement are important to combinate harmoniously for a successful animation to make the audiences fun by arising funny emotions.
https://doi.org/10.7230/KOSCAS.2013.33.099 인용 PDF KSCI

Search Result 1,019, Processing Time 0.025 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)