• Title/Summary/Keyword: Deep Learning Neural Networks

Search Result 694, Processing Time 0.024 seconds

Performance Analysis of Object Detection Neural Network According to Compression Ratio of RGB and IR Images (RGB와 IR 영상의 압축률에 따른 객체 탐지 신경망 성능 분석)

  • Lee, Yegi;Kim, Shin;Lim, Hanshin;Lee, Hee Kyung;Choo, Hyon-Gon;Seo, Jeongil;Yoon, Kyoungro
    • Journal of Broadcast Engineering
    • /
    • v.26 no.2
    • /
    • pp.155-166
    • /
    • 2021
  • Most object detection algorithms are studied based on RGB images. Because the RGB cameras are capturing images based on light, however, the object detection performance is poor when the light condition is not good, e.g., at night or foggy days. On the other hand, high-quality infrared(IR) images regardless of weather condition and light can be acquired because IR images are captured by an IR sensor that makes images with heat information. In this paper, we performed the object detection algorithm based on the compression ratio in RGB and IR images to show the detection capabilities. We selected RGB and IR images that were taken at night from the Free FLIR Thermal dataset for the ADAS(Advanced Driver Assistance Systems) research. We used the pre-trained object detection network for RGB images and a fine-tuned network that is tuned based on night RGB and IR images. Experimental results show that higher object detection performance can be acquired using IR images than using RGB images in both networks.

Research Status of Satellite-based Evapotranspiration and Soil Moisture Estimations in South Korea (위성기반 증발산량 및 토양수분량 산정 국내 연구동향)

  • Choi, Ga-young;Cho, Younghyun
    • Korean Journal of Remote Sensing
    • /
    • v.38 no.6_1
    • /
    • pp.1141-1180
    • /
    • 2022
  • The application of satellite imageries has increased in the field of hydrology and water resources in recent years. However, challenges have been encountered on obtaining accurate evapotranspiration and soil moisture. Therefore, present researches have emphasized the necessity to obtain estimations of satellite-based evapotranspiration and soil moisture with related development researches. In this study, we presented the research status in Korea by investigating the current trends and methodologies for evapotranspiration and soil moisture. As a result of examining the detailed methodologies, we have ascertained that, in general, evapotranspiration is estimated using Energy balance models, such as Surface Energy Balance Algorithm for Land (SEBAL) and Mapping Evapotranspiration with Internalized Calibration (METRIC). In addition, Penman-Monteith and Priestley-Taylor equations are also used to estimate evapotranspiration. In the case of soil moisture, in general, active (AMSR-E, AMSR2, MIRAS, and SMAP) and passive (ASCAT and SAR)sensors are used for estimation. In terms of statistics, deep learning, as well as linear regression equations and artificial neural networks, are used for estimating these parameters. There were a number of research cases in which various indices were calculated using satellite-based data and applied to the characterization of drought. In some cases, hydrological cycle factors of evapotranspiration and soil moisture were calculated based on the Land Surface Model (LSM). Through this process, by comparing, reviewing, and presenting major detailed methodologies, we intend to use these references in related research, and lay the foundation for the advancement of researches on the calculation of satellite-based hydrological cycle data in the future.

Artificial Intelligence for Assistance of Facial Expression Practice Using Emotion Classification (감정 분류를 이용한 표정 연습 보조 인공지능)

  • Dong-Kyu, Kim;So Hwa, Lee;Jae Hwan, Bong
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • v.17 no.6
    • /
    • pp.1137-1144
    • /
    • 2022
  • In this study, an artificial intelligence(AI) was developed to help with facial expression practice in order to express emotions. The developed AI used multimodal inputs consisting of sentences and facial images for deep neural networks (DNNs). The DNNs calculated similarities between the emotions predicted by the sentences and the emotions predicted by facial images. The user practiced facial expressions based on the situation given by sentences, and the AI provided the user with numerical feedback based on the similarity between the emotion predicted by sentence and the emotion predicted by facial expression. ResNet34 structure was trained on FER2013 public data to predict emotions from facial images. To predict emotions in sentences, KoBERT model was trained in transfer learning manner using the conversational speech dataset for emotion classification opened to the public by AIHub. The DNN that predicts emotions from the facial images demonstrated 65% accuracy, which is comparable to human emotional classification ability. The DNN that predicts emotions from the sentences achieved 90% accuracy. The performance of the developed AI was evaluated through experiments with changing facial expressions in which an ordinary person was participated.

A Multi-speaker Speech Synthesis System Using X-vector (x-vector를 이용한 다화자 음성합성 시스템)

  • Jo, Min Su;Kwon, Chul Hong
    • The Journal of the Convergence on Culture Technology
    • /
    • v.7 no.4
    • /
    • pp.675-681
    • /
    • 2021
  • With the recent growth of the AI speaker market, the demand for speech synthesis technology that enables natural conversation with users is increasing. Therefore, there is a need for a multi-speaker speech synthesis system that can generate voices of various tones. In order to synthesize natural speech, it is required to train with a large-capacity. high-quality speech DB. However, it is very difficult in terms of recording time and cost to collect a high-quality, large-capacity speech database uttered by many speakers. Therefore, it is necessary to train the speech synthesis system using the speech DB of a very large number of speakers with a small amount of training data for each speaker, and a technique for naturally expressing the tone and rhyme of multiple speakers is required. In this paper, we propose a technology for constructing a speaker encoder by applying the deep learning-based x-vector technique used in speaker recognition technology, and synthesizing a new speaker's tone with a small amount of data through the speaker encoder. In the multi-speaker speech synthesis system, the module for synthesizing mel-spectrogram from input text is composed of Tacotron2, and the vocoder generating synthesized speech consists of WaveNet with mixture of logistic distributions applied. The x-vector extracted from the trained speaker embedding neural networks is added to Tacotron2 as an input to express the desired speaker's tone.