• Title/Summary/Keyword: broadcast algorithm

Search Result 1,141, Processing Time 0.021 seconds

Deep Learning-based SISR (Single Image Super Resolution) Method using RDB (Residual Dense Block) and Wavelet Prediction Network (RDB 및 웨이블릿 예측 네트워크 기반 단일 영상을 위한 심층 학습기반 초해상도 기법)

  • NGUYEN, HUU DUNG;Kim, Eung-Tae
    • Journal of Broadcast Engineering
    • /
    • v.24 no.5
    • /
    • pp.703-712
    • /
    • 2019
  • Single image Super-Resolution (SISR) aims to generate a visually pleasing high-resolution image from its degraded low-resolution measurement. In recent years, deep learning - based super - resolution methods have been actively researched and have shown more reliable and high performance. A typical method is WaveletSRNet, which restores high-resolution images through wavelet coefficient learning based on feature maps of images. However, there are two disadvantages in WaveletSRNet. One is a big processing time due to the complexity of the algorithm. The other is not to utilize feature maps efficiently when extracting input image's features. To improve this problems, we propose an efficient single image super resolution method, named RDB-WaveletSRNet. The proposed method uses the residual dense block to effectively extract low-resolution feature maps to improve single image super-resolution performance. We also adjust appropriated growth rates to solve complex computational problems. In addition, wavelet packet decomposition is used to obtain the wavelet coefficients according to the possibility of large scale ratio. In the experimental result on various images, we have proven that the proposed method has faster processing time and better image quality than the conventional methods. Experimental results have shown that the proposed method has better image quality by increasing 0.1813dB of PSNR and 1.17 times faster than the conventional method.

Robust Motorbike License Plate Detection and Recognition using Image Warping based on YOLOv2 (YOLOv2 기반의 영상워핑을 이용한 강인한 오토바이 번호판 검출 및 인식)

  • Dang, Xuan-Truong;Kim, Eung-Tae
    • Journal of Broadcast Engineering
    • /
    • v.24 no.5
    • /
    • pp.713-725
    • /
    • 2019
  • Automatic License Plate Recognition (ALPR) is a technology required for many applications such as Intelligent Transportation Systems and Video Surveillance Systems. Most of the studies have studied were about the detection and recognition of license plates on cars, and there is very little about detecting and recognizing license plates on motorbikes. In the case of a car, the license plate is located at the front or rear center of the vehicle and is a straight or slightly sloped license plate. Also, the background of the license plate is mainly monochromatic, and license plate detection and recognition process is less complicated. However since the motorbike is parked by using a kickstand, it is inclined at various angles when parked, so the process of recognizing characters on the motorbike license plate is more complicated. In this paper, we have developed a 2-stage YOLOv2 algorithm to detect the area of a license plate after detection of a motorbike area in order to improve the recognition accuracy of license plate for motorbike data set parked at various angles. In order to increase the detection rate, the size and number of the anchor boxes were adjusted according to the characteristics of the motorbike and license plate. Image warping algorithms were applied after detecting tilted license plates. As a result of simulating the license plate character recognition process, the proposed method had the recognition rate of license plate of 80.23% compared to the recognition rate of the conventional method(YOLOv2 without image warping) of 47.74%. Therefore, the proposed method can increase the recognition of tilted motorbike license plate character by using the adjustment of anchor boxes and the image warping which fit the motorbike license plate.

Performance Analysis of Object Detection Neural Network According to Compression Ratio of RGB and IR Images (RGB와 IR 영상의 압축률에 따른 객체 탐지 신경망 성능 분석)

  • Lee, Yegi;Kim, Shin;Lim, Hanshin;Lee, Hee Kyung;Choo, Hyon-Gon;Seo, Jeongil;Yoon, Kyoungro
    • Journal of Broadcast Engineering
    • /
    • v.26 no.2
    • /
    • pp.155-166
    • /
    • 2021
  • Most object detection algorithms are studied based on RGB images. Because the RGB cameras are capturing images based on light, however, the object detection performance is poor when the light condition is not good, e.g., at night or foggy days. On the other hand, high-quality infrared(IR) images regardless of weather condition and light can be acquired because IR images are captured by an IR sensor that makes images with heat information. In this paper, we performed the object detection algorithm based on the compression ratio in RGB and IR images to show the detection capabilities. We selected RGB and IR images that were taken at night from the Free FLIR Thermal dataset for the ADAS(Advanced Driver Assistance Systems) research. We used the pre-trained object detection network for RGB images and a fine-tuned network that is tuned based on night RGB and IR images. Experimental results show that higher object detection performance can be acquired using IR images than using RGB images in both networks.

Dual CNN Structured Sound Event Detection Algorithm Based on Real Life Acoustic Dataset (실생활 음향 데이터 기반 이중 CNN 구조를 특징으로 하는 음향 이벤트 인식 알고리즘)

  • Suh, Sangwon;Lim, Wootaek;Jeong, Youngho;Lee, Taejin;Kim, Hui Yong
    • Journal of Broadcast Engineering
    • /
    • v.23 no.6
    • /
    • pp.855-865
    • /
    • 2018
  • Sound event detection is one of the research areas to model human auditory cognitive characteristics by recognizing events in an environment with multiple acoustic events and determining the onset and offset time for each event. DCASE, a research group on acoustic scene classification and sound event detection, is proceeding challenges to encourage participation of researchers and to activate sound event detection research. However, the size of the dataset provided by the DCASE Challenge is relatively small compared to ImageNet, which is a representative dataset for visual object recognition, and there are not many open sources for the acoustic dataset. In this study, the sound events that can occur in indoor and outdoor are collected on a larger scale and annotated for dataset construction. Furthermore, to improve the performance of the sound event detection task, we developed a dual CNN structured sound event detection system by adding a supplementary neural network to a convolutional neural network to determine the presence of sound events. Finally, we conducted a comparative experiment with both baseline systems of the DCASE 2016 and 2017.

Pupil Data Measurement and Social Emotion Inference Technology by using Smart Glasses (스마트 글래스를 활용한 동공 데이터 수집과 사회 감성 추정 기술)

  • Lee, Dong Won;Mun, Sungchul;Park, Sangin;Kim, Hwan-jin;Whang, Mincheol
    • Journal of Broadcast Engineering
    • /
    • v.25 no.6
    • /
    • pp.973-979
    • /
    • 2020
  • This study aims to objectively and quantitatively determine the social emotion of empathy by collecting pupillary response. 52 subjects (26 men and 26 women) voluntarily participated in the experiment. After the measurement of the reference of 30 seconds, the experiment was divided into the task of imitation and spontaneously self-expression. The two subjects were interacted through facial expressions, and the pupil images were recorded. The pupil data was processed through binarization and circular edge detection algorithm, and outlier detection and removal technique was used to reject eye-blinking. The pupil size according to the empathy was confirmed for statistical significance with test of normality and independent sample t-test. Statistical analysis results, the pupil size was significantly different between empathy (M ± SD = 0.050 ± 1.817)) and non-empathy (M ± SD = 1.659 ± 1.514) condition (t(92) = -4.629, p = 0.000). The rule of empathy according to the pupil size was defined through discriminant analysis, and the rule was verified (Estimation accuracy: 75%) new 12 subjects (6 men and 6 women, mean age ± SD = 22.84 ± 1.57 years). The method proposed in this study is non-contact camera technology and is expected to be utilized in various virtual reality with smart glasses.

Real-Time Joint Animation Production and Expression System using Deep Learning Model and Kinect Camera (딥러닝 모델과 Kinect 카메라를 이용한 실시간 관절 애니메이션 제작 및 표출 시스템 구축에 관한 연구)

  • Kim, Sang-Joon;Lee, Yu-Jin;Park, Goo-man
    • Journal of Broadcast Engineering
    • /
    • v.26 no.3
    • /
    • pp.269-282
    • /
    • 2021
  • As the distribution of 3D content such as augmented reality and virtual reality increases, the importance of real-time computer animation technology is increasing. However, the computer animation process consists mostly of manual or marker-attaching motion capture, which requires a very long time for experienced professionals to obtain realistic images. To solve these problems, animation production systems and algorithms based on deep learning model and sensors have recently emerged. Thus, in this paper, we study four methods of implementing natural human movement in deep learning model and kinect camera-based animation production systems. Each method is chosen considering its environmental characteristics and accuracy. The first method uses a Kinect camera. The second method uses a Kinect camera and a calibration algorithm. The third method uses deep learning model. The fourth method uses deep learning model and kinect. Experiments with the proposed method showed that the fourth method of deep learning model and using the Kinect simultaneously showed the best results compared to other methods.

Video-to-Video Generated by Collage Technique (콜라주 기법으로 해석한 비디오 생성)

  • Cho, Hyeongrae;Park, Gooman
    • Journal of Broadcast Engineering
    • /
    • v.26 no.1
    • /
    • pp.39-60
    • /
    • 2021
  • In the field of deep learning, there are many algorithms mainly after GAN in research related to generation, but in terms of generation, there are similarities and differences with art. If the generation in the engineering aspect is mainly to judge the presence or absence of a quantitative indicator or the correct answer and the incorrect answer, the creation in the artistic aspect creates a creation that interprets the world and human life by cross-validating and doubting the correct answer and incorrect answer from various perspectives. In this paper, the video generation ability of deep learning was interpreted from the perspective of collage and compared with the results made by the artist. The characteristic of the experiment is to compare and analyze how much GAN reproduces the result of the creator made with the collage technique and the difference between the creative part, and investigate the satisfaction level by making performance evaluation items for the reproducibility of GAN. In order to experiment on how much the creator's statement and purpose of expression were reproduced, a deep learning algorithm corresponding to the statement keyword was found and its similarity was compared. As a result of the experiment, GAN did not meet much expectations to express the collage technique. Nevertheless, the image association showed higher satisfaction than human ability, which is a positive discovery that GAN can show comparable ability to humans in terms of abstract creation.

Analysis of Transfer Learning Effect for Automatic Dog Breed Classification (반려견 자동 품종 분류를 위한 전이학습 효과 분석)

  • Lee, Dongsu;Park, Gooman
    • Journal of Broadcast Engineering
    • /
    • v.27 no.1
    • /
    • pp.133-145
    • /
    • 2022
  • Compared to the continuously increasing dog population and industry size in Korea, systematic analysis of related data and research on breed classification methods are very insufficient. In this paper, an automatic breed classification method is proposed using deep learning technology for 14 major dog breeds domestically raised. To do this, dog images are collected for deep learning training and a dataset is built, and a breed classification algorithm is created by performing transfer learning based on VGG-16 and Resnet-34 as backbone networks. In order to check the transfer learning effect of the two models on dog images, we compared the use of pre-trained weights and the experiment of updating the weights. When fine tuning was performed based on VGG-16 backbone network, in the final model, the accuracy of Top 1 was about 89% and that of Top 3 was about 94%, respectively. The domestic dog breed classification method and data construction proposed in this paper have the potential to be used for various application purposes, such as classification of abandoned and lost dog breeds in animal protection centers or utilization in pet-feed industry.

Super High-Resolution Image Style Transfer (초-고해상도 영상 스타일 전이)

  • Kim, Yong-Goo
    • Journal of Broadcast Engineering
    • /
    • v.27 no.1
    • /
    • pp.104-123
    • /
    • 2022
  • Style transfer based on neural network provides very high quality results by reflecting the high level structural characteristics of images, and thereby has recently attracted great attention. This paper deals with the problem of resolution limitation due to GPU memory in performing such neural style transfer. We can expect that the gradient operation for style transfer based on partial image, with the aid of the fixed size of receptive field, can produce the same result as the gradient operation using the entire image. Based on this idea, each component of the style transfer loss function is analyzed in this paper to obtain the necessary conditions for partitioning and padding, and to identify, among the information required for gradient calculation, the one that depends on the entire input. By structuring such information for using it as auxiliary constant input for partition-based gradient calculation, this paper develops a recursive algorithm for super high-resolution image style transfer. Since the proposed method performs style transfer by partitioning input image into the size that a GPU can handle, it can perform style transfer without the limit of the input image resolution accompanied by the GPU memory size. With the aid of such super high-resolution support, the proposed method can provide a unique style characteristics of detailed area which can only be appreciated in super high-resolution style transfer.

Deep learning-based Multi-view Depth Estimation Methodology of Contents' Characteristics (다 시점 영상 콘텐츠 특성에 따른 딥러닝 기반 깊이 추정 방법론)

  • Son, Hosung;Shin, Minjung;Kim, Joonsoo;Yun, Kug-jin;Cheong, Won-sik;Lee, Hyun-woo;Kang, Suk-ju
    • Proceedings of the Korean Society of Broadcast Engineers Conference
    • /
    • 2022.06a
    • /
    • pp.4-7
    • /
    • 2022
  • Recently, multi-view depth estimation methods using deep learning network for the 3D scene reconstruction have gained lots of attention. Multi-view video contents have various characteristics according to their camera composition, environment, and setting. It is important to understand these characteristics and apply the proper depth estimation methods for high-quality 3D reconstruction tasks. The camera setting represents the physical distance which is called baseline, between each camera viewpoint. Our proposed methods focus on deciding the appropriate depth estimation methodologies according to the characteristics of multi-view video contents. Some limitations were found from the empirical results when the existing multi-view depth estimation methods were applied to a divergent or large baseline dataset. Therefore, we verified the necessity of obtaining the proper number of source views and the application of the source view selection algorithm suitable for each dataset's capturing environment. In conclusion, when implementing a deep learning-based depth estimation network for 3D scene reconstruction, the results of this study can be used as a guideline for finding adaptive depth estimation methods.

  • PDF