• Title/Summary/Keyword: Broadcast image

Search Result 1,306, Processing Time 0.028 seconds

ROUTE/DASH-SRD based Point Cloud Content Region Division Transfer and Density Scalability Supporting Method (포인트 클라우드 콘텐츠의 밀도 스케일러빌리티를 지원하는 ROUTE/DASH-SRD 기반 영역 분할 전송 방법)

  • Kim, Doohwan;Park, Seonghwan;Kim, Kyuheon
    • Journal of Broadcast Engineering
    • /
    • v.24 no.5
    • /
    • pp.849-858
    • /
    • 2019
  • Recent developments in computer graphics technology and image processing technology have increased interest in point cloud technology for inputting real space and object information as three-dimensional data. In particular, point cloud technology can accurately provide spatial information, and has attracted a great deal of interest in the field of autonomous vehicles and AR (Augmented Reality)/VR (Virtual Reality). However, in order to provide users with 3D point cloud contents that require more data than conventional 2D images, various technology developments are required. In order to solve these problems, an international standardization organization, MPEG(Moving Picture Experts Group), is in the process of discussing efficient compression and transmission schemes. In this paper, we provide a region division transfer method of 3D point cloud content through extension of existing MPEG-DASH (Dynamic Adaptive Streaming over HTTP)-SRD (Spatial Relationship Description) technology, quality parameters are further defined in the signaling message so that the quality parameters can be selectively determined according to the user's request. We also design a verification platform for ROUTE (Real Time Object Delivery Over Unidirectional Transport)/DASH based heterogeneous network environment and use the results to validate the proposed technology.

A Problematic Bubble Detection Algorithm for Conformal Coated PCB Using Convolutional Neural Networks (합성곱 신경망을 이용한 컨포멀 코팅 PCB에 발생한 문제성 기포 검출 알고리즘)

  • Lee, Dong Hee;Cho, SungRyung;Jung, Kyeong-Hoon;Kang, Dong Wook
    • Journal of Broadcast Engineering
    • /
    • v.26 no.4
    • /
    • pp.409-418
    • /
    • 2021
  • Conformal coating is a technology that protects PCB(Printed Circuit Board) and minimizes PCB failures. Since the defects in the coating are linked to failure of the PCB, the coating surface is examined for air bubbles to satisfy the successful conditions of the conformal coating. In this paper, we propose an algorithm for detecting problematic bubbles in high-risk groups by applying image signal processing. The algorithm consists of finding candidates for problematic bubbles and verifying candidates. Bubbles do not appear in visible light images, but can be visually distinguished from UV(Ultra Violet) light sources. In particular the center of the problematic bubble is dark in brightness and the border is high in brightness. In the paper, these brightness characteristics are called valley and mountain features, and the areas where both characteristics appear at the same time are candidates for problematic bubbles. However, it is necessary to verify candidates because there may be candidates who are not bubbles. In the candidate verification phase, we used convolutional neural network models, and ResNet performed best compared to other models. The algorithms presented in this paper showed the performance of precision 0.805, recall 0.763, and f1-score 0.767, and these results show sufficient potential for bubble test automation.

Dual CNN Structured Sound Event Detection Algorithm Based on Real Life Acoustic Dataset (실생활 음향 데이터 기반 이중 CNN 구조를 특징으로 하는 음향 이벤트 인식 알고리즘)

  • Suh, Sangwon;Lim, Wootaek;Jeong, Youngho;Lee, Taejin;Kim, Hui Yong
    • Journal of Broadcast Engineering
    • /
    • v.23 no.6
    • /
    • pp.855-865
    • /
    • 2018
  • Sound event detection is one of the research areas to model human auditory cognitive characteristics by recognizing events in an environment with multiple acoustic events and determining the onset and offset time for each event. DCASE, a research group on acoustic scene classification and sound event detection, is proceeding challenges to encourage participation of researchers and to activate sound event detection research. However, the size of the dataset provided by the DCASE Challenge is relatively small compared to ImageNet, which is a representative dataset for visual object recognition, and there are not many open sources for the acoustic dataset. In this study, the sound events that can occur in indoor and outdoor are collected on a larger scale and annotated for dataset construction. Furthermore, to improve the performance of the sound event detection task, we developed a dual CNN structured sound event detection system by adding a supplementary neural network to a convolutional neural network to determine the presence of sound events. Finally, we conducted a comparative experiment with both baseline systems of the DCASE 2016 and 2017.

Low-complexity Local Illuminance Compensation for Bi-prediction mode (양방향 예측 모드를 위한 저복잡도 LIC 방법 연구)

  • Choi, Han Sol;Byeon, Joo Hyung;Bang, Gun;Sim, Dong Gyu
    • Journal of Broadcast Engineering
    • /
    • v.24 no.3
    • /
    • pp.463-471
    • /
    • 2019
  • This paper proposes a method for reducing the complexity of LIC (Local Illuminance Compensation) for bi-directional inter prediction. The LIC performs local illumination compensation using neighboring reconstruction samples of the current block and the reference block to improve the accuracy of the inter prediction. Since the weight and offset required for local illumination compensation are calculated at both sides of the encoder and decoder using the reconstructed samples, there is an advantage that the coding efficiency is improved without signaling any information. Since the weight and the offset are obtained in the encoding prediction step and the decoding step, encoder and decoder complexity are increased. This paper proposes two methods for low complexity LIC. The first method is a method of applying illumination compensation with offset only in bi-directional prediction, and the second is a method of applying LIC after weighted average step of reference block obtained by bidirectional prediction. To evaluate the performance of the proposed method, BD-rate is compared with BMS-2.0.1 using B, C, and D classes of MPEG standard experimental image under RA (Random Access) condition. Experimental results show that the proposed method reduces the average of 0.29%, 0.23%, 0.04% for Y, U, and V in terms of BD-rate performance compared to BMS-2.0.1 and encoding/decoding time is almost same. Although the BD-rate was lost, the calculation complexity of the LIC was greatly reduced as the multiplication operation was removed and the addition operation was halved in the LIC parameter derivation process.

FBX Format Animation Generation System Combined with Joint Estimation Network using RGB Images (RGB 이미지를 이용한 관절 추정 네트워크와 결합된 FBX 형식 애니메이션 생성 시스템)

  • Lee, Yujin;Kim, Sangjoon;Park, Gooman
    • Journal of Broadcast Engineering
    • /
    • v.26 no.5
    • /
    • pp.519-532
    • /
    • 2021
  • Recently, in various fields such as games, movies, and animation, content that uses motion capture to build body models and create characters to express in 3D space is increasing. Studies are underway to generate animations using RGB-D cameras to compensate for problems such as the cost of cinematography in how to place joints by attaching markers, but the problem of pose estimation accuracy or equipment cost still exists. Therefore, in this paper, we propose a system that inputs RGB images into a joint estimation network and converts the results into 3D data to create FBX format animations in order to reduce the equipment cost required for animation creation and increase joint estimation accuracy. First, the two-dimensional joint is estimated for the RGB image, and the three-dimensional coordinates of the joint are estimated using this value. The result is converted to a quaternion, rotated, and an animation in FBX format is created. To measure the accuracy of the proposed method, the system operation was verified by comparing the error between the animation generated based on the 3D position of the marker by attaching a marker to the body and the animation generated by the proposed system.

Correlation between Head Movement Data and Virtual Reality Content Immersion (헤드 무브먼트 데이터와 가상현실 콘텐츠 몰입도 상관관계)

  • Kim, Jungho;Yoo, Taekyung
    • Journal of Broadcast Engineering
    • /
    • v.26 no.5
    • /
    • pp.500-507
    • /
    • 2021
  • The virtual reality industry has an opportunity to take another leap forward with the surge in demand for non-face-to-face content and interest in the metaverse after Covid-19. Therefore, in order to popularize virtual reality content along with this trend, high-quality content production and storytelling research suitable for the characteristics of virtual reality should be continuously conducted. In order for content to which virtual reality characteristics are applied to be effectively produced through user feedback, a quantitative index that can evaluate the content is needed. In this study, the process of viewing virtual reality contents was analyzed and head movement was set as a quantitative indicator. Afterwards, the experimenter watched five animations and analyzed the correlation between recorded head movement information and immersion. As a result of the analysis, high immersion was shown when the head movement speed was relatively slow, and it was found that the head movement speed can be used significantly as an index indicating the degree of content immersion. The result derived in this way can be used as a quantitative indicator that can verify the validity of the storytelling method applied after the prototype is produced when the creator creates virtual reality content. This method can improve the quality of content by quickly identifying the problems of the proposed storytelling method and suggesting a better method. This study aims to contribute to the production of high-quality virtual reality content and the popularization of virtual reality content as a basic research to analyze immersion based on the quantitative indicator of head movement speed.

Object Detection on the Road Environment Using Attention Module-based Lightweight Mask R-CNN (주의 모듈 기반 Mask R-CNN 경량화 모델을 이용한 도로 환경 내 객체 검출 방법)

  • Song, Minsoo;Kim, Wonjun;Jang, Rae-Young;Lee, Ryong;Park, Min-Woo;Lee, Sang-Hwan;Choi, Myung-seok
    • Journal of Broadcast Engineering
    • /
    • v.25 no.6
    • /
    • pp.944-953
    • /
    • 2020
  • Object detection plays a crucial role in a self-driving system. With the advances of image recognition based on deep convolutional neural networks, researches on object detection have been actively explored. In this paper, we proposed a lightweight model of the mask R-CNN, which has been most widely used for object detection, to efficiently predict location and shape of various objects on the road environment. Furthermore, feature maps are adaptively re-calibrated to improve the detection performance by applying an attention module to the neural network layer that plays different roles within the mask R-CNN. Various experimental results for real driving scenes demonstrate that the proposed method is able to maintain the high detection performance with significantly reduced network parameters.

A New Calibration of 3D Point Cloud using 3D Skeleton (3D 스켈레톤을 이용한 3D 포인트 클라우드의 캘리브레이션)

  • Park, Byung-Seo;Kang, Ji-Won;Lee, Sol;Park, Jung-Tak;Choi, Jang-Hwan;Kim, Dong-Wook;Seo, Young-Ho
    • Journal of Broadcast Engineering
    • /
    • v.26 no.3
    • /
    • pp.247-257
    • /
    • 2021
  • This paper proposes a new technique for calibrating a multi-view RGB-D camera using a 3D (dimensional) skeleton. In order to calibrate a multi-view camera, consistent feature points are required. In addition, it is necessary to acquire accurate feature points in order to obtain a high-accuracy calibration result. We use the human skeleton as a feature point to calibrate a multi-view camera. The human skeleton can be easily obtained using state-of-the-art pose estimation algorithms. We propose an RGB-D-based calibration algorithm that uses the joint coordinates of the 3D skeleton obtained through the posture estimation algorithm as a feature point. Since the human body information captured by the multi-view camera may be incomplete, the skeleton predicted based on the image information acquired through it may be incomplete. After efficiently integrating a large number of incomplete skeletons into one skeleton, multi-view cameras can be calibrated by using the integrated skeleton to obtain a camera transformation matrix. In order to increase the accuracy of the calibration, multiple skeletons are used for optimization through temporal iterations. We demonstrate through experiments that a multi-view camera can be calibrated using a large number of incomplete skeletons.

Video-to-Video Generated by Collage Technique (콜라주 기법으로 해석한 비디오 생성)

  • Cho, Hyeongrae;Park, Gooman
    • Journal of Broadcast Engineering
    • /
    • v.26 no.1
    • /
    • pp.39-60
    • /
    • 2021
  • In the field of deep learning, there are many algorithms mainly after GAN in research related to generation, but in terms of generation, there are similarities and differences with art. If the generation in the engineering aspect is mainly to judge the presence or absence of a quantitative indicator or the correct answer and the incorrect answer, the creation in the artistic aspect creates a creation that interprets the world and human life by cross-validating and doubting the correct answer and incorrect answer from various perspectives. In this paper, the video generation ability of deep learning was interpreted from the perspective of collage and compared with the results made by the artist. The characteristic of the experiment is to compare and analyze how much GAN reproduces the result of the creator made with the collage technique and the difference between the creative part, and investigate the satisfaction level by making performance evaluation items for the reproducibility of GAN. In order to experiment on how much the creator's statement and purpose of expression were reproduced, a deep learning algorithm corresponding to the statement keyword was found and its similarity was compared. As a result of the experiment, GAN did not meet much expectations to express the collage technique. Nevertheless, the image association showed higher satisfaction than human ability, which is a positive discovery that GAN can show comparable ability to humans in terms of abstract creation.

Character Detection and Recognition of Steel Materials in Construction Drawings using YOLOv4-based Small Object Detection Techniques (YOLOv4 기반의 소형 물체탐지기법을 이용한 건설도면 내 철강 자재 문자 검출 및 인식기법)

  • Sim, Ji-Woo;Woo, Hee-Jo;Kim, Yoonhwan;Kim, Eung-Tae
    • Journal of Broadcast Engineering
    • /
    • v.27 no.3
    • /
    • pp.391-401
    • /
    • 2022
  • As deep learning-based object detection and recognition research have been developed recently, the scope of application to industry and real life is expanding. But deep learning-based systems in the construction system are still much less studied. Calculating materials in the construction system is still manual, so it is a reality that transactions of wrong volumn calculation are generated due to a lot of time required and difficulty in accurate accumulation. A fast and accurate automatic drawing recognition system is required to solve this problem. Therefore, we propose an AI-based automatic drawing recognition accumulation system that detects and recognizes steel materials in construction drawings. To accurately detect steel materials in construction drawings, we propose data augmentation techniques and spatial attention modules for improving small object detection performance based on YOLOv4. The detected steel material area is recognized by text, and the number of steel materials is integrated based on the predicted characters. Experimental results show that the proposed method increases the accuracy and precision by 1.8% and 16%, respectively, compared with the conventional YOLOv4. As for the proposed method, Precision performance was 0.938. The recall was 1. Average Precision AP0.5 was 99.4% and AP0.5:0.95 was 67%. Accuracy for character recognition obtained 99.9.% by configuring and learning a suitable dataset that contains fonts used in construction drawings compared to the 75.6% using the existing dataset. The average time required per image was 0.013 seconds in the detection, 0.65 seconds in character recognition, and 0.16 seconds in the accumulation, resulting in 0.84 seconds.