• Title/Summary/Keyword: 인코더/디코더

Search Result 90, Processing Time 0.023 seconds

Adaptive In-loop Filter Method for High-efficiency Video Coding (고효율 비디오 부호화를 위한 적응적 인-루프 필터 방법)

  • Jung, Kwang-Su;Nam, Jung-Hak;Lim, Woong;Jo, Hyun-Ho;Sim, Dong-Gyu;Choi, Byeong-Doo;Cho, Dae-Sung
    • Journal of Broadcast Engineering
    • /
    • v.16 no.1
    • /
    • pp.1-13
    • /
    • 2011
  • In this paper, we propose an adaptive in-loop filter to improve the coding efficiency. Recently, there are post-filter hint SEI and block-based adaptive filter control (BAFC) methods based on the Wiener filter which can minimize the mean square error between the input image and the decoded image in video coding standards. However, since the post-filter hint SEI is applied only to the output image, it cannot reduce the prediction errors of the subsequent frames. Because BAFC is also conducted with a deblocking filter, independently, it has a problem of high computational complexity on the encoder and decoder sides. In this paper, we propose the low-complexity adaptive in-loop filter (LCALF) which has lower computational complexity by using H.264/AVC deblocking filter, adaptively, as well as shows better performance than the conventional method. In the experimental results, the computational complexity of the proposed method is reduced about 22% than the conventional method. Furthermore, the coding efficiency of the proposed method is about 1% better than the BAFC.

Design and Implementation of a Reusable and Extensible HL7 Encoding/Decoding Framework (재사용성과 확장성 있는 HL7 인코딩/디코딩 프레임워크의 설계 및 구현)

  • Kim, Jung-Sun;Park, Seung-Hun;Nah, Yun-Mook
    • Journal of KIISE:Computing Practices and Letters
    • /
    • v.8 no.1
    • /
    • pp.96-106
    • /
    • 2002
  • this paper, we propose a flexible, reusable, and extensible HL7 encoding and decoding framework using a Message Object Model (MOM) and Message Definition Repository (MDR). The MOM provides an abstract HL7 message form represented by a group of objects and their relationships. It reflects logical relationships among the standard HL7 message elements such as segments, fields, and components, while enforcing the key structural constraints imposed by the standard. Since the MOM completely eliminates the dependency of the HL7 encoder and decoder on platform-specific data formats, it makes it possible to build the encoder and decoder as reusable standalone software components, enabling the interconnection of arbitrary heterogeneous hospital information systems(HISs) with little effort. Moreover, the MDR, an external database of key definitions for HL7 messages, helps make the encoder and decoder as resilient as possible to future modifications of the standard HL7 message formats. It is also used by the encoder and decoder to perform a well formedness check for their respective inputs (i. e., HL7 message objects expressed in the MOM and encoded HL7 message strings). Although we implemented a prototype version of the encoder and decoder using JAVA, they can be easily packaged and delivered as standalone components using the standard component frameworks like ActiveX, JAVABEAN, or CORBA component.

Pixel-level Crack Detection in X-ray Computed Tomography Image of Granite using Deep Learning (딥러닝을 이용한 화강암 X-ray CT 영상에서의 균열 검출에 관한 연구)

  • Hyun, Seokhwan;Lee, Jun Sung;Jeon, Seonghwan;Kim, Yejin;Kim, Kwang Yeom;Yun, Tae Sup
    • Tunnel and Underground Space
    • /
    • v.29 no.3
    • /
    • pp.184-196
    • /
    • 2019
  • This study aims to extract a 3D image of micro-cracks generated by hydraulic fracturing tests, using the deep learning method and X-ray computed tomography images. The pixel-level cracks are difficult to be detected via conventional image processing methods, such as global thresholding, canny edge detection, and the region growing method. Thus, the convolutional neural network-based encoder-decoder network is adapted to extract and analyze the micro-crack quantitatively. The number of training data can be acquired by dividing, rotating, and flipping images and the optimum combination for the image augmentation method is verified. Application of the optimal image augmentation method shows enhanced performance for not only the validation dataset but also the test dataset. In addition, the influence of the original number of training data to the performance of the deep learning-based neural network is confirmed, and it leads to succeed the pixel-level crack detection.

Design and Implementation of Hybrid Network Associated 3D Video Broadcasting System (이종망 연동형 3D 비디오 방송시스템 설계 및 구현)

  • Yun, Kugjin;Cheong, Won-Sik;Lee, Jinyoung;Kim, Kyuheon
    • Journal of Broadcast Engineering
    • /
    • v.19 no.5
    • /
    • pp.687-698
    • /
    • 2014
  • ATSC is currently working on standardization of hybrid 3DTV broadcasting service in heterogenous network environment after completion of service-compatible 3DTV broadcasting service standard based on broadcasting channel. This paper proposes a convergence 3D video broadcasting method on broadcasting and IP network while guaranteeing a Full-HD 3D quality without degrading the image quality of legacy DTV. Specifically, this paper describes transmission of the 3D additional video using the ISO/IEC 23009-1 DASH, robust synchronization method under heterogenous network environments and system target decoder model for hybrid 3DTV receiver. Based on experimental results, we confirm that proposed technologies can be used as a core technology in the hybrid 3DTV standardization and a reference model for a development of hybrid 3DTV encoder and receiver.

ViStoryNet: Neural Networks with Successive Event Order Embedding and BiLSTMs for Video Story Regeneration (ViStoryNet: 비디오 스토리 재현을 위한 연속 이벤트 임베딩 및 BiLSTM 기반 신경망)

  • Heo, Min-Oh;Kim, Kyung-Min;Zhang, Byoung-Tak
    • KIISE Transactions on Computing Practices
    • /
    • v.24 no.3
    • /
    • pp.138-144
    • /
    • 2018
  • A video is a vivid medium similar to human's visual-linguistic experiences, since it can inculcate a sequence of situations, actions or dialogues that can be told as a story. In this study, we propose story learning/regeneration frameworks from videos with successive event order supervision for contextual coherence. The supervision induces each episode to have a form of trajectory in the latent space, which constructs a composite representation of ordering and semantics. In this study, we incorporated the use of kids videos as a training data. Some of the advantages associated with the kids videos include omnibus style, simple/explicit storyline in short, chronological narrative order, and relatively limited number of characters and spatial environments. We build the encoder-decoder structure with successive event order embedding, and train bi-directional LSTMs as sequence models considering multi-step sequence prediction. Using a series of approximately 200 episodes of kids videos named 'Pororo the Little Penguin', we give empirical results for story regeneration tasks and SEOE. In addition, each episode shows a trajectory-like shape on the latent space of the model, which gives the geometric information for the sequence models.

MSaGAN: Improved SaGAN using Guide Mask and Multitask Learning Approach for Facial Attribute Editing

  • Yang, Hyeon Seok;Han, Jeong Hoon;Moon, Young Shik
    • Journal of the Korea Society of Computer and Information
    • /
    • v.25 no.5
    • /
    • pp.37-46
    • /
    • 2020
  • Recently, studies of facial attribute editing have obtained realistic results using generative adversarial net (GAN) and encoder-decoder structure. Spatial attention GAN (SaGAN), one of the latest researches, is the method that can change only desired attribute in a face image by spatial attention mechanism. However, sometimes unnatural results are obtained due to insufficient information on face areas. In this paper, we propose an improved SaGAN (MSaGAN) using a guide mask for learning and applying multitask learning approach to improve the limitations of the existing methods. Through extensive experiments, we evaluated the results of the facial attribute editing in therms of the mask loss function and the neural network structure. It has been shown that the proposed method can efficiently produce more natural results compared to the previous methods.

VLSI Architecture of High Performance Huffman Codec (고성능 허프만 코덱의 VLSI 구조)

  • Choi, Hyun-Jun;Seo, Young-Ho;Kim, Dong-Wook
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.15 no.2
    • /
    • pp.439-446
    • /
    • 2011
  • In this paper, we proposed and implemented a dedicated hardware for Huffman coding which is a method of entropy coding to use compressing multimedia data with video coding. The proposed Huffman codec consists Huffman encoder and decoder. The Huffman encoder converts symbols to Huffman codes using look-up table. The Huffman code which has a variable length is packetized to a data format with 32 bits in data packeting block and then sequentially output in unit of a frame. The Huffman decoder converts serial bitstream to original symbols without buffering using FSM(finite state machine) which has a tree structure. The proposed hardware has a flexible operational property to program encoding and decoding hardware, so it can operate various Huffman coding. The implemented hardware was implemented in Cyclone III FPGA of Altera Inc., and it uses 3725 LUTs in the operational frequency of 365MHz

3D Human Shape Deformation using Deep Learning (딥러닝을 이용한 3차원 사람모델형상 변형)

  • Kim, DaeHee;Hwang, Bon-Woo;Lee, SeungWook;Kwak, Sooyeong
    • Journal of Korea Society of Industrial Information Systems
    • /
    • v.25 no.2
    • /
    • pp.19-27
    • /
    • 2020
  • Recently, rapid and accurate 3D models creation is required in various applications using virtual reality and augmented reality technology. In this paper, we propose an on-site learning based shape deformation method which transforms the clothed 3D human model into the shape of an input point cloud. The proposed algorithm consists of two main parts: one is pre-learning and the other is on-site learning. Each learning consists of encoder, template transformation and decoder network. The proposed network is learned by unsupervised method, which uses the Chamfer distance between the input point cloud form and the template vertices as the loss function. By performing on-site learning on the input point clouds during the inference process, the high accuracy of the inference results can be obtained and presented through experiments.

2D and 3D Hand Pose Estimation Based on Skip Connection Form (스킵 연결 형태 기반의 손 관절 2D 및 3D 검출 기법)

  • Ku, Jong-Hoe;Kim, Mi-Kyung;Cha, Eui-Young
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.24 no.12
    • /
    • pp.1574-1580
    • /
    • 2020
  • Traditional pose estimation methods include using special devices or images through image processing. The disadvantage of using a device is that the environment in which the device can be used is limited and costly. The use of cameras and image processing has the advantage of reducing environmental constraints and costs, but the performance is lower. CNN(Convolutional Neural Networks) were studied for pose estimation just using only camera without these disadvantage. Various techniques were proposed to increase cognitive performance. In this paper, the effect of the skip connection on the network was experimented by using various skip connections on the joint recognition of the hand. Experiments have confirmed that the presence of additional skip connections other than the basic skip connections has a better effect on performance, but the network with downward skip connections is the best performance.

Assembly Performance Evaluation for Prefabricated Steel Structures Using k-nearest Neighbor and Vision Sensor (k-근접 이웃 및 비전센서를 활용한 프리팹 강구조물 조립 성능 평가 기술)

  • Bang, Hyuntae;Yu, Byeongjun;Jeon, Haemin
    • Journal of the Computational Structural Engineering Institute of Korea
    • /
    • v.35 no.5
    • /
    • pp.259-266
    • /
    • 2022
  • In this study, we developed a deep learning and vision sensor-based assembly performance evaluation method isfor prefabricated steel structures. The assembly parts were segmented using a modified version of the receptive field block convolution module inspired by the eccentric function of the human visual system. The quality of the assembly was evaluated by detecting the bolt holes in the segmented assembly part and calculating the bolt hole positions. To validate the performance of the evaluation, models of standard and defective assembly parts were produced using a 3D printer. The assembly part segmentation network was trained based on the 3D model images captured from a vision sensor. The sbolt hole positions in the segmented assembly image were calculated using image processing techniques, and the assembly performance evaluation using the k-nearest neighbor algorithm was verified. The experimental results show that the assembly parts were segmented with high precision, and the assembly performance based on the positions of the bolt holes in the detected assembly part was evaluated with a classification error of less than 5%.