• Title/Summary/Keyword: 인코더/디코더

Search Result 89, Processing Time 0.03 seconds

Assessing Techniques for Advancing Land Cover Classification Accuracy through CNN and Transformer Model Integration (CNN 모델과 Transformer 조합을 통한 토지피복 분류 정확도 개선방안 검토)

  • Woo-Dam SIM;Jung-Soo LEE
    • Journal of the Korean Association of Geographic Information Studies
    • /
    • v.27 no.1
    • /
    • pp.115-127
    • /
    • 2024
  • This research aimed to construct models with various structures based on the Transformer module and to perform land cover classification, thereby examining the applicability of the Transformer module. For the classification of land cover, the Unet model, which has a CNN structure, was selected as the base model, and a total of four deep learning models were constructed by combining both the encoder and decoder parts with the Transformer module. During the training process of the deep learning models, the training was repeated 10 times under the same conditions to evaluate the generalization performance. The evaluation of the classification accuracy of the deep learning models showed that the Model D, which utilized the Transformer module in both the encoder and decoder structures, achieved the highest overall accuracy with an average of approximately 89.4% and a Kappa coefficient average of about 73.2%. In terms of training time, models based on CNN were the most efficient. however, the use of Transformer-based models resulted in an average improvement of 0.5% in classification accuracy based on the Kappa coefficient. It is considered necessary to refine the model by considering various variables such as adjusting hyperparameters and image patch sizes during the integration process with CNN models. A common issue identified in all models during the land cover classification process was the difficulty in detecting small-scale objects. To improve this misclassification phenomenon, it is deemed necessary to explore the use of high-resolution input data and integrate multidimensional data that includes terrain and texture information.

A study on skip-connection with time-frequency self-attention for improving speech enhancement based on complex-valued spectrum (복소 스펙트럼 기반 음성 향상의 성능 향상을 위한 time-frequency self-attention 기반 skip-connection 기법 연구)

  • Jaehee Jung;Wooil Kim
    • The Journal of the Acoustical Society of Korea
    • /
    • v.42 no.2
    • /
    • pp.94-101
    • /
    • 2023
  • A deep neural network composed of encoders and decoders, such as U-Net, used for speech enhancement, concatenates the encoder to the decoder through skip-connection. Skip-connection helps reconstruct the enhanced spectrum and complement the lost information. The features of the encoder and the decoder connected by the skip-connection are incompatible with each other. In this paper, for complex-valued spectrum based speech enhancement, Self-Attention (SA) method is applied to skip-connection to transform the feature of encoder to be compatible with the features of decoder. SA is a technique in which when generating an output sequence in a sequence-to-sequence tasks the weighted average of input is used to put attention on subsets of input, showing that noise can be effectively eliminated by being applied in speech enhancement. The three models using encoder and decoder features to apply SA to skip-connection are studied. As experimental results using TIMIT database, the proposed methods show improvements in all evaluation metrics compared to the Deep Complex U-Net (DCUNET) with skip-connection only.

Design of an Efficient LDPC Codec for Hardware Implementation (하드웨어 구현에 적합한 효율적인 LDPC 코덱의 설계)

  • Lee Chan-Ho;Park Jae-Geun
    • Journal of the Institute of Electronics Engineers of Korea SD
    • /
    • v.43 no.7 s.349
    • /
    • pp.50-57
    • /
    • 2006
  • Low-density parity-check (LDPC) codes are recently emerged due to its excellent performance. However, the parity check (H) matrices of the previous works are not adequate for hardware implementation of encoders or decoders. This paper proposes a hybrid parity check matrix which is efficient in hardware implementation of both decoders and encoders. The hybrid H-matrices are constructed so that both the semi-random technique and the partly parallel structure can be applied to design encoders and decoders. Using the proposed methods, the implementation of encoders can become practical while keeping the hardware complexity of the partly parallel decoder structures. An encoder and a decoder are designed using Verilog-HDL and compared with the previous results.

A Study on the Design of Content Addressable and Reentrant Memory(CARM) (Content Addressable and Reentrant Memory (CARM)의 설계에 관한 연구)

  • 이준수;백인천;박상봉;박노경;차균현
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.16 no.1
    • /
    • pp.46-56
    • /
    • 1991
  • In this paper, 16word X 8bit Content Addressable and Reentrant Memory(CARM) is described. This device has 4 operation modes(read, write, match, reentrant). The read and write operation of CARM is like that of static RAM, CARM has the reentrant mode operation where the on chip garbage collection is accomplished conditionally. Thus function can be used for high speed matching unit of dynamic data flow computer. And CARM also can encode matching address sequentially according to therir priority. CARM consists of 8 blocks(CAM cell, Sequential Address Encoder(S.A.E). Reentrant operation. Read/Write control circuit, Data/Mask Register, Sense Amplifier, Encoder. Decoder). Designed DARM can be used in data flow computer, pattern, inspection, table look-up, image processing. The simulation is performed using the QUICKSIM logic simulator and Pspice circuit simulator. Having hierarchical structure, the layout was done using the 3{\;}\mu\textrm{m} n well CMOS technology of the ETRI design rule.

  • PDF

Atrous Residual U-Net for Semantic Segmentation in Street Scenes based on Deep Learning (딥러닝 기반 거리 영상의 Semantic Segmentation을 위한 Atrous Residual U-Net)

  • Shin, SeokYong;Lee, SangHun;Han, HyunHo
    • Journal of Convergence for Information Technology
    • /
    • v.11 no.10
    • /
    • pp.45-52
    • /
    • 2021
  • In this paper, we proposed an Atrous Residual U-Net (AR-UNet) to improve the segmentation accuracy of semantic segmentation method based on U-Net. The U-Net is mainly used in fields such as medical image analysis, autonomous vehicles, and remote sensing images. The conventional U-Net lacks extracted features due to the small number of convolution layers in the encoder part. The extracted features are essential for classifying object categories, and if they are insufficient, it causes a problem of lowering the segmentation accuracy. Therefore, to improve this problem, we proposed the AR-UNet using residual learning and ASPP in the encoder. Residual learning improves feature extraction ability and is effective in preventing feature loss and vanishing gradient problems caused by continuous convolutions. In addition, ASPP enables additional feature extraction without reducing the resolution of the feature map. Experiments verified the effectiveness of the AR-UNet with Cityscapes dataset. The experimental results showed that the AR-UNet showed improved segmentation results compared to the conventional U-Net. In this way, AR-UNet can contribute to the advancement of many applications where accuracy is important.

Analysis of Power Saving Factor for a DVS Based Multimedia Processor (DVS 기반 멀티미디어 프로세서의 전력절감율 분석)

  • Kim Byoung-Il;Chang Tae-Gyu
    • Journal of the Institute of Electronics Engineers of Korea SP
    • /
    • v.42 no.1
    • /
    • pp.95-100
    • /
    • 2005
  • This paper proposes a DVS method which effectively reduces the power consumption of multimedia signal processor. Analytic derivations of effective range of its power saving factor are obtained with the assumption of a Gaussian distribution for the frame-based computational burden of the multimedia processor. A closed form equation of the power saving factor is derived in terms of the mean-standard deviation of the distribution. An MPEG-2 video decoder algorithm and AAC encoder algorithm are tested on ARM9 RISC processor for the experimental verification of the power saying of the proposed DVS approach. The experimental results with diverse MPEG-2 video and audio files show 50~30% power saving factor and show good agreement with those of the analytically derived values.

Real-Time DSP Implementation of Adaptive Multi-Rate with TMS320C542 board (TMS320C542보드를 이용한 Adaptive Multi-Rate 음성부호화기의 실시간 구현)

  • 박세익;전라온;이인성
    • Proceedings of the IEEK Conference
    • /
    • 2000.09a
    • /
    • pp.827-830
    • /
    • 2000
  • 3GPP and ETSI adopted AMR(Adaptive Multi-Rate) as a standard for next generation IMT-2000 service. In this paper, we analyzed algorithm about AMR and optimized ANSI C source on the C complier and assembly language of Texas Instrument . The implemented AMR speech codec requires 28.2MIPS of complexity for encoder and 5.5MIPS for decoder. we performed real-time implementation of AMR speech codec using 82% of TMS320C5402 with 40 MIPS specification. We give proof that the output speech of the implemented speech codec on DSP board is identical with result of C source program simulation. Also the reconstructed speech is verified in the real-time environment consisted of microphone and speaker.

  • PDF

Measuring Sentence Similarity using Morpheme Embedding Model and GRU Encoder for Question and Answering System (질의응답 시스템에서 형태소임베딩 모델과 GRU 인코더를 이용한 문장유사도 측정)

  • Lee, DongKeon;Oh, KyoJoong;Choi, Ho-Jin;Heo, Jeong
    • 한국어정보학회:학술대회논문집
    • /
    • 2016.10a
    • /
    • pp.128-133
    • /
    • 2016
  • 문장유사도 분석은 문서 평가 자동화에 활용될 수 있는 중요한 기술이다. 최근 순환신경망을 이용한 인코더-디코더 언어 모델이 기계학습 분야에서 괄목할만한 성과를 거두고 있다. 본 논문에서는 한국어 형태소임베딩 모델과 GRU(Gated Recurrent Unit)기반의 인코더를 제시하고, 이를 이용하여 언어모델을 한국어 위키피디아 말뭉치로부터 학습하고, 한국어 질의응답 시스템에서 질문에 대한 정답을 유추 할 수 있는 증거문장을 찾을 수 있도록 문장유사도를 측정하는 방법을 제시한다. 본 논문에 제시된 형태소임베딩 모델과 GRU 기반의 인코딩 모델을 이용하여 문장유사도 측정에 있어서, 기존 글자임베딩 방법에 비해 개선된 결과를 얻을 수 있었으며, 질의응답 시스템에서도 유용하게 활용될 수 있음을 알 수 있었다.

  • PDF

Development of Closed Caption Decoder System on Broadcast Monitor (방송용 모니터의 방송 자막 디코더 시스템 개발)

  • Song, Young-Kyu;Jeong, Jae-Seok
    • Proceedings of the Korean Society of Broadcast Engineers Conference
    • /
    • 2010.07a
    • /
    • pp.36-39
    • /
    • 2010
  • 멀티 포맷 방송용 모니터는 SDI 신호뿐만 아니라 HDMI, DVI, Component, Composite로 전송되는 영상, 음성, 부가 데이터를 보여주는 모니터로 방송용 레퍼런스 모니터로 사용되고 있다. 특히 부가 데이터 중에서 Closed Caption의 경우 북미에서는 EIA-608과 EIA-708 두 가지 표준이 있고, 세부적으로 네 가지의 방법으로 전송되는데 일반적인 방송용 모니터에는 적용되어 있는 것이 극히 드물다. 또한 SDI 신호로 전송되는 Closed Caption 데이터를 Decoding하는 상용 IC는 거의 없는 수준이다. 이에 본 논문에서는 SDI로 전송되는 다양한 방식의 Closed Caption 데이터를 모두 표시하기 위한 방법을 제안하였다. 먼저 VBI (Vertical Blanking Interval) 에 아날로그 Waveform 형태로 입력되는 경우 데이터의 신뢰도를 높이기 위해 Clock Run In을 실시간으로 검출 할 수 있는 구조를 제안하고 FPGA (Field Programmable Gata Array)로 구현하였다. 또한 VANC (Vertical Ancillary Space)로 들어오는 Caption데이터의 경우 특히 EIA-708 처럼 많은 데이터가 입력되는 경우 실시간으로 처리하기 위해서 기존의 I2C와 같은 느린 전송 방법이 아닌 FPGA와 프로세서 간에 메모리를 직접 Access 할 수 있는 방법을 제안하였다. 본 논문에서 제안 한 방법을 FPGA로 구현하였고, 실제 미국이나 캐나다 방송국에서 사용하는 Caption 인코더 장비 뿐만아니라 방송 콘텐츠를 직접 이용하여 동작 상태를 검증하였다.

  • PDF

Coreference Resolution using Hierarchical Pointer Networks (계층적 포인터 네트워크를 이용한 상호참조해결)

  • Park, Cheoneum;Lee, Changki
    • KIISE Transactions on Computing Practices
    • /
    • v.23 no.9
    • /
    • pp.542-549
    • /
    • 2017
  • Sequence-to-sequence models and similar pointer networks suffer from performance degradation when an input is composed of multiple sentences or when the length of the input sentence is long. To solve this problem, this paper proposes a hierarchical pointer network model that uses both the word level and sentence level information to encode input sequences composed of several sentences at the word level and sentence level. We propose a hierarchical pointer network based coreference resolution that performs a coreference resolution for all mentions. The experimental results show that the proposed model has a precision of 87.07%, recall of 65.39% and CoNLL F1 74.61%, which is an improvement of 21.83% compared to an existing rule-based model.