• 제목/요약/키워드: Encoder Model

검색결과 354건 처리시간 0.029초

이미지 생성을 위한 변동 자동 인코더 분산 제약 (Variational Auto Encoder Distributed Restrictions for Image Generation)

  • 김용길
    • 한국인터넷방송통신학회논문지
    • /
    • 제23권3호
    • /
    • pp.91-97
    • /
    • 2023
  • GAN(Generative Adversarial Networks)이 합성 이미지 생성 및 기타 다양한 응용 프로그램에 현재 사용되고 있지만, 생성 모델을 제어하기가 어렵다. 문제는 생성 모델의 잠재 공간에 있는데, 이미지 생성과 관련하여 입력된 잠재코드를 받아 특정 텍스트 및 신호에 따라 지정된 대상 속성이 향상되도록 하고 다른 속성은 크게 영향을 받지 않도록 하기 위해서는 상당한 제약이 요구된다. 본 연구에서는 이미지 생성 및 조작과 관련하여 변동 자동 인코더의 잠재 벡터에 관해 특정 제약을 수반한 모델을 제안한다. 제안된 모델에 관해 TensorFlow의 변동 자동 인코더를 통해 실험한 결과 이미지의 생성 및 조작과 관련하여 비교적 우수한 성능을 갖는 것으로 확인된다.

관로 조사를 위한 오토 인코더 기반 이상 탐지기법에 관한 연구 (A study on the auto encoder-based anomaly detection technique for pipeline inspection)

  • 김관태;이준원
    • 상하수도학회지
    • /
    • 제38권2호
    • /
    • pp.83-93
    • /
    • 2024
  • In this study, we present a sewer pipe inspection technique through a combination of active sonar technology and deep learning algorithms. It is difficult to inspect pipes containing water using conventional CCTV inspection methods, and there are various limitations, so a new approach is needed. In this paper, we introduce a inspection method using active sonar, and apply an auto encoder deep learning model to process sonar data to distinguish between normal and abnormal pipelines. This model underwent training on sonar data from a controlled environment under the assumption of normal pipeline conditions and utilized anomaly detection techniques to identify deviations from established standards. This approach presents a new perspective in pipeline inspection, promising to reduce the time and resources required for sewer system management and to enhance the reliability of pipeline inspections.

Accuracy Assessment of Forest Degradation Detection in Semantic Segmentation based Deep Learning Models with Time-series Satellite Imagery

  • Woo-Dam Sim;Jung-Soo Lee
    • Journal of Forest and Environmental Science
    • /
    • 제40권1호
    • /
    • pp.15-23
    • /
    • 2024
  • This research aimed to assess the possibility of detecting forest degradation using time-series satellite imagery and three different deep learning-based change detection techniques. The dataset used for the deep learning models was composed of two sets, one based on surface reflectance (SR) spectral information from satellite imagery, combined with Texture Information (GLCM; Gray-Level Co-occurrence Matrix) and terrain information. The deep learning models employed for land cover change detection included image differencing using the Unet semantic segmentation model, multi-encoder Unet model, and multi-encoder Unet++ model. The study found that there was no significant difference in accuracy between the deep learning models for forest degradation detection. Both training and validation accuracies were approx-imately 89% and 92%, respectively. Among the three deep learning models, the multi-encoder Unet model showed the most efficient analysis time and comparable accuracy. Moreover, models that incorporated both texture and gradient information in addition to spectral information were found to have a higher classification accuracy compared to models that used only spectral information. Overall, the accuracy of forest degradation extraction was outstanding, achieving 98%.

System-level Function and Architecture Codesign for Optimization of MPEG Encoder

  • Choi, Jin-Ku;Togawa, Nozomu;Yanagisawa, Masao;Ohtsuki, Tatsuo
    • 대한전자공학회:학술대회논문집
    • /
    • 대한전자공학회 2002년도 ITC-CSCC -3
    • /
    • pp.1736-1739
    • /
    • 2002
  • The advanced in semiconductor, hardware, and software technologies enables the integration of more com- plex systems and the increasing design complexity. As system design complexity becomes more complicated, System-level design based on the If block and processor model is more needed in most of the RTL level or low level. In this paper, we present a novel approach fur the system-level design, which satisfies the various required constraints and an optimization method of image encoder based on codesign of function, algorithm, and architecture. In addition, we show an MPEG-4 encoder as a design case study. The best tradeoffs between algorithm and architecture are necessary to deliver the design with satisfying performance and area constraints. The evaluations provide the effective optimization of motion estimation, which is in charge of an amount of performance in the MPEG-4 encoder module.

  • PDF

Deep Reference-based Dynamic Scene Deblurring

  • Cunzhe Liu;Zhen Hua;Jinjiang Li
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • 제18권3호
    • /
    • pp.653-669
    • /
    • 2024
  • Dynamic scene deblurring is a complex computer vision problem owing to its difficulty to model mathematically. In this paper, we present a novel approach for image deblurring with the help of the sharp reference image, which utilizes the reference image for high-quality and high-frequency detail results. To better utilize the clear reference image, we develop an encoder-decoder network and two novel modules are designed to guide the network for better image restoration. The proposed Reference Extraction and Aggregation Module can effectively establish the correspondence between blurry image and reference image and explore the most relevant features for better blur removal and the proposed Spatial Feature Fusion Module enables the encoder to perceive blur information at different spatial scales. In the final, the multi-scale feature maps from the encoder and cascaded Reference Extraction and Aggregation Modules are integrated into the decoder for a global fusion and representation. Extensive quantitative and qualitative experimental results from the different benchmarks show the effectiveness of our proposed method.

뉴로모픽 구조 기반 IoT 통합 개발환경에서 SNN 모델을 지원하기 위한 인코더/디코더 구현 (Implementation of Encoder/Decoder to Support SNN Model in an IoT Integrated Development Environment based on Neuromorphic Architecture)

  • 김회남;윤영선
    • 한국소프트웨어감정평가학회 논문지
    • /
    • 제17권2호
    • /
    • pp.47-57
    • /
    • 2021
  • 뉴로모픽 기술은 인간의 뇌 구조와 연산과정을 하드웨어로 모방하는 기술로 기존 인공지능 기술의 단점을 보완하기 위하여 제안되었다. 뉴로모픽 하드웨어 기반의 IoT 응용을 개발하기 위해 NA-IDE가 제안되었으며, NA-IDE에서 SNN 모델을 구현하기 위하여 일반적으로 많이 사용되는 입력 데이터를 SNN모델에 사용할 수 있도록 변환이 필요하다. 본 논문에서는 이미지 데이터를 SNN 입력으로 사용하기 위하여 스파이크 시계열 패턴으로 변환하는 신경코딩 방식의 인코더 컴포넌트를 구현하였다. 디코더 컴포넌트는 SNN 모델이 스파이크 시계열 패턴을 생성하는 경우, 출력된 시계열 데이터를 다시 이미지 데이터로 변환하도록 구현하였다. 디코더 컴포넌트는 출력 데이터에 인코딩 과정과 동일한 매개변수를 사용한 경우, 원본 데이터와 유사한 정적 데이터를 얻을 수 있었다. 제안된 인코더와 디코더를 사용한다면 image-to-image나 speech-to-speech와 같이 입력 데이터를 변환하여 재생성하는 분야에 사용할 수 있을 것이다.

Ensemble UNet 3+ for Medical Image Segmentation

  • JongJin, Park
    • International Journal of Internet, Broadcasting and Communication
    • /
    • 제15권1호
    • /
    • pp.269-274
    • /
    • 2023
  • In this paper, we proposed a new UNet 3+ model for medical image segmentation. The proposed ensemble(E) UNet 3+ model consists of UNet 3+s of varying depths into one unified architecture. UNet 3+s of varying depths have same encoder, but have their own decoders. They can bridge semantic gap between encoder and decoder nodes of UNet 3+. Deep supervision was used for learning on a total of 8 nodes of the E-UNet 3+ to improve performance. The proposed E-UNet 3+ model shows better segmentation results than those of the UNet 3+. As a result of the simulation, the E-UNet 3+ model using deep supervision was the best with loss function values of 0.8904 and 0.8562 for training and validation data. For the test data, the UNet 3+ model using deep supervision was the best with a value of 0.7406. Qualitative comparison of the simulation results shows the results of the proposed model are better than those of existing UNet 3+.

Zero-shot voice conversion with HuBERT

  • Hyelee Chung;Hosung Nam
    • 말소리와 음성과학
    • /
    • 제15권3호
    • /
    • pp.69-74
    • /
    • 2023
  • This study introduces an innovative model for zero-shot voice conversion that utilizes the capabilities of HuBERT. Zero-shot voice conversion models can transform the speech of one speaker to mimic that of another, even when the model has not been exposed to the target speaker's voice during the training phase. Comprising five main components (HuBERT, feature encoder, flow, speaker encoder, and vocoder), the model offers remarkable performance across a range of scenarios. Notably, it excels in the challenging unseen-to-unseen voice-conversion tasks. The effectiveness of the model was assessed based on the mean opinion scores and similarity scores, reflecting high voice quality and similarity to the target speakers. This model demonstrates considerable promise for a range of real-world applications demanding high-quality voice conversion. This study sets a precedent in the exploration of HuBERT-based models for voice conversion, and presents new directions for future research in this domain. Despite its complexities, the robust performance of this model underscores the viability of HuBERT in advancing voice conversion technology, making it a significant contributor to the field.

재사용성과 확장성 있는 HL7 인코딩/디코딩 프레임워크의 설계 및 구현 (Design and Implementation of a Reusable and Extensible HL7 Encoding/Decoding Framework)

  • 김정선;박승훈;나연묵
    • 한국정보과학회논문지:컴퓨팅의 실제 및 레터
    • /
    • 제8권1호
    • /
    • pp.96-106
    • /
    • 2002
  • HL7 (Health Level 7)은 Healthcare 환경의 이질적 시스템간에 임상 및 관리정보의 교환을 가능하게 하는 국제 표준 프로토콜로서 표준 인코딩 규칙에 따른 다양한 HL7 메시지 양식을 정의하고 있다. 본 논문에서는 메시지 객체 모델(Message Object Model)과 메시지 정의 저장소(Message Definition Repository)를 이용하여 유연성, 재사용성, 확장성이 탁월한 HL7 인코딩/디코딩 프레임워크의 설계 및 구현을 제시한다. 메시지 객체 모델은 HL7 메시지를 구성하는 객체들과 그들 간의 다양한 관계를 나타내는 추상적 HL7 메시지 양식으로서, 세그먼트, 필드, 컴포넌트 등과 같은 HL7 메시지의 표준 구성요소들 간의 논리적 관계를 반영하는 동시에 표준안에 의해 규정된 구조적 제약사항을 만족하도록 하여 준다. 메시지 객체 모델은 플랫폼 종속적인 데이터 양식과 상관없이 독립적으로 HL7 인코더와 디코더를 구축할 수 있도록 하여 주기 때문에 최소의 노력으로 임의의 이질적 병원 정보 시스템들을 상호 연결할 수 있도록 한다. 한편, HL7 메시지들을 정의하고 있는 외부 데이터베이스인 메시지 정의 저장소는 표준 HL7 메시지 양식이 수정되더라도 인코더와 디코더의 구현이 영향을 받지 않게 하여 준다. 게다가, 메시지 정의 저장소는 인코더와 디코더 각각의 입력(즉, 메시지 객체 모델로 표현된 HL7 메시지 객체와 인코딩된 HL7 메시지 문자열)에 대하여 합법성 여부를 조사하는 데 유용하게 사용된다. 본 논문에서는 프로토타입 HL7 인코더와 디코더의 구현을 위해 JAVA를 이용하였지만, 제시된 인코딩/디코딩 프레임워크는 인코더와 디코더를 ActiveX, JAVABEAN 또는 CORBA 객체 등과 같이 독립된 표준 컴포넌트로서 쉽게 구현될 수 있도록 하여 준다.

Arabic Stock News Sentiments Using the Bidirectional Encoder Representations from Transformers Model

  • Eman Alasmari;Mohamed Hamdy;Khaled H. Alyoubi;Fahd Saleh Alotaibi
    • International Journal of Computer Science & Network Security
    • /
    • 제24권2호
    • /
    • pp.113-123
    • /
    • 2024
  • Stock market news sentiment analysis (SA) aims to identify the attitudes of the news of the stock on the official platforms toward companies' stocks. It supports making the right decision in investing or analysts' evaluation. However, the research on Arabic SA is limited compared to that on English SA due to the complexity and limited corpora of the Arabic language. This paper develops a model of sentiment classification to predict the polarity of Arabic stock news in microblogs. Also, it aims to extract the reasons which lead to polarity categorization as the main economic causes or aspects based on semantic unity. Therefore, this paper presents an Arabic SA approach based on the logistic regression model and the Bidirectional Encoder Representations from Transformers (BERT) model. The proposed model is used to classify articles as positive, negative, or neutral. It was trained on the basis of data collected from an official Saudi stock market article platform that was later preprocessed and labeled. Moreover, the economic reasons for the articles based on semantic unit, divided into seven economic aspects to highlight the polarity of the articles, were investigated. The supervised BERT model obtained 88% article classification accuracy based on SA, and the unsupervised mean Word2Vec encoder obtained 80% economic-aspect clustering accuracy. Predicting polarity classification on the Arabic stock market news and their economic reasons would provide valuable benefits to the stock SA field.