• 제목/요약/키워드: Neural network Transformer

Search Result 109, Processing Time 0.023 seconds

Transformer and Spatial Pyramid Pooling based YOLO network for Object Detection (객체 검출을 위한 트랜스포머와 공간 피라미드 풀링 기반의 YOLO 네트워크)

  • Kwon, Oh-Jun;Jeong, Je-Chang
    • Proceedings of the Korean Society of Broadcast Engineers Conference
    • /
    • fall
    • /
    • pp.113-116
    • /
    • 2021
  • 일반적으로 딥러닝 기반의 객체 검출(Object Detection)기법은 합성곱 신경망(Convolutional Neural Network, CNN)을 통해 입력된 영상의 특징(Feature)을 추출하여 이를 통해 객체 검출을 수행한다. 최근 자연어 처리 분야에서 획기적인 성능을 보인 트랜스포머(Transformer)가 영상 분류, 객체 검출과 같은 컴퓨터 비전 작업을 수행하는데 있어 경쟁력이 있음이 드러나고 있다. 본 논문에서는 YOLOv4-CSP의 CSP 블록을 개선한 one-stage 방식의 객체 검출 네트워크를 제안한다. 개선된 CSP 블록은 트랜스포머(Transformer)의 멀티 헤드 어텐션(Multi-Head Attention)과 CSP 형태의 공간 피라미드 풀링(Spatial Pyramid Pooling, SPP) 연산을 기반으로 네트워크의 Backbone과 Neck에서의 feature 학습을 돕는다. 본 실험은 MSCOCO test-dev2017 데이터 셋으로 평가하였으며 제안하는 네트워크는 YOLOv4-CSP의 경량화 모델인 YOLOv4s-mish에 대하여 평균 정밀도(Average Precision, AP)기준 2.7% 향상된 검출 정확도를 보인다.

  • PDF

Application of LVQ3 for Dissolved Gas Analysis for Power Transformer (전력용 변압기의 유중가스 분석을 위한 LVQ3의 적용)

  • Jeon, Yeong-Jae;Kim, Jae-Cheol
    • The Transactions of the Korean Institute of Electrical Engineers A
    • /
    • v.49 no.1
    • /
    • pp.31-36
    • /
    • 2000
  • To enhance the fault diagnosis ability for the dissolved gas analysis(DGA) of the power transformer, this paper proposes a learning vector quantization(LVQ) for the incipient fault recognition. LVQ is suitable expecially for pattern recognition such as fault diagnosis of power transformer using DGA because it improves the performance of Kohonen neural network by placing emphasis on the classification around the decision boundary. The capabilities of the proposed diagnosis system for the transformer DGA decision support have been extensively verified through the practical test data collected from Korea Electrical Power Corporation.

  • PDF

Implementation of Melody Generation Model Through Weight Adaptation of Music Information Based on Music Transformer (Music Transformer 기반 음악 정보의 가중치 변형을 통한 멜로디 생성 모델 구현)

  • Seunga Cho;Jaeho Lee
    • IEMEK Journal of Embedded Systems and Applications
    • /
    • v.18 no.5
    • /
    • pp.217-223
    • /
    • 2023
  • In this paper, we propose a new model for the conditional generation of music, considering key and rhythm, fundamental elements of music. MIDI sheet music is converted into a WAV format, which is then transformed into a Mel Spectrogram using the Short-Time Fourier Transform (STFT). Using this information, key and rhythm details are classified by passing through two Convolutional Neural Networks (CNNs), and this information is again fed into the Music Transformer. The key and rhythm details are combined by differentially multiplying the weights and the embedding vectors of the MIDI events. Several experiments are conducted, including a process for determining the optimal weights. This research represents a new effort to integrate essential elements into music generation and explains the detailed structure and operating principles of the model, verifying its effects and potentials through experiments. In this study, the accuracy for rhythm classification reached 94.7%, the accuracy for key classification reached 92.1%, and the Negative Likelihood based on the weights of the embedding vector resulted in 3.01.

A Study on Utilization of Vision Transformer for CTR Prediction (CTR 예측을 위한 비전 트랜스포머 활용에 관한 연구)

  • Kim, Tae-Suk;Kim, Seokhun;Im, Kwang Hyuk
    • Knowledge Management Research
    • /
    • v.22 no.4
    • /
    • pp.27-40
    • /
    • 2021
  • Click-Through Rate (CTR) prediction is a key function that determines the ranking of candidate items in the recommendation system and recommends high-ranking items to reduce customer information overload and achieve profit maximization through sales promotion. The fields of natural language processing and image classification are achieving remarkable growth through the use of deep neural networks. Recently, a transformer model based on an attention mechanism, differentiated from the mainstream models in the fields of natural language processing and image classification, has been proposed to achieve state-of-the-art in this field. In this study, we present a method for improving the performance of a transformer model for CTR prediction. In order to analyze the effect of discrete and categorical CTR data characteristics different from natural language and image data on performance, experiments on embedding regularization and transformer normalization are performed. According to the experimental results, it was confirmed that the prediction performance of the transformer was significantly improved when the L2 generalization was applied in the embedding process for CTR data input processing and when batch normalization was applied instead of layer normalization, which is the default regularization method, to the transformer model.

A Korean speech recognition based on conformer (콘포머 기반 한국어 음성인식)

  • Koo, Myoung-Wan
    • The Journal of the Acoustical Society of Korea
    • /
    • v.40 no.5
    • /
    • pp.488-495
    • /
    • 2021
  • We propose a speech recognition system based on conformer. Conformer is known to be convolution-augmented transformer, which combines transfer model for capturing global information with Convolution Neural Network (CNN) for exploiting local feature effectively. The baseline system is developed to be a transfer-based speech recognition using Long Short-Term Memory (LSTM)-based language model. The proposed system is a system which uses conformer instead of transformer with transformer-based language model. When Electronics and Telecommunications Research Institute (ETRI) speech corpus in AI-Hub is used for our evaluation, the proposed system yields 5.7 % of Character Error Rate (CER) while the baseline system results in 11.8 % of CER. Even though speech corpus is extended into other domain of AI-hub such as NHNdiguest speech corpus, the proposed system makes a robust performance for two domains. Throughout those experiments, we can prove a validation of the proposed system.

Simple and effective neural coreference resolution for Korean language

  • Park, Cheoneum;Lim, Joonho;Ryu, Jihee;Kim, Hyunki;Lee, Changki
    • ETRI Journal
    • /
    • v.43 no.6
    • /
    • pp.1038-1048
    • /
    • 2021
  • We propose an end-to-end neural coreference resolution for the Korean language that uses an attention mechanism to point to the same entity. Because Korean is a head-final language, we focused on a method that uses a pointer network based on the head. The key idea is to consider all nouns in the document as candidates based on the head-final characteristics of the Korean language and learn distributions over the referenced entity positions for each noun. Given the recent success of applications using bidirectional encoder representation from transformer (BERT) in natural language-processing tasks, we employed BERT in the proposed model to create word representations based on contextual information. The experimental results indicated that the proposed model achieved state-of-the-art performance in Korean language coreference resolution.

A Dual-Structured Self-Attention for improving the Performance of Vision Transformers (비전 트랜스포머 성능향상을 위한 이중 구조 셀프 어텐션)

  • Kwang-Yeob Lee;Hwang-Hee Moon;Tae-Ryong Park
    • Journal of IKEEE
    • /
    • v.27 no.3
    • /
    • pp.251-257
    • /
    • 2023
  • In this paper, we propose a dual-structured self-attention method that improves the lack of regional features of the vision transformer's self-attention. Vision Transformers, which are more computationally efficient than convolutional neural networks in object classification, object segmentation, and video image recognition, lack the ability to extract regional features relatively. To solve this problem, many studies are conducted based on Windows or Shift Windows, but these methods weaken the advantages of self-attention-based transformers by increasing computational complexity using multiple levels of encoders. This paper proposes a dual-structure self-attention using self-attention and neighborhood network to improve locality inductive bias compared to the existing method. The neighborhood network for extracting local context information provides a much simpler computational complexity than the window structure. CIFAR-10 and CIFAR-100 were used to compare the performance of the proposed dual-structure self-attention transformer and the existing transformer, and the experiment showed improvements of 0.63% and 1.57% in Top-1 accuracy, respectively.

Development of Management Software for Transformers Based on Artificial Intelligent Analysis Technology of Dissolved Gases in Oil (지능형 유중가스 분석기술 기반 유입식 변압기 전산관리 프로그램 개발)

  • Sun Jong-Ho;Han Sang-Bo;Kang Dong-Sik;Kim Kwang-Hwa
    • The Transactions of the Korean Institute of Electrical Engineers C
    • /
    • v.54 no.12
    • /
    • pp.578-584
    • /
    • 2005
  • This paper describes development of management software for transformers based on artificial intelligent analysis technology of dissolved gases in oil. Fault interpretation using the artificial intelligent analysis is performed by the artificial neural network and a rule based on the analysis of dissolved gases. The used gases are acetylene($C_{2}H_{2}$), hydrogen($H_2$), ethylene($C_{2}H_{4}$), methane($CH_4$), ethane($C_{2}H_{6}$), carbon monoxide(CO) and carbon dioxide($CO_2$). This software is mainly composed of gases input, fault's causes, expected fault's phenomena in detail, the decision on maintenance as well as report and gas trend windows. It is indicated that this is very powerful software for the efficient management of oil-immersed transformers using data analysis of gas components.

Machine learning-based evaluation technology of 3D spatial distribution of residual radioactivity in large-scale radioactive structures

  • UkJae Lee;Phillip Chang;Nam-Suk Jung;Jonghun Jang;Jimin Lee;Hee-Seock Lee
    • Nuclear Engineering and Technology
    • /
    • v.56 no.8
    • /
    • pp.3199-3209
    • /
    • 2024
  • During the decommissioning of nuclear and particle accelerator facilities, a considerable amount of large-scale radioactive waste may be generated. Accurately defining the activation level of the waste is crucial for proper disposal. However, directly measuring the internal radioactivity distribution poses challenges. This study introduced a novel technology employing machine learning to assess the internal radioactivity distribution based on external measurements. Random radioactivity distribution within a structure were established, and the photon spectrum measured by detectors from outside the structure was simulated using the FLUKA Monte-Carlo code. Through training with spectrum data corresponding to various radioactivity distributions, an evaluation model for radioactivity using simulated data was developed by above Monte-Carlo simulation. Convolutional Neural Network and Transformer methods were utilized to establish the evaluation model. The machine learning construction involves 5425 simulation datasets, and 603 datasets, which were used to obtain the evaluated results. Preprocessing was applied to the datasets, but the evaluation model using raw spectrum data showed the best evaluation results. The estimation of the intensity and shape of the radioactivity distribution inside the structure was achieved with a relative error of 10%. Additionally, the evaluation based on the constructed model takes only a few seconds to complete the process.

Training an Artificial Neural Network (ANN) to Control the Tap Changer of Parallel Transformers for a Closed Primary Bus

  • Sedaghati, Alireza
    • 제어로봇시스템학회:학술대회논문집
    • /
    • 2004.08a
    • /
    • pp.1042-1047
    • /
    • 2004
  • Voltage control is an essential part of the electric energy transmission and distribution system to maintain proper voltage limit at the consumer's terminal. Besides the generating units that provide the basic voltage control, there are many additional voltage-controlling agents e.g., shunt capacitors, shunt reactors, static VAr compensators, regulating transformers mentioned in [1], [2]. The most popular one, among all those agents for controlling voltage levels at the distribution and transmission system, is the on-load tap changer transformer. It serves two functions-energy transformation in different voltage levels and the voltage control. Artificial Neural Network (ANN) has been realized as a convenient tool that can be used in controlling the on load tap changer in the distribution transformers. Usage of the ANN in this area needs suitable training and testing data for performance analysis before the practical application. This paper briefly describes a procedure of processing the data to train an Artificial Neural Network (ANN) to control the tap changer operating decision of parallel transformers for a closed primary bus. The data set are used to train a two layer ANN using three different neural net learning algorithms, namely, Standard Backpropagation [3], Bayesian Regularization [4] and Scaled Conjugate Gradient [5]. The experimental results are presented including performance analysis.

  • PDF