• Title/Summary/Keyword: Domain adaptation technique

Search Result 20, Processing Time 0.027 seconds

Korean Semantic Role Labeling Using Domain Adaptation Technique (도메인 적응 기술을 이용한 한국어 의미역 인식)

  • Lim, Soojong;Bae, Yongjin;Kim, Hyunki
    • Annual Conference on Human and Language Technology
    • /
    • 2014.10a
    • /
    • pp.56-60
    • /
    • 2014
  • 기계학습 방법에 기반한 자연어 분석은 학습 데이터가 필요하다. 학습 데이터가 구축된 소스 도메인이 아닌 다른 도메인에 적용할 경우 한국어 의미역 인식 기술은 15% 정도 성능 하락이 발생한다. 본 논문은 이러한 다른 도메인에 적용시 발생하는 성능 하락 현상을 극복하기 위해서 기존의 소스 도메인 학습 데이터를 활용하여, 소규모의 타겟 도메인 학습 데이터 구축만으로도 성능 하락을 최소화하기 위해 한국어 의미역 인식 기술에 prior 모델을 제안하며 기존의 도메인 적응 알고리즘과 비교 실험하였다. 추가적으로 학습 데이터에 사용되는 자질 중에서, 형태소 태그와 구문 태그의 자질 값을 기존보다 단순하게 적용하여 성능의 변화를 실험하였다.

  • PDF

Extending Korean PropBank for Korean Semantic Role Labeling and Applying Domain Adaptation Technique (한국어 의미역 결정을 위한 Korean PropBank 확장 및 도메인 적응 기술 적용)

  • Bae, JangSeong;Oh, JunHo;Hwang, HyunSun;Lee, Changki
    • Annual Conference on Human and Language Technology
    • /
    • 2014.10a
    • /
    • pp.44-47
    • /
    • 2014
  • 한국어 의미역 결정(Semantic Role Labeling)은 주로 기계 학습에 의해 이루어지며 많은 말뭉치 자원을 필요로 한다. 그러나 한국어 의미역 결정 시스템에서 사용되는 Korean PropBank는 의미역 부착 말뭉치와 동사 격틀이 영어 PropBank의 1/8 수준에 불과하다. 따라서 본 논문에서는 한국어 의미역 결정 시스템을 위해 의미역 부착 말뭉치와 동사 격틀을 확장하여 Korean PropBank를 확장 시키고자 한다. 의미역 부착 말뭉치를 만드는 일은 많은 자원과 시간이 소비되는 작업이다. 본 논문에서는 도메인 적응 기술을 적용해보고 기존의 학습 데이터를 활용하여, 적은 양의 새로운 학습 말뭉치만을 가지고 성능 하락을 최소화 할 수 있는지 실험을 통해 알아보고자 한다.

  • PDF

A Korean Flight Reservation System Using Continuous Speech Recognition

  • Choi, Jong-Ryong;Kim, Bum-Koog;Chung, Hyun-Yeol;Nakagawa, Seiichi
    • The Journal of the Acoustical Society of Korea
    • /
    • v.15 no.3E
    • /
    • pp.60-65
    • /
    • 1996
  • This paper describes on the Korean continuous speech recognition system for flight reservation. It adopts a frame-synchronous One-Pass DP search algorithm driven by syntactic constraints of context free grammar(CFG). For recognition, 48 phoneme-like units(PLU) were defined and used as basic units for acoustic modeling of Korean. This modeling was conducted using a HMM technique, where each model has 4-states 3-continuous output probability distributions and 3-discrete-duration distributions. Language modeling by CFG was also applied to the task domain of flight reservation, which consisted of 346 words and 422 rewriting rules. In the tests, the sentence recognition rate of 62.6% was obtained after speaker adaptation.

  • PDF

A Study of Semantic Role Labeling using Domain Adaptation Technique for Question (도메인 적응 기술 기반 질문 문장에 대한 의미역 인식 연구)

  • Lim, Soojong;Kim, Hyunki
    • Annual Conference on Human and Language Technology
    • /
    • 2015.10a
    • /
    • pp.246-249
    • /
    • 2015
  • 기계학습 방법에 기반한 자연어 분석은 학습 데이터가 필요하다. 학습 데이터가 구축된 소스 도메인이 아닌 다른 도메인에 적용할 경우 한국어 의미역 인식 기술은 10% 정도 성능 하락이 발생한다. 본 논문은 기존 도메인 적응 기술을 이용하여 도메인이 다르고, 문장의 형태도 다를 경우에 도메인 적응 알고리즘을 적용하여, 질의응답 시스템에서 필요한 질문 문장 의미역 인식을 위해, 소규모의 질문 문장에 대한 학습 데이터 구축만으로도 한국어 질문 문장에 대해 성능을 향상시키기 위한 방법을 제안한다. 한국어 의미역 인식 기술에 prior 모델을 제안한다. 제안하는 방법은 실험결과 소스 도메인 데이터만 사용한 실험보다 9.42, 소스와 타겟 도메인 데이터를 단순 합하여 학습한 경우보다 2.64의 성능향상을 보였다.

  • PDF

An Ensemble Model for Credit Default Discrimination: Incorporating BERT-based NLP and Transformer

  • Sophot Ky;Ju-Hong Lee
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2023.05a
    • /
    • pp.624-626
    • /
    • 2023
  • Credit scoring is a technique used by financial institutions to assess the creditworthiness of potential borrowers. This involves evaluating a borrower's credit history to predict the likelihood of defaulting on a loan. This paper presents an ensemble of two Transformer based models within a framework for discriminating the default risk of loan applications in the field of credit scoring. The first model is FinBERT, a pretrained NLP model to analyze sentiment of financial text. The second model is FT-Transformer, a simple adaptation of the Transformer architecture for the tabular domain. Both models are trained on the same underlying data set, with the only difference being the representation of the data. This multi-modal approach allows us to leverage the unique capabilities of each model and potentially uncover insights that may not be apparent when using a single model alone. We compare our model with two famous ensemble-based models, Random Forest and Extreme Gradient Boosting.

Domain adaptation of Korean coreference resolution using continual learning (Continual learning을 이용한 한국어 상호참조해결의 도메인 적응)

  • Yohan Choi;Kyengbin Jo;Changki Lee;Jihee Ryu;Joonho Lim
    • Annual Conference on Human and Language Technology
    • /
    • 2022.10a
    • /
    • pp.320-323
    • /
    • 2022
  • 상호참조해결은 문서에서 명사, 대명사, 명사구 등의 멘션 후보를 식별하고 동일한 개체를 의미하는 멘션들을 찾아 그룹화하는 태스크이다. 딥러닝 기반의 한국어 상호참조해결 연구들에서는 BERT를 이용하여 단어의 문맥 표현을 얻은 후 멘션 탐지와 상호참조해결을 동시에 수행하는 End-to-End 모델이 주로 연구가 되었으며, 최근에는 스팬 표현을 사용하지 않고 시작과 끝 표현식을 통해 상호참조해결을 빠르게 수행하는 Start-to-End 방식의 한국어 상호참조해결 모델이 연구되었다. 최근에 한국어 상호참조해결을 위해 구축된 ETRI 데이터셋은 WIKI, QA, CONVERSATION 등 다양한 도메인으로 이루어져 있으며, 신규 도메인의 데이터가 추가될 경우 신규 데이터가 추가된 전체 학습데이터로 모델을 다시 학습해야 하며, 이때 많은 시간이 걸리는 문제가 있다. 본 논문에서는 이러한 상호참조해결 모델의 도메인 적응에 Continual learning을 적용해 각기 다른 도메인의 데이터로 모델을 학습 시킬 때 이전에 학습했던 정보를 망각하는 Catastrophic forgetting 현상을 억제할 수 있음을 보인다. 또한, Continual learning의 성능 향상을 위해 2가지 Transfer Techniques을 함께 적용한 실험을 진행한다. 실험 결과, 본 논문에서 제안한 모델이 베이스라인 모델보다 개발 셋에서 3.6%p, 테스트 셋에서 2.1%p의 성능 향상을 보였다.

  • PDF

Design of Optimal FIR Filters for Data Transmission (데이터 전송을 위한 최적 FIR 필터 설계)

  • 이상욱;이용환
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.18 no.8
    • /
    • pp.1226-1237
    • /
    • 1993
  • For data transmission over strictly band-limited non-ideal channels, different types of filters with arbitrary responses are needed. In this paper. we proposed two efficient techniques for the design of such FIR filters whose response is specified in either the time or the frequency domain. In particular when a fractionally-spaced structure is used for the transceiver, these filters can be efficiently designed by making use of characteristics of oversampling. By using a minimum mean-squared error criterion, we design a fractionally-spaced FIR filter whose frequency response can be controlled without affecting the output error. With proper specification of the shape of the additive noise signals, for example, the design results in a receiver filter that can perform compromise equalization as well as phase splitting filtering for QAM demodulation. The second method ad-dresses the design of an FIR filter whose desired response can be arbitrarily specified in the frequency domain. For optimum design, we use an iterative optimization technique based on a weighted least mean square algorithm. A new adaptation algorithm for updating the weighting function is proposed for fast and stable convergence. It is shown that these two independent methods can be efficiently combined together for more complex applications.

  • PDF

Aerodynamic Shape Optimization of Helicopter Rotor Blades in Hover Using a Continuous Adjoint Method on Unstructured Meshes (비정렬 격자계에서 연속 Adjoint 방법을 이용한 헬리콥터 로터 블레이드의 제자리 비행 공력 형상 최적설계)

  • Lee, S.-W.;Kwon, O.-J.
    • Journal of the Korean Society for Aeronautical & Space Sciences
    • /
    • v.33 no.1
    • /
    • pp.1-10
    • /
    • 2005
  • An aerodynamic shape optimization technique has been developed for helicopter rotor blades in hover based on a continuous adjoint method on unstructured meshes. The Euler flow solver and the continuous adjoint sensitivity analysis were formulated on the rotating frame of reference for hovering rotor blades. In order to handle the repeated evaluation of the design cycle efficiently, the flow and adjoint solvers were parallelized using a domain decomposition strategy. A solution-adaptive mesh refinement technique was adopted for the accurate capturing of the tip vortex. Applications were made for the aerodynamic shape optimization of Caradonna-Tung rotor blades and UH60 rotor blades in hover. The results showed that the present method is an effective tool to determine optimum aerodynamic shapes of rotor blades requiring less torque while maintaining the desired thrust level.

Measurement of the Skin Blood Flow using Cross-Correlation (Cross-Correlation법에 의한 피부 혈류속도 측정)

  • Lee, Jeong-Taek;Im, Chun-Seong;Ryu, Jeom-Su;Lee, Jong-Su;Gong, Seong-Bae;Kim, Yeong-Gil
    • Journal of Biomedical Engineering Research
    • /
    • v.19 no.4
    • /
    • pp.379-384
    • /
    • 1998
  • To measure precisely the blood velocity in the skin microcirculation, we have used time domain correlation (called Cross-Correlation) based on the processing of the backscattered RF signal obtained with a wideband echographic imaging transducer, although it is difficulties of adaptation of the pulsed wave system, because of the data processing in real time and the hardware problem. This dedicated technology based on a 20MHz echographic imaging system has been developed. We present how the experimental data, i.e. the backscattered RF signal, have to be analyzed. After RF lines realignment, stationary echo canceling procedure and correlation level control, a velocity profile has been obtained. In-vitro result show that velocity measurements as low as 0.1mm/sec attainable with a 80${\mu}m$ in axial resolution. We have also validated with in-vivo experimentation on the external ear of a rabbit using B-mode sector scanning image and M-mode image of a custom made 20MHz skin image system. The flow of the "auriculares caudales" vein, a microvessel of 600 m diameter, has been detected and studied. This technique will allow a more precise exploration of circulatory troubles in cutaneous pathologies.

  • PDF

Novel Deep Learning-Based Profiling Side-Channel Analysis on the Different-Device (이종 디바이스 환경에 효과적인 신규 딥러닝 기반 프로파일링 부채널 분석)

  • Woo, Ji-Eun;Han, Dong-Guk
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.32 no.5
    • /
    • pp.987-995
    • /
    • 2022
  • Deep learning-based profiling side-channel analysis has been many proposed. Deep learning-based profiling analysis is a technique that trains the relationship between the side-channel information and the intermediate values to the neural network, then finds the secret key of the attack device using the trained neural network. Recently, cross-device profiling side channel analysis was proposed to consider the realistic deep learning-based profiling side channel analysis scenarios. However, it has a limitation in that attack performance is lowered if the profiling device and the attack device have not the same chips. In this paper, an environment in which the profiling device and the attack device have not the same chips is defined as the different-device, and a novel deep learning-based profiling side-channel analysis on different-device is proposed. Also, MCNN is used to well extract the characteristic of each data. We experimented with the six different boards to verify the attack performance of the proposed method; as a result, when the proposed method was used, the minimum number of attack traces was reduced by up to 25 times compared to without the proposed method.