• 제목/요약/키워드: sequence-to-sequence model

검색결과 1,626건 처리시간 0.032초

적대적 생성 모델을 활용한 사용자 행위 이상 탐지 방법 (Anomaly Detection for User Action with Generative Adversarial Networks)

  • 최남웅;김우주
    • 지능정보연구
    • /
    • 제25권3호
    • /
    • pp.43-62
    • /
    • 2019
  • 한때, 이상 탐지 분야는 특정 데이터로부터 도출한 기초 통계량을 기반으로 이상 유무를 판단하는 방법이 지배적이었다. 이와 같은 방법론이 가능했던 이유는 과거엔 데이터의 차원이 단순하여 고전적 통계 방법이 효과적으로 작용할 수 있었기 때문이다. 하지만 빅데이터 시대에 접어들며 데이터의 속성이 복잡하게 변화함에 따라 더는 기존의 방식으로 산업 전반에 발생하는 데이터를 정확하게 분석, 예측하기 어렵게 되었다. 따라서 기계 학습 방법을 접목한 SVM, Decision Tree와 같은 모형을 활용하게 되었다. 하지만 지도 학습 기반의 모형은 훈련 데이터의 이상과 정상의 클래스 수가 비슷할 때만 테스트 과정에서 정확한 예측을 할 수 있다는 특수성이 있고 산업에서 생성되는 데이터는 대부분 정답 클래스가 불균형하기에 지도 학습 모형을 적용할 경우, 항상 예측되는 결과의 타당성이 부족하다는 문제점이 있다. 이러한 단점을 극복하고자 현재는 클래스 분포에 영향을 받지 않는 비지도 학습 기반의 모델을 바탕으로 이상 탐지 모형을 구성하여 실제 산업에 적용하기 위해 시행착오를 거치고 있다. 본 연구는 이러한 추세에 발맞춰 적대적 생성 신경망을 활용하여 이상 탐지하는 방법을 제안하고자 한다. 시퀀스 데이터를 학습시키기 위해 적대적 생성 신경망의 구조를 LSTM으로 구성하고 생성자의 LSTM은 2개의 층으로 각각 32차원과 64차원의 은닉유닛으로 구성, 판별자의 LSTM은 64차원의 은닉유닛으로 구성된 1개의 층을 사용하였다. 기존 시퀀스 데이터의 이상 탐지 논문에서는 이상 점수를 도출하는 과정에서 판별자가 실제데이터일 확률의 엔트로피 값을 사용하지만 본 논문에서는 자질 매칭 기법을 활용한 함수로 변경하여 이상 점수를 도출하였다. 또한, 잠재 변수를 최적화하는 과정을 LSTM으로 구성하여 모델 성능을 향상시킬 수 있었다. 변형된 형태의 적대적 생성 모델은 오토인코더의 비해 모든 실험의 경우에서 정밀도가 우세하였고 정확도 측면에서는 대략 7% 정도 높음을 확인할 수 있었다.

Measurement of missing video frames in NPP control room monitoring system using Kalman filter

  • Mrityunjay Chaubey;Lalit Kumar Singh;Manjari Gupta
    • Nuclear Engineering and Technology
    • /
    • 제55권1호
    • /
    • pp.37-44
    • /
    • 2023
  • Using the Kalman filtering technique, we propose a novel method for estimating the missing video frames to monitor the activities inside the control room of a nuclear power plant (NPP). The purpose of this study is to reinforce the existing security and safety procedures in the control room of an NPP. The NPP control room serves as the nervous system of the plant, with instrumentation and control systems used to monitor and control critical plant parameters. Because the safety and security of the NPP control room are critical, it must be monitored closely by security cameras in order to assess and reduce the onset of any incidents and accidents that could adversely impact the safety of the NPP. However, for a variety of technical and administrative reasons, continuous monitoring may be interrupted. Because of the interruption, one or more frames of the video may be distorted or missing, making it difficult to identify the activity during this time period. This could endanger overall safety. The demonstrated Kalman filter model estimates the value of the missing frame pixel-by-pixel using information from the frame that occurred in the video sequence before it and the frame that will occur in the video sequence after it. The results of the experiment provide evidence of the effectiveness of the algorithm.

Best Practice on Automatic Toon Image Creation from JSON File of Message Sequence Diagram via Natural Language based Requirement Specifications

  • Hyuntae Kim;Ji Hoon Kong;Hyun Seung Son;R. Young Chul Kim
    • International journal of advanced smart convergence
    • /
    • 제13권1호
    • /
    • pp.99-107
    • /
    • 2024
  • In AI image generation tools, most general users must use an effective prompt to craft queries or statements to elicit the desired response (image, result) from the AI model. But we are software engineers who focus on software processes. At the process's early stage, we use informal and formal requirement specifications. At this time, we adapt the natural language approach into requirement engineering and toon engineering. Most Generative AI tools do not produce the same image in the same query. The reason is that the same data asset is not used for the same query. To solve this problem, we intend to use informal requirement engineering and linguistics to create a toon. Therefore, we propose a sequence diagram and image generation mechanism by analyzing and applying key objects and attributes as an informal natural language requirement analysis. Identify morpheme and semantic roles by analyzing natural language through linguistic methods. Based on the analysis results, a sequence diagram and an image are generated through the diagram. We expect consistent image generation using the same image element asset through the proposed mechanism.

Implicit Treatment of Technical Specification and Thermal Hydraulic Parameter Uncertainties in Gaussian Process Model to Estimate Safety Margin

  • Fynan, Douglas A.;Ahn, Kwang-Il
    • Nuclear Engineering and Technology
    • /
    • 제48권3호
    • /
    • pp.684-701
    • /
    • 2016
  • The Gaussian process model (GPM) is a flexible surrogate model that can be used for nonparametric regression for multivariate problems. A unique feature of the GPM is that a prediction variance is automatically provided with the regression function. In this paper, we estimate the safety margin of a nuclear power plant by performing regression on the output of best-estimate simulations of a large-break loss-of-coolant accident with sampling of safety system configuration, sequence timing, technical specifications, and thermal hydraulic parameter uncertainties. The key aspect of our approach is that the GPM regression is only performed on the dominant input variables, the safety injection flow rate and the delay time for AC powered pumps to start representing sequence timing uncertainty, providing a predictive model for the peak clad temperature during a reflood phase. Other uncertainties are interpreted as contributors to the measurement noise of the code output and are implicitly treated in the GPM in the noise variance term, providing local uncertainty bounds for the peak clad temperature. We discuss the applicability of the foregoing method to reduce the use of conservative assumptions in best estimate plus uncertainty (BEPU) and Level 1 probabilistic safety assessment (PSA) success criteria definitions while dealing with a large number of uncertainties.

Structural Analysis of Recombinant Human Preproinsulins by Structure Prediction, Molecular Dynamics, and Protein-Protein Docking

  • Jung, Sung Hun;Kim, Chang-Kyu;Lee, Gunhee;Yoon, Jonghwan;Lee, Minho
    • Genomics & Informatics
    • /
    • 제15권4호
    • /
    • pp.142-146
    • /
    • 2017
  • More effective production of human insulin is important, because insulin is the main medication that is used to treat multiple types of diabetes and because many people are suffering from diabetes. The current system of insulin production is based on recombinant DNA technology, and the expression vector is composed of a preproinsulin sequence that is a fused form of an artificial leader peptide and the native proinsulin. It has been reported that the sequence of the leader peptide affects the production of insulin. To analyze how the leader peptide affects the maturation of insulin structurally, we adapted several in silico simulations using 13 artificial proinsulin sequences. Three-dimensional structures of models were predicted and compared. Although their sequences had few differences, the predicted structures were somewhat different. The structures were refined by molecular dynamics simulation, and the energy of each model was estimated. Then, protein-protein docking between the models and trypsin was carried out to compare how efficiently the protease could access the cleavage sites of the proinsulin models. The results showed some concordance with experimental results that have been reported; so, we expect our analysis will be used to predict the optimized sequence of artificial proinsulin for more effective production.

Protein Ontology: Semantic Data Integration in Proteomics

  • Sidhu, Amandeep S.;Dillon, Tharam S.;Chang, Elizabeth;Sidhu, Baldev S.
    • 한국생물정보학회:학술대회논문집
    • /
    • 한국생물정보시스템생물학회 2005년도 BIOINFO 2005
    • /
    • pp.388-391
    • /
    • 2005
  • The Protein Structural and Functional Conservation need a common language for data definition. With the help of common language provided by Protein Ontology the high level of sequence and functional conservation can be extended to all organisms with the likelihood that proteins that carry out core biological processes will again be probable orthologues. The structural and functional conservation in these proteins presents both opportunities and challenges. The main opportunity lies in the possibility of automated transfer of protein data annotations from experimentally traceable model organisms to a less traceable organism based on protein sequence similarity. Such information can be used to improve human health or agriculture. The challenge lies in using a common language to transfer protein data annotations among different species of organisms. First step in achieving this huge challenge is producing a structured, precisely defined common vocabulary using Protein Ontology. The Protein Ontology described in this paper covers the sequence, structure and biological roles of Protein Complexes in any organism.

  • PDF

Modification of cell wall structural carbohydrate in the hybrid poplar expressing Medicago R2R3-MYB transcription factor MtMYB70

  • Kim, Sun Hee;Choi, Young Im;Jin, Hyunjung;Shin, Soo-Jeong;Park, Jong-Sug;Kwon, Mi
    • Journal of Plant Biotechnology
    • /
    • 제42권2호
    • /
    • pp.93-103
    • /
    • 2015
  • The isolation, cloning, and characterization of an R2R3-MYB transcription factor gene (MtMYB70) from the model legume Medicago truncatula is reported. MtMYB70 consists of a 768-bp coding sequence corresponding to 255 amino acids. Sequence alignment revealed that MtMYB70 cDNA contains conserved R2R3-type MYB domains with highly divergent C terminal regions. MtMYB70 was found to have relatively low sequence homology with known R2R3-MYB genes. Phylogenetic analysis placed the R2R3-MYB domain of MtMYB70 closest to PtMYB1, a known activator of lignin biosynthesis. Overexpression of MtMYB70 under the control of the 35S promoter in transgenic poplar did not cause a significant difference in total lignin content relative to the control, but glucan content was significantly increased in transgenic poplar. Therefore, MtMYB70 might have regulatory role in the biosynthesis of cell wall structural carbohydrates.

Long Short Term Memory 모델 기반 Case Study를 통한 낙동강 하구역의 용존산소농도 예측 (Prediction of DO Concentration in Nakdong River Estuary through Case Study Based on Long Short Term Memory Model)

  • 박성식;김경회
    • 한국해안·해양공학회논문집
    • /
    • 제33권6호
    • /
    • pp.238-245
    • /
    • 2021
  • 본 연구에서는 LSTM 모델을 활용하여 낙동강 하구역의 DO 농도 예측을 위한 최적 모델 조건과 적합한 예측변수를 찾기 위한 Case study를 수행하였다. 모델 매개변수 case study 결과, Epoch = 300과 Sequence length = 1에서 상대적으로 높은 정확도를 보였다. 예측변수 case study 결과, DO와 수온을 예측변수로 했을 때 가장 높은 정확도를 보였으며, 이는 DO 농도와 수온의 높은 상관성에 기인한 것으로 판단된다. 상기 결과로부터 낙동강 하구역의 DO 농도 예측에 적합한 LSTM 모델 조건과 예측변수를 찾을 수 있었다.

고분해능 NMR 분석법에 의한 에틸렌글리콜과 1, 4-시클로헥산디메탄올의 테레프탈산 공중합체의 화학구조 연구 (Chemical Structure Study on Copolyterephthalates Based on Ethylene Glycol and 1, 4-Cyclohexane Dimethanol by High Resolution NMR Analysis)

  • 유희열;김상욱
    • 공업화학
    • /
    • 제4권4호
    • /
    • pp.770-775
    • /
    • 1993
  • 에틸렌글리콜과 1, 4-시클로헥산디메탄올의 테레프탈산 공중합체인, poly(ethylene terephthalate-co-1, 4-cyelohexylene dimethylene terephthalate), P(ET-CT)의 화학 구조를 고분해능 NMR 분석을 통하여 조사하였다. $^1H$ NMR 분석에서 메칠렌기에 의한 chemical shift가 ET, 트란스 CT, 및 시스 CT로 분리됨에 따라 공중합조성 및 이성질체의 비율을 구할 수 있었다. $^{13}C$ NMR 분석에서 카르보닐기에 연결된 벤젠기의 탄소가 diad로 분리됨에 따라 공중합 연쇄 분포(copolymer sequence distribution)를 구한 결과, P(ET-CT) 공중합체는 랜덤공중합체임이 판명되었다. 아울러 랜덤 통계식을 이용하여 공중합 연쇄의 분포 및 평균길이를 구할 수 있었다.

  • PDF

THE LUMINOSITY FUNCTION AND INITIAL MASS FUNCTION FOR THE PLEIADES CLUSTER

  • LEE SEE WOO;SUNG HWANKYUNG
    • 천문학회지
    • /
    • 제28권1호
    • /
    • pp.45-59
    • /
    • 1995
  • In the best observed Pleiades cluster, the luminosity function(LF) and mass function(MF) for main sequence(MS) stars extended to $Mv{\approx}15.5(V{\approx}21)$ are very similar to the initial luminosity function(ILF) and initial mass function(IMF) for field stars in the solar neighborhood showing a bump at log $m{\simeq}-0.05$ and a dip at log $m{\simeq}-0.12$. This dip is equivalent to the Wielen dip appearing in the LF for the field stars. The occurence of these bump and dip is independent of adopted mass-luminosity relation(MLR) . and their characteristics could be explained by a time-dependent bimodal IMF. The model with this IMF gives a total cluster mass of $\~700M_\bigodot,\;\~25$ brown dwarfs and $\~3$ white dwarfs if the upper mass limit of progenitor of white dwarf is greater than $4.5M_\bigodot$. The cluster age on the basis of LF for brightest stars is given by $\~8\times10^7yr$ and all stars in the cluster lie along the single age sequence in the C-M diagram without showing a large dispersion from the sequence.

  • PDF