• Title/Summary/Keyword: Generate Data

Search Result 3,066, Processing Time 0.028 seconds

De Novo Drug Design Using Self-Attention Based Variational Autoencoder (Self-Attention 기반의 변분 오토인코더를 활용한 신약 디자인)

  • Piao, Shengmin;Choi, Jonghwan;Seo, Sangmin;Kim, Kyeonghun;Park, Sanghyun
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.11 no.1
    • /
    • pp.11-18
    • /
    • 2022
  • De novo drug design is the process of developing new drugs that can interact with biological targets such as protein receptors. Traditional process of de novo drug design consists of drug candidate discovery and drug development, but it requires a long time of more than 10 years to develop a new drug. Deep learning-based methods are being studied to shorten this period and efficiently find chemical compounds for new drug candidates. Many existing deep learning-based drug design models utilize recurrent neural networks to generate a chemical entity represented by SMILES strings, but due to the disadvantages of the recurrent networks, such as slow training speed and poor understanding of complex molecular formula rules, there is room for improvement. To overcome these shortcomings, we propose a deep learning model for SMILES string generation using variational autoencoders with self-attention mechanism. Our proposed model decreased the training time by 1/26 compared to the latest drug design model, as well as generated valid SMILES more effectively.

Why Should I Ban You! : X-FDS (Explainable FDS) Model Based on Online Game Payment Log (X-FDS : 게임 결제 로그 기반 XAI적용 이상 거래탐지 모델 연구)

  • Lee, Young Hun;Kim, Huy Kang
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.32 no.1
    • /
    • pp.25-38
    • /
    • 2022
  • With the diversification of payment methods and games, related financial accidents are causing serious problems for users and game companies. Recently, game companies have introduced an Fraud Detection System (FDS) for game payment systems to prevent financial incident. However, FDS is ineffective and cannot provide major evidence based on judgment results, as it requires constant change of detection patterns. In this paper, we analyze abnormal transactions among payment log data of real game companies to generate related features. One of the unsupervised learning models, Autoencoder, was used to build a model to detect abnormal transactions, which resulted in over 85% accuracy. Using X-FDS (Explainable FDS) with XAI-SHAP, we could understand that the variables with the highest explanation for anomaly detection were the amount of transaction, transaction medium, and the age of users. Based on X-FDS, we derive an improved detection model with an accuracy of 94% was finally derived by fine-tuning the importance of features that adversely affect the proposed model.

FAST Design for Large-Scale Satellite Image Processing (대용량 위성영상 처리를 위한 FAST 시스템 설계)

  • Lee, Youngrim;Park, Wanyong;Park, Hyunchun;Shin, Daesik
    • Journal of the Korea Institute of Military Science and Technology
    • /
    • v.25 no.4
    • /
    • pp.372-380
    • /
    • 2022
  • This study proposes a distributed parallel processing system, called the Fast Analysis System for remote sensing daTa(FAST), for large-scale satellite image processing and analysis. FAST is a system that designs jobs in vertices and sequences, and distributes and processes them simultaneously. FAST manages data based on the Hadoop Distributed File System, controls entire jobs based on Apache Spark, and performs tasks in parallel in multiple slave nodes based on a docker container design. FAST enables the high-performance processing of progressively accumulated large-volume satellite images. Because the unit task is performed based on Docker, it is possible to reuse existing source codes for designing and implementing unit tasks. Additionally, the system is robust against software/hardware faults. To prove the capability of the proposed system, we performed an experiment to generate the original satellite images as ortho-images, which is a pre-processing step for all image analyses. In the experiment, when FAST was configured with eight slave nodes, it was found that the processing of a satellite image took less than 30 sec. Through these results, we proved the suitability and practical applicability of the FAST design.

Automatic Building Extraction Using SpaceNet Building Dataset and Context-based ResU-Net (SpaceNet 건물 데이터셋과 Context-based ResU-Net을 이용한 건물 자동 추출)

  • Yoo, Suhong;Kim, Cheol Hwan;Kwon, Youngmok;Choi, Wonjun;Sohn, Hong-Gyoo
    • Korean Journal of Remote Sensing
    • /
    • v.38 no.5_2
    • /
    • pp.685-694
    • /
    • 2022
  • Building information is essential for various urban spatial analyses. For this reason, continuous building monitoring is required, but it is a subject with many practical difficulties. To this end, research is being conducted to extract buildings from satellite images that can be continuously observed over a wide area. Recently, deep learning-based semantic segmentation techniques have been used. In this study, a part of the structure of the context-based ResU-Net was modified, and training was conducted to automatically extract a building from a 30 cm Worldview-3 RGB image using SpaceNet's building v2 free open data. As a result of the classification accuracy evaluation, the f1-score, which was higher than the classification accuracy of the 2nd SpaceNet competition winners. Therefore, if Worldview-3 satellite imagery can be continuously provided, it will be possible to use the building extraction results of this study to generate an automatic model of building around the world.

Literature Review on Private Health Insurance for Korean older adults (한국 노인대상 민간의료보험에 관한 선행연구 고찰)

  • Choi, Hyoung-Shim
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.22 no.4
    • /
    • pp.75-82
    • /
    • 2021
  • The purpose of this study was to review literature about private health insurance (PHI) schemes for Korean older adults published in Korea and abroad from 2015 to 2019, and to present the implications of the study and provide an overview of PHI utilization by Korean older adults. Nineteen quantitative academic research papers were selected and analyzed with respect to presentation time, research purpose, data source, and study design. Old age, low income, low education, and chronic disease were found to hinder enrollment in PHI schemes. Although the relationship between PHI and medical usage was not consistent, it was positively related to health outcomes such as subjective health status and health-related quality of life. Therefore, for PHI to fulfill the role of providing supplementary medical care for older adults with high medical demands, a national improvement policy should be developed to address the barriers that prevent older adults adopting PHI schemes, and empirical studies related to medical usages by and the health outcomes of older adults should be conducted in order to generate a basis for the use of PHI.

A SE Approach for Real-Time NPP Response Prediction under CEA Withdrawal Accident Conditions

  • Felix Isuwa, Wapachi;Aya, Diab
    • Journal of the Korean Society of Systems Engineering
    • /
    • v.18 no.2
    • /
    • pp.75-93
    • /
    • 2022
  • Machine learning (ML) data-driven meta-model is proposed as a surrogate model to reduce the excessive computational cost of the physics-based model and facilitate the real-time prediction of a nuclear power plant's transient response. To forecast the transient response three machine learning (ML) meta-models based on recurrent neural networks (RNNs); specifically, Long Short Term Memory (LSTM), Gated Recurrent Unit (GRU), and a sequence combination of Convolutional Neural Network (CNN) and LSTM are developed. The chosen accident scenario is a control element assembly withdrawal at power concurrent with the Loss Of Offsite Power (LOOP). The transient response was obtained using the best estimate thermal hydraulics code, MARS-KS, and cross-validated against the Design and control document (DCD). DAKOTA software is loosely coupled with MARS-KS code via a python interface to perform the Best Estimate Plus Uncertainty Quantification (BEPU) analysis and generate a time series database of the system response to train, test and validate the ML meta-models. Key uncertain parameters identified as required by the CASU methodology were propagated using the non-parametric Monte-Carlo (MC) random propagation and Latin Hypercube Sampling technique until a statistically significant database (181 samples) as required by Wilk's fifth order is achieved with 95% probability and 95% confidence level. The three ML RNN models were built and optimized with the help of the Talos tool and demonstrated excellent performance in forecasting the most probable NPP transient response. This research was guided by the Systems Engineering (SE) approach for the systematic and efficient planning and execution of the research.

A Systems Engineering Approach for Predicting NPP Response under Steam Generator Tube Rupture Conditions using Machine Learning

  • Tran Canh Hai, Nguyen;Aya, Diab
    • Journal of the Korean Society of Systems Engineering
    • /
    • v.18 no.2
    • /
    • pp.94-107
    • /
    • 2022
  • Accidents prevention and mitigation is the highest priority of nuclear power plant (NPP) operation, particularly in the aftermath of the Fukushima Daiichi accident, which has reignited public anxieties and skepticism regarding nuclear energy usage. To deal with accident scenarios more effectively, operators must have ample and precise information about key safety parameters as well as their future trajectories. This work investigates the potential of machine learning in forecasting NPP response in real-time to provide an additional validation method and help reduce human error, especially in accident situations where operators are under a lot of stress. First, a base-case SGTR simulation is carried out by the best-estimate code RELAP5/MOD3.4 to confirm the validity of the model against results reported in the APR1400 Design Control Document (DCD). Then, uncertainty quantification is performed by coupling RELAP5/MOD3.4 and the statistical tool DAKOTA to generate a large enough dataset for the construction and training of neural-based machine learning (ML) models, namely LSTM, GRU, and hybrid CNN-LSTM. Finally, the accuracy and reliability of these models in forecasting system response are tested by their performance on fresh data. To facilitate and oversee the process of developing the ML models, a Systems Engineering (SE) methodology is used to ensure that the work is consistently in line with the originating mission statement and that the findings obtained at each subsequent phase are valid.

Analysis of Efficiency and Productivity for Major Korean Seaports using PCA-DEA model (PCA-DEA 모델을 이용한 국내 주요항만의 효율성과 생산성 분석에 관한 연구)

  • Pham, Thi Quynh Mai;Kim, Hwayoung
    • Journal of Korea Port Economic Association
    • /
    • v.38 no.2
    • /
    • pp.123-138
    • /
    • 2022
  • Korea has been huge investments in its port system, annually upgrading its infrastructure to turn the ports into Asian hub port. However, while Busan port is ranked fifth globally for container throughput, Other Korean ports are ranked much lower. This article applies Data Envelopment Analysis (DEA) and Malmquist Productivity Index (MPI) to evaluate selected major Korean seaports' operational efficiency and productivity from 2010 to 2018. It further integrates Principal Component Analysis (PCA) into DEA, with the PCA-DEA combined model strengthening the basic DEA results, as the discriminatory power weakens when the variable number exceeds the number of Decision Making Units(DMU). Meanwhile, MPI is applied to measure the seaports' productivity over the years. The analyses generate efficiency and productivity rankings for Korean seaports. The results show that except for Gwangyang and Ulsan port, none of the selected seaports is currently efficient enough in their operations. The study also indicates that technological progress has led to impactful changes in the productivity of Korean seaports.

Discrete Event Simulation based Equipment Combination Optimization Method - based on construction equipment performance estimation of the Construction Standard Production Rate - (이산형 이벤트 시뮬레이션 기반 최적의 건설장비 조합 도출 방법 제시 - 표준품셈 건설기계 시공능력 산식을 기반으로 -)

  • Ko, Yongho;Ngov, Kheang;Noh, Jaeyun;Kim, Yujin;Han, Seungwoo
    • Korean Journal of Construction Engineering and Management
    • /
    • v.23 no.6
    • /
    • pp.21-29
    • /
    • 2022
  • Productivity estimation of construction operations is crucial to successful project delivery. Especially in the preconstruction phase, the adequacy and effectiveness of plans directly affect the actual performance of operations. Currently, productivity estimation is conducted by referring to existing references such as the Construction Standard Production Rate. However, it is difficult to promptly apply changing conditions of operations when using such references. Moreover, it is difficult to deduce the optimal combination of construction machinery for the given condition. This paper presents a simple simulation model that can be used to generate productivity data that considers site conditions and construction equipment combination. The suggested method is expected to be used as a decision making assisting tool for practitioners who rely on estimations using the Construction Standard Production Rate when establishing construction plans using heavy machinery such as backhoes, loaders and dumptrucks.

Analysis of Material Properties According to Compounding Conditions of Polymer Composites to Reduce Thermal Deformation (열변형 저감을 위한 고분자 복합소재 배합 조건에 따른 재료특성 분석)

  • Byun, Sangwon;Kim, Youngshin;Jeon, Euy sik
    • Journal of the Semiconductor & Display Technology
    • /
    • v.21 no.1
    • /
    • pp.148-154
    • /
    • 2022
  • As the 4th industrial age approaches, the demand for semiconductors is increasing enough to be used in all electronic devices. At the same time, semiconductor technology is also developing day by day, leading to ultraprecision and low power consumption. Semiconductors that keep getting smaller generate heat because the energy density increases, and the generated heat changes the shape of the semiconductor package, so it is important to manage. The temperature change is not only self-heating of the semiconductor package, but also heat generated by external damage. If the package is deformed, it is necessary to manage it because functional problems and performance degradation such as damage occur. The package burn in test in the post-process of semiconductor production is a process that tests the durability and function of the package in a high-temperature environment, and heat dissipation performance can be evaluated. In this paper, we intend to review a new material formulation that can improve the performance of the adapter, which is one of the parts of the test socket used in the burn-in test. It was confirmed what characteristics the basic base showed when polyamide, a high-molecular material, and alumina, which had high thermal conductivity, were mixed for each magnification. In this study, functional evaluation was also carried out by injecting an adapter, a part of the test socket, at the same time as the specimen was manufactured. Verification of stiffness such as tensile strength and flexural strength by mixing ratio, performance evaluation such as thermal conductivity, and manufacturing of a dummy device also confirmed warpage. As a result, it was confirmed that the thermal stability was excellent. Through this study, it is thought that it can be used as basic data for the development of materials for burn-in sockets in the future.