• Title/Summary/Keyword: 자동화된 머신러닝

Search Result 69, Processing Time 0.026 seconds

Design of A New CAPTCHA System using Detecting Orientation of Polygonal Image (다각형 이미지의 방향 결정을 이용한 새로운 CAPTCHA 시스템의 설계)

  • Chung, WooKeun;Kim, JongWoo;Cho, HwanGue
    • Annual Conference of KIPS
    • /
    • 2010.04a
    • /
    • pp.766-769
    • /
    • 2010
  • CAPTCHA 시스템은 스팸이나 로봇에 의한 자동 가입, 계정 생성 방지도구로써 인간의 우수한 가독성을 통해 특정 언어 또는 그림을 해독할 수 있는 특성을 이용한 것으로 일반적으로 컴퓨터 프로그램이 해독하기 어려운 기호, 글자 등을 재입력하도록 하여 스팸을 위한 자동화 도구 등을 무력화 시키는 보안 기술이다. 하지만 기존에 존재하였던 텍스트 기반의 시스템은 웹봇이나 머신 러닝등을 통하여 쉽게 통과할 수 있는 단점을 나타냈다. 우리는 이러한 단점을 보완하고자 새로운 이미지 기반의 CAPTCHA 시스템을 제안하였다. 제안된 시스템은 일반적인 사진에서 부분 이미지를 출력, 무작위 회전을 가하여 사용자에게 올바른 교정을 요하는 시스템이었다. 본 논문에서는 일반적인 사진에서 출력되는 부분 이미지의 형태를 다각형으로 추출하여, 사용자에게 좀 더 인식률을 높일 수 있는 서브 이미지의 형태를 찾고, 좀 더 효과적이고 실용적일수 있는 CAPTCHA 시스템을 제안하고자 한다. 본 논문에서 제공하는 다각형의 형태는 정사각형, 정오각형, 정육각형, 정칠각형 그리고 정팔각형이다. 총 5가지 형태의 다각형 중에서 사용자에게 가장 효과적인 다각형을 실험을 통하여 찾을 것이다.

Analyzing K-POP idol popularity factors using music charts and new media data using machine learning (머신러닝을 활용한 음원 차트와 뉴미디어 데이터를 활용한 K-POP 아이돌 인기 요인 분석)

  • Jiwon Choi;Dayeon Jung;Kangkyu Choi;Taein Lim;Daehoon Kim;Jongkyn Jung;Seunmin Rho
    • Journal of Platform Technology
    • /
    • v.12 no.1
    • /
    • pp.55-66
    • /
    • 2024
  • The K-POP market has become influential not only in culture but also in society as a whole, including diplomacy and environmental movements. As a result, various papers have been conducted based on machine learning to identify the success factors of idols by utilizing traditional data such as music and recordings. However, there is a limitation that previous studies have not reflected the influence of new media platforms such as Instagram releases, YouTube shorts, TikTok, Twitter, etc. on the popularity of idols. Therefore, it is difficult to clarify the causal relationship of recent idol success factors because the existing studies do not consider the daily changing media trends. To solve these problems, this paper proposes a data collection system and analysis methodology for idol-related data. By developing a container-based real-time data collection automation system that reflects the specificity of idol data, we secure the stability and scalability of idol data collection and compare and analyze the clusters of successful idols through a K-Means clustering-based outlier detection model. As a result, we were able to identify commonalities among successful idols such as gender, time of success after album release, and association with new media. Through this, it is expected that we can finally plan optimal comeback promotions for each idol, album type, and comeback period to improve the chances of idol success.

  • PDF

Development of a water quality prediction model for mineral springs in the metropolitan area using machine learning (머신러닝을 활용한 수도권 약수터 수질 예측 모델 개발)

  • Yeong-Woo Lim;Ji-Yeon Eom;Kee-Young Kwahk
    • Journal of Intelligence and Information Systems
    • /
    • v.29 no.1
    • /
    • pp.307-325
    • /
    • 2023
  • Due to the prolonged COVID-19 pandemic, the frequency of people who are tired of living indoors visiting nearby mountains and national parks to relieve depression and lethargy has exploded. There is a place where thousands of people who came out of nature stop walking and breathe and rest, that is the mineral spring. Even in mountains or national parks, there are about 600 mineral springs that can be found occasionally in neighboring parks or trails in the metropolitan area. However, due to irregular and manual water quality tests, people drink mineral water without knowing the test results in real time. Therefore, in this study, we intend to develop a model that can predict the quality of the spring water in real time by exploring the factors affecting the quality of the spring water and collecting data scattered in various places. After limiting the regions to Seoul and Gyeonggi-do due to the limitations of data collection, we obtained data on water quality tests from 2015 to 2020 for about 300 mineral springs in 18 cities where data management is well performed. A total of 10 factors were finally selected after two rounds of review among various factors that are considered to affect the suitability of the mineral spring water quality. Using AutoML, an automated machine learning technology that has recently been attracting attention, we derived the top 5 models based on prediction performance among about 20 machine learning methods. Among them, the catboost model has the highest performance with a prediction classification accuracy of 75.26%. In addition, as a result of examining the absolute influence of the variables used in the analysis through the SHAP method on the prediction, the most important factor was whether or not a water quality test was judged nonconforming in the previous water quality test. It was confirmed that the temperature on the day of the inspection and the altitude of the mineral spring had an influence on whether the water quality was unsuitable.

A Study on the Failure Diagnosis of Transfer Robot for Semiconductor Automation Based on Machine Learning Algorithm (머신러닝 알고리즘 기반 반도체 자동화를 위한 이송로봇 고장진단에 대한 연구)

  • Kim, Mi Jin;Ko, Kwang In;Ku, Kyo Mun;Shim, Jae Hong;Kim, Kihyun
    • Journal of the Semiconductor & Display Technology
    • /
    • v.21 no.4
    • /
    • pp.65-70
    • /
    • 2022
  • In manufacturing and semiconductor industries, transfer robots increase productivity through accurate and continuous work. Due to the nature of the semiconductor process, there are environments where humans cannot intervene to maintain internal temperature and humidity in a clean room. So, transport robots take responsibility over humans. In such an environment where the manpower of the process is cutting down, the lack of maintenance and management technology of the machine may adversely affect the production, and that's why it is necessary to develop a technology for the machine failure diagnosis system. Therefore, this paper tries to identify various causes of failure of transport robots that are widely used in semiconductor automation, and the Prognostics and Health Management (PHM) method is considered for determining and predicting the process of failures. The robot mainly fails in the driving unit due to long-term repetitive motion, and the core components of the driving unit are motors and gear reducer. A simulation drive unit was manufactured and tested around this component and then applied to 6-axis vertical multi-joint robots used in actual industrial sites. Vibration data was collected for each cause of failure of the robot, and then the collected data was processed through signal processing and frequency analysis. The processed data can determine the fault of the robot by utilizing machine learning algorithms such as SVM (Support Vector Machine) and KNN (K-Nearest Neighbor). As a result, the PHM environment was built based on machine learning algorithms using SVM and KNN, confirming that failure prediction was partially possible.

Construction of Medical Image-Based Learning Data Support Platform for Machine Learning and Its Application of Sarcopenia Data AI (머신러닝을 위한 의료영상기반 학습 데이터 지원 플랫폼 구축 및 근감소증 데이터 AI 응용)

  • Kim, Ji-Eon;Lim, Dong Wook;Yu, Yeong Ju;Noh, Si-Hyeong;Lee, ChungSub;Kim, Tae-Hoon;Jeong, Chang-Won
    • Annual Conference of KIPS
    • /
    • 2021.11a
    • /
    • pp.434-436
    • /
    • 2021
  • 의료산업은 진단 및 치료 위주의 기술개발이 진행되어왔다. 최근 의료 빅데이터를 기반으로 진단, 치료 및 재활뿐만 아니라 예방과 예후관리까지 지원하는 의료서비스에 대한 패러다임이 변화되고 있다. 특히, 여러 의료 중심의 플랫폼 기술 가운데 객관적인 진단지표를 가지고 있는 의료영상을 기반으로 인공지능 학습에 적용하여 진단 및 예측을 중심으로 한 플랫폼 개발이 진행되고 있다. 하지만, 인공지능 연구에는 많은 학습 데이터가 요구될 뿐만 아니라 학습에 적용하기 위해서는 데이터 특성에 따른 전처리 기술과 분류 작업에 많은 시간 소요되어 이와 같은 문제점을 해결할 수 있는 방법들이 요구되고 있다. 따라서, 본 논문은 인공지능 학습까지 적용하기 위한 의료영상 데이터에 대한 확장 모델을 개발하여 공통적인 조건에 따라 의료영상 데이터가 표준화되어 변환하며, 자동화 시스템 구조에 따라 데이터가 분류·저장되어 인공지능 학습까지 지원할 수 있는 플랫폼을 제안하고자 한다. 그리고 근감소증 학습데이터 관리 및 적용 결과를 통해 플랫폼의 수행성을 검증하였다. 향후 제안한 플랫폼을 통해 의료데이터에 대한 전처리, 분류, 관리까지 지원함으로써 CDM 확장 표준 의료데이터 플랫폼으로 활용 가능성을 보였다.

Dam Inflow Prediction and Evaluation Using Hybrid Auto-sklearn Ensemble Model (하이브리드 Auto-sklearn 앙상블 모델을 이용한 댐 유입량 예측 및 평가)

  • Lee, Seoro;Bae, Joo Hyun;Lee, Gwanjae;Yang, Dongseok;Hong, Jiyeong;Kim, Jonggun;Lim, Kyoung Jae
    • Proceedings of the Korea Water Resources Association Conference
    • /
    • 2022.05a
    • /
    • pp.307-307
    • /
    • 2022
  • 최근 기후변화와 댐 상류 토지이용 변화 등과 같은 다양한 원인에 의해 댐 유입량의 변동성이 증가하면서 댐 관리 및 운영조작 의사 결정에 어려움이 발생하고 있다. 따라서 이러한 댐 유입량의 변동 특성을 반영하여 댐 유입량을 정확하고 효율적으로 예측할 수 있는 방안이 필요한 실정이다. 머신러닝 기술이 발전하면서 Auto-ML(Automated Machine Learning)이 다양한 분야에서 활용되고 있다. Auto-ML은 데이터 전처리, 최적 알고리즘 선택, 하이퍼파라미터 튜닝, 모델 학습 및 평가 등의 모든 과정을 자동화하는 기술이다. 그러나 아직까지 수문 분야에서 댐 유입량을 예측하기 위한 모델을 개발하는데 있어서 Auto-ML을 활용한 사례는 부족하고, 특히 댐 유입량의 예측 정확성을 확보하기 위해 High-inflow and low-inflow 의 변동 특성을 고려한 하이브리드 결합 방식을 통해 Auto-ML 기반 앙상블 모델을 개발하고 평가한 연구는 없다. 본 연구에서는 Auto-ML의 패키지 중 Auto-sklearn을 통해 홍수기, 비홍수기 유입량 변동 특성을 반영한 하이브리드 앙상블 댐 유입량 예측 모델을 개발하였다. 소양강댐을 대상으로 적용한 결과, 하이브리드 Auto-sklearn 앙상블 모델의 댐 유입량 예측 성능은 R2 0.868, RMSE 66.23 m3/s, MAE 16.45 m3/s로 단일 Auto-sklearn을 통해 구축 된 앙상블 모델보다 전반적으로 우수한 것으로 나타났다. 특히 FDC (Flow Duration Curve)의 저수기, 갈수기 구간에서 두 모델의 유입량 예측 경향은 큰 차이를 보였으며, 하이브리드 Auto-sklearn 모델의 예측 값이 관측 값과 더욱 유사한 것으로 나타났다. 이는 홍수기, 비홍수기 구간에 대한 앙상블 모델이 독립적으로 구축되는 과정에서 각 모델에 대한 하이퍼파라미터가 최적화되었기 때문이라 판단된다. 향후 본 연구의 방법론은 보다 정확한 댐 유입량 예측 자료를 생성하기 위한 방안 수립뿐만 아니라 다양한 분야의 불균형한 데이터셋을 이용한 앙상블 모델을 구축하는데도 유용하게 활용될 수 있을 것으로 사료된다.

  • PDF

Blockchain-based Important Information Management Techniques for IoT Environment (IoT 환경을 위한 블록체인 기반의 중요 정보 관리 기법)

  • Yoon-Su Jeong
    • Advanced Industrial SCIence
    • /
    • v.3 no.1
    • /
    • pp.30-36
    • /
    • 2024
  • Recently, the Internet of Things (IoT), which has been applied to various industrial fields, is constantly evolving in the process of automation and digitization. However, in the network where IoT devices are built, research on IoT critical information-related data sharing, personal information protection, and data integrity among intermediate nodes is still being actively studied. In this study, we propose a blockchain-based IoT critical information management technique that is easy to implement without burdening the intermediate node in the network environment where IoT is built. The proposed technique allocates a random value of a random size to the IoT critical information arriving at the intermediate node and manages it to become a decentralized P2P blockchain. In addition, the proposed technique makes it easier to manage IoT critical data by creating licenses such as time limit and device limitation according to the weight condition of IoT critical information. Performance evaluation and proposed techniques have improved delay time and processing time by 7.6% and 10.1% on average compared to existing techniques.

A Study on Atmospheric Data Anomaly Detection Algorithm based on Unsupervised Learning Using Adversarial Generative Neural Network (적대적 생성 신경망을 활용한 비지도 학습 기반의 대기 자료 이상 탐지 알고리즘 연구)

  • Yang, Ho-Jun;Lee, Seon-Woo;Lee, Mun-Hyung;Kim, Jong-Gu;Choi, Jung-Mu;Shin, Yu-mi;Lee, Seok-Chae;Kwon, Jang-Woo;Park, Ji-Hoon;Jung, Dong-Hee;Shin, Hye-Jung
    • Journal of Convergence for Information Technology
    • /
    • v.12 no.4
    • /
    • pp.260-269
    • /
    • 2022
  • In this paper, We propose an anomaly detection model using deep neural network to automate the identification of outliers of the national air pollution measurement network data that is previously performed by experts. We generated training data by analyzing missing values and outliers of weather data provided by the Institute of Environmental Research and based on the BeatGAN model of the unsupervised learning method, we propose a new model by changing the kernel structure, adding the convolutional filter layer and the transposed convolutional filter layer to improve anomaly detection performance. In addition, by utilizing the generative features of the proposed model to implement and apply a retraining algorithm that generates new data and uses it for training, it was confirmed that the proposed model had the highest performance compared to the original BeatGAN models and other unsupervised learning model like Iforest and One Class SVM. Through this study, it was possible to suggest a method to improve the anomaly detection performance of proposed model while avoiding overfitting without additional cost in situations where training data are insufficient due to various factors such as sensor abnormalities and inspections in actual industrial sites.

Prediction of Landslides and Determination of Its Variable Importance Using AutoML (AutoML을 이용한 산사태 예측 및 변수 중요도 산정)

  • Nam, KoungHoon;Kim, Man-Il;Kwon, Oil;Wang, Fawu;Jeong, Gyo-Cheol
    • The Journal of Engineering Geology
    • /
    • v.30 no.3
    • /
    • pp.315-325
    • /
    • 2020
  • This study was performed to develop a model to predict landslides and determine the variable importance of landslides susceptibility factors based on the probabilistic prediction of landslides occurring on slopes along the road. Field survey data of 30,615 slopes from 2007 to 2020 in Korea were analyzed to develop a landslide prediction model. Of the total 131 variable factors, 17 topographic factors and 114 geological factors (including 89 bedrocks) were used to predict landslides. Automated machine learning (AutoML) was used to classify landslides and non-landslides. The verification results revealed that the best model, an extremely randomized tree (XRT) with excellent predictive performance, yielded 83.977% of prediction rates on test data. As a result of the analysis to determine the variable importance of the landslide susceptibility factors, it was composed of 10 topographic factors and 9 geological factors, which was presented as a percentage for each factor. This model was evaluated probabilistically and quantitatively for the likelihood of landslide occurrence by deriving the ranking of variable importance using only on-site survey data. It is considered that this model can provide a reliable basis for slope safety assessment through field surveys to decision-makers in the future.

Study on High-speed Cyber Penetration Attack Analysis Technology based on Static Feature Base Applicable to Endpoints (Endpoint에 적용 가능한 정적 feature 기반 고속의 사이버 침투공격 분석기술 연구)

  • Hwang, Jun-ho;Hwang, Seon-bin;Kim, Su-jeong;Lee, Tae-jin
    • Journal of Internet Computing and Services
    • /
    • v.19 no.5
    • /
    • pp.21-31
    • /
    • 2018
  • Cyber penetration attacks can not only damage cyber space but can attack entire infrastructure such as electricity, gas, water, and nuclear power, which can cause enormous damage to the lives of the people. Also, cyber space has already been defined as the fifth battlefield, and strategic responses are very important. Most of recent cyber attacks are caused by malicious code, and since the number is more than 1.6 million per day, automated analysis technology to cope with a large amount of malicious code is very important. However, it is difficult to deal with malicious code encryption, obfuscation and packing, and the dynamic analysis technique is not limited to the performance requirements of dynamic analysis but also to the virtual There is a limit in coping with environment avoiding technology. In this paper, we propose a machine learning based malicious code analysis technique which improve the weakness of the detection performance of existing analysis technology while maintaining the light and high-speed analysis performance applicable to commercial endpoints. The results of this study show that 99.13% accuracy, 99.26% precision and 99.09% recall analysis performance of 71,000 normal file and malicious code in commercial environment and analysis time in PC environment can be analyzed more than 5 per second, and it can be operated independently in the endpoint environment and it is considered that it works in complementary form in operation in conjunction with existing antivirus technology and static and dynamic analysis technology. It is also expected to be used as a core element of EDR technology and malware variant analysis.