• 제목/요약/키워드: Automated machine learning

검색결과 192건 처리시간 0.027초

기계학습모델을 이용한 이상기상에 따른 사일리지용 옥수수 생산량에 미치는 피해 산정 (Calculation of Damage to Whole Crop Corn Yield by Abnormal Climate Using Machine Learning)

  • 김지융;최재성;조현욱;김문주;김병완;성경일
    • 한국초지조사료학회지
    • /
    • 제43권1호
    • /
    • pp.11-21
    • /
    • 2023
  • 본 연구는 기계학습을 기반으로 제작한 수량예측모델을 이용하여 PCR 4.5 시나리오에 따른 사일리지용 옥수수(WCC)의 피해량 산정 및 전자지도를 작성할 목적으로 수행하였다. WCC 데이터는 수입적응성 시험보고서(n=1,219), 국립축산과학원 시험연구보고서(n=1,294), 한국축산학회지(n=8), 한국초지조사료학회지(n=707) 및 학위논문(n=4)에서 총 3,232점을 수집하였으며, 기상데이터는 기상청의 기상자료개방포털에서 수집하였다. 본 연구에서 이상기상에 따른 WCC의 피해량은 RCP 4.5 시나리오에 따른 월평균기온 및 강수량을 시간단위로 환산하여 준용하여 산정하였다. 정상기상에서 DMY 예측값은 13,845~19,347 kg/ha 범위로 나타났다. 이상기상에 따른 피해량은 이상기온 2050 및 2100년 각각 -263~360 및-1,023~92 kg/ha, 이상강수량 2050 및 2100년 각각 -17~-2 및-12~2 kg/ha였다. 월평균기온이 증가함에 따라서 WCC의 DMY는 증가하는 경향으로 나타났다. RCP 4.5 시나리오를 통해 산정한 WCC의 피해량은 QGIS를 이용하여 전자지도로 제시하였다. 본 연구는 온실가스 저감이 진행된 시나리오를 이용했지만, 추가 연구는 온실가스 저감이 되지 않은 RCP 시나리오를 이용한 연구를 수행할 필요가 있다.

기계학습을 이용한 벼 수발아율 예측 (Predicting the Pre-Harvest Sprouting Rate in Rice Using Machine Learning)

  • 반호영;정재혁;황운하;이현석;양서영;최명구;이충근;이지우;이채영;윤여태;한채민;신서호;이성태
    • 한국농림기상학회지
    • /
    • 제22권4호
    • /
    • pp.239-249
    • /
    • 2020
  • 본 연구는 자연 조건에서 쌀가루용 벼의 수발아율을 예측하기 위한 것으로 기계학습을 이용하여 기상요소들에 따른 수발아율을 간단히 예측할 수 있는 초기 시스템을 개발하기 위해 수행되었다. 이를 위하여 강원도, 충청북도, 경상북도에 위치한 6개 지역에서 쌀가루용 벼 3품종을 재배하였다. 수확 후 수발아율과 출수일을 조사하였으며, 각 지역의 종관기상대의 일평균 기온과 상대 습도, 그리고 강수량 정보를 이용하여 기계학습 모델 중 하나이며, 정확도가 높은 GBM 모델로 수발아율을 예측하였다. 2017년부터 2019년까지 강원과 충북, 그리고 경북의 6개 지역에서 쌀가루 용 벼 3품종에 대해 재배 실험을 수행하였다. 조사 항목은 출수일과 수발아율이었다. 기상자료는 동일한 지역명의 종관기상대를 이용하여 일 평균 기온 및 상대 습도, 그리고 강수량 자료를 수집하였다. 수발아율 예측을 위해 기계학습 모델인 Gradient Boosting Machine (GBM)을 이용하였으며, 학습 투입 변수로는 평균 기온과 상대 습도, 그리고 총 강수량이었다. 또한 수발아 피해 관련 기간을 설정하기 위해 출수 후 몇일 후부터 그 이후의 기간에 대한 실험도 수행하였다. 자료는 수발아 피해 관련 기간의 교정을 위한 training-set과 vali-set, 검증을 위한 test-set으로 구분하였다. training-set과 vali-set으로 교정한 결과, 출수 후 22일 후부터 24일동안에서 가장 높은 score를 나타내었다. test-set으로 검증한 결과는 3.0%보다 낮은 구간에서 수발아율을 약간 높게 예측한 경향이 있었지만, 높은 예측력을 보였다(R2=0.76). 따라서, 기계학습을 이용하여 특정기간동안의 기상요소들로 수발아율을 간단하게 예측할 수 있을 것으로 예상된다. 본 연구의 결과를 종합해 볼 때, 기계학습을 이용하여 특정 기간 동안에 평균 기온과 상대 습도, 그리고 총 강수량으로 높은 수발아율 예측 성능을 보였으며, 이 시스템을 이용하여 일반 농가들을 대상으로 수발아에 관한 피해를 예방할 수 있는 조기 수발아 예측 시스템으로 이용가능 할 것으로 판단된다. 하지만 품종마다 휴면 정도 차이로 인한 수발아 관련 기간에 차이가 있으므로, 다른 쌀가루용 벼 품종에 대해서도 추가로 조사하고, 개별 품종으로 세분화하여 분석한다면 좀 더 정확도 높은 예측 시스템을 개발할 수 있을 것으로 판단된다.

Advanced neuroimaging techniques for evaluating pediatric epilepsy

  • Lee, Yun Jeong
    • Clinical and Experimental Pediatrics
    • /
    • 제63권3호
    • /
    • pp.88-95
    • /
    • 2020
  • Accurate localization of the seizure onset zone is important for better seizure outcomes and preventing deficits following epilepsy surgery. Recent advances in neuroimaging techniques have increased our understanding of the underlying etiology and improved our ability to noninvasively identify the seizure onset zone. Using epilepsy-specific magnetic resonance imaging (MRI) protocols, structural MRI allows better detection of the seizure onset zone, particularly when it is interpreted by experienced neuroradiologists. Ultra-high-field imaging and postprocessing analysis with automated machine learning algorithms can detect subtle structural abnormalities in MRI-negative patients. Tractography derived from diffusion tensor imaging can delineate white matter connections associated with epilepsy or eloquent function, thus, preventing deficits after epilepsy surgery. Arterial spin-labeling perfusion MRI, simultaneous electroencephalography (EEG)-functional MRI (fMRI), and magnetoencephalography (MEG) are noinvasive imaging modalities that can be used to localize the epileptogenic foci and assist in planning epilepsy surgery with positron emission tomography, ictal single-photon emission computed tomography, and intracranial EEG monitoring. MEG and fMRI can localize and lateralize the area of the cortex that is essential for language, motor, and memory function and identify its relationship with planned surgical resection sites to reduce the risk of neurological impairments. These advanced structural and functional imaging modalities can be combined with postprocessing methods to better understand the epileptic network and obtain valuable clinical information for predicting long-term outcomes in pediatric epilepsy.

RandomForest와 XGBoost를 활용한 한국어 텍스트 분류: 서울특별시 응답소 민원 데이터를 중심으로 (Korean Text Classification Using Randomforest and XGBoost Focusing on Seoul Metropolitan Civil Complaint Data)

  • 하지은;신현철;이준기
    • 한국빅데이터학회지
    • /
    • 제2권2호
    • /
    • pp.95-104
    • /
    • 2017
  • 2014년 서울시는 시민의 목소리에 신속한 응대를 목표로 '서울특별시 응답소' 서비스를 시작하였다. 접수된 민원은 내용을 바탕으로 카테고리 확인 및 담당부서로 분류 되는데, 이 부분을 자동화시킬 수 있다면 시간 및 인력 비용이 감소될 것이다. 본 연구는 2010년 6월 1일부터 2017년 5월 31일까지 7년치 민원 사례 17,700건의 데이터를 수집하여, 최근 화두가 되고 있는 XGBoost 모델을 기존 RandomForest 모델과 비교하여 한국어 텍스트 분류의 적합성을 확인하였다. 그 결과 RandomForest에 대비 XGBoost의 정확도가 전반적으로 높게 나타났다. 동일한 표본을 활용하여 업 샘플링과 다운 샘플링 시행 후에는 RandomForest의 정확도가 불안정하게 나타난 반면, XGBoost는 전반적으로 안정적인 정확도를 보였다.

  • PDF

Accurate and Efficient Log Template Discovery Technique

  • Tak, Byungchul
    • 한국컴퓨터정보학회논문지
    • /
    • 제23권10호
    • /
    • pp.11-21
    • /
    • 2018
  • In this paper we propose a novel log template discovery algorithm which achieves high quality of discovered log templates through iterative log filtering technique. Log templates are the static string pattern of logs that are used to produce actual logs by inserting variable values during runtime. Identifying individual logs into their template category correctly enables us to conduct automated analysis using state-of-the-art machine learning techniques. Our technique looks at the group of logs column-wise and filters the logs that have the value of the highest proportion. We repeat this process per each column until we are left with highly homogeneous set of logs that most likely belong to the same log template category. Then, we determine which column is the static part and which is the variable part by vertically comparing all the logs in the group. This process repeats until we have discovered all the templates from given logs. Also, during this process we discover the custom patterns such as ID formats that are unique to the application. This information helps us quickly identify such strings in the logs as variable parts thereby further increasing the accuracy of the discovered log templates. Existing solutions suffer from log templates being too general or too specific because of the inability to detect custom patterns. Through extensive evaluations we have learned that our proposed method achieves 2 to 20 times better accuracy.

바이그램이 문서범주화 성능에 미치는 영향에 관한 연구 (A Study on the Effectiveness of Bigrams in Text Categorization)

  • 이찬도;최준영
    • Journal of Information Technology Applications and Management
    • /
    • 제12권2호
    • /
    • pp.15-27
    • /
    • 2005
  • Text categorization systems generally use single words (unigrams) as features. A deceptively simple algorithm for improving text categorization is investigated here, an idea previously shown not to work. It is to identify useful word pairs (bigrams) made up of adjacent unigrams. The bigrams it found, while small in numbers, can substantially raise the quality of feature sets. The algorithm was tested on two pre-classified datasets, Reuters-21578 for English and Korea-web for Korean. The results show that the algorithm was successful in extracting high quality bigrams and increased the quality of overall features. To find out the role of bigrams, we trained the Na$\"{i}$ve Bayes classifiers using both unigrams and bigrams as features. The results show that recall values were higher than those of unigrams alone. Break-even points and F1 values improved in most documents, especially when documents were classified along the large classes. In Reuters-21578 break-even points increased by 2.1%, with the highest at 18.8%, and F1 improved by 1.5%, with the highest at 3.2%. In Korea-web break-even points increased by 1.0%, with the highest at 4.5%, and F1 improved by 0.4%, with the highest at 4.2%. We can conclude that text classification using unigrams and bigrams together is more efficient than using only unigrams.

  • PDF

The study of a full cycle semi-automated business process re-engineering: A comprehensive framework

  • Lee, Sanghwa;Sutrisnowati, Riska A.;Won, Seokrae;Woo, Jong Seong;Bae, Hyerim
    • 한국컴퓨터정보학회논문지
    • /
    • 제23권11호
    • /
    • pp.103-109
    • /
    • 2018
  • This paper presents an idea and framework to automate a full cycle business process management and re-engineering by integrating traditional business process management systems, process mining, data mining, machine learning, and simulation. We build our framework on the cloud-based platform such that various data sources can be incorporated. We design our systems to be extensible so that not only beneficial for practitioners of BPM, but also for researchers. Our framework can be used as a test bed for researchers without the complication of system integration. The automation of redesigning phase and selecting a baseline process model for deployment are the two main contributions of this study. In the redesigning phase, we deal with both the analysis of the existing process model and what-if analysis on how to improve the process at the same time, Additionally, improving a business process can be applied in a case by case basis that needs a lot of trial and error and huge data. In selecting the baseline process model, we need to compare many probable routes of business execution and calculate the most efficient one in respect to production cost and execution time. We also discuss the challenges and limitation of the framework, including the systems adoptability, technical difficulties and human factors.

TANFIS Classifier Integrated Efficacious Aassistance System for Heart Disease Prediction using CNN-MDRP

  • Bhaskaru, O.;Sreedevi, M.
    • International Journal of Computer Science & Network Security
    • /
    • 제22권10호
    • /
    • pp.171-176
    • /
    • 2022
  • A dramatic rise in the number of people dying from heart disease has prompted efforts to find a way to identify it sooner using efficient approaches. A variety of variables contribute to the condition and even hereditary factors. The current estimate approaches use an automated diagnostic system that fails to attain a high level of accuracy because it includes irrelevant dataset information. This paper presents an effective neural network with convolutional layers for classifying clinical data that is highly class-imbalanced. Traditional approaches rely on massive amounts of data rather than precise predictions. Data must be picked carefully in order to achieve an earlier prediction process. It's a setback for analysis if the data obtained is just partially complete. However, feature extraction is a major challenge in classification and prediction since increased data increases the training time of traditional machine learning classifiers. The work integrates the CNN-MDRP classifier (convolutional neural network (CNN)-based efficient multimodal disease risk prediction with TANFIS (tuned adaptive neuro-fuzzy inference system) for earlier accurate prediction. Perform data cleaning by transforming partial data to informative data from the dataset in this project. The recommended TANFIS tuning parameters are then improved using a Laplace Gaussian mutation-based grasshopper and moth flame optimization approach (LGM2G). The proposed approach yields a prediction accuracy of 98.40 percent when compared to current algorithms.

Diagnosing a Child with Autism using Artificial Intelligence

  • Alharbi, Abdulrahman;Alyami, Hadi;Alenzi, Saleh;Alharbi, Saud;bassfar, Zaid
    • International Journal of Computer Science & Network Security
    • /
    • 제22권6호
    • /
    • pp.145-156
    • /
    • 2022
  • Children are the foundation and future of this society and understanding their impressions and behaviors is very important and the child's behavioral problems are a burden on the family and society as well as have a bad impact on the development of the child, and the early diagnosis of these problems helps to solve or mitigate them, and in this research project we aim to understand and know the behaviors of children, through artificial intelligence algorithms that helped solve many complex problems in an automated system, By using this technique to read and analyze the behaviors and feelings of the child by reading the features of the child's face, the movement of the child's body, the method of the child's session and nervous emotions, and by analyzing these factors we can predict the feelings and behaviors of children from grief, tension, happiness and anger as well as determine whether this child has the autism spectrum or not. The scarcity of studies and the privacy of data and its scarcity on these behaviors and feelings limited researchers in the process of analysis and training to the model presented in a set of images, videos and audio recordings that can be connected, this model results in understanding the feelings of children and their behaviors and helps doctors and specialists to understand and know these behaviors and feelings.

Differentiation of Legal Rules and Individualization of Court Decisions in Criminal, Administrative and Civil Cases: Identification and Assessment Methods

  • Egor, Trofimov;Oleg, Metsker;Georgy, Kopanitsa
    • International Journal of Computer Science & Network Security
    • /
    • 제22권12호
    • /
    • pp.125-131
    • /
    • 2022
  • The diversity and complexity of criminal, administrative and civil cases resolved by the courts makes it difficult to develop universal automated tools for the analysis and evaluation of justice. However, big data generated in the scope of justice gives hope that this problem will be resolved as soon as possible. The big data applying makes it possible to identify typical options for resolving cases, form detailed rules for the individualization of a court decision, and correlate these rules with an abstract provisions of law. This approach allows us to somewhat overcome the contradiction between the abstract and the concrete in law, to automate the analysis of justice and to model e-justice for scientific and practical purposes. The article presents the results of using dimension reduction, SHAP value, and p-value to identify, analyze and evaluate the individualization of justice and the differentiation of legal regulation. Processing and analysis of arrays of court decisions by computational methods make it possible to identify the typical views of courts on questions of fact and questions of law. This knowledge, obtained automatically, is promising for the scientific study of justice issues, the improvement of the prescriptions of the law and the probabilistic prediction of a court decision with a known set of facts.