• Title/Summary/Keyword: 기계학습법

Search Result 173, Processing Time 0.03 seconds

Classification of Transport Vehicle Noise Events in Magnetotelluric Time Series Data in an Urban area Using Random Forest Techniques (Random Forest 기법을 이용한 도심지 MT 시계열 자료의 차량 잡음 분류)

  • Kwon, Hyoung-Seok;Ryu, Kyeongho;Sim, Ickhyeon;Lee, Choon-Ki;Oh, Seokhoon
    • Geophysics and Geophysical Exploration
    • /
    • v.23 no.4
    • /
    • pp.230-242
    • /
    • 2020
  • We performed a magnetotelluric (MT) survey to delineate the geological structures below the depth of 20 km in the Gyeongju area where an earthquake with a magnitude of 5.8 occurred in September 2016. The measured MT data were severely distorted by electrical noise caused by subways, power lines, factories, houses, and farmlands, and by vehicle noise from passing trains and large trucks. Using machine-learning methods, we classified the MT time series data obtained near the railway and highway into two groups according to the inclusion of traffic noise. We applied three schemes, stochastic gradient descent, support vector machine, and random forest, to the time series data for the highspeed train noise. We formulated three datasets, Hx, Hy, and Hx & Hy, for the time series data of the large truck noise and applied the random forest method to each dataset. To evaluate the effect of removing the traffic noise, we compared the time series data, amplitude spectra, and apparent resistivity curves before and after removing the traffic noise from the time series data. We also examined the frequency range affected by traffic noise and whether artifact noise occurred during the traffic noise removal process as a result of the residual difference.

A Technique to Recommend Appropriate Developers for Reported Bugs Based on Term Similarity and Bug Resolution History (개발자 별 버그 해결 유형을 고려한 자동적 개발자 추천 접근법)

  • Park, Seong Hun;Kim, Jung Il;Lee, Eun Joo
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.3 no.12
    • /
    • pp.511-522
    • /
    • 2014
  • During the development of the software, a variety of bugs are reported. Several bug tracking systems, such as, Bugzilla, MantisBT, Trac, JIRA, are used to deal with reported bug information in many open source development projects. Bug reports in bug tracking system would be triaged to manage bugs and determine developer who is responsible for resolving the bug report. As the size of the software is increasingly growing and bug reports tend to be duplicated, bug triage becomes more and more complex and difficult. In this paper, we present an approach to assign bug reports to appropriate developers, which is a main part of bug triage task. At first, words which have been included the resolved bug reports are classified according to each developer. Second, words in newly bug reports are selected. After first and second steps, vectors whose items are the selected words are generated. At the third step, TF-IDF(Term frequency - Inverse document frequency) of the each selected words are computed, which is the weight value of each vector item. Finally, the developers are recommended based on the similarity between the developer's word vector and the vector of new bug report. We conducted an experiment on Eclipse JDT and CDT project to show the applicability of the proposed approach. We also compared the proposed approach with an existing study which is based on machine learning. The experimental results show that the proposed approach is superior to existing method.

Factors influencing metabolic syndrome perception and exercising behaviors in Korean adults: Data mining approach (대사증후군의 인지와 신체활동 실천에 영향을 미치는 요인: 데이터 마이닝 접근)

  • Lee, Soo-Kyoung;Moon, Mikyung
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.18 no.12
    • /
    • pp.581-588
    • /
    • 2017
  • This study was conducted to determine which factors would predict metabolic syndrome (MetS) perception and exercise by applying a machine learning classifier, or Extreme Gradient Boosting algorithm (XGBoost) from July 2014 to December 2015. Data were obtained from the Korean Community Health Survey (KCHS), representing different community-dwelling Korean adults 19 years and older, from 2009 to 2013. The dataset includes 370,430 adults. Outcomes were categorized as follows based on the perception of MetS and physical activity (PA): Stage 1 (no perception, no PA), Stage 2 (perception, no PA), and Stage 3 (perception, PA). Features common to all questionnaires for the last 5 years were selected for modeling. Overall, there were 161 features, categorical except for age and the visual analogue scale (EQ-VAS). We used the Extreme Boosting algorithm in R programming for a model to predict factors and achieved prediction accuracy in 0.735 submissions. The top 10 predictive factors in Stage 3 were: age, education level, attempt to control weight, EQ mobility, nutrition label checks, private health insurance, EQ-5D usual activities, anti-smoking advertising, EQ-VAS, education in health centers for diabetes, and dental care. In conclusion, the results showed that XGBoost can be used to identify factors influencing disease prevention and management using healthcare bigdata.

An Electric Load Forecasting Scheme with High Time Resolution Based on Artificial Neural Network (인공 신경망 기반의 고시간 해상도를 갖는 전력수요 예측기법)

  • Park, Jinwoong;Moon, Jihoon;Hwang, Eenjun
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.6 no.11
    • /
    • pp.527-536
    • /
    • 2017
  • With the recent development of smart grid industry, the necessity for efficient EMS(Energy Management System) has been increased. In particular, in order to reduce electric load and energy cost, sophisticated electric load forecasting and efficient smart grid operation strategy are required. In this paper, for more accurate electric load forecasting, we extend the data collected at demand time into high time resolution and construct an artificial neural network-based forecasting model appropriate for the high time resolution data. Furthermore, to improve the accuracy of electric load forecasting, time series data of sequence form are transformed into continuous data of two-dimensional space to solve that problem that machine learning methods cannot reflect the periodicity of time series data. In addition, to consider external factors such as temperature and humidity in accordance with the time resolution, we estimate their value at the time resolution using linear interpolation method. Finally, we apply the PCA(Principal Component Analysis) algorithm to the feature vector composed of external factors to remove data which have little correlation with the power data. Finally, we perform the evaluation of our model through 5-fold cross-validation. The results show that forecasting based on higher time resolution improve the accuracy and the best error rate of 3.71% was achieved at the 3-min resolution.

Integrative Review on Nursing education Adopting Virtual Reality Convergence Simulation (간호교육에 적용한 가상현실 융합시뮬레이션 연구에 대한 통합적 고찰)

  • Kang, Sujeong;Kim, Chunmi;Lee, Hung Sa;Nam, Jae-Woo;Park, Myung Sook
    • Journal of Convergence for Information Technology
    • /
    • v.10 no.1
    • /
    • pp.60-74
    • /
    • 2020
  • Nursing education using virtual reality simulation (VRS) has emerged as a new teaching method for improving nursing student's knowledge as well as of competency for clinical nursing skill. The purpose of this study was to analyze the effects of nursing education using VRS through an integrative analysis on quantitative and qualitative research. Through quality assessment on the total 382 studies, 17studies (12 quantitative and 5 qualitative) were finally selected. Contents of the 17 studies were reviewed and those with respect to four aspects were gathered: the condition, knowledge, and attitude for effective education using VRS, and the effects of nursing education using VRS on the practice. Readiness of the use of virtual reality device, mastsering of the platform, and interesting scenario were required condition for effective education. The effects of nursing education adopting virtual reality convergence simulation oin terms of knowledge, attitude, and practice included enhancement of the knowledge and extension of the knowledge, improvement in memorizing the process and sequence of the practice through repetitive education, and development of empathy ability and formation of rapport. Hence, adopting virtual reality to convergence simulation of nursing education can maximize the effect of the education.

Oil Spill Visualization and Particle Matching Algorithm (유출유 이동 가시화 및 입자 매칭 알고리즘)

  • Lee, Hyeon-Chang;Kim, Yong-Hyuk
    • Journal of the Korea Convergence Society
    • /
    • v.11 no.3
    • /
    • pp.53-59
    • /
    • 2020
  • Initial response is important in marine oil spills, such as the Hebei Spirit oil spill, but it is very difficult to predict the movement of oil out of the ocean, where there are many variables. In order to solve this problem, the forecasting of oil spill has been carried out by expanding the particle prediction, which is an existing study that studies the movement of floats on the sea using the data of the float. In the ocean data format HDF5, the current and wind velocity data at a specific location were extracted using bilinear interpolation, and then the movement of numerous points was predicted by particles and the results were visualized using polygons and heat maps. In addition, we propose a spill oil particle matching algorithm to compensate for the lack of data and the difference between the spilled oil and movement. The spilled oil particle matching algorithm is an algorithm that tracks the movement of particles by granulating the appearance of surface oil spilled oil. The problem was segmented using principal component analysis and matched using genetic algorithm to the point where the variance of travel distance of effluent oil is minimized. As a result of verifying the effluent oil visualization data, it was confirmed that the particle matching algorithm using principal component analysis and genetic algorithm showed the best performance, and the mean data error was 3.2%.

A Study on Spatial Downscaling of Satellite-based Soil Moisture Data (토양수분 위성자료의 공간상세화에 관한 연구)

  • Shin, Dae Yun;Lee, Yang Won;Park, Mun Sung
    • Proceedings of the Korea Water Resources Association Conference
    • /
    • 2017.05a
    • /
    • pp.414-414
    • /
    • 2017
  • 토양수분은 지면환경에서 일어나는 수문 및 에너지 순환을 이해하는 데 있어 중요한 기상인자이다. 토양수분 현장관측은 땅속에 매설된 센서에 의해 상당히 정확하게 이루어지만, 관측점 수가 충분치 않아 공간적 연속성을 확보하지 못하는 어려움이 존재한다. 이에 광역적 및 연속적 관측이 가능한 마이크로파 위성센서가 토양수분 정보 획득을 위한 보조수단으로서 그 중요성이 부각되고 있다. 마이크로파 위성센서는 구름 등 기상조건의 제약을 받지 않으며, 1978년 이래 현재까지 여러 위성에 의해 25 km 및 10 km 해상도의 전지구 토양수분자료가 생산되어 왔다. 마이크로파 센서를 이용한 토양수분자료는 동일지점에 대하여 하루 2회 정도 산출되므로 적절한 시간분해능을 가지지만, 공간해상도가 최고 10 km로서 지역규모의 수문분석에 적용하기에는 충분치 않다. 이러한 토양수분자료의 공간해상도 문제 해결을 위하여 다양한 지면환경요소를 활용한 통계적 다운스케일링이 대안으로 제시되었다. 최근의 선행연구들은 대부분 방정식을 이용한 결합모형을 통해 통계적 다운스케일링을 수행하였는데, 회귀식과 같은 선형결합뿐 아니라 신경망이나 기계학습 등의 비선형결합에서도, 불가피하게 발생할 수밖에 없는 잔차(residual)로 인하여 다운스케일링 전후의 공간분포 패턴이 달라져버리는 문제를 안고 있었다. 회귀분석에 잔차의 공간내삽을 결합시킨 회귀크리깅(regression kriging)은 잔차보정을 통해 이러한 문제를 해결함으로써 다운스케일링 전후의 공간분포 일관성을 보장하는 기법이다. 이 연구에서는 회귀크리깅을 이용하여 일자별 AMSR2(Advanced Microwave Scanning Radiometer 2) 토양수분 자료를 10 km에서 1 km 해상도로 다운스케일링하고, 다운스케일링 전후의 자료패턴 일관성을 평가한다. 지면온도(LST), 지면온도상승률(RR), 식생온도건조지수(TVDI)는 일자별로 DB를 구축하였고, 식생지수(NDVI), 수분지수(NDWI), 지면알베도(SA)는 8일 간격으로 DB를 구축하였다. 이러한 8일 간격의 자료를 일자별로 변환하기 위하여 큐빅스플라인(cubic spline)을 이용하여 시계열내삽을 수행하였다. 또한 상이한 공간해상도의 자료는 최근린법을 이용하여 다운스케일링 목표해상도인 1 km에 맞도록 변환하였다. 우선 저해상도 스케일에서 추정치를 산출하기 위해서는 저해상도 픽셀별로 이에 해당하는 복수의 고해상도 픽셀을 평균화하여 대응시켜야 하며, 이를 통해 6개의 설명변수(LST, RR, TVDI, NDVI, NDWI, SA)와 AMSR2 토양수분을 반응변수로 하는 다중회귀식을 도출하였다. 이식을 고해상도 스케일의 설명변수들에 적용하면 고해상도 토양수분 추정치가 산출되는데, 이때 추정치와 원자료의 차이에 해당하는 잔차에 대한 보정이 필요하다. 저해상도 스케일로 존재하는 잔차를 크리깅 공간내삽을 통해 고해상도로 변환한 후 이를 고해상도 추정치에 부가해주는 방식으로 잔차보정이 이루어짐으로써, 다운스케일링 전후의 자료패턴 일관성이 유지되는(r>0.95) 공간상세화된 토양수분 자료를 생산할 수 있다.

  • PDF

A Study on the Air Pollution Monitoring Network Algorithm Using Deep Learning (심층신경망 모델을 이용한 대기오염망 자료확정 알고리즘 연구)

  • Lee, Seon-Woo;Yang, Ho-Jun;Lee, Mun-Hyung;Choi, Jung-Moo;Yun, Se-Hwan;Kwon, Jang-Woo;Park, Ji-Hoon;Jung, Dong-Hee;Shin, Hye-Jung
    • Journal of Convergence for Information Technology
    • /
    • v.11 no.11
    • /
    • pp.57-65
    • /
    • 2021
  • We propose a novel method to detect abnormal data of specific symptoms using deep learning in air pollution measurement system. Existing methods generally detect abnomal data by classifying data showing unusual patterns different from the existing time series data. However, these approaches have limitations in detecting specific symptoms. In this paper, we use DeepLab V3+ model mainly used for foreground segmentation of images, whose structure has been changed to handle one-dimensional data. Instead of images, the model receives time-series data from multiple sensors and can detect data showing specific symptoms. In addition, we improve model's performance by reducing the complexity of noisy form time series data by using 'piecewise aggregation approximation'. Through the experimental results, it can be confirmed that anomaly data detection can be performed successfully.

A Study on the Awareness and Preparation of the Forth Industrial Revolution of Some Health Department College Students (일부 보건계열학과 대학생의 4차 산업혁명 인식 및 준비도 연구)

  • Cho, Hye-Eun
    • Journal of the Korea Convergence Society
    • /
    • v.11 no.12
    • /
    • pp.291-299
    • /
    • 2020
  • The purpose of this study was to be used as basic data for the development of future-type curriculum in health. The awareness and preparation of the forth industrial revolution were surveyed on 280 college students in health departments preparing medical technicians. A self-written structured questionnaire was used for data collection, and the recognition of the forth industry revolution was 2.74, 3D printing (3.59) was high, and neural network machine learning(2.33) was the lowest. Students majoring in Physiotherapy (3.00) had the highest perception, and those majored in Dental engineering(2.37) had the lowest perception, and there was a difference in the degree of perception of IoT by major (p=0.024). For the forth industrial revolution, 54.5% of students are preparing, and lack of interest (42.9%) is the most difficult reason to prepare, and 50.6% of educational experience and 60.9% of VR&AR game experience have experience. In the era of the forth industrial revolution, job loss (38.7%) was high, and the required competency was creative capacity (50.6%). Therefore, it is necessary to develop a curriculum related to the fourth industrial revolution and apply teaching methods that can increase the awareness and preparation of health college students in the era of the fourth industrial revolution.

Applicability Analysis on Estimation of Spectral Induced Polarization Parameters Based on Multi-objective Optimization (다중목적함수 최적화에 기초한 광대역 유도분극 변수 예측 적용성 분석)

  • Kim, Bitnarae;Jeong, Ju Yeon;Min, Baehyun;Nam, Myung Jin
    • Geophysics and Geophysical Exploration
    • /
    • v.25 no.3
    • /
    • pp.99-108
    • /
    • 2022
  • Among induced polarization (IP) methods, spectral IP (SIP) uses alternating current as a transmission source to measure amplitudes and phase of complex electrical resistivity at each source frequency, which disperse with respect to source frequencies. The frequency dependence, which can be explained by a relaxation model such as Cole-Cole model or equivalent models, is analyzed to estimate SIP parameters from dispersion curves of complex resistivity employing multi-objective optimization (MOO). The estimation uses a generic algorithm to optimize two objective functions minimizing data misfits of amplitude and phase based on Cole-Cole model, which is most widely used to explain IP relaxation effects. The MOO-based estimation properly recovered Cole-Cole model parameters for synthetic examples but hardly fitted for the real laboratory measures ones, which have relatively smaller values of phases (less than about 10 mrad). Discrepancies between scales for data misfits of amplitude and phase, used as parameters of MOO method, and it is in necessity to employ other methods such as machine learning, which can deal with the discrepancies, to estimate SIP parameters from dispersion curves of complex resistivity.