• Title/Summary/Keyword: 오분류

Search Result 800, Processing Time 0.032 seconds

Analysis of cycle racing ranking using statistical prediction models (통계적 예측모형을 활용한 경륜 경기 순위 분석)

  • Park, Gahee;Park, Rira;Song, Jongwoo
    • The Korean Journal of Applied Statistics
    • /
    • v.30 no.1
    • /
    • pp.25-39
    • /
    • 2017
  • Over 5 million people participate in cycle racing betting and its revenue is more than 2 trillion won. This study predicts the ranking of cycle racing using various statistical analyses and identifies important variables which have influence on ranking. We propose competitive ranking prediction models using various classification and regression methods. Our model can predict rankings with low misclassification rates most of the time. We found that the ranking increases as the grade of a racer decreases and as overall scores increase. Inversely, we can observe that the ranking decreases when the grade of a racer increases, race number four is given, and the ranking of the last race of a racer decreases. We also found that prediction accuracy can be improved when we use centered data per race instead of raw data. However, the real profit from the future data was not high when we applied our prediction model because our model can predict only low-return events well.

Model Predictive Control for Distributed Storage Facilities and Sewer Network Systems via PSO (분산형 저류시설-하수관망 네트워크 시스템의 입자군집최적화 기반 모델 예측 제어)

  • Baek, Hyunwook;Ryu, Jaena;Kim, Tea-Hyoung;Oh, Jeill
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.22 no.6
    • /
    • pp.722-728
    • /
    • 2012
  • Urban sewer systems has a limitation of capacity of rainwater storage and problem of occurrence of untreated sewage, so adopting a storage facility for sewer flooding prevention and urban non-point pollution reduction has a big attention. The Korea Ministry of Environment has recently introduced a new concept of "multi-functional storage facility", which is crucial not only in preventive stormwater management but also in dealing with combined sewer overflow and sanitary sewer discharge, and also has been promoting its adoption. However, reserving a space for a single large-scale storage facility might be difficult especially in urban areas. Thus, decentralized construction of small- and midium-sized storage facilities and its operation have been introduced as an alternative way. In this paper, we propose a model predictive control scheme for an optimized operation of distributed storage facilities and sewer networks. To this aim, we first describe the mathematical model of each component of networks system which enables us to analyze its detailed dynamic behavior. Second, overflow locations and volumes will be predicted based on the developed network model with data on the external inflow occurred at specific locations of the network. MPC scheme based on the introduced particle swarm optimization technique then produces the optimized the gate setting for sewer network flow control, which minimizes sewer flooding and maximizes the potential storage capacity. Finally, the operational efficacy of the proposed control scheme is demonstrated by simulation study with virtual rainstorm event.

A Comparative Study on Methods for Outlier Test of Rainfall in Korea (국내 강우의 이상치검정 방법의 비교 연구)

  • Lee, Jung Sik;Shin, Chang Dong
    • Proceedings of the Korea Water Resources Association Conference
    • /
    • 2018.05a
    • /
    • pp.359-359
    • /
    • 2018
  • 이상치는 표본자료에서 크게 어긋나 다른 자료들로부터 떨어져 표시되는 자료로써, 실제로 발생할 확률이 매우 낮은 자료로 정의되고 있다. 설계홍수량을 산정하기 위하여 적용하고 있는 극치계열의 연최대치 강우자료에는 기계오작동 및 엔지니어의 표독오류가 발생하고 있으며, 기후변화에 따른 거대태풍 및 국지적인 집중호우 발생 등으로 인한 극치값 등에서 이상치가 관측되고 있다. 통상 이상치들은 통계분석시 자료 본연의 특성을 왜곡시켜 편향된 결과를 산정할 수 있으므로 빈도해석시 이상치해석 절차를 수행하여 자료의 적정성을 확인하여야 한다. 현재 실무에서는 설계홍수량 산정요령과 하천설계기준 해설 등에서 관련 내용을 기술하고 있지만, 국내 강우자료의 기록연수의 부족으로 인하여 빈도해석시 이상치 해석이 미수행되고 있어 이상치에 따른 자료편의가 발생하면 결과물인 확률강우량이 왜곡되게 산정될 수 있다. 따라서, 본 연구에서는 국내 주요 도시의 강우자료를 이용하여 이상치검정을 수행하였다. 대상지점으로는 서울, 부산, 대전, 대구, 인천, 광주, 울산 등의 비교적 긴 관측년수를 보유하고 있는 광역시를 선정하였으며, 지속기간은 10분, 1~24시간의 25개 강우자료를 적용하였다. 이상치검정 방법으로는 타 방법에 비하여 이상치 검정력이 뛰어난 것으로 알려진 2가지 방법을 채택하였으며, 표본자료의 평균과 표준편차로 표준화된 z값을 이용하여 상 하 한계선를 초과하는 값을 확인하는 z-Score 방법중 향상된 중위수 절대편차(MAD)에 의한 수정 z-Score 방법(Hoaglin, 1993)과 Box-Plot 방법(Tukey, 1969)을 적용하였다. Box-Plot 방법(Tukey, 1969)은 전체 자료를 25%씩 사분위로 구분하는 방법으로 정렬된 자료계열을 중앙값, 박스, 수염(whiskers), 이상치로 구분한다. 정렬된 25~75% 값들을 박스로 포함하여 외곽의 수염값들을 이상치로 분류하며, 특히 사분위수의 도식화로 데이터의 분포를 파악하기 좋으며, 이상치들의 위치와 자료의 비대칭 여부를 쉽게 파악할 수 있다. 본 연구의 수행으로 수정 z-Score 방법의 경우에는 서울과 대구지점에는 이상치가 없으며, 부산지점에는 13개, 대전지점 7개, 인천지점 5개, 광주지점 32개, 울산지점 26개가 나타났다. Box-Plot 방법으로는 서울지점 35개, 부산지점 39개, 대전지점 32개, 대구지점 38개, 인천지점 51개, 광주지점 61개, 울산지점 65개의 이상치가 분석되었다. 연구를 수행한 결과, 수정 z-Score 방법에 비하여 Box-Plot 방법에 의한 이상치가 더 많이 발생하였으며, 각각의 방법으로 지속기간 및 연도별 이상치 발생자료를 확인하였다. 방법별 이상치 발생현황 등을 분석하여 지점별 발생횟수를 분석하였으며, 추후 지점 및 자료의 보완이 수행되면 활용성을 증대시킬 수 있을 것으로 판단된다.

  • PDF

Artificial Intelligence Techniques for Predicting Online Peer-to-Peer(P2P) Loan Default (인공지능기법을 이용한 온라인 P2P 대출거래의 채무불이행 예측에 관한 실증연구)

  • Bae, Jae Kwon;Lee, Seung Yeon;Seo, Hee Jin
    • The Journal of Society for e-Business Studies
    • /
    • v.23 no.3
    • /
    • pp.207-224
    • /
    • 2018
  • In this article, an empirical study was conducted by using public dataset from Lending Club Corporation, the largest online peer-to-peer (P2P) lending in the world. We explore significant predictor variables related to P2P lending default that housing situation, length of employment, average current balance, debt-to-income ratio, loan amount, loan purpose, interest rate, public records, number of finance trades, total credit/credit limit, number of delinquent accounts, number of mortgage accounts, and number of bank card accounts are significant factors to loan funded successful on Lending Club platform. We developed online P2P lending default prediction models using discriminant analysis, logistic regression, neural networks, and decision trees (i.e., CART and C5.0) in order to predict P2P loan default. To verify the feasibility and effectiveness of P2P lending default prediction models, borrower loan data and credit data used in this study. Empirical results indicated that neural networks outperforms other classifiers such as discriminant analysis, logistic regression, CART, and C5.0. Neural networks always outperforms other classifiers in P2P loan default prediction.

Effects of Constitutional Food on Nurse's NK Cell Activity and Stress Reduction (체질푸드가 간호사의 스트레스 감소와 NK세포 활성도에 미치는 영향)

  • Park, Sun-Mi
    • The Journal of the Korea Contents Association
    • /
    • v.19 no.3
    • /
    • pp.500-509
    • /
    • 2019
  • The study looked at the effect of body food on stress reduction and NK cell activity in improving natural healing for nurses, and examined whether it is effective in preventing and curing human health. The study was conducted for 30 days on 22 nurses who currently worked for more than a year at a general hospital in Gyeonggi Province. Stress measurements were conducted through pulse wave measuring instrument, and blood tests were conducted on the activity of NK cells. Treatment of collected data was performed by Paired t-testing through the Stat program SPSS 21 and providing body food suitable for the physical quality of participants after the classification of the body based on the internal diameter of the emperor and scholarship. Studies have shown that the effect on reducing the stress of body food has significant positive effects and that body food is effective for the activity of NK cells. This study has the following significance: First, the main health threat factor for modern people is the increased resistance of natural healing through body food to stress, which has the potential to prevent disease. Second, immunodeficiency in disease prevention and treatment is very important, which can increase the natural healing power of the human body by increasing the activity of NK cells through body food.

Reaction Characteristics of Phytoplankton Before and After the Yellow Dust Event in Taean Peninsula and Yellow Dust Impact Assessment (태안반도주변에서 춘계 황사 전·후 식물플랑크톤 반응특성과 황사분진 영향평가)

  • Yoo, Man Ho;Youn, Seok Hyun;Oh, Hyun Ju;Choi, Joong Ki
    • Journal of the Korean Society of Marine Environment & Safety
    • /
    • v.24 no.7
    • /
    • pp.898-906
    • /
    • 2018
  • To investigate the effect of yellow dust on phytoplankton, a field survey and physiological experiments were carried out in the waters near Taean Peninsula from April 22 to 26, 2006, when yellow dust occurred. Phytoplankton populations during the yellow dust period were in the range of $26{\sim}290{\times}10^3cells{\cdot}L^{-1}$, a somewhat low standing crop. An increase in diatoms (a main taxonomic group), especially benthic diatoms such as Paralia sulcate, a typical species for active mixed sea water areas, was also remarkable. In addition, the Chl-a concentration after yellow dust exceeded the Chl-a concentration change range according to the tide before yellow dust. As the concentration of yellow sand increased in a yellow sand treatment experiment, primary productivity decreased, and the maximum assimilation number showed the same tendency. In the 48h culture experiment, primary productivity of the test group was lower than that of the control group at the early stage (T0) of yellow sand treatment, but after 48 hours (T48), the test group showed higher primary productivity than the control group. In particular, the primary productivity of the test group significantly increased to 321 % after 48 hours. Therefore, strong physical environment accompanied by yellow dust may temporarily inhibit the growth of phytoplankton in the waters adjacent to China in the early stage of yellow dust, but the formation of stable water mass has also been identified as a potential factor promoting the growth of phytoplankton.

Development of a Rubric for Assessing Middle School Students' Conceptual Understanding about Dew Point (이슬점에 대한 중학생들의 개념 이해 평가 루브릭 개발)

  • Lee, Kiyoung;Lee, Jaebong;Oh, Hyunseok
    • Journal of the Korean earth science society
    • /
    • v.41 no.6
    • /
    • pp.684-694
    • /
    • 2020
  • In this study, we developed a rubric for assessing middle school students' conceptual understanding about dew point. For this purpose, we analyzed 9th grade students' responses collected by using a multi-tiers constructed-response item of National Assessment of Educational Achievement (NAEA) and classified the types of the responses according to their characteristics. In addition, we analyzed the distribution of student response types according to mean achievement scores and developed an assessment rubric of conceptual understanding about dew point. The findings are as follows: First, the analysis of student responses to finding dew point in the saturation curve showed that many students had no or lack understanding of the scientific concept of dew point. Second, as a result of analyzing the student response to the water vapor condensation process at dew point, the proportion of scientific conception types was very low, while the proportion of misconception types was relatively high and the types varied as well. Third, a four- level assessment rubric was developed based on the analysis of the distribution of student response types according to the mean achievement scores. Based on the findings, we suggested the development and utilization of assessment rubric in the field of Earth science education.

A Non-annotated Recurrent Neural Network Ensemble-based Model for Near-real Time Detection of Erroneous Sea Level Anomaly in Coastal Tide Gauge Observation (비주석 재귀신경망 앙상블 모델을 기반으로 한 조위관측소 해수위의 준실시간 이상값 탐지)

  • LEE, EUN-JOO;KIM, YOUNG-TAEG;KIM, SONG-HAK;JU, HO-JEONG;PARK, JAE-HUN
    • The Sea:JOURNAL OF THE KOREAN SOCIETY OF OCEANOGRAPHY
    • /
    • v.26 no.4
    • /
    • pp.307-326
    • /
    • 2021
  • Real-time sea level observations from tide gauges include missing and erroneous values. Classification as abnormal values can be done for the latter by the quality control procedure. Although the 3𝜎 (three standard deviations) rule has been applied in general to eliminate them, it is difficult to apply it to the sea-level data where extreme values can exist due to weather events, etc., or where erroneous values can exist even within the 3𝜎 range. An artificial intelligence model set designed in this study consists of non-annotated recurrent neural networks and ensemble techniques that do not require pre-labeling of the abnormal values. The developed model can identify an erroneous value less than 20 minutes of tide gauge recording an abnormal sea level. The validated model well separates normal and abnormal values during normal times and weather events. It was also confirmed that abnormal values can be detected even in the period of years when the sea level data have not been used for training. The artificial neural network algorithm utilized in this study is not limited to the coastal sea level, and hence it can be extended to the detection model of erroneous values in various oceanic and atmospheric data.

Detecting Daily-Driven Game-Bot Based on Online Game Play Log Clustering (온라인 게임 로그 데이터 클러스터링 기반 일일 단위 게임봇 판별)

  • Kim, Joo Hwan;Choi, Jin-Young
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.31 no.6
    • /
    • pp.1097-1104
    • /
    • 2021
  • Online game-bots are already known for a lot of persons by various ways. It leads to problems such as declining game player's interest, in-game financial crisis, etc. Detecting and restricting of game-bot is now essential. Because both publishers and players get disadvantages from their long term abnormal working. But it is not easy to restrict, because of false restriction risks. Game publishers need to distinguish game-bot from server-side game logs. At last, it should can make reasons for game-bot restriction. In this paper, we classified game-bot users by using daily separated game logs for testing data. For daily-driven detection, we separated total dataset into one day logs. Preliminary detects game-bots with one day logs, and determines total results by using these data. Daily driven detection advantages on detection which contains combined game playing style. Which shows like normal user and game-bot. These methodology shows better F1-score, which one of indicator which demonstrate classification accuracy. It increases from 0.898 to 0.945 by using Random Forest classifier.

A Study on Dry Weight-Based Nutritional Deviations in Rice Foods for Normalization of Food Data (식품 데이터 정규화를 위한 쌀 음식의 건물중 기반 영양 편차 고찰)

  • Kim, Sang Cheol;Lee, Woon Yong;Park, Woo Pung;Yun, Ki Oh;Kim, Jong Rin
    • Smart Media Journal
    • /
    • v.11 no.7
    • /
    • pp.76-84
    • /
    • 2022
  • In Korea, where rice is the staple food, there are many cases in which the nutritional composition of food is different at the same weight, even though the same ingredients are used and the food or food name is the same. The cause is closely related to the moisture content of the food according to the cooking method and cooking process. In order to design a diet tailored to individual health and supply accurate calories and nutrients, a method of expressing food data that is not affected by the cooking process or cooking method is required. Usually, the same ingredients or foods show a lot of deviation from the nutritional components presented in the standard food database due to the difference in moisture content. For this reason, there are problems that increase the complexity of the food ingredient database and the difficulty in using it. As a method to improve these problems, we would like to propose a food data expression method based on dry weight. As an example of this, the characteristics of rice as a food material and changes in major nutritional components according to the change in moisture of various rice-processed foods made from rice were considered. In addition, as an example of how to normalize food data through this, the dry weight-based nutrition label of rice was presented.