• Title/Summary/Keyword: Test Error

Search Result 3,987, Processing Time 0.029 seconds

Exploring the Reliability of an Assessment based on Automatic Item Generation Using the Multivariate Generalizability Theory (다변량일반화가능도 이론을 적용한 자동문항생성 기반 평가에서의 신뢰도 탐색)

  • Jinmin Chung;Sungyeun Kim
    • Journal of Science Education
    • /
    • v.47 no.2
    • /
    • pp.211-224
    • /
    • 2023
  • The purpose of this study is to suggest how to investigate the reliability of the assessment, which consists of items generated by automatic item generation using empirical example data. To achieve this, we analyzed the illustrative assessment data by applying the multivariate generalizability theory, which can reflect the design of responding to different items for each student and multiple error sources in the assessment score. The result of the G-study showed that, in most designs, the student effect corresponding to the true score of the classical test theory was relatively large after residual effects. In addition, in the design where the content domain was fixed, the ranking of students did not change depending on the item types or items. Similarly, in the design where the item format was fixed, the difficulty showed little variation depending on the content domains. The result of the D-study indicated that the original assessment data achieved a sufficient level of reliability. It was also found that higher reliability than the original assessment data could be obtained by reducing the number of items in the content domains of operation, geometry, and probability and statistics, or by assigning higher weights to the domains of letters and formulas, and function. The efficient measurement conditions presented in this study are limited to the illustrative assessment data. However, the method applied in this study can be utilized to determine the reliability and to find efficient measurement conditions for the various assessment situations using automatic item generation based on measurement traits.

A Study the Activities of Working People in the Sports Club (직장인들의 생활체육 동호회 활동에 관한 연구)

  • Kim, Kyoung-Hyun
    • Journal of Korea Entertainment Industry Association
    • /
    • v.13 no.1
    • /
    • pp.99-109
    • /
    • 2019
  • The study was conducted to investigate the activities of working people in the sports club. The subject of this study was to take samples of workers who participated in the physical education system using the convenience sampling method. Out of a total of 400 questionnaires, 387 were used for research purposes, except for invalid or error questionnaires. Factor analysis and reliability tests were performed using IBM SPSS statistics Ver 21.0. Frequency analysis was conducted to explore the general characteristics of the study participants. An independent sample t-test ANOVA were conducted to verify differences among groups according to demographic characteristics, and a correlation analysis was conducted to examine the relationship between variables. Regression was performed to verify the effect of variable factors. The results of the study are as follows. First, there was no difference in wellness and job satisfaction according to gender. Second, there was no difference in wellness and job satisfaction according to sport. Third, there was a significant difference intellectual wellness according to age. In particular, 40s and 50s were higher than 60s and over. Fourth, there was a significant difference in social wellness according to activity duration. In particular, 1~2 years were higher than 3 years or more. Finally, If you look at the impact of working people's wellness lifestyle sports club activities on job satisfaction, the professional wellness lifestyle club activities showed significant influence on job satisfaction.

Estimation of River Flow Data Using Machine Learning (머신러닝 기법을 이용한 유량 자료 생산 방법)

  • Kang, Noel;Lee, Ji Hun;Lee, Jung Hoon;Lee, Chungdae
    • Proceedings of the Korea Water Resources Association Conference
    • /
    • 2020.06a
    • /
    • pp.261-261
    • /
    • 2020
  • 물관리의 기본이 되는 연속적인 유량 자료 확보를 위해서는 정확도 높은 수위-유량 관계 곡선식 개발이 필수적이다. 수위-유량 관계곡선식은 모든 수문시설 설계의 기초가 되며 홍수, 가뭄 등 물재해 대응을 위해서도 중요한 의미를 가지고 있다. 그러나 일반적으로 유량 측정은 많은 비용과 시간이 들고, 식생성장, 단면변화 등의 통제특성(control)이 변함에 따라 구간분리, 기간분리와 같은 비선형적인 양상이 나타나 자료 해석에 어려움이 존재한다. 특히, 국내 하천의 경우 자연적 및 인위적인 환경 변화가 다양하여 지점 및 기간에 따라 세밀한 분석이 요구된다. 머신러닝(Machine Learning)이란 데이터를 통해 컴퓨터가 스스로 학습하여 모델을 구축하고 성능을 향상시키는 일련의 과정을 뜻한다. 기존의 수위-유량 관계곡선식은 개발자의 판단에 의해 데이터의 종류와 기간 등을 설정하여 회귀식의 파라미터를 산출한다면, 머신러닝은 유효한 전체 데이터를 이용해 스스로 학습하여 자료 간 상관성을 찾아내 모델을 구축하고 성능을 지속적으로 향상 시킬 수 있다. 머신러닝은 충분한 수문자료가 확보되었다는 전제 하에 복잡하고 가변적인 수자원 환경을 반영하여 유량 추정의 정확도를 지속적으로 향상시킬 수 있다는 이점을 가지고 있다. 본 연구는 머신러닝의 대표적인 알고리즘들을 활용하여 유량을 추정하는 모델을 구축하고 성능을 비교·분석하였다. 대상지역은 안정적인 수량을 확보하고 있는 한강수계의 거운교 지점이며, 사용자료는 2010~2018년의 시간, 수위, 유량, 수면폭 등 이다. 프로그램은 파이썬을 기반으로 한 머신러닝 라이브러리인 사이킷런(sklearn)을 사용하였고 알고리즘은 랜덤포레스트 회귀, 의사결정트리, KNN(K-Nearest Neighbor), rgboost을 적용하였다. 학습(train) 데이터는 입력자료 종류별로 조합하여 6개의 세트로 구분하여 모델을 구축하였고, 이를 적용해 검증(test) 데이터를 RMSE(Roog Mean Square Error)로 평가하였다. 그 결과 모델 및 입력 자료의 조합에 따라 3.67~171.46로 다소 넓은 범위의 값이 도출되었다. 그 중 가장 우수한 유형은 수위, 연도, 수면폭 3개의 입력자료를 조합하여 랜덤포레스트 회귀 모델에 적용한 경우이다. 비교를 위해 동일한 검증 데이터를 한국수문조사연보(2018년) 내거운교 지점의 수위별 수위-유량 곡선식을 이용해 유량을 추정한 결과 RMSE가 3.76이 산출되어, 머신러닝이 세분화된 수위-유량 곡선식과 비슷한 수준까지 성능을 내는 것으로 확인되었다. 본 연구는 양질의 유량자료 생산을 위해 기 구축된 수문자료를 기반으로 머신러닝 기법의 적용 가능성을 검토한 기초 연구로써, 국내 효율적인 수문자료 측정 및 수위-유량 곡선 산출에 도움이 될 수 있을 것으로 판단된다. 향후 수자원 환경 및 통제특성에 영향을 미치는 다양한 영향변수를 파악하기 위해 기상자료, 취수량 등의 입력 자료를 적용할 필요가 있으며, 머신러닝 내 비지도학습인 딥러닝과 같은 보다 정교한 모델에 대한 추가적인 연구도 수행되어야 할 것이다.

  • PDF

Prediction of Postoperative Lung Function in Lung Cancer Patients Using Machine Learning Models

  • Oh Beom Kwon;Solji Han;Hwa Young Lee;Hye Seon Kang;Sung Kyoung Kim;Ju Sang Kim;Chan Kwon Park;Sang Haak Lee;Seung Joon Kim;Jin Woo Kim;Chang Dong Yeo
    • Tuberculosis and Respiratory Diseases
    • /
    • v.86 no.3
    • /
    • pp.203-215
    • /
    • 2023
  • Background: Surgical resection is the standard treatment for early-stage lung cancer. Since postoperative lung function is related to mortality, predicted postoperative lung function is used to determine the treatment modality. The aim of this study was to evaluate the predictive performance of linear regression and machine learning models. Methods: We extracted data from the Clinical Data Warehouse and developed three sets: set I, the linear regression model; set II, machine learning models omitting the missing data: and set III, machine learning models imputing the missing data. Six machine learning models, the least absolute shrinkage and selection operator (LASSO), Ridge regression, ElasticNet, Random Forest, eXtreme gradient boosting (XGBoost), and the light gradient boosting machine (LightGBM) were implemented. The forced expiratory volume in 1 second measured 6 months after surgery was defined as the outcome. Five-fold cross-validation was performed for hyperparameter tuning of the machine learning models. The dataset was split into training and test datasets at a 70:30 ratio. Implementation was done after dataset splitting in set III. Predictive performance was evaluated by R2 and mean squared error (MSE) in the three sets. Results: A total of 1,487 patients were included in sets I and III and 896 patients were included in set II. In set I, the R2 value was 0.27 and in set II, LightGBM was the best model with the highest R2 value of 0.5 and the lowest MSE of 154.95. In set III, LightGBM was the best model with the highest R2 value of 0.56 and the lowest MSE of 174.07. Conclusion: The LightGBM model showed the best performance in predicting postoperative lung function.

An Analysis on Reasoning of 4th-Grade Elementary School Students in Comparing Unlike Fraction Magnitudes (초등학교 4학년 학생들의 이분모 분수 크기 비교에 나타나는 추론 분석)

  • Yoon, Chaerin;Chang, Hyewon
    • Education of Primary School Mathematics
    • /
    • v.26 no.3
    • /
    • pp.181-197
    • /
    • 2023
  • The importance of reasoning processes based on fractional concepts and number senses, rather than a formalized procedural method using common denominators, has been noted in a number of studies in relation to compare the magnitudes of unlike fractions. In this study, a unlike fraction magnitudes comparison test was conducted on fourth-grade elementary school students who did not learn equivalent fractions and common denominators to analyze the reasoning perspectives of the correct and wrong answers for each of the eight problem types. As a result of the analysis, even students before learning equivalent fractions and reduction to common denominators were able to compare the unlike fractions through reasoning based on fractional sense. The perspective chosen by the most students for the comparison of the magnitudes of unlike fractions is the 'part-whole perspective', which shows that reasoning when comparing the magnitudes of fractions depends heavily on the concept of fractions itself. In addition, it was found that students who lack a conceptual understanding of fractions led to difficulties in having quantitative sense of fraction, making it difficult to compare and infer the magnitudes of unlike fractions. Based on the results of the study, some didactical implications were derived for reasoning guidance based on the concept of fractions and the sense of numbers without reduction to common denominators when comparing the magnitudes of unlike fraction.

Development of 3-D Nonlinear Wave Driver Using SPH (SPH을 활용한 3차원 비선형 파랑모형 개발)

  • Cho, Yong Jun;Kim, Gweon Soo
    • KSCE Journal of Civil and Environmental Engineering Research
    • /
    • v.28 no.5B
    • /
    • pp.559-573
    • /
    • 2008
  • In this study, we newly proposed 3-D nonlinear wave driver utilizing the Navier-Stokes Eq. the numerical integration of which is carried out using SPH (Smoothed Particle Hydrodynamics), an internal wave generation with the source function of Gaussian distribution and an energy absorbing layer. For the verification of new 3-D nonlinear wave driver, we numerically simulate the sloshing problem within a parabolic water basin triggered by a Gaussian hump and uniformly inclined water surface by Thacker (1981). It turns out that the qualitative behavior of sloshing caused by relaxing the external force which makes a free surface convex or uniformly inclined is successfully simulated even though phase error is visible and an inundation height shrinks as numerical simulation more proceeds. For the more severe test, we also simulate the nonlinear shoaling and refraction over uniform beach of wedge shape. It is shown that numerically simulated waves are less refracted than the linear counterpart by Hamiltonian ray theory due to nonlinearity, energy dissipation at the bottom and side walls, energy loss induced by breaking, and the hydraulic jump occurring when breaking waves encounter a down-rush by the preceding wave.

Development of PSC I Girder Bridge Weigh-in-Motion System without Axle Detector (축감지기가 없는 PSC I 거더교의 주행중 차량하중분석시스템 개발)

  • Park, Min-Seok;Jo, Byung-Wan;Lee, Jungwhee;Kim, Sungkon
    • KSCE Journal of Civil and Environmental Engineering Research
    • /
    • v.28 no.5A
    • /
    • pp.673-683
    • /
    • 2008
  • This study improved the existing method of using the longitudinal strain and concept of influence line to develop Bridge Weigh-in-Motion system without axle detector using the dynamic strain of the bridge girders and concrete slab. This paper first describes the considered algorithms of extracting passing vehicle information from the dynamic strain signal measured at the bridge slab, girders, and cross beams. Two different analysis methods of 1) influence line method, and 2) neural network method are considered, and parameter study of measurement locations is also performed. Then the procedures and the results of field tests are described. The field tests are performed to acquire training sets and test sets for neural networks, and also to verify and compare performances of the considered algorithms. Finally, comparison between the results of different algorithms and discussions are followed. For a PSC I-girder bridge, vehicle weight can be calculated within a reasonable error range using the dynamic strain gauge installed on the girders. The passing lane and passing speed of the vehicle can be accurately estimated using the strain signal from the concrete slab. The passing speed and peak duration were added to the input variables to reflect the influence of the dynamic interaction between the bridge and vehicles, and impact of the distance between axles, respectively; thus improving the accuracy of the weight calculation.

Automated Satellite Image Co-Registration using Pre-Qualified Area Matching and Studentized Outlier Detection (사전검수영역기반정합법과 't-분포 과대오차검출법'을 이용한 위성영상의 '자동 영상좌표 상호등록')

  • Kim, Jong Hong;Heo, Joon;Sohn, Hong Gyoo
    • KSCE Journal of Civil and Environmental Engineering Research
    • /
    • v.26 no.4D
    • /
    • pp.687-693
    • /
    • 2006
  • Image co-registration is the process of overlaying two images of the same scene, one of which represents a reference image, while the other is geometrically transformed to the one. In order to improve efficiency and effectiveness of the co-registration approach, the author proposed a pre-qualified area matching algorithm which is composed of feature extraction with canny operator and area matching algorithm with cross correlation coefficient. For refining matching points, outlier detection using studentized residual was used and iteratively removes outliers at the level of three standard deviation. Throughout the pre-qualification and the refining processes, the computation time was significantly improved and the registration accuracy is enhanced. A prototype of the proposed algorithm was implemented and the performance test of 3 Landsat images of Korea. showed: (1) average RMSE error of the approach was 0.435 pixel; (2) the average number of matching points was over 25,573; (3) the average processing time was 4.2 min per image with a regular workstation equipped with a 3 GHz Intel Pentium 4 CPU and 1 Gbytes Ram. The proposed approach achieved robustness, full automation, and time efficiency.

Quantity-based Early Cost Estimation Model for Road Construction Projects (대표물량 기반의 도로공사 설계단계의 개략공사비 예측모델)

  • Kim, Du Yon;Kim, Byungil;Yeo, Donghoon;Han, Seung Heon
    • KSCE Journal of Civil and Environmental Engineering Research
    • /
    • v.29 no.3D
    • /
    • pp.373-379
    • /
    • 2009
  • Cost estimation in the early phase enables government to plan public budgeting more efficiently by providing information about construction cost. However, cost estimation in the early phase is difficult to predict because only a little information can be utilized. The cost estimation method now being used by the government is calculated by length of the road multiplied by unit cost per length and shows high error rate because it cannot reflect the unique characteristics of each project. As the project is being proceeded, level of available information also changed. So, reflecting available information of a project is important. This paper divided early phase into two parts : planning phase and early design phase, and developed cost estimation model considering level of available information of each phase. Total 143 cases are utilized to find influencing variables and develop cost estimation model and model validation is done by adopting required accuracy level. This cost estimation model reflecting level of available information can be applied to public budgeting, feasibility test, and comparison between routes.

Optimization-based Deep Learning Model to Localize L3 Slice in Whole Body Computerized Tomography Images (컴퓨터 단층촬영 영상에서 3번 요추부 슬라이스 검출을 위한 최적화 기반 딥러닝 모델)

  • Seongwon Chae;Jae-Hyun Jo;Ye-Eun Park;Jin-Hyoung, Jeong;Sung Jin Kim;Ahnryul Choi
    • The Journal of Korea Institute of Information, Electronics, and Communication Technology
    • /
    • v.16 no.5
    • /
    • pp.331-337
    • /
    • 2023
  • In this paper, we propose a deep learning model to detect lumbar 3 (L3) CT images to determine the occurrence and degree of sarcopenia. In addition, we would like to propose an optimization technique that uses oversampling ratio and class weight as design parameters to address the problem of performance degradation due to data imbalance between L3 level and non-L3 level portions of CT data. In order to train and test the model, a total of 150 whole-body CT images of 104 prostate cancer patients and 46 bladder cancer patients who visited Gangneung Asan Medical Center were used. The deep learning model used ResNet50, and the design parameters of the optimization technique were selected as six types of model hyperparameters, data augmentation ratio, and class weight. It was confirmed that the proposed optimization-based L3 level extraction model reduced the median L3 error by about 1.0 slices compared to the control model (a model that optimized only 5 types of hyperparameters). Through the results of this study, accurate L3 slice detection was possible, and additionally, we were able to present the possibility of effectively solving the data imbalance problem through oversampling through data augmentation and class weight adjustment.