• Title/Summary/Keyword: statistical learning

Search Result 1,288, Processing Time 0.028 seconds

A study on average changes in college students' credits earned and grade point average according to face-to-face and non-face-to-face classes in the COVID-19 situation

  • Jeong-Man, Seo
    • Journal of the Korea Society of Computer and Information
    • /
    • v.28 no.3
    • /
    • pp.167-175
    • /
    • 2023
  • In the context of COVID-19, this study was conducted to study how college students' earned grades and average grade point averages changed according to face-to-face and non-face-to-face classes. For this study, grade data was extracted using an access database. For the study, 152 students during the 3rd semester were compared and analyzed the grade point average, average grade point average, midterm exam, final exam, assignment score, and attendance score of students who participated in non-face-to-face and face-to-face classes. As an analysis method, independent sample t-test statistical processing was performed. It was concluded that the face-to-face class students had better grades and average GPA. As a result, the face-to-face class students showed 4.39 points higher than the non-face-to-face class students, and the average grade value was 0.6642 points higher. As a result of the comparative analysis, it was statistically significant, and the face-to-face class averaged 21.22 and the non-face-to-face class had 16.83 points. In conclusion, it was confirmed that face-to-face students' grades were generally higher than those of non-face-to-face students, and that face-to-face students showed higher participation in class.

Wafer bin map failure pattern recognition using hierarchical clustering (계층적 군집분석을 이용한 반도체 웨이퍼의 불량 및 불량 패턴 탐지)

  • Jeong, Joowon;Jung, Yoonsuh
    • The Korean Journal of Applied Statistics
    • /
    • v.35 no.3
    • /
    • pp.407-419
    • /
    • 2022
  • The semiconductor fabrication process is complex and time-consuming. There are sometimes errors in the process, which results in defective die on the wafer bin map (WBM). We can detect the faulty WBM by finding some patterns caused by dies. When one manually seeks the failure on WBM, it takes a long time due to the enormous number of WBMs. We suggest a two-step approach to discover the probable pattern on the WBMs in this paper. The first step is to separate the normal WBMs from the defective WBMs. We adapt a hierarchical clustering for de-noising, which nicely performs this work by wisely tuning the number of minimum points and the cutting height. Once declared as a faulty WBM, then it moves to the next step. In the second step, we classify the patterns among the defective WBMs. For this purpose, we extract features from the WBM. Then machine learning algorithm classifies the pattern. We use a real WBM data set (WM-811K) released by Taiwan semiconductor manufacturing company.

Comparative Study of Anomaly Detection Accuracy of Intrusion Detection Systems Based on Various Data Preprocessing Techniques (다양한 데이터 전처리 기법 기반 침입탐지 시스템의 이상탐지 정확도 비교 연구)

  • Park, Kyungseon;Kim, Kangseok
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.10 no.11
    • /
    • pp.449-456
    • /
    • 2021
  • An intrusion detection system is a technology that detects abnormal behaviors that violate security, and detects abnormal operations and prevents system attacks. Existing intrusion detection systems have been designed using statistical analysis or anomaly detection techniques for traffic patterns, but modern systems generate a variety of traffic different from existing systems due to rapidly growing technologies, so the existing methods have limitations. In order to overcome this limitation, study on intrusion detection methods applying various machine learning techniques is being actively conducted. In this study, a comparative study was conducted on data preprocessing techniques that can improve the accuracy of anomaly detection using NGIDS-DS (Next Generation IDS Database) generated by simulation equipment for traffic in various network environments. Padding and sliding window were used as data preprocessing, and an oversampling technique with Adversarial Auto-Encoder (AAE) was applied to solve the problem of imbalance between the normal data rate and the abnormal data rate. In addition, the performance improvement of detection accuracy was confirmed by using Skip-gram among the Word2Vec techniques that can extract feature vectors of preprocessed sequence data. PCA-SVM and GRU were used as models for comparative experiments, and the experimental results showed better performance when sliding window, skip-gram, AAE, and GRU were applied.

A study of artificial neural network for in-situ air temperature mapping using satellite data in urban area (위성 정보를 활용한 도심 지역 기온자료 지도화를 위한 인공신경망 적용 연구)

  • Jeon, Hyunho;Jeong, Jaehwan;Cho, Seongkeun;Choi, Minha
    • Journal of Korea Water Resources Association
    • /
    • v.55 no.11
    • /
    • pp.855-863
    • /
    • 2022
  • In this study, the Artificial Neural Network (ANN) was used to mapping air temperature in Seoul. MODerate resolution Imaging Spectroradiomter (MODIS) data was used as auxiliary data for mapping. For the ANN network topology optimizing, scatterplots and statistical analysis were conducted, and input-data was classified and combined that highly correlated data which surface temperature, Normalized Difference Vegetation Index (NDVI), Enhanced Vegetation Index (EVI), time (satellite observation time, Day of year), location (latitude, hardness), and data quality (cloudness). When machine learning was conducted only with data with a high correlation with air temperature, the average values of correlation coefficient (r) and Root Mean Squared Error (RMSE) were 0.967 and 2.708℃. In addition, the performance improved as other data were added, and when all data were utilized the average values of r and RMSE were 0.9840 and 1.883℃, which showed the best performance. In the Seoul air temperature map by the ANN model, the air temperature was appropriately calculated for each pixels topographic characteristics, and it will be possible to analyze the air temperature distribution in city-level and national-level by expanding research areas and diversifying satellite data.

Comparative study of data augmentation methods for fake audio detection (음성위조 탐지에 있어서 데이터 증강 기법의 성능에 관한 비교 연구)

  • KwanYeol Park;Il-Youp Kwak
    • The Korean Journal of Applied Statistics
    • /
    • v.36 no.2
    • /
    • pp.101-114
    • /
    • 2023
  • The data augmentation technique is effectively used to solve the problem of overfitting the model by allowing the training dataset to be viewed from various perspectives. In addition to image augmentation techniques such as rotation, cropping, horizontal flip, and vertical flip, occlusion-based data augmentation methods such as Cutmix and Cutout have been proposed. For models based on speech data, it is possible to use an occlusion-based data-based augmentation technique after converting a 1D speech signal into a 2D spectrogram. In particular, SpecAugment is an occlusion-based augmentation technique for speech spectrograms. In this study, we intend to compare and study data augmentation techniques that can be used in the problem of false-voice detection. Using data from the ASVspoof2017 and ASVspoof2019 competitions held to detect fake audio, a dataset applied with Cutout, Cutmix, and SpecAugment, an occlusion-based data augmentation method, was trained through an LCNN model. All three augmentation techniques, Cutout, Cutmix, and SpecAugment, generally improved the performance of the model. In ASVspoof2017, Cutmix, in ASVspoof2019 LA, Mixup, and in ASVspoof2019 PA, SpecAugment showed the best performance. In addition, increasing the number of masks for SpecAugment helps to improve performance. In conclusion, it is understood that the appropriate augmentation technique differs depending on the situation and data.

Development of a Water Quality Indicator Prediction Model for the Korean Peninsula Seas using Artificial Intelligence (인공지능 기법을 활용한 한반도 해역의 수질평가지수 예측모델 개발)

  • Seong-Su Kim;Kyuhee Son;Doyoun Kim;Jang-Mu Heo;Seongeun Kim
    • Journal of the Korean Society of Marine Environment & Safety
    • /
    • v.29 no.1
    • /
    • pp.24-35
    • /
    • 2023
  • Rapid industrialization and urbanization have led to severe marine pollution. A Water Quality Index (WQI) has been developed to allow the effective management of marine pollution. However, the WQI suffers from problems with loss of information due to the complex calculations involved, changes in standards, calculation errors by practitioners, and statistical errors. Consequently, research on the use of artificial intelligence techniques to predict the marine and coastal WQI is being conducted both locally and internationally. In this study, six techniques (RF, XGBoost, KNN, Ext, SVM, and LR) were studied using marine environmental measurement data (2000-2020) to determine the most appropriate artificial intelligence technique to estimate the WOI of five ecoregions in the Korean seas. Our results show that the random forest method offers the best performance as compared to the other methods studied. The residual analysis of the WQI predicted score and actual score using the random forest method shows that the temporal and spatial prediction performance was exceptional for all ecoregions. In conclusion, the RF model of WQI prediction developed in this study is considered to be applicable to Korean seas with high accuracy.

Study of Smart Integration processing Systems for Sensor Data (센서 데이터를 위한 스마트 통합 처리 시스템 연구)

  • Ji, Hyo-Sang;Kim, Jae-Sung;Kim, Ri-Won;Kim, Jeong-Joon;Han, Ik-Joo;Park, Jeong-Min
    • Asia-pacific Journal of Multimedia Services Convergent with Art, Humanities, and Sociology
    • /
    • v.7 no.8
    • /
    • pp.327-342
    • /
    • 2017
  • In this paper, we introduce an integrated processing system of smart sensor data for IoT service which collects sensor data and efficiently processes it. Based on the technology of collecting sensor data to the development of the IoT field and sending it to the network · Based on the receiving technology, as various projects such as smart homes, autonomous running vehicles progress, the sensor data is processed and effectively An autonomous control system to utilize has been a problem. However, since the data type of the sensor for monitoring the autonomous control system varies according to the domain, a sensor data integration processing system applying the autonomous control system to various different domains is necessary. Therefore, in this paper, we introduce the Smart Sensor Data Integrated Processing System, apply it and use the window as a reference to process internal and external sensor data 1) receiveData, 2) parseData, 3) addToDatabase 3 With the process of the stage, we provide and implement the automatic window opening / closing system "Smart Window" which ventilates to create a comfortable indoor environment by autonomous control system. As a result, standby information is collected and monitored, and machine learning for performing statistical analysis and better autonomous control based on the stored data is made possible.

Age classification of emergency callers based on behavioral speech utterance characteristics (발화행태 특징을 활용한 응급상황 신고자 연령분류)

  • Son, Guiyoung;Kwon, Soonil;Baik, Sungwook
    • The Journal of Korean Institute of Next Generation Computing
    • /
    • v.13 no.6
    • /
    • pp.96-105
    • /
    • 2017
  • In this paper, we investigated the age classification from the speaker by analyzing the voice calls of the emergency center. We classified the adult and elderly from the call center calls using behavioral speech utterances and SVM(Support Vector Machine) which is a machine learning classifier. We selected two behavioral speech utterances through analysis of the call data from the emergency center: Silent Pause and Turn-taking latency. First, the criteria for age classification selected through analysis based on the behavioral speech utterances of the emergency call center and then it was significant(p <0.05) through statistical analysis. We analyzed 200 datasets (adult: 100, elderly: 100) by the 5 fold cross-validation using the SVM(Support Vector Machine) classifier. As a result, we achieved 70% accuracy using two behavioral speech utterances. It is higher accuracy than one behavioral speech utterance. These results can be suggested age classification as a new method which is used behavioral speech utterances and will be classified by combining acoustic information(MFCC) with new behavioral speech utterances of the real voice data in the further work. Furthermore, it will contribute to the development of the emergency situation judgment system related to the age classification.

Effects of Personal Protective Equipment Practice Education on the Effectiveness of Repeated Learning and Satisfaction (개인보호구 실습교육의 반복학습 효과와 만족도에 미치는 영향)

  • Dae Jin Jo;Won Souk Eoh
    • Journal of Korean Society of Occupational and Environmental Hygiene
    • /
    • v.33 no.2
    • /
    • pp.156-170
    • /
    • 2023
  • Objectives: This study conducted practical training to improve the proper usage of personal protective equipment(PPE), which greatly impacts workplace safety and health management. Personal protective equipment education was conducted through active participation, without theoretical modules, and aimed to identify the effects of repeated practical education and determine ways to increase participant satisfaction. Methods: Study data were analyzed using the IBM SPSS Statistics ver.29 software. First, participants' general characteristics were analyzed with frequency analysis. Second, the normality and equality of variances (Leven's test) were tested for the dependent variables prior to statistical analyses to determine the use of parametric tests. In general, normality is assumed when the sample size is 30 or more per the central limit theorem (Park et al., 2014). As our sample size of health management workers was 43, normality can be assumed. However, to ensure rigor of the study, we examined skewness and kurtosis. The results confirmed that the data were normally distributed. Third, the effects of repeated PPE training were analyzed using paired t-tests. Fourth, differences in satisfaction with PPE training according to the safety and health job position and safety and health certification were analyzed with t-test and Welch's t-test. For parameters that did not meet the assumption of equal variances, the Welch's t-test was performed. Results: Repeated PPE training improved the educational outcomes, and the improvements were significant in the 1st and 2nd respiratory PPE and safety and hygiene PPE training evaluations (p<.001). In terms of safety and health job position, repeated training led to improvements in educational outcomes, with significant improvements observed among supervisors and specialized health management institution workers in the 1st and 2nd training evaluations (p<.005). In terms of safety certification, repeated training led to improvements in educational outcomes, with significant improvements observed among both certified and non-certified individuals (p<.005). Regarding satisfaction with PPE training according to safety and health job positions, specialized health management institution workers showed greater satisfaction than supervisors, with significant differences in the satisfaction for expertise of lecture, work relevance, and lecturer's attitude (p<.001). Regarding satisfaction with PPE training according to safety and health certification, satisfaction was higher among certified individuals, with significant differences in satisfaction for work relevance and lecture attitude (p<.05) Conclusions: PPE education should be recommended to be provided as practical training. Repeated training can enhance educational outcomes for individuals with inadequate knowledge and understanding of PPE prior to education. For individuals with high levels of pre-existing knowledge and understanding of PPE, the results show that various training experiences should be provided to enhance their satisfaction. Therefore, it suggests that the workplace should actively seek educational media and methods to acquire expertise and skills in wearing personal protective equipment and improve the ability to use

The Effect of Engineering Design Based Ocean Clean Up Lesson on STEAM Attitude and Creative Engineering Problem Solving Propensity (공학설계기반 오션클린업(Ocean Clean-up) 수업이 STEAM태도와 창의공학적 문제해결성향에 미치는 효과)

  • DongYoung Lee;Hyojin Yi;Younkyeong Nam
    • Journal of the Korean earth science society
    • /
    • v.44 no.1
    • /
    • pp.79-89
    • /
    • 2023
  • The purpose of this study was to investigate the effects of engineering design-based ocean cleanup classes on STEAM attitudes and creative engineering problem-solving dispositions. Furthermore, during this process, we tried to determine interesting points that students encountered in engineering design-based classes. For this study, a science class with six lessons based on engineering design was developed and reviewed by a professor who majored in engineering design, along with five engineering design experts with a master's degree or higher. The subject of the class was selected as the design and implementation of scientific and engineering measures to reduce marine pollution based on the method implemented in an actual Ocean Clean-up Project. The engineering design process utilized the engineering design model presented by NGSS (2013), and was configured to experience redesign through the optimization process. To verify effectiveness, the STEAM attitude questionnaire developed by Park et al. (2019) and the creative engineering problemsolving propensity test tool developed by Kang and Nam (2016) were used. A pre and post t-test was used for statistical analysis for the effectiveness test. In addition, the contents of interesting points experienced by the learners were transcribed after receiving descriptive responses, and were analyzed and visualized through degree centrality analysis. Results confirmed that engineering design in science classes had a positive effect on both STEAM attitude and creative engineering problem-solving disposition (p< .05). In addition, as a result of unstructured data analysis, science and engineering knowledge, engineering experience, and cooperation and collaboration appeared as factors in which learners were interested in learning, confirming that engineering experience was the main factor.