• 제목/요약/키워드: prediction score

검색결과 516건 처리시간 0.025초

Prediction of OPS(On-base Plus Slugging) in KBO League (한국프로야구에서 장타율과 출루율(OPS) 예측 연구)

  • Dong Yun Shin;Jinho Kim
    • The Journal of Bigdata
    • /
    • 제7권1호
    • /
    • pp.49-61
    • /
    • 2022
  • In sports, the proportion of data analysis in team management such as team strategy planning and marketing is increasing. In KBO(Korea Baseball Organization) league, in particular, plans such as recruiting players and fostering players are established to devise team strategies for the next year, such as FA and trade, at the end of a season. For these reasons, it is very important to predict players' performance for the next year. In this study, the target was limited to only the batter and tried to find out how to predict whether the performance of the next year will improve. As a standard record for rising and falling, OPS(On-Base Plus Slugging), which is easy to calculate and has a high relationship with team score, was used. In this study, 40 years of regular season data from 1982 to 2021 were used as data, and 11 machine learning classification models were used as experimental methods. Predicting the rise and fall of OPS, RBF SVM, Neural Net, Gaussian Process, and AdaBoost were more accurate than other classification models, and age did not significantly affect accuracy.

Automated Prioritization of Construction Project Requirements using Machine Learning and Fuzzy Logic System

  • Hassan, Fahad ul;Le, Tuyen;Le, Chau;Shrestha, K. Joseph
    • International conference on construction engineering and project management
    • /
    • The 9th International Conference on Construction Engineering and Project Management
    • /
    • pp.304-311
    • /
    • 2022
  • Construction inspection is a crucial stage that ensures that all contractual requirements of a construction project are verified. The construction inspection capabilities among state highway agencies have been greatly affected due to budget reduction. As a result, efficient inspection practices such as risk-based inspection are required to optimize the use of limited resources without compromising inspection quality. Automated prioritization of textual requirements according to their criticality would be extremely helpful since contractual requirements are typically presented in an unstructured natural language in voluminous text documents. The current study introduces a novel model for predicting the risk level of requirements using machine learning (ML) algorithms. The ML algorithms tested in this study included naïve Bayes, support vector machines, logistic regression, and random forest. The training data includes sequences of requirement texts which were labeled with risk levels (such as very low, low, medium, high, very high) using the fuzzy logic systems. The fuzzy model treats the three risk factors (severity, probability, detectability) as fuzzy input variables, and implements the fuzzy inference rules to determine the labels of requirements. The performance of the model was examined on labeled dataset created by fuzzy inference rules and three different membership functions. The developed requirement risk prediction model yielded a precision, recall, and f-score of 78.18%, 77.75%, and 75.82%, respectively. The proposed model is expected to provide construction inspectors with a means for the automated prioritization of voluminous requirements by their importance, thus help to maximize the effectiveness of inspection activities under resource constraints.

  • PDF

Evaluation of U-Net Based Learning Models according to Equalization Algorithm in Thyroid Ultrasound Imaging (갑상선 초음파 영상의 평활화 알고리즘에 따른 U-Net 기반 학습 모델 평가)

  • Moo-Jin Jeong;Joo-Young Oh;Hoon-Hee Park;Joo-Young Lee
    • Journal of radiological science and technology
    • /
    • 제47권1호
    • /
    • pp.29-37
    • /
    • 2024
  • This study aims to evaluate the performance of the U-Net based learning model that may vary depending on the histogram equalization algorithm. The subject of the experiment were 17 radiology students of this college, and 1,727 data sets in which the region of interest was set in the thyroid after acquiring ultrasound image data were used. The training set consisted of 1,383 images, the validation set consisted of 172 and the test data set consisted of 172. The equalization algorithm was divided into Histogram Equalization(HE) and Contrast Limited Adaptive Histogram Equalization(CLAHE), and according to the clip limit, it was divided into CLAHE8-1, CLAHE8-2. CLAHE8-3. Deep Learning was learned through size control, histogram equalization, Z-score normalization, and data augmentation. As a result of the experiment, the Attention U-Net showed the highest performance from CLAHE8-2 to 0.8355, and the U-Net and BSU-Net showed the highest performance from CLAHE8-3 to 0.8303 and 0.8277. In the case of mIoU, the Attention U-Net was 0.7175 in CLAHE8-2, the U-Net was 0.7098 and the BSU-Net was 0.7060 in CLAHE8-3. This study attempted to confirm the effects of U-Net, Attention U-Net, and BSU-Net models when histogram equalization is performed on ultrasound images. The increase in Clip Limit can be expected to increase the ROI match with the prediction mask by clarifying the boundaries, which affects the improvement of the contrast of the thyroid area in deep learning model learning, and consequently affects the performance improvement.

Genetic evaluation for economic traits of commercial Hanwoo population using single-step GBLUP

  • Gwang Hyeon Lee;Khaliunaa Tseveen;Yoon Seok Lee;Hong Sik Kong
    • Journal of Animal Reproduction and Biotechnology
    • /
    • 제38권4호
    • /
    • pp.268-274
    • /
    • 2023
  • Background: Recently, the single-step genomic best linear unbiased prediction (ssGBLUP) method, which incorporates not only genomic information but also phenotypic information of pedigree, is under study. In this study, we performed a ssGBLUP analysis on a commercial Hanwoo population using phenotypic, genotypic, and pedigree data. Methods: The test population comprised Hanwoo 1,740 heads raised in four regions of Korea, while the reference population used Hanwoo 18,499 heads raised across the country and two-generation pedigree data. Analysis was performed using genotype data generated by the Hanwoo 50 K SNP beadchip. Results: The mean Genome estimated breeding values (GEBVs) estimated using the ssGBLUP methods for carcass weight (CWT), eye muscle area (EMA), back fat thickness (BFT), and marbling score (MS) were 7.348, 1.515, -0.355, and 0.040, respectively, while the accuracy of each trait was 0.749, 0.733, 0.769, and 0.768, respectively. When the correlation analysis between the GEBVs as a result of this study and the actual slaughter performance was confirmed, CWT, EMA, BFT, and MS were reported to be 0.519, 0.435, 0.444, and 0.543, respectively. Conclusions: Our results suggest that the ssGBLUP method enables a more accurate evaluation because it conducts a genetic evaluation of an individual using not only genotype information but also phenotypic information of the pedigree. Individual evaluation using the ssGBLUP method is considered effective for enhancing the genetic ability of farms and enabling accurate and rapid improvements. It is considered that if more pedigree information of reference population is collected for analysis, genetic ability can be evaluated more accurately.

Preliminary Inspection Prediction Model to select the on-Site Inspected Foreign Food Facility using Multiple Correspondence Analysis (차원축소를 활용한 해외제조업체 대상 사전점검 예측 모형에 관한 연구)

  • Hae Jin Park;Jae Suk Choi;Sang Goo Cho
    • Journal of Intelligence and Information Systems
    • /
    • 제29권1호
    • /
    • pp.121-142
    • /
    • 2023
  • As the number and weight of imported food are steadily increasing, safety management of imported food to prevent food safety accidents is becoming more important. The Ministry of Food and Drug Safety conducts on-site inspections of foreign food facilities before customs clearance as well as import inspection at the customs clearance stage. However, a data-based safety management plan for imported food is needed due to time, cost, and limited resources. In this study, we tried to increase the efficiency of the on-site inspection by preparing a machine learning prediction model that pre-selects the companies that are expected to fail before the on-site inspection. Basic information of 303,272 foreign food facilities and processing businesses collected in the Integrated Food Safety Information Network and 1,689 cases of on-site inspection information data collected from 2019 to April 2022 were collected. After preprocessing the data of foreign food facilities, only the data subject to on-site inspection were extracted using the foreign food facility_code. As a result, it consisted of a total of 1,689 data and 103 variables. For 103 variables, variables that were '0' were removed based on the Theil-U index, and after reducing by applying Multiple Correspondence Analysis, 49 characteristic variables were finally derived. We build eight different models and perform hyperparameter tuning through 5-fold cross validation. Then, the performance of the generated models are evaluated. The research purpose of selecting companies subject to on-site inspection is to maximize the recall, which is the probability of judging nonconforming companies as nonconforming. As a result of applying various algorithms of machine learning, the Random Forest model with the highest Recall_macro, AUROC, Average PR, F1-score, and Balanced Accuracy was evaluated as the best model. Finally, we apply Kernal SHAP (SHapley Additive exPlanations) to present the selection reason for nonconforming facilities of individual instances, and discuss applicability to the on-site inspection facility selection system. Based on the results of this study, it is expected that it will contribute to the efficient operation of limited resources such as manpower and budget by establishing an imported food management system through a data-based scientific risk management model.

Feasibility of Deep Learning Algorithms for Binary Classification Problems (이진 분류문제에서의 딥러닝 알고리즘의 활용 가능성 평가)

  • Kim, Kitae;Lee, Bomi;Kim, Jong Woo
    • Journal of Intelligence and Information Systems
    • /
    • 제23권1호
    • /
    • pp.95-108
    • /
    • 2017
  • Recently, AlphaGo which is Bakuk (Go) artificial intelligence program by Google DeepMind, had a huge victory against Lee Sedol. Many people thought that machines would not be able to win a man in Go games because the number of paths to make a one move is more than the number of atoms in the universe unlike chess, but the result was the opposite to what people predicted. After the match, artificial intelligence technology was focused as a core technology of the fourth industrial revolution and attracted attentions from various application domains. Especially, deep learning technique have been attracted as a core artificial intelligence technology used in the AlphaGo algorithm. The deep learning technique is already being applied to many problems. Especially, it shows good performance in image recognition field. In addition, it shows good performance in high dimensional data area such as voice, image and natural language, which was difficult to get good performance using existing machine learning techniques. However, in contrast, it is difficult to find deep leaning researches on traditional business data and structured data analysis. In this study, we tried to find out whether the deep learning techniques have been studied so far can be used not only for the recognition of high dimensional data but also for the binary classification problem of traditional business data analysis such as customer churn analysis, marketing response prediction, and default prediction. And we compare the performance of the deep learning techniques with that of traditional artificial neural network models. The experimental data in the paper is the telemarketing response data of a bank in Portugal. It has input variables such as age, occupation, loan status, and the number of previous telemarketing and has a binary target variable that records whether the customer intends to open an account or not. In this study, to evaluate the possibility of utilization of deep learning algorithms and techniques in binary classification problem, we compared the performance of various models using CNN, LSTM algorithm and dropout, which are widely used algorithms and techniques in deep learning, with that of MLP models which is a traditional artificial neural network model. However, since all the network design alternatives can not be tested due to the nature of the artificial neural network, the experiment was conducted based on restricted settings on the number of hidden layers, the number of neurons in the hidden layer, the number of output data (filters), and the application conditions of the dropout technique. The F1 Score was used to evaluate the performance of models to show how well the models work to classify the interesting class instead of the overall accuracy. The detail methods for applying each deep learning technique in the experiment is as follows. The CNN algorithm is a method that reads adjacent values from a specific value and recognizes the features, but it does not matter how close the distance of each business data field is because each field is usually independent. In this experiment, we set the filter size of the CNN algorithm as the number of fields to learn the whole characteristics of the data at once, and added a hidden layer to make decision based on the additional features. For the model having two LSTM layers, the input direction of the second layer is put in reversed position with first layer in order to reduce the influence from the position of each field. In the case of the dropout technique, we set the neurons to disappear with a probability of 0.5 for each hidden layer. The experimental results show that the predicted model with the highest F1 score was the CNN model using the dropout technique, and the next best model was the MLP model with two hidden layers using the dropout technique. In this study, we were able to get some findings as the experiment had proceeded. First, models using dropout techniques have a slightly more conservative prediction than those without dropout techniques, and it generally shows better performance in classification. Second, CNN models show better classification performance than MLP models. This is interesting because it has shown good performance in binary classification problems which it rarely have been applied to, as well as in the fields where it's effectiveness has been proven. Third, the LSTM algorithm seems to be unsuitable for binary classification problems because the training time is too long compared to the performance improvement. From these results, we can confirm that some of the deep learning algorithms can be applied to solve business binary classification problems.

A STUDY ON THE CORRELATIONSHIP OF SUBMENTOVERTEX VIEW AND LATERAL CEPHALOGRAM MEASUREMENTS (이하두정방사선사진과 측모두부방사선사진상에서의 계측치 상호연관성에 관한연구)

  • Cho, Jae-Hyung;Ryu, Young-Kyu
    • The korean journal of orthodontics
    • /
    • 제26권4호
    • /
    • pp.414-420
    • /
    • 1996
  • Cephalometric measureements have disadvantage of representing cranio-facial structures in two dimension only and therefore they pose limitations in describing three-dimentional structures of cranio-facial region. More interests have been put on the correlation between the two planes. This study evaluated correlations between facial type score, which allows effects on malocclusion, growth change prediction and establishment of treatment method and prognosis, and measurements from submentovertex view. Cephalometric view and submentovertex view were taken of skeletal Class I adults with optimal profile and correlations between them have been observed. Following results were obtained: 1. To learn about factors that influence average condylar angulation, FACE, INT-CO-ANG, MN-CORPUS, CON-RATIO, GON-RATIO, MN-RATIO were used as variables and underwent multiple regression analysis. As a result, the following equation was obtained : CON-AVE=.l73(FACE)-.322(INT-CO-ANG)+36.34(GON-RATIO) +.420(MN-CORPUS) (($R^2=.85451$) 2. The following equation was obtained concerning facial type score. FACE= .050(CON-ANG)+.023(INT-CO-ANG)-.075(MN-CORPUS)($R^2=.31547$) 3. Among the submentovertex measurements, MN-CORPUS, CON-RATIO, GON-RATIO, MN-RATIO showed close correlations. (P<0.05) 4. Average condylar angualtions were $23.37^{\circ}$ on the right and $20.71^{\circ}$ on left. There was a difference between the two. FACE : facial type soore. CON-ANG: mean value of condylar angulation. CON-AVE: mean value of Rt. Lt condylar angulation. INT-CO-ANG : angle between Rt. Lt condylar axis. MN-CORPUS : angle formed between RT. Lt gonion & pogonion. CON-RATIO: lntercondylar distance/mandibular body length. GON-RATIO : intergonion distanoe/mandibular body length. MN-RATIO: lntermylohyoid distance/mandibular body length. MX-RATIO: intermaxillary tuberosity distance/ANS-PNS distance.

  • PDF

Risk Ranking Analysis for the City-Gas Pipelines in the Underground Laying Facilities (지하매설물 중 도시가스 지하배관에 대한 위험성 서열화 분석)

  • Ko, Jae-Sun;Kim, Hyo
    • Fire Science and Engineering
    • /
    • 제18권1호
    • /
    • pp.54-66
    • /
    • 2004
  • In this article, we are to suggest the hazard-assessing method for the underground pipelines, and find out the pipeline-maintenance schemes of high efficiency in cost. Three kinds of methods are applied in order to refer to the approaching methods of listing the hazards for the underground pipelines: the first is RBI(Risk Based Inspection), which firstly assess the effect of the neighboring population, the dimension, thickness of pipe, and working time. It enables us to estimate quantitatively the risk exposure. The second is the scoring system which is based on the environmental factors of the buried pipelines. Last we quantify the frequency of the releases using the present THOMAS' theory. In this work, as a result of assessing the hazard of it using SPC scheme, the hazard score related to how the gas pipelines erodes indicate the numbers from 30 to 70, which means that the assessing criteria define well the relative hazards of actual pipelines. Therefore. even if one pipeline region is relatively low score, it can have the high frequency of leakage due to its longer length. The acceptable limit of the release frequency of pipeline shows 2.50E-2 to 1.00E-l/yr, from which we must take the appropriate actions to have the consequence to be less than the acceptable region. The prediction of total frequency using regression analysis shows the limit operating time of pipeline is the range of 11 to 13 years, which is well consistent with that of the actual pipeline. Concludingly, the hazard-listing scheme suggested in this research will be very effectively applied to maintaining the underground pipelines.

Prediction of Improvement of Hibernating Myocardium after Coronary Artery Bypass Grafting -The role of dobutamine stress echocardiography- (동면심근을 가진 관상동맥 환자의 수술 후 기능회복의 예측에 대한 임상적 고찰 - Dobutamine 심초음파의 역할 -)

  • 유경종;강면식;이교준;김대준;임세중;정남식
    • Journal of Chest Surgery
    • /
    • 제31권8호
    • /
    • pp.776-780
    • /
    • 1998
  • Background: In patients with coronary artery disease, dysfunctional hypoperfused myocardium at rest may represent either nonviable or viable hibernating myocardium. Two-dimensional echocardiography can detect regional wall motion abnormalities resulting from myocardial ischemia by dobutamine infusion. The purpose of the present study was to identify the prediction of improvement of regional left ventricular(LV) function after surgical revascularization. Materials and methods: Sixteen patients with chronic regional LV dysfunction underwent dobutamine stress echocardiography(DSE) (dobutamine: baseline, 5, 10, 20$\mu$g/kg/min) before coronary artery bypass grafting(CABG) and underwent echocardiography at least 2 months after CABG. Results: All patients were male with mean age of 58 years ranging from 42 to 73 years. The mean LV ejection fraction was 41.8% with a range from 19% to 55%. During DSE, there were no complications, also, there were no operative morbidities or mortalities. Improvement of wall motion within the dysfunctional myocardium was found in 8(50%) of 16 patients in DSE. Among them, 6 patients(75%) showed functional recovery after CABG. Another 8 patients did not show improvement of wall motion in DSE. But among them, 3 patients(38%) showed functional recovery after CABG. 84 dysfunctional segments were found in 256 segments of 16 patients. Improvement of wall motion was found in 34 of 84 segments in DSE. Among them, 23 segments(74%) showed functional recovery after CABG. Another 53 segments did not show improvement of wall motion in DSE. But among them, 12 segments(23%) showed functional recovery after CABG. The sensitivity and specificity of DSE for the prediction of postoperative improvement of segmental wall motion were 66% and 84%, respectively. The positive and negative predictive value of DSE were 74% and 77%, respectively. In patients with chronic regional LV dysfunction, think that DSE is a good predictor of the improvement of dysfunctional segments after CABG.

  • PDF

A Study on the Evaluative Models and Indicators for Diagnosis of Urban Visual Landscape - Focusing on Seoul City - (도시경관 진단을 위한 평가모델 및 지표개발 연구 - 서울시를 중심으로 -)

  • Kim, Seung-Ju;Im, Seung-Bin
    • Journal of the Korean Institute of Landscape Architecture
    • /
    • 제37권1호
    • /
    • pp.78-86
    • /
    • 2009
  • Recently, there seems to besome problems in the urban visual landscape as a result of continuous economic growth and industrial development. At the same time, the public has begun to be aware of the importance of visual resources, and the necessity for visual landscape conservation and improvement. Therefore, the development of evaluative indicators for systematic visual landscape planning and design is urgent. The purpose ofthis study is to discover evaluative models and indicators for the diagnosis of urban visual landscapes. This study included the selection of 18 physical indicators(statistical data) by literature reviews, adoption of field and questionnaire surveys at 12 autonomous districts in Seoul and surrounding major mountain valleys and river streams(i.e. Mt. Nam and Han-River). The content of the questionnaire is scenic beauty. Moreover, the linear regression analysis between the scenic beauty mean scores and the physical indicator scores figure out the scenic beauty prediction model. As this study suggests, the most important indicators in urban visual landscapes are 'Greens', 'Park' and 'the number of apartment buildings(higher than 20 stories).' Based on the results, greens and parks should be priority elements to considerin urban landscape planning and design. Moreover, since the number of apartment buildings that are higher than 20 stories has a negative correlation with the scenic beauty score, it can be used as basic data for landscape planning. For the scenic beauty prediction models and evaluative indicators suggest a direction of urban management, each indicator becomes basic data for visual landscape planning and design. In following studies, if physical indicators and case studies are added, the scenic beauty prediction models and evaluative indicators could be more synthetic and systematic. Moreover, the development of physical indicators in three dimensions(3D)(i.e. results from visual district analysis, view surface analysis) could be expected to obtain more general and varied results.