• Title/Summary/Keyword: Categorical data analysis

Search Result 195, Processing Time 0.027 seconds

The use of data mining methods for dystocia detection in Polish Holstein-Friesian Black-and-White cattle

  • Zaborski, Daniel;Proskura, Witold S.;Grzesiak, Wilhelm
    • Asian-Australasian Journal of Animal Sciences
    • /
    • v.31 no.11
    • /
    • pp.1700-1713
    • /
    • 2018
  • Objective: The aim of this study was to verify the usefulness of artificial neural networks (ANN), multivariate adaptive regression splines (MARS), naïve Bayes classifier (NBC), general discriminant analysis (GDA), and logistic regression (LR) for dystocia detection in Polish Holstein-Friesian Black-and-White heifers and cows and to indicate the most influential predictors of calving difficulty. Methods: A total of 1,342 and 1,699 calving records including six categorical and four continuous predictors were used. Calving category (difficult vs easy or difficult, moderate and easy) was the dependent variable. Results: The maximum sensitivity, specificity and accuracy achieved for heifers on the independent test set were 0.855 (for ANN), 0.969 (for NBC), and 0.813 (for GDA), respectively, whereas the values for cows were 0.600 (for ANN), 1.000 and 0.965 (for NBC, GDA, and LR), respectively. With the three categories of calving difficulty, the maximum overall accuracy for heifers and cows was 0.589 (for MARS) and 0.649 (for ANN), respectively. The most influential predictors for heifers were an average calving difficulty score for the dam's sire, calving age and the mean yield of the farm, where the heifer was kept, whereas for cows, these additionally included: calf sex, the difficulty of the preceding calving, and the mean daily milk yield for the preceding lactation. Conclusion: The potential application of the investigated models in dairy cattle farming requires, however, their further improvement in order to reduce the rate of dystocia misdiagnosis and to increase detection reliability.

A Study on Forecasting Accuracy Improvement of Case Based Reasoning Approach Using Fuzzy Relation (퍼지 관계를 활용한 사례기반추론 예측 정확성 향상에 관한 연구)

  • Lee, In-Ho;Shin, Kyung-Shik
    • Journal of Intelligence and Information Systems
    • /
    • v.16 no.4
    • /
    • pp.67-84
    • /
    • 2010
  • In terms of business, forecasting is a work of what is expected to happen in the future to make managerial decisions and plans. Therefore, the accurate forecasting is very important for major managerial decision making and is the basis for making various strategies of business. But it is very difficult to make an unbiased and consistent estimate because of uncertainty and complexity in the future business environment. That is why we should use scientific forecasting model to support business decision making, and make an effort to minimize the model's forecasting error which is difference between observation and estimator. Nevertheless, minimizing the error is not an easy task. Case-based reasoning is a problem solving method that utilizes the past similar case to solve the current problem. To build the successful case-based reasoning models, retrieving the case not only the most similar case but also the most relevant case is very important. To retrieve the similar and relevant case from past cases, the measurement of similarities between cases is an important key factor. Especially, if the cases contain symbolic data, it is more difficult to measure the distances. The purpose of this study is to improve the forecasting accuracy of case-based reasoning approach using fuzzy relation and composition. Especially, two methods are adopted to measure the similarity between cases containing symbolic data. One is to deduct the similarity matrix following binary logic(the judgment of sameness between two symbolic data), the other is to deduct the similarity matrix following fuzzy relation and composition. This study is conducted in the following order; data gathering and preprocessing, model building and analysis, validation analysis, conclusion. First, in the progress of data gathering and preprocessing we collect data set including categorical dependent variables. Also, the data set gathered is cross-section data and independent variables of the data set include several qualitative variables expressed symbolic data. The research data consists of many financial ratios and the corresponding bond ratings of Korean companies. The ratings we employ in this study cover all bonds rated by one of the bond rating agencies in Korea. Our total sample includes 1,816 companies whose commercial papers have been rated in the period 1997~2000. Credit grades are defined as outputs and classified into 5 rating categories(A1, A2, A3, B, C) according to credit levels. Second, in the progress of model building and analysis we deduct the similarity matrix following binary logic and fuzzy composition to measure the similarity between cases containing symbolic data. In this process, the used types of fuzzy composition are max-min, max-product, max-average. And then, the analysis is carried out by case-based reasoning approach with the deducted similarity matrix. Third, in the progress of validation analysis we verify the validation of model through McNemar test based on hit ratio. Finally, we draw a conclusion from the study. As a result, the similarity measuring method using fuzzy relation and composition shows good forecasting performance compared to the similarity measuring method using binary logic for similarity measurement between two symbolic data. But the results of the analysis are not statistically significant in forecasting performance among the types of fuzzy composition. The contributions of this study are as follows. We propose another methodology that fuzzy relation and fuzzy composition could be applied for the similarity measurement between two symbolic data. That is the most important factor to build case-based reasoning model.

A Meta-Analysis on Improvement in Locomotor Skills of Children with Disabilities by Physical Activity Programs (신체활동 프로그램 참여가 장애아동의 이동운동능력에 미치는 효과: 메타분석)

  • Han, Byum Suk;Lee, Tae Hee;Chun, Hea Ja
    • 재활복지
    • /
    • v.20 no.3
    • /
    • pp.83-104
    • /
    • 2016
  • The purpose of this study was to identify improvement in locomotor skills by physical activity programs. Method of this study indicates that the current literature (2004-2015) were reviewed and the data from 24 studies with 518 disabled children were analyzed by using CMA3 (Comprehensive Meta-Analysis ver.3) program. Analyzing the data of the primary studies included gender, age, type of disabilities, duration of the physical activity program intervention(weeks, session per week, minutes per session), run, gallop, hop, leap, horizontal jump, and slide. For sensitivity analysis, publication bias and outlier were reviewed. Results of analysis indicates that the overall effect size of improvement in locomotor skills by physical activity programs was 1.143. There were large effect size in categorical analyses. Autistic spectrum among type of disabilities was 1.697 and run among 6 of locomotor skills was 1.019. 8~10 aged was 0.920 and the intervention of 100~120minutes(1.261)per session, 3sessions(1.078) per week, 16~20(1.587)weeks was found to be more larger than the others. In conclusion, improvement in locomotor skills by program participation showed that treated group was 37% more effective than control group.

An Analysis of Image Use in Twitter Message (트위터 상의 이미지 이용에 관한 분석)

  • Chung, EunKyung;Yoon, JungWon
    • Journal of the Korean BIBLIA Society for library and Information Science
    • /
    • v.24 no.4
    • /
    • pp.75-90
    • /
    • 2013
  • Given the context that users are actively using social media with multimedia embedded information, the purpose of this study is to demonstrate how images are used within Twitter messages, especially in influential and favorited messages. In order to achieve the purpose of this study, the top 200 influential and favorited messages with images were selected out of 1,589 tweets related to "Boston bombing" in April 2013. The characteristics of the message, image use, and user are analyzed and compared. Two phases of the analysis were conducted on three data sets containing the top 200 influential messages, top 200 favorited messages, and general messages. In the first phase, coding schemes have been developed for conducting three categorical analyses: (1) categorization of tweets, (2) categorization of image use, and (3) categorization of users. The three data sets were then coded using the coding schemes. In the second phase, comparison analyses were conducted among influential, favorited, and general tweets in terms of tweet type, image use, and user. While messages expressing opinion were found to be most favorited, the messages that shared information were recognized as most influential to users. On the other hand, as only four image uses - information dissemination, illustration, emotive/persuasive, and information processing - were found in this data set, the primary image use is likely to be data-driven rather than object-driven. From the perspective of users, the user types such as government, celebrity, and photo-sharing sites were found to be favorited and influential. An improved understanding of how users' image needs, in the context of social media, contribute to the body of knowledge of image needs. This study will also provide valuable insight into practical designs and implications of image retrieval systems or services.

Statistical analysis of hazen-williams C and influencing factors in multi-regional water supply system (광역상수도 유속계수와 영향인자에 관한 통계적 분석)

  • Kim, Bumjun;Kim, Gilho;Kim, Hung soo
    • Journal of Korea Water Resources Association
    • /
    • v.49 no.5
    • /
    • pp.399-410
    • /
    • 2016
  • In case of the application of Hazen-Williams C for design, operation or maintenance of water supply system, field situations always should be reflected on the factors. In this study, the relationships between C factors and influencing factors are analyzed using statistical techniques with 174 measured C factor data collected in periodic inspection for safety diagnosis in multi-regional water supply systems. To analyze their relationships, cross analysis, one-way ANOVA, correlation analysis were conducted. Analysis results showed that C factors had high correlations with both of elapsed year and pipe diameter and were relatively highly affected by coating material among influencing factors with the categorical type. On the other hand, elapsed year, pipe diameter and water type were meaningful influencing factors according to the results of multiple regression analysis. The Cluster analysis revealed that C factors had a tendency of being fundamentally classified on the basis of the elapsed year of about 20 years and the pipe diameter of 1500mm. Although C factors were generally greatly affected by elapsed year, size of pipe diameter relatively had an large influence on values of them in case of large diameter pipes. Lastly, It can be suggested that C factor estimation formulas using multiple regression analysis and clustering analysis in this study, can be applied as decision standards of C factor in multi-regional water supply systems.

An Empirical Study on Children′s Peer Status Perception (아동의 또래지위지각 관련변인 연구)

  • Song, Soon
    • Journal of Families and Better Life
    • /
    • v.20 no.2
    • /
    • pp.147-159
    • /
    • 2002
  • The purpose of this study is to investigate children's perceptions of their own peer status and the variables that affect the perception. Four hundred boys and girls in grades five and six participated in this study. The participants were sampled from elementary schools located in two cities in Cheon-buk Province. Out of the 400 self-report questionnaires filled by the participants, 380 were used for the data analyses. The methods of analyses included basic descriptive categorical analysis (frequencies, means, percentages) as well as t-test, one way ANOVA, and multiple regressions. To summarize major findings from the analyses; first, a significant difference was found in children's aggression by father's job and mother's age, in children's popularity by school GPA, father's education, mother's education, and fathers job, and in children's isolation by father's age, father's education, mother's education, and father's job. Second, children's aggression was significantly dependent upon self-esteem, loneliness, family harmony, and family communication. Children's popularity was related with school grade, name satisfaction, body satisfaction, self-esteem, number of close friends, loneliness, family harmony family communication, parental love and acceptance, and perceived closeness to mother. Children's isolation was significantly associated with school grade, body satisfaction, self-esteem, number of close friends, loneliness, family harmony, family communication, parental love and acceptance, and perceived closeness to mother Third, according to the multiple regression analyses, it was found that highly aggressive children tend to report less family harmony, more loneliness, and a larger number of friends. Also, highly popular children tend to report less loneliness, larger number of friends, strong family harmony, and higher academic achievement. On the other hand, highly isolated children tend to perceive weak family harmony, more loneliness, and lower body satisfaction. Lastly, the overall peer status indicator depended significantly on family harmony, loneliness, self-esteem, academic achievement, body satisfaction.

A Study on the Decision-Making of Private Banker's in Recommending Hedge Fund among Financial Goods (은행 금융상품에서 프라이빗 뱅커의 전문투자형 사모펀드 추천 의사결정)

  • Yu, Hwan;Lee, Young-Jai
    • The Journal of Information Systems
    • /
    • v.28 no.4
    • /
    • pp.333-358
    • /
    • 2019
  • Purpose The study aims to develop a data-based decision model for private bankers when recommending hedge funds to their customers in financial institutions. Design/methodology/approach The independent variables are set in two groups. The independent variables of the first group are aggressive investors, active investors, and risk-neutral type investors. In the second group, variables considered by private bankers include customer propensity to invest, reliability, product subscription experience, professionalism, intimacy, and product understanding. A decision-making variable for a private banker is in recommending a first-rate general private fund composed of foreign and domestic FinTech products. These contain dependent variables that include target return rate(%), fund period (months), safeguard existence, underlying asset, and hedge fund name. Findings Based on the research results, there is a 94.4% accuracy in decision-making when the independent variables (customer rating, reliability, intimacy, product subscription experience, professionalism and product understanding) are used according to the following order of relevant dependent variables: step 1 on safeguard existence, step 2 on target return rate, step 3 on fund period, and step 4 on hedge fund name. Next, a 93.7% accuracy is expected when decision-making uses the following order of dependent variables: step 1 on safeguard existence, step 2 on target return rate, step 3 on underlying asset, and step 4 on fund period. In conclusion, a private banker conducts a decision making stage when recommending hedge funds to their customers. When examining a private banker's recommendations of hedge funds to a customer, independent variables influencing dependent variables are intimacy, product comprehension, and product subscription experience according to a categorical regression model and artificial neural network analysis model.

Influence of Microcurrent Therapy in Interleukin-1 Expression in Rhueumatoid Arthritis Rats (미세전류치료가 류마티스 관절염 유발 흰쥐의 Interleukin-1 발현에 미치는 영향)

  • Lee, Hyun-Min;Chae, Yun-Won
    • The Journal of Korean Physical Therapy
    • /
    • v.21 no.2
    • /
    • pp.103-108
    • /
    • 2009
  • Purpose: Electrical stimulation is one of several treatments recommended for RA patients. Electrical stimulation of RA patients, reduces pain, or facilitates joint motion prior to exercises. However, there is still limited evidence on the efficacy of electrical stimulation and thus any conclusions drawn about this method remain controversial. Recently, Microcurrent Electrical Neuromuscular Stimulation (MENS) has received significant attention as a potential method of electrical stimulation. In this study, we investigated the effect of microcurrent treatment in rheumatoid arthritis rat. Methods: Subjects were allocated either to the control group or experimental group, which was subject to microcurrent stimulation. Interleukin-1 expression in the metatarsophlangeal joint and the oedema index in the ankle were used for classification and subsequent evaluation of pathology. Subjects were assessed at 1, 7 and 14 days after inducing rheumatoid arthritis through adjuvant injection. Thirty-six subjects, 18 in each group, were used in this study. Statistical analysis was performed by calculating the differences between the two groups and between each interval assessment. Categorical variables were compared between the two groups with the paired-T test. The one-way ANOVA test was performed to assess changes in ordinal variables. Results: Baseline characteristics were similar in both groups. Statistically significant differences were found between the two groups. The biological marker of pro-inflammatory cytokine and oedema index were decreased in response to this treatment. Conclusion: These data show that treatment of rheumatoid arthritis with a microcurrent stimulation device reduced the oedema index and pro-inflammatory cytokine IL-1.

  • PDF

Landslide Risk Assessment in Inje Using Logistic Regression Model (로지스틱 회귀분석을 이용한 인제군 산사태지역의 위험도 평가)

  • Lee, Hwan-Gil;Kim, Gi-Hong
    • Journal of the Korean Society of Surveying, Geodesy, Photogrammetry and Cartography
    • /
    • v.30 no.3
    • /
    • pp.313-321
    • /
    • 2012
  • Korea has been continuously affected by landslides, as 70% of the land is covered by mountains and most of annual rainfall concentrates between June and September. Recently, abrupt climate change affects the increase of landslide occurrence. Gangwon region is especially suffered by landslide damages, because the most of the part is mountainous, steep, and having shallow soil. In this study, a landslide risk assessment model was developed by applying logistic regression to the various data of Duksan-ri, Inje-eup, Inje-gun, Gangwon-do, which has suffered massive landslide triggered by heavy rain in July 2006. The information collected from field investigation and aerial photos right after the landslide of study area were stored in GIS DB for analysis. Slope gradient entered in two ways-as categorical variable and as linear variable. Error matrix for each case was made, and developed model showed the classification accuracy of 81.4% and 81.9%, respectively.

The Experience of Fathers who Have Children with Internet Addiction (인터넷 중독 아동을 자녀로 둔 아버지 경험)

  • Lee, Hwa Sook;Yee, Young Hwan
    • Korean Journal of Childcare and Education
    • /
    • v.9 no.5
    • /
    • pp.437-460
    • /
    • 2013
  • The purpose of this study was to examine the image perceived by fathers of children with Internet addiction. To achieve the study purpose, we conducted depth interviews with ten fathers of children with Internet addiction who had been diagnosed as At-Risk through a face to face meeting. The reason for choosing face to face interviews was to listen to the subjective stories and descriptions from fathers about the kind of fathers they are, which are felt and experienced in real life. More specifically, we selected the depth-interview methods through structured questions and non-structured questions. The interviews were held for two or three hours through an individual meeting of ten fathers and data were collected in order to conduct a categorical analysis. The figures discovered from fathers of children were divided into four categories; the father figure who is indifferent to the Internet use of his child, the father figure depending only on education through the Internet, the father figure who is in a weaker point than the Internet and the father figure who is addicted to the Internet. Based on the study results, we suggested the desirable behaviors, which may be useful for both fathers and children with Internet addiction as well as a follow-up study.