• Title/Summary/Keyword: 범주형 자료분석

Search Result 176, Processing Time 0.02 seconds

A study on preferable contents depending on regions and terminal types for high speed mobile internet (초고속 무선인터넷에서 폰형과 모뎀형 단말기의 이용장소에 따르는 선호 콘텐츠에 관한 연구)

  • Ryu, Gui-Yeol
    • Journal of the Korean Data and Information Science Society
    • /
    • v.22 no.4
    • /
    • pp.701-715
    • /
    • 2011
  • The object of study is what kinds of contents are preferable depending on regions and terminal types for high speed mobile internet. We consider 10 contents, 9 region2, and 2 terminal types. The methods are the adjusted residuals, the corresponding analysis, and the multiple corresponding analysis. The results are different, which comes from different mathematical models. 50% of results are same between the corresponding analysis and the multiple corresponding analysis. 26.3% are same between the adjusted residuals and the corresponding analysis. 21.1% are same between the adjusted residuals and the multiple corresponding analysis. We recommend the adjusted residuals because it can test hypothesis for preferring contents. The content which is not chosen by three methods is business.

Characteristics of Middle School Students' Open-Inquiry Report and Their Perceptions of Conducting Inquiry (중학생의 자유 탐구 보고서에 나타난 특징과 탐구 수행에 대한 학생들의 인식)

  • Park, Mi-Hyun;Cha, Jeong-Ho;Kim, In-Whan
    • Journal of the Korean Chemical Society
    • /
    • v.56 no.3
    • /
    • pp.371-377
    • /
    • 2012
  • In this study, open inquiry reports of 165 eighth graders in Daegu were analyzed in terms of content area, the types of inquiry hypothesis, and the types of inquiry variables. Before summer vacation, students learned about inquiry process and explored their own inquiry topic for two class hours. During summer vacation, students performed open inquiry including problem selection, designing and performing experiment, data collection, data analysis, and writing report. After the vacation, students submitted their reports, and answered to additional survey regarding the source of inquiry idea, the definition of hypothesis, and the most difficult step of inquiry process. As a result, chemistry was the most dominant content area of the reports and biology and life science were the next. 130 out of 165 reports included inquiry hypotheses, and most of them were predictive hypotheses. In many reports, dependent and independent variables could not be identified because of their ambiguity. However, inquiry variables described in experimental design, which were mostly categorical variables, were clearer than those described in inquiry subject and inquiry hypothesis. The most difficult step of inquiry process for students was to generate an idea for open inquiry.

Effective Diagnostic Method Of Breast Cancer Data Using Decision Tree (Decision Tree를 이용한 효과적인 유방암 진단)

  • Jung, Yong-Gyu;Lee, Seung-Ho;Sung, Ho-Joong
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.10 no.5
    • /
    • pp.57-62
    • /
    • 2010
  • Recently, decision tree techniques have been studied in terms of quick searching and extracting of massive data in medical fields. Although many different techniques have been developed such as CART, C4.5 and CHAID which are belong to a pie in Clermont decision tree classification algorithm, those methods can jeopardize remained data by the binary method during procedures. In brief, C4.5 method composes a decision tree by entropy levels. In contrast, CART method does by entropy matrix in categorical or continuous data. Therefore, we compared C4.5 and CART methods which were belong to a same pie using breast cancer data to evaluate their performance respectively. To convince data accuracy, we performed cross-validation of results in this paper.

Trimmed LAD Estimators for Multidimensional Contingency Tables (분할표 분석을 위한 절사 LAD 추정량과 최적 절사율 결정)

  • Choi, Hyun-Jip
    • The Korean Journal of Applied Statistics
    • /
    • v.23 no.6
    • /
    • pp.1235-1243
    • /
    • 2010
  • This study proposes a trimmed LAD(least absolute deviation) estimators for multi-dimensional contingency tables and suggests an algorithm to estimate it. In addition, a method to determine the trimming quantity of the estimators is suggested. A Monte Carlo study shows that the propose method yields a better trimming rate and coverage rate than the previously suggest method based on the determinant of the covariance matrix.

Comparison of Step-Wise and Exact Maximum Likelihood Estimations on Cell Probabilities of Contingency Table (단계별로 얻어진 이차원 분할표의 모수 추정을 위한 정확최대우도추정법과 단계별추출추정법의 비교)

  • Lee, Sang-Eun;Kang, Kee-Hoon;Jeung, Seok-O;Shin, Key-Il
    • Communications for Statistical Applications and Methods
    • /
    • v.17 no.1
    • /
    • pp.67-77
    • /
    • 2010
  • In multinomial scheme with step-wise sampling, maximum likelihood estimates of multinomial probabilities are improved when some frequencies are merged. In this study, for cell probabilities in a I by J independent contingency tables, exact MLE and step-wise estimation methods are applied and the results are compared using MSE and Bias.

Sensitivity analysis of missing mechanisms for the 19th Korean presidential election poll survey (19대 대선 여론조사에서 무응답 메카니즘의 민감도 분석)

  • Kim, Seongyong;Kwak, Dongho
    • The Korean Journal of Applied Statistics
    • /
    • v.32 no.1
    • /
    • pp.29-40
    • /
    • 2019
  • Categorical data with non-responses are frequently observed in election poll surveys, and can be represented by incomplete contingency tables. To estimate supporting rates of candidates, the identification of the missing mechanism should be pre-determined because the estimates of non-responses can be changed depending on the assumed missing mechanism. However, it has been shown that it is not possible to identify the missing mechanism when using observed data. To overcome this problem, sensitivity analysis has been suggested. The previously proposed sensitivity analysis can be applicable only to two-way incomplete contingency tables with binary variables. The previous sensitivity analysis is inappropriate to use since more than two of the factors such as region, gender, and age are usually considered in election poll surveys. In this paper, sensitivity analysis suitable to an multi-dimensional incomplete contingency table is devised, and also applied to the 19th Korean presidential election poll survey data. As a result, the intervals of estimates from the sensitivity analysis include actual results as well as estimates from various missing mechanisms. In addition, the properties of the missing mechanism that produce estimates nearest to actual election results are investigated.

A Statistical Study on Korean Baseball League Games (한국 프로야구 경기결과에 관한 통계적 연구)

  • Choi, Young-Gun;Kim, Hyoung-Moon
    • The Korean Journal of Applied Statistics
    • /
    • v.24 no.5
    • /
    • pp.915-930
    • /
    • 2011
  • There are a variety of methods to model game results and many methods exist for the case of paired comparison data. Among them, the Bradley-Terry model is the most widely used to derive a latent preference scale from paired comparison data. It has been applied in a variety of fields in psychology and related disciplines. We applied this model to the data of Korean Baseball League. It shows that the loglinear Bradley-Terry model of defensive rate and save is optimal in terms of AIC. Also some categorical characteristics, such as east team and west team, existence of golden glove winning players, team(s) with seasonal pitching leader, and team(s) with home advantage, influenced the game result significantly. As a result, the suggested models can be further utilized to predict future game results.

A study for improving data mining methods for continuous response variables (연속형 반응변수를 위한 데이터마이닝 방법 성능 향상 연구)

  • Choi, Jin-Soo;Lee, Seok-Hyung;Cho, Hyung-Jun
    • Journal of the Korean Data and Information Science Society
    • /
    • v.21 no.5
    • /
    • pp.917-926
    • /
    • 2010
  • It is known that bagging and boosting techniques improve the performance in classification problem. A number of researchers have proved the high performance of bagging and boosting through experiments for categorical response but not for continuous response. We study whether bagging and boosting improve data mining methods for continuous responses such as linear regression, decision tree, neural network through bagging and boosting. The analysis of eight real data sets prove the high performance of bagging and boosting empirically.

A Study on Elementary School Teachers' Experiences in Teaching Students with Low Achievement in Science based on Grounded Theory (초등교사의 과학학습부진학생 지도경험에 관한 근거이론적 연구)

  • Kang, Jihoon
    • Journal of Korean Elementary Science Education
    • /
    • v.41 no.1
    • /
    • pp.44-64
    • /
    • 2022
  • This study explored the elementary school teachers' experiences while teaching students with low achievement in science based on the grounded theory. In-depth interviews and analysis were conducted on 13 teachers with experiences in teaching students with low achievement in science within the last three years and more than five years of field experience until the theoretical saturation of data on the teaching experiences for students with low achievement in science. The analysis results were as follows. First, the teaching experiences of elementary school teachers for underachievers in science were classified into 119 concepts, 41 subcategories, and 17 categories. Based on the paradigm model, the categories were structured and presented as causal conditions, contextual conditions, intervening conditions, action/interaction strategies and consequences based on the central phenomenon of 'difficulty in teaching students with low achievement in science'. Second, the core category of elementary school teachers' teaching underachievers in science was assumed to be 'overcoming difficulties and teaching underachievers in science'. And according to the properties and dimensions of the core category, teachers who teaching students with low achievement in science were divided into four types: 'compromising-', 'overcoming-', 'accepting-', and 'conflicting-reality type'. Third, a conditional matrix was presented to summarize and integrate the results of this study by classifying the teaching experience of elementary school teachers for underachievers in science into educational providers and educational demanders. On the basis of these findings, educational implications for teaching students with low achievement in science were discussed.

Determinants of Satisfaction, Revisit Intention, and Recommendation Intention Using Decision Tree Analysis - Foreign Tourists Visiting Korea during the COVID-19 Pandemic - (의사결정나무분석을 활용한 방문 만족도, 재방문 의사, 타인 권유 의사 결정요인 분석 - 코로나19 상황에서의 한국 방문 외래관광객을 대상으로 -)

  • Won-Sik Kim
    • The Journal of the Convergence on Culture Technology
    • /
    • v.9 no.3
    • /
    • pp.129-136
    • /
    • 2023
  • The study aims to examine the determinants that affect satisfaction, revisit intention, and recommendation intention with foreign tourists who visited Korea despite the threat of COVID-19. This study employs the survey data collected by the Korea Tourism Organization from 8,135 foreign tourists who visited Korea in 2020. As the survey data contains a mixture of continuous and categorical variables, decision tree analysis can ensure analytical validity for the research. According to the analytical results, the determinants affecting satisfaction are the purpose of the visit and acceptance of self-quarantine during their stay. The factors influencing revisit intention are the purpose of the visit, frequency of the visit, and acceptance of self-quarantine during their stay. The determinants affecting recommendation intention are the purpose of the visit, length of stay, and gender. Based on the results of this analysis, this study not only explains the relationship between these determinants and tourism satisfaction, revisit intention, and recommendation intention, but also suggests implications for revitalizing tourism activities.