• 제목/요약/키워드: Classification Variables

검색결과 939건 처리시간 0.023초

Gender discrimination and multivariate analysis using deboning data

  • Shim, Joon-Yong;Kim, Ha-Yeong;Cho, Byoung-Kwan;Lee, Wang-Hee
    • 한국농업기계학회:학술대회논문집
    • /
    • 한국농업기계학회 2017년도 춘계공동학술대회
    • /
    • pp.23-23
    • /
    • 2017
  • Recent favor on high quality food and concern on food safety have demonstrated the superiority of Hanwoo (Korean native cattle). In general, the price of cow is higher than those of steer and bull, causing cheating issues in the market. Hence, this study is to discriminate genders of Hanwoo with identification of factors which highly influence gender discrimination based on the big-size deboning data. Totally, there were 31 variables in the deboning data, and we divided into them two categories: data obtained before and after deboning. Discriminant function analysis was then applied into the data to determined the accuracy of gender discrimination in Hanwoo. The result showed that Hanwoo could be classified by gender with 99.2% of accuracy when using all 31 variables. In detail, it was possible to identify 93 of 94 bulls (98.9%), 96 of 96 cows (100%) and 74 of 75 steers (98.7%). The most significant variables was chuck, sirloin, armbone shin, plates, retail and cuts percentage, sequentially. With variables obtainable before deboning, accuracies of classification were 91.5% for bulls, 92.7% for cows, and 89.3% for steers. The most significant variables was water, cold carcass weight and back-fat thickness. The discrimination accuracy was higher with data obtainable after deboning: bulls (98.9%), cows (99.0%) and steers (98.7%). In this case, chuck, sirloin and armbone shin were the factors determined the classification ability. This study showed that Hanwoo can be classified based on deboning data with appropriate statistics, further suggesting weight of cut of beef might be the standard for gender classification.

  • PDF

독립변수의 차원감소에 의한 Polynomial Adaline의 성능개선 (Performance Improvement of Polynomial Adaline by Using Dimension Reduction of Independent Variables)

  • 조용현
    • 한국산업융합학회 논문집
    • /
    • 제5권1호
    • /
    • pp.33-38
    • /
    • 2002
  • This paper proposes an efficient method for improving the performance of polynomial adaline using the dimension reduction of independent variables. The adaptive principal component analysis is applied for reducing the dimension by extracting efficiently the features of the given independent variables. It can be solved the problems due to high dimensional input data in the polynomial adaline that the principal component analysis converts input data into set of statistically independent features. The proposed polynomial adaline has been applied to classify the patterns. The simulation results shows that the proposed polynomial adaline has better performances of the classification for test patterns, in comparison with those using the conventional polynomial adaline. Also, it is affected less by the scope of the smoothing factor.

  • PDF

토지이용특성을 고려한 서울시 교통사고 발생 모형 개발 (Development of Traffic Accident Models in Seoul Considering Land Use Characteristics)

  • 임삼진;박준태
    • 한국재난정보학회 논문집
    • /
    • 제9권1호
    • /
    • pp.30-49
    • /
    • 2013
  • 본 연구에서는 토지이용에 기반을 두는 새로운 교통사고 예측모형을 개발하였다. 다양한 지역의 특성을 반영할 수 있는 변수에 대한 시장분할 및 추가변수 도입을 토대로 Data Mining 기법의 하나인 의사나무결정법(Classification and Regression Tree)을 활용하여 새로운 유형별 교통사고 예측모형을 개발하였다. 분석결과를 살펴보면 주민등록인구수, 통근 등 활동변수와 활동의 대상이 되는 도로규모, 유발시설 등이 교통사고를 설명하는 변수로 도출되었다.

선행연구에 나타난 의복소비자 행동변인 및 시장 변인연구 (A Study on the Variables of Clothing Consumer Behavior and Market: Literature Review)

  • 박혜선
    • 한국의류학회지
    • /
    • 제20권6호
    • /
    • pp.1125-1137
    • /
    • 1996
  • The author reviewed seventy papers on social psychology of clothing and fashion marketing fields, which were published in the Journal of the Korean Society of Clothing and Textiles between 1983 and 1996. The market variables and consumer behavior variables were focused on. This review showed that the market variables had been divided into three groups of variables: 1) product variables (product image and product classification): 2) brand variables (brand image and brand positioning): and 3) store variables (store image, store type, and distribution system) Consumer behavior variables have been studied on the basis of EBM Consumer Behavior Model: 1) purchasing motivation as need recognition: 2) information using as search information: 3) evaluation criteria and choice criteria as alternative evaluatioin : 4) clothing purchase, brand choice and store choice as purchase: 5) degree of wear, satisfaction and dissatisfaction as outcome: and 6) clothing discard. Variables that influence on consumer behavior, including situation variables, clothing attitude variables, personal . social variables were added to develop a variable model of clothing consumer behavior using the EBM Consumer Behavior Model.

  • PDF

선형 음의 사분 종속확률변수에서 가중합에 대한 수렴성 연구 (Convergence of weighted sums of linearly negative quadrant dependent random variables)

  • 이승우;백종일
    • 한국신뢰성학회지:신뢰성응용연구
    • /
    • 제12권4호
    • /
    • pp.265-274
    • /
    • 2012
  • We in this paper discuss the strong law of large numbers for weighted sums of arrays of rowwise LNQD random variables by using a new exponential inequality of LNQD r.v.'s under suitable conditions and we obtain one of corollary.

남한의 생물기후권역 구분과 특성 규명 (Bioclimatic Classification and Characterization in South Korea)

  • 최유영;임철희;류지은;;강진영;;;이우균;전성우
    • 한국환경복원기술학회지
    • /
    • 제20권3호
    • /
    • pp.1-18
    • /
    • 2017
  • This study constructed a high-resolution bioclimatic classification map of South Korea which classifies land into homogeneous zones by similar environment properties using advanced statistical techniques compared to existing ecological area classification studies. The climate data provided by WorldClim(1960-1990) were used to generate 27 bioclimatic variables affecting biological habitats, and key environmental variables were derived from Correlation Analysis and Principal Component Analysis. Clustering Analysis was performed using the ISODATA method to construct a 30'(~1km) resolution bioclimatic classification map. South Korea was divided into 21 regions and the results of classification were verified by correlation analysis with the Gross Primary Production(GPP), Actual Vegetation map made by the Ministry of Environment. Each zones' were described and named by its environmental characteristics and major vegetation distribution. This study could provide useful spatial frameworks to support ecosystem research, monitoring and policy decisions.

Predictive Analysis of Problematic Smartphone Use by Machine Learning Technique

  • Kim, Yu Jeong;Lee, Dong Su
    • 한국컴퓨터정보학회논문지
    • /
    • 제25권2호
    • /
    • pp.213-219
    • /
    • 2020
  • 본 연구는 스마트폰 과의존을 진단하고 예측하기 위하여 할 수 있는 분류분석 방법과 스마트폰 과의존 분류율에 영향을 미치는 중요변수를 규명하고자 시도되었다. 이를 위해 인공지능의 방법인 기계학습 분석 기법 중 의사결정트리, 랜덤포레스트, 서포트벡터머신의 분류율을 비교하였다. 자료는 한국정보화진흥원에서 제공한 '2018년 스마트폰 과의존 실태조사'에 응답한 25,465명의 데이터였고, R 통계패키지(ver. 3.6.2)를 사용하여 분석하였다. 분석한 결과, 3가지 분류분석 기법은 정분류율이 유사하게 나타났으며, 모델에 대한 과적합 문제가 발생되지 않았다. 3가지 분류분석 방법 중 서포트벡터머신의 분류율이 가장 높게 나타났고, 다음으로 의사결정트리 기법, 랜덤포레스트 기법 순이었다. 스마트폰 이용 유형 중 분류율에 영향을 미치는 상위 3개 변수는 생활서비스형, 정보검색형, 여가추구형이었다.

다구찌 디자인을 이용한 앙상블 및 군집분석 분류 성능 비교 (Comparing Classification Accuracy of Ensemble and Clustering Algorithms Based on Taguchi Design)

  • 신형원;손소영
    • 대한산업공학회지
    • /
    • 제27권1호
    • /
    • pp.47-53
    • /
    • 2001
  • In this paper, we compare the classification performances of both ensemble and clustering algorithms (Data Bagging, Variable Selection Bagging, Parameter Combining, Clustering) to logistic regression in consideration of various characteristics of input data. Four factors used to simulate the logistic model are (1) correlation among input variables (2) variance of observation (3) training data size and (4) input-output function. In view of the unknown relationship between input and output function, we use a Taguchi design to improve the practicality of our study results by letting it as a noise factor. Experimental study results indicate the following: When the level of the variance is medium, Bagging & Parameter Combining performs worse than Logistic Regression, Variable Selection Bagging and Clustering. However, classification performances of Logistic Regression, Variable Selection Bagging, Bagging and Clustering are not significantly different when the variance of input data is either small or large. When there is strong correlation in input variables, Variable Selection Bagging outperforms both Logistic Regression and Parameter combining. In general, Parameter Combining algorithm appears to be the worst at our disappointment.

  • PDF

A Study on the Classification of Variables Affecting Smartphone Addiction in Decision Tree Environment Using Python Program

  • Kim, Seung-Jae
    • International journal of advanced smart convergence
    • /
    • 제11권4호
    • /
    • pp.68-80
    • /
    • 2022
  • Since the launch of AI, technology development to implement complete and sophisticated AI functions has continued. In efforts to develop technologies for complete automation, Machine Learning techniques and deep learning techniques are mainly used. These techniques deal with supervised learning, unsupervised learning, and reinforcement learning as internal technical elements, and use the Big-data Analysis method again to set the cornerstone for decision-making. In addition, established decision-making is being improved through subsequent repetition and renewal of decision-making standards. In other words, big data analysis, which enables data classification and recognition/recognition, is important enough to be called a key technical element of AI function. Therefore, big data analysis itself is important and requires sophisticated analysis. In this study, among various tools that can analyze big data, we will use a Python program to find out what variables can affect addiction according to smartphone use in a decision tree environment. We the Python program checks whether data classification by decision tree shows the same performance as other tools, and sees if it can give reliability to decision-making about the addictiveness of smartphone use. Through the results of this study, it can be seen that there is no problem in performing big data analysis using any of the various statistical tools such as Python and R when analyzing big data.

On EM Algorithm For Discrete Classification With Bahadur Model: Unknown Prior Case

  • Kim, Hea-Jung;Jung, Hun-Jo
    • Journal of the Korean Statistical Society
    • /
    • 제23권1호
    • /
    • pp.63-78
    • /
    • 1994
  • For discrimination with binary variables, reformulated full and first order Bahadur model with incomplete observations are presented. This allows prior probabilities associated with multiple population to be estimated for the sample-based classification rule. The EM algorithm is adopted to provided the maximum likelihood estimates of the parameters of interest. Some experiences with the models are evaluated and discussed.

  • PDF