• Title/Summary/Keyword: statistical learning

Search Result 1,324, Processing Time 0.025 seconds

A Study on Analyzing Sentiments on Movie Reviews by Multi-Level Sentiment Classifier (영화 리뷰 감성분석을 위한 텍스트 마이닝 기반 감성 분류기 구축)

  • Kim, Yuyoung;Song, Min
    • Journal of Intelligence and Information Systems
    • /
    • v.22 no.3
    • /
    • pp.71-89
    • /
    • 2016
  • Sentiment analysis is used for identifying emotions or sentiments embedded in the user generated data such as customer reviews from blogs, social network services, and so on. Various research fields such as computer science and business management can take advantage of this feature to analyze customer-generated opinions. In previous studies, the star rating of a review is regarded as the same as sentiment embedded in the text. However, it does not always correspond to the sentiment polarity. Due to this supposition, previous studies have some limitations in their accuracy. To solve this issue, the present study uses a supervised sentiment classification model to measure a more accurate sentiment polarity. This study aims to propose an advanced sentiment classifier and to discover the correlation between movie reviews and box-office success. The advanced sentiment classifier is based on two supervised machine learning techniques, the Support Vector Machines (SVM) and Feedforward Neural Network (FNN). The sentiment scores of the movie reviews are measured by the sentiment classifier and are analyzed by statistical correlations between movie reviews and box-office success. Movie reviews are collected along with a star-rate. The dataset used in this study consists of 1,258,538 reviews from 175 films gathered from Naver Movie website (movie.naver.com). The results show that the proposed sentiment classifier outperforms Naive Bayes (NB) classifier as its accuracy is about 6% higher than NB. Furthermore, the results indicate that there are positive correlations between the star-rate and the number of audiences, which can be regarded as the box-office success of a movie. The study also shows that there is the mild, positive correlation between the sentiment scores estimated by the classifier and the number of audiences. To verify the applicability of the sentiment scores, an independent sample t-test was conducted. For this, the movies were divided into two groups using the average of sentiment scores. The two groups are significantly different in terms of the star-rated scores.

Elementary Schooler's Recognition and Understanding of the Scientific Units in Daily Life (초등학교 학생들의 생활 속 과학단위 인식과 이해)

  • Kim, Sung-Kyu
    • Journal of Science Education
    • /
    • v.36 no.2
    • /
    • pp.235-250
    • /
    • 2012
  • This paper aims to find out whether or not elementary school students recognize and understand scientific units that they encounter in their everyday life. To select appropriate units for the survey, first, scientific units in elementary textbooks of science and other science related subjects were analyzed. Then it was examined how these units were related to the learners' daily life. The participants in the current survey were 320 elementary school 6th graders. A questionnaire consisted of 11 units of science, such as kg for mass, km for distance, L for volume, V for voltage, s for time, $^{\circ}C$ for temperature, km/h for speed, kcal for heat, % for percentage, W for electric power, pH for acidity, which can often be seen and used in daily life. The students were asked to do the following four tasks, (1) to see presented pictures and select appropriate scientific units, (2) to write reasons for choosing the units, (3) to answer what the units are used for, and (4) to check where to find the units. The data were analyzed in terms of the percentage of the students who seemed to well recognize and understand the units, using SPSS 17.0 statistical program. The results are as follows: Regarding the general use of the units, it was revealed that almost the same units were repeated in science and other subject textbooks from the same grade. With an increase of the students' grade more difficult units were used. As for the use of each unit, it was found that they seemed to relatively well understand what these units kg, km, L, $^{\circ}C$, kcal, km/h, and W stand for, showing more than 91% right. However, the units of V, s, in particular, %, and pH did not seem to be understood. With respect to the recognition of the units, most students did not recognize such units as L for volume and pH for acidity, probably because the units are difficult at the elementary level in comparison to other scientific units. The students indicated that schools were the best place where they could learn and find scientific units related to life, followed by shops/marts, newspapers/broadcasting, streets/roads, homes, and others in that order. The results show that scientific unit learning should be conducted in a systematic way at school and that teachers can play a major role in improving students' understanding and use of the units.

  • PDF

Development of Market Growth Pattern Map Based on Growth Model and Self-organizing Map Algorithm: Focusing on ICT products (자기조직화 지도를 활용한 성장모형 기반의 시장 성장패턴 지도 구축: ICT제품을 중심으로)

  • Park, Do-Hyung;Chung, Jaekwon;Chung, Yeo Jin;Lee, Dongwon
    • Journal of Intelligence and Information Systems
    • /
    • v.20 no.4
    • /
    • pp.1-23
    • /
    • 2014
  • Market forecasting aims to estimate the sales volume of a product or service that is sold to consumers for a specific selling period. From the perspective of the enterprise, accurate market forecasting assists in determining the timing of new product introduction, product design, and establishing production plans and marketing strategies that enable a more efficient decision-making process. Moreover, accurate market forecasting enables governments to efficiently establish a national budget organization. This study aims to generate a market growth curve for ICT (information and communication technology) goods using past time series data; categorize products showing similar growth patterns; understand markets in the industry; and forecast the future outlook of such products. This study suggests the useful and meaningful process (or methodology) to identify the market growth pattern with quantitative growth model and data mining algorithm. The study employs the following methodology. At the first stage, past time series data are collected based on the target products or services of categorized industry. The data, such as the volume of sales and domestic consumption for a specific product or service, are collected from the relevant government ministry, the National Statistical Office, and other relevant government organizations. For collected data that may not be analyzed due to the lack of past data and the alteration of code names, data pre-processing work should be performed. At the second stage of this process, an optimal model for market forecasting should be selected. This model can be varied on the basis of the characteristics of each categorized industry. As this study is focused on the ICT industry, which has more frequent new technology appearances resulting in changes of the market structure, Logistic model, Gompertz model, and Bass model are selected. A hybrid model that combines different models can also be considered. The hybrid model considered for use in this study analyzes the size of the market potential through the Logistic and Gompertz models, and then the figures are used for the Bass model. The third stage of this process is to evaluate which model most accurately explains the data. In order to do this, the parameter should be estimated on the basis of the collected past time series data to generate the models' predictive value and calculate the root-mean squared error (RMSE). The model that shows the lowest average RMSE value for every product type is considered as the best model. At the fourth stage of this process, based on the estimated parameter value generated by the best model, a market growth pattern map is constructed with self-organizing map algorithm. A self-organizing map is learning with market pattern parameters for all products or services as input data, and the products or services are organized into an $N{\times}N$ map. The number of clusters increase from 2 to M, depending on the characteristics of the nodes on the map. The clusters are divided into zones, and the clusters with the ability to provide the most meaningful explanation are selected. Based on the final selection of clusters, the boundaries between the nodes are selected and, ultimately, the market growth pattern map is completed. The last step is to determine the final characteristics of the clusters as well as the market growth curve. The average of the market growth pattern parameters in the clusters is taken to be a representative figure. Using this figure, a growth curve is drawn for each cluster, and their characteristics are analyzed. Also, taking into consideration the product types in each cluster, their characteristics can be qualitatively generated. We expect that the process and system that this paper suggests can be used as a tool for forecasting demand in the ICT and other industries.

A Study of Improvement of School Health in Korea (학교보건(學校保健)의 개선방안(改善方案) 연구(硏究))

  • Lee, Soo Hee
    • Journal of the Korean Society of School Health
    • /
    • v.1 no.2
    • /
    • pp.118-135
    • /
    • 1988
  • This study is designed to analyze the problems of health education in schools and explore the ways of enhancing health education from a historical perspective. It also shed light on the managerial aspect of health education (including medical-check-up for students disease management. school feeding and the health education law and its organization) as well as its educational aspect (including curriculum, teaching & learning, and wishes of teachers). At the same time it attempted to present the ways of resolving the problems in health education as identified her. Its major findings are as follows; I. Colculsion and Summary 1. Despite the importance of health education, the area remains relatively undeveloped. Students spend a greater part of their time in schools. Hence the government should develop a keener awareness of the importance of health education and invest more in it to ensure a healthy, comfortable life for students. 2. At the moment the outcomes of medical-check-up for students, which constitutes the mainstay of health education, are used only as statistical data to report to the relevant authorities. Needless to say they should be used to help improve the wellbeing of students. Specifically, nurse-teachers and home-room teachers should share the outcomes of medical-check-up to help the students wit shortcomings in growth or development or other physical handicaps more clearly recognize their problems and correct them if possible. 3. In the area of disease management, 62.6, 30.3 and 23.0 percent of primary, middle, and highschool students, respectively, were found to suffer from dental ailments. By contrast 2.2, 7.8, and 11.5 percent of primary, middle and highschool students suffered from visual disorders. The incidence of dental ailments decreases while that of visual impairments increases as students grow up. This signifies that students are under tremendous physical strain in their efforts to be admitted by schools of higher grade. Accordingly the relevant authorities should revise the current admission system as well as improve lighting system in classrooms. 4. Budget restraints have often been cited as a major bottleneck to the expansion of school feeding. Nevertheless it should be extended at least, to all primary schools even at the expense of parents to ensure the sound growth of children by improving their diet. 5. The existing health education law should be revised in such a way as to better meet the needs of schools. Also the manpower for health education should be strengthened. 6. Proper curriculum is essential to the effective implementation of health education. Hence it is necessary to remove those parts in the current health education curriculum that overlaps with other subjects. It is also necessary to make health education a compulsory course in teachers' college at the same time the teachers in charge of health education should be given an in-service training. 7. Currently health education is being taught as part of physical education, science, home economics or other courses. However these subjects tend to be overshadowed by English, mathematics, and other subjects which carry heavier weight in admission test. It is necessary among other things, to develop an educational plan specifying the course hours and teaching materials. 8. Health education is carried out by nurse-teachers or home-room teachers. In connection with health education, they expressed the hope that health education will be normalized with newly-developed teaching material, expanded opportunity for in-service training and increased budget, facilities and supply of manpower. These are the mainpoints that the decision-makers should take into account in the formation of future policy for health education. II. Recommendations for the Improvement of Health Education 1. Regular medical check-up for students, which now is the mainstay of health education, should be used as educational data in an appropriate manner. For instance the records of medical check-up could be transferred between schools. 2. School feeding should be expanded at least in primary schools at the expense of the government or even parents. It will help improve the physical wellbeing of youths and the diet for the people. 3. At the moment the health education law is only nominal. Hence the law should be revised in such a way as to ensure the physical wellbeing of students and faculty. 4. Health education should be made a compulsory course in teachers' college. Also the teachers in service should be offered training in health education. 5. The curriculum of health education should be revised. Also the course hours should be extended or readjusted to better meet the needs of students. 6. In the meantime the course hours should be strictly observed, while educational materials should be revised in no time. 7. The government should expand its investment in facilities, budget and personnel for health education in schools at all levels.

  • PDF

A Basic Study for Sustainable Analysis and Evaluation of Energy Environment in Buildings : Focusing on Energy Environment Historical Data of Residential Buildings (빌딩의 지속가능 에너지환경 분석 및 평가를 위한 기초 연구 : 주거용 건물의 에너지환경 실적정보를 중심으로)

  • Lee, Goon-Jae
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.18 no.1
    • /
    • pp.262-268
    • /
    • 2017
  • The energy consumption of buildings is approximately 20.5% of the total energy consumption, and the interest in energy efficiency and low consumption of the building is increasing. Several studies have performed energy analysis and evaluation. Energy analysis and evaluation are effective when applied in the initial design phase. In the initial design phase, however, the energy performance is evaluated using general level information, such as glazing area and surface area. Therefore, the evaluation results of the detailed design stage, which is based on the drawings, including detailed information of the materials and facilities, will be different. Thus far, most studies have reported the analysis and evaluation at the detailed design stage, where detailed information about the materials installed in the building becomes clear. Therefore, it is possible to improve the accuracy of the energy environment analysis if the energy environment information generated during the life cycle of the building can be established and accurate information can be provided in the analysis at the initial design stage using a probability / statistical method. On the other hand, historical data on energy use has not been established in Korea. Therefore, this study performed energy environment analysis to construct the energy environment historical data. As a result of the research, information classification system, information model, and service model for acquiring and providing energy environment information that can be used for building lifecycle information of buildings are presented and used as the basic data. The results can be utilized in the historical data management system so that the reliability of analysis can be improved by supplementing the input information at the initial design stage. If the historical data is stacked, it can be used as learning data in methods, such as probability / statistics or artificial intelligence for energy environment analysis in the initial design stage.

Developing and Applying the Questionnaire to Measure Science Core Competencies Based on the 2015 Revised National Science Curriculum (2015 개정 과학과 교육과정에 기초한 과학과 핵심역량 조사 문항의 개발 및 적용)

  • Ha, Minsu;Park, HyunJu;Kim, Yong-Jin;Kang, Nam-Hwa;Oh, Phil Seok;Kim, Mi-Jum;Min, Jae-Sik;Lee, Yoonhyeong;Han, Hyo-Jeong;Kim, Moogyeong;Ko, Sung-Woo;Son, Mi-Hyun
    • Journal of The Korean Association For Science Education
    • /
    • v.38 no.4
    • /
    • pp.495-504
    • /
    • 2018
  • This study was conducted to develop items to measure scientific core competency based on statements of scientific core competencies presented in the 2015 revised national science curriculum and to identify the validity and reliability of the newly developed items. Based on the explanations of scientific reasoning, scientific inquiry ability, scientific problem-solving ability, scientific communication ability, participation/lifelong learning in science presented in the 2015 revised national science curriculum, 25 items were developed by five science education experts. To explore the validity and reliability of the developed items, data were collected from 11,348 students in elementary, middle, and high schools nationwide. The content validity, substantive validity, the internal structure validity, and generalization validity proposed by Messick (1995) were examined by various statistical tests. The results of the MNSQ analysis showed that there were no nonconformity in the 25 items. The confirmatory factor analysis using the structural equation modeling revealed that the five-factor model was a suitable model. The differential item functioning analyses by gender and school level revealed that the nonconformity DIF value was found in only two out of 175 cases. The results of the multivariate analysis of variance by gender and school level showed significant differences of test scores between schools and genders, and the interaction effect was also significant. The assessment items of science core competency based on the 2015 revised national science curriculum are valid from a psychometric point of view and can be used in the science education field.

Relationship of Maternal Perception of the Infant Temperament and Confidence and Satisfaction of Maternal Role (어머니가 지각한 영아기질과 어머니 역할수행에 대한 자신감 및 만족도의 관계)

  • Lee Young-Eun;Kang Yang-Hee;Park Hae-Sun;Hwang Eun-Ju;Mun Mi-Young
    • Child Health Nursing Research
    • /
    • v.9 no.2
    • /
    • pp.206-220
    • /
    • 2003
  • Purpose: this study was intended to search the relationship between perception of the infant temperament in mother of infant at the age of 1~12 months and maternal confidence and satisfaction in performing maternal role, and to submit a basic data to establish a nursing intervention program which is helpful for determination of infant development and performing maternal role promotion by identify variables associated with infant temperament. Method: The subjects of this study were 300 mothers of infant at the age of 1~12 months who visited well baby clinic in 4 hospitals in Busan city and Kyoung-Nam province. Final analysis was performed in 293 cases. Seven cases was excluded in this study because of its inappropriate data collection. The data was collected from 1st July to 15th August 2002. The questionaries which were fill-up by mother were collected. Infant temperament was measured by using the tool of 'what my baby is like'(WBL) which was developed by Priham et. al.(1994) and translated by Bang(1999). The scale of postpartum self evaluation which was developed by Lederman et al(1981) and translated by Lee(1992) was used for the confidence and satisfaction of maternal role. All statistical analyses were performed using SPSS-PC for window, version 10.0: frequency, percentage, minimum, maximum, mean, SD, t-test, ANOVA, Post-hoc test(Scheffe's test), Pearson Correlation Coefficients. Result: The mean score of maternal perception of the infant temperament was 6.17±1.04, and mother recognized her infant as positive. The mean score of confidence of maternal role was 2.89± .41 and this revealed in an average level. The mean score of satisfaction of maternal role was 3.29± .51 and this revealed in a higher level. There was a weak significant positive correlation between the score of maternal perception of infant temperament and confidence of maternal role(r=0.176, P= .003), but there was no significant correlation between satisfaction of maternal role(P> .05). It revealed the more maternal perception of the infant temperament as positive, the higher confidence of maternal role. There was a moderate significant positive correlation between confidence of maternal role and satisfaction of maternal role(r=0.410, P= .000). It revealed the more confidence of maternal role, the higher satisfaction of maternal role. The variables related with the score of maternal perception of infant temperament were the type of delivery (t=-2.600, P= .010), experience of learning baby care(t=2.382, P= .018), maternal perception on baby's health status(F=3.467, P= .033), maternal perception on her health status(F=3.467, P= .027), baby's age(F=3.080, P= .028). Conclusion: Our result showed the confidence of maternal role was increased as the maternal perception of infant temperament was positive, and conformed that the confidence of maternal role was also related with satisfaction of maternal role. Prenatal education, type of delivery, baby's age were also related with the maternal perception of infant temperament. So, nursing intervention program of developmental stage maybe necessary in order to help maternal perception of infant temperament as positive, and it will be increased the confidence of maternal role and satisfaction of performing maternal role which was considered as real indicate of achievement of maternal role.

  • PDF

Relationship between Sleep Insufficiency and Excessive Daytime Sleepiness (수면 부족과 과도한 주간졸림증의 관련성)

  • Choi, Yun-Kyeung;Lee, Heon-Jeong;Suh, Kwang-Yoon;Kim, Leen
    • Sleep Medicine and Psychophysiology
    • /
    • v.10 no.2
    • /
    • pp.93-99
    • /
    • 2003
  • Objectives:Sleep loss and excessive daytime sleepiness may have serious consequences, including traffic and industrial accidents, decreased productivity, learning disabilities and interpersonal problems. Yet despite these adverse effects, there are few epidemiological studies on sleep loss and daytime sleepiness in the general population of Korea. This study investigates the number of people who suffer from sleep insufficiency, how much recovery sleep occurs on weekends, and the relationship between the amount of recovery sleep and daytime sleepiness. Methods:A total 164 volunteers, aged 20 and over, were recruited by advertisement. The subjects were workers and college students living in Seoul, Korea. Subjects were excluded if they were aged over 60;if they had medical, neurological, psychiatric or sleep disorders that could cause insomnia or daytime sleepiness;if they were not following a regular sleep schedule;if they traveled abroad during the study;or if they did not leave home to work or were shift workers. They were interviewed and given a sleep log to complete on each of 14 consecutive mornings. They also completed the Epworth Sleepiness Scale (ESS) at noontime on the last day of the second week. All statistical data were analyzed by t-test, $X^2$-test or ANOVA, using SPSS/PC+. Results:The results showed that the subjects woke up at 6:50 (${\pm}1$:16) on weekdays, 7:09 (${\pm}1$:29) on Saturdays, and 8:12 (${\pm}1$:39) on Sundays and holidays. They took more frequent and longer naps on Sundays than on weekdays and Saturdays. The mean sleep duration was 6h 35 min. on week nights, with a mean increase of about 1h on weekends. Only 9.1% of the subjects spent more than 8h in bed on week nights, with 67% spending less than 7h, and 49.4% reported recovery sleep of more than 1h on Sundays. The subjects who reported recovery sleep of more than 2h on Sundays, showed significantly more excessive daytime sleepiness than those who reported less than 30 min (F=2.62, p<.05). Conclusions:These findings suggest that sleep insufficiency and excessive daytime sleepiness are relatively common in Korea, and that the people who get insufficient sleep on weekdays try to compensate for sleep loss with oversleeping and daytime napping on Sundays and holidays. It appeared that daily sleep insufficiency had a cumulative effect and increased daytime sleepiness.

  • PDF

Usefulness of Data Mining in Criminal Investigation (데이터 마이닝의 범죄수사 적용 가능성)

  • Kim, Joon-Woo;Sohn, Joong-Kweon;Lee, Sang-Han
    • Journal of forensic and investigative science
    • /
    • v.1 no.2
    • /
    • pp.5-19
    • /
    • 2006
  • Data mining is an information extraction activity to discover hidden facts contained in databases. Using a combination of machine learning, statistical analysis, modeling techniques and database technology, data mining finds patterns and subtle relationships in data and infers rules that allow the prediction of future results. Typical applications include market segmentation, customer profiling, fraud detection, evaluation of retail promotions, and credit risk analysis. Law enforcement agencies deal with mass data to investigate the crime and its amount is increasing due to the development of processing the data by using computer. Now new challenge to discover knowledge in that data is confronted to us. It can be applied in criminal investigation to find offenders by analysis of complex and relational data structures and free texts using their criminal records or statement texts. This study was aimed to evaluate possibile application of data mining and its limitation in practical criminal investigation. Clustering of the criminal cases will be possible in habitual crimes such as fraud and burglary when using data mining to identify the crime pattern. Neural network modelling, one of tools in data mining, can be applied to differentiating suspect's photograph or handwriting with that of convict or criminal profiling. A case study of in practical insurance fraud showed that data mining was useful in organized crimes such as gang, terrorism and money laundering. But the products of data mining in criminal investigation should be cautious for evaluating because data mining just offer a clue instead of conclusion. The legal regulation is needed to control the abuse of law enforcement agencies and to protect personal privacy or human rights.

  • PDF

The Prediction of DEA based Efficiency Rating for Venture Business Using Multi-class SVM (다분류 SVM을 이용한 DEA기반 벤처기업 효율성등급 예측모형)

  • Park, Ji-Young;Hong, Tae-Ho
    • Asia pacific journal of information systems
    • /
    • v.19 no.2
    • /
    • pp.139-155
    • /
    • 2009
  • For the last few decades, many studies have tried to explore and unveil venture companies' success factors and unique features in order to identify the sources of such companies' competitive advantages over their rivals. Such venture companies have shown tendency to give high returns for investors generally making the best use of information technology. For this reason, many venture companies are keen on attracting avid investors' attention. Investors generally make their investment decisions by carefully examining the evaluation criteria of the alternatives. To them, credit rating information provided by international rating agencies, such as Standard and Poor's, Moody's and Fitch is crucial source as to such pivotal concerns as companies stability, growth, and risk status. But these types of information are generated only for the companies issuing corporate bonds, not venture companies. Therefore, this study proposes a method for evaluating venture businesses by presenting our recent empirical results using financial data of Korean venture companies listed on KOSDAQ in Korea exchange. In addition, this paper used multi-class SVM for the prediction of DEA-based efficiency rating for venture businesses, which was derived from our proposed method. Our approach sheds light on ways to locate efficient companies generating high level of profits. Above all, in determining effective ways to evaluate a venture firm's efficiency, it is important to understand the major contributing factors of such efficiency. Therefore, this paper is constructed on the basis of following two ideas to classify which companies are more efficient venture companies: i) making DEA based multi-class rating for sample companies and ii) developing multi-class SVM-based efficiency prediction model for classifying all companies. First, the Data Envelopment Analysis(DEA) is a non-parametric multiple input-output efficiency technique that measures the relative efficiency of decision making units(DMUs) using a linear programming based model. It is non-parametric because it requires no assumption on the shape or parameters of the underlying production function. DEA has been already widely applied for evaluating the relative efficiency of DMUs. Recently, a number of DEA based studies have evaluated the efficiency of various types of companies, such as internet companies and venture companies. It has been also applied to corporate credit ratings. In this study we utilized DEA for sorting venture companies by efficiency based ratings. The Support Vector Machine(SVM), on the other hand, is a popular technique for solving data classification problems. In this paper, we employed SVM to classify the efficiency ratings in IT venture companies according to the results of DEA. The SVM method was first developed by Vapnik (1995). As one of many machine learning techniques, SVM is based on a statistical theory. Thus far, the method has shown good performances especially in generalizing capacity in classification tasks, resulting in numerous applications in many areas of business, SVM is basically the algorithm that finds the maximum margin hyperplane, which is the maximum separation between classes. According to this method, support vectors are the closest to the maximum margin hyperplane. If it is impossible to classify, we can use the kernel function. In the case of nonlinear class boundaries, we can transform the inputs into a high-dimensional feature space, This is the original input space and is mapped into a high-dimensional dot-product space. Many studies applied SVM to the prediction of bankruptcy, the forecast a financial time series, and the problem of estimating credit rating, In this study we employed SVM for developing data mining-based efficiency prediction model. We used the Gaussian radial function as a kernel function of SVM. In multi-class SVM, we adopted one-against-one approach between binary classification method and two all-together methods, proposed by Weston and Watkins(1999) and Crammer and Singer(2000), respectively. In this research, we used corporate information of 154 companies listed on KOSDAQ market in Korea exchange. We obtained companies' financial information of 2005 from the KIS(Korea Information Service, Inc.). Using this data, we made multi-class rating with DEA efficiency and built multi-class prediction model based data mining. Among three manners of multi-classification, the hit ratio of the Weston and Watkins method is the best in the test data set. In multi classification problems as efficiency ratings of venture business, it is very useful for investors to know the class with errors, one class difference, when it is difficult to find out the accurate class in the actual market. So we presented accuracy results within 1-class errors, and the Weston and Watkins method showed 85.7% accuracy in our test samples. We conclude that the DEA based multi-class approach in venture business generates more information than the binary classification problem, notwithstanding its efficiency level. We believe this model can help investors in decision making as it provides a reliably tool to evaluate venture companies in the financial domain. For the future research, we perceive the need to enhance such areas as the variable selection process, the parameter selection of kernel function, the generalization, and the sample size of multi-class.