• Title/Summary/Keyword: top-k classification

Search Result 160, Processing Time 0.032 seconds

Sexual Maturation of the Top Shell, Omphalius rusticus (Gastropoda: Trochidae), on the Western Coast of Korea

  • Lee, Ju-Ha
    • Proceedings of the Korean Society of Fisheries Technology Conference
    • /
    • 2000.10a
    • /
    • pp.244-245
    • /
    • 2000
  • The top shell, Omphalius rusticus (Gastropoda: Trochidae), is marine mollusk inhabiting underneath a rock in the intertidal zone of the coasts of Korea and Japan, and it is one of the edible gastropods. This species is a herbivorous animal. Up to now, there have been some reports on the Trochidae: aspects of classification, spawning periodicity, production, growth and size-frequency distribution of living populations, feeding, reproductive cycle, and induction of larval metamorphosis. (omitted)

  • PDF

Research on Text Classification of Research Reports using Korea National Science and Technology Standards Classification Codes (국가 과학기술 표준분류 체계 기반 연구보고서 문서의 자동 분류 연구)

  • Choi, Jong-Yun;Hahn, Hyuk;Jung, Yuchul
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.21 no.1
    • /
    • pp.169-177
    • /
    • 2020
  • In South Korea, the results of R&D in science and technology are submitted to the National Science and Technology Information Service (NTIS) in reports that have Korea national science and technology standard classification codes (K-NSCC). However, considering there are more than 2000 sub-categories, it is non-trivial to choose correct classification codes without a clear understanding of the K-NSCC. In addition, there are few cases of automatic document classification research based on the K-NSCC, and there are no training data in the public domain. To the best of our knowledge, this study is the first attempt to build a highly performing K-NSCC classification system based on NTIS report meta-information from the last five years (2013-2017). To this end, about 210 mid-level categories were selected, and we conducted preprocessing considering the characteristics of research report metadata. More specifically, we propose a convolutional neural network (CNN) technique using only task names and keywords, which are the most influential fields. The proposed model is compared with several machine learning methods (e.g., the linear support vector classifier, CNN, gated recurrent unit, etc.) that show good performance in text classification, and that have a performance advantage of 1% to 7% based on a top-three F1 score.

Acute Leukemia Classification Using Sequential Neural Network Classifier in Clinical Decision Support System (임상적 의사결정지원시스템에서 순차신경망 분류기를 이용한 급성백혈병 분류기법)

  • Lim, Seon-Ja;Vincent, Ivan;Kwon, Ki-Ryong;Yun, Sung-Dae
    • Journal of Korea Multimedia Society
    • /
    • v.23 no.2
    • /
    • pp.174-185
    • /
    • 2020
  • Leukemia induced death has been listed in the top ten most dangerous mortality basis for human being. Some of the reason is due to slow decision-making process which caused suitable medical treatment cannot be applied on time. Therefore, good clinical decision support for acute leukemia type classification has become a necessity. In this paper, the author proposed a novel approach to perform acute leukemia type classification using sequential neural network classifier. Our experimental result only cover the first classification process which shows an excellent performance in differentiating normal and abnormal cells. Further development is needed to prove the effectiveness of second neural network classifier.

Reliability Analysis of the railway signalling system which applied to the KNR ERP(Enterprise Resource Planning) Classification System (철도경영혁신 ERP 분류체계에 따른 철도신호시스템의 신뢰성 분석)

  • Cho, Rae-Hyuck;Park, Chae-Young;Min, Young-Hee;Yun, Hak-Sun
    • Proceedings of the KSR Conference
    • /
    • 2007.05a
    • /
    • pp.993-999
    • /
    • 2007
  • With the introduction of the RAMS(Reliability, Availability, Maintainability, Safety), the interest of the system assurance has been increased. First of all, fast-growing electronic circuit requires analyzing the failure rates, by dividing the signalling system more specifically. Since 2005, the K.N.R (Korean National Railway) has incorporated ERP(Enterprise Resource Planning) in order to establish the complete status as the top international comprise, therefore while ordering the project, it has established the classification system and then has been applying to ERP system in 2007. Due to the complex of the classification system, the reliability analysis of the signalling system was assessed with the limit of IXL ATP with On-board and wayside equipment. This paper assumed MTBF(Mean Time Between Failure), MTTR((Mean Time Between Repair) of total signalling system, by using the classification of ERP program.

  • PDF

Correlation Analysis of Airline Customer Satisfaction using Random Forest with Deep Neural Network and Support Vector Machine Model

  • Hong, Sang Hoon;Kim, Bumsu;Jung, Yong Gyu
    • International Journal of Internet, Broadcasting and Communication
    • /
    • v.12 no.4
    • /
    • pp.26-32
    • /
    • 2020
  • There are many airline customer evaluation data, but they are insufficient in terms of predicting customer satisfaction in practice. In particular, they are generally insufficient in case of verification of data value and development of a customer satisfaction prediction model based on customer evaluation data. In this paper, airline customer satisfaction analysis is conducted through an experiment of correlation analysis between customer evaluation data provided by Google's Kaggle. The difference in accuracy varied according to the three types, which are the overall variables, the top 4 and top 8 variables with the highest correlation. To build an airline customer satisfaction prediction model, they are applied to three classification algorithms of Random Forest, SVM, DNN and conduct a classification experiment. They are divided into training data and verification data by 7:3. As a result, the DNN model showed the lowest accuracy at 86.4%, while the SVM model at 89% and the Random Forest model at 95.7% showed the highest accuracy and performance.

Assessment of Priority Order Using the Chemical to Cause to Generate Occupational Diseases and Classification by GHS (직업병발생 물질과 GHS분류 자료를 이용한 화학물질 우선순위 평가)

  • Baik, Nam-Sik;Chung, Jin-Do;Park, Chan-Hee
    • Journal of Environmental Science International
    • /
    • v.19 no.6
    • /
    • pp.715-735
    • /
    • 2010
  • This study is designed to assess the priority order of the chemicals to cause to generate occupational diseases in order to understand the fundamental data required for the preparation of health protective measure for the workers dealing with chemicals. The 41 types of 51 ones of chemicals to cause to generate the national occupational diseases were selected as the study objects by understanding their domestic use or not, and their occupational diseases' occurrence or not among 110,608 types of domestic and overseas chemicals. To assess their priority order the sum of scores was acquired by understanding the actually classified condition based on a perfect score of physical riskiness(90points) and health toxicity(92points) as a classification standard by GHS, the priority order on GHS riskiness assessment, GHS toxicity assessment, GHS toxic xriskiness assessment(sum of riskiness plus toxicity) was assessed by multiplying each result by each weight of occupational disease's occurrence. The high ranking 5 items of chemicals for GHS riskiness assessment were turned out to be urethane, copper, chlorine, manganese, and thiomersal by order. Besides as a result of GHS toxicity assessment the top fives were assessed to be aluminum, iron oxide, manganese, copper, and cadium(Metal) by order. On the other hand, GHS toxicity riskiness assessment showed that the top fives were assessed to be copper, urethane, iron oxide, chlorine and phenanthrene by order. As there is no material or many uncertain details for physical riskiness or health toxicity by GHS classification though such materials caused to generate the national occupational diseases, it is very urgent to prepare its countermeasure based on the forementioned in order to protect the workers handling or being exposed to chemicals from health.

New Feature Selection Method for Text Categorization

  • Wang, Xingfeng;Kim, Hee-Cheol
    • Journal of information and communication convergence engineering
    • /
    • v.15 no.1
    • /
    • pp.53-61
    • /
    • 2017
  • The preferred feature selection methods for text classification are filter-based. In a common filter-based feature selection scheme, unique scores are assigned to features; then, these features are sorted according to their scores. The last step is to add the top-N features to the feature set. In this paper, we propose an improved global feature selection scheme wherein its last step is modified to obtain a more representative feature set. The proposed method aims to improve the classification performance of global feature selection methods by creating a feature set representing all classes almost equally. For this purpose, a local feature selection method is used in the proposed method to label features according to their discriminative power on classes; these labels are used while producing the feature sets. Experimental results obtained using the well-known 20 Newsgroups and Reuters-21578 datasets with the k-nearest neighbor algorithm and a support vector machine indicate that the proposed method improves the classification performance in terms of a widely known metric ($F_1$).

Comparative Evaluation of Machine Learning Models for Predicting Soccer Injury Types

  • Davronbek Malikov;Jaeho Kim;Jung Kyu Park
    • Journal of the Korean Society of Industry Convergence
    • /
    • v.27 no.2_1
    • /
    • pp.257-268
    • /
    • 2024
  • Soccer is type of sport that carries a high risk of injury. Injury is not only cause in the unlucky soccer carrier and also team performance as well as financial effects can be worse since soccer is a team-based game. The duration of recovery from a soccer injury typically relies on its type and severity. Therefore, we conduct this research in order to predict the probability of players injury type using machine learning technologies in this paper. Furthermore, we compare different machine learning models to find the best fit model. This paper utilizes various supervised classification machine learning models, including Decision Tree, Random Forest, K-Nearest Neighbors (KNN), and Naive Bayes. Moreover, based on our finding the KNN and Decision models achieved the highest accuracy rates at 70%, surpassing other models. The Random Forest model followed closely with an accuracy score of 62%. Among the evaluated models, the Naive Bayes model demonstrated the lowest accuracy at 56%. We gathered information about 54 professional soccer players who are playing in the top five European leagues based on their career history. We gathered information about 54 professional soccer players who are playing in the top five European leagues based on their career history.

Generating Rank-Comparison Decision Rules with Variable Number of Genes for Cancer Classification (순위 비교를 기반으로 하는 다양한 유전자 개수로 이루어진 암 분류 결정 규칙의 생성)

  • Yoon, Young-Mi;Bien, Sang-Jay;Park, Sang-Hyun
    • The KIPS Transactions:PartD
    • /
    • v.15D no.6
    • /
    • pp.767-776
    • /
    • 2008
  • Microarray technology is extensively being used in experimental molecular biology field. Microarray experiments generate quantitative expression measurements for thousands of genes simultaneously, which is useful for the phenotype classification of many diseases. One of the two major problems in microarray data classification is that the number of genes exceeds the number of tissue samples. The other problem is that current methods generate classifiers that are accurate but difficult to interpret. Our paper addresses these two problems. We performed a direct integration of individual microarrays with same biological objectives by transforming an expression value into a rank value within a sample and generated rank-comparison decision rules with variable number of genes for cancer classification. Our classifier is an ensemble method which has k top scoring decision rules. Each rule contains a number of genes, a relationship among involved genes, and a class label. Current classifiers which are also ensemble methods consist of k top scoring decision rules. However these classifiers fix the number of genes in each rule as a pair or a triple. In this paper we generalized the number of genes involved in each rule. The number of genes in each rule is in the range of 2 to N respectively. Generalizing the number of genes increases the robustness and the reliability of the classifier for the class prediction of an independent sample. Also our classifier is readily interpretable, accurate with small number of genes, and shed a possibility of the use in a clinical setting.

The Effects of Industry Classification on a Successful ERP Implementation Model

  • Lee, Sangmin;Kim, Dongho
    • Journal of Information Processing Systems
    • /
    • v.12 no.1
    • /
    • pp.169-181
    • /
    • 2016
  • Organizations in some industries are still hesitant to adopt the Enterprise Resource Planning (ERP) system due to its high risk of failures. This study examined how industry classification affects the successful implementation of the ERP system. To achieve this goal, we reinvestigated the existing ERP Success Model that was developed by Chung with the data from various industry sectors, since Chung validated the model only in the engineering and construction industries. In order to test to see if the Chung model can be applicable outside the engineering and construction industries, the relationships between the ERP success indicators and the critical success factors in the Chung model and those in the sample data collected from ten different industry sectors were compared and investigated. The ten industry sectors were selected based on the Global Industry Classification Standard (GICS). We found that the impact of success factors on the success of implementing an ERP system varied across industry sectors. This means that the success of ERP system implementation can be industry-specific. Thus, industry classification should be considered as another factor to help IT decision makers or top-management avoid ERP system failures when they plan to implement a new ERP system.