• Title/Summary/Keyword: information classification

Search Result 8,303, Processing Time 0.034 seconds

Comparison Between Optimal Features of Korean and Chinese for Text Classification (한중 자동 문서분류를 위한 최적 자질어 비교)

  • Ren, Mei-Ying;Kang, Sinjae
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.25 no.4
    • /
    • pp.386-391
    • /
    • 2015
  • This paper proposed the optimal attributes for text classification based on Korean and Chinese linguistic features. The experiments committed to discover which is the best feature among n-grams which is known as language independent, morphemes that have language dependency and some other feature sets consisted with n-grams and morphemes showed best results. This paper used SVM classifier and Internet news for text classification. As a result, bi-gram was the best feature in Korean text categorization with the highest F1-Measure of 87.07%, and for Chinese document classification, 'uni-gram+noun+verb+adjective+idiom', which is the combined feature set, showed the best performance with the highest F1-Measure of 82.79%.

A Study on the Development of a Classification Model for Terminological Relationships (용어관계의 분류 모형 개발에 관한 연구)

  • Baek, Ji-Won;Chung, Yeon-Kyoung
    • Journal of the Korean Society for information Management
    • /
    • v.23 no.1 s.59
    • /
    • pp.63-81
    • /
    • 2006
  • The purpose of this study is to present the limitation of terminological relationships in the current information environment and to propose a solution to result in the richer and refined terminological resources. For this, various kinds of terminological relationships in knowledge organization systems and theoretical researches were collected and analyzed. Based upon the analysis, a methodology for classification of terminological relationships was suggested and classification models were presented. Additionally, four suggestions were made for the practical uses of the classification models.

Improving the Performance of a Fast Text Classifier with Document-side Feature Selection (문서측 자질선정을 이용한 고속 문서분류기의 성능향상에 관한 연구)

  • Lee, Jae-Yun
    • Journal of Information Management
    • /
    • v.36 no.4
    • /
    • pp.51-69
    • /
    • 2005
  • High-speed classification method becomes an important research issue in text categorization systems. A fast text categorization technique, named feature value voting, is introduced recently on the text categorization problems. But the classification accuracy of this technique is not good as its classification speed. We present a novel approach for feature selection, named document-side feature selection, and apply it to feature value voting method. In this approach, there is no feature selection process in learning phase; but realtime feature selection is executed in classification phase. Our results show that feature value voting with document-side feature selection can allow fast and accurate text classification system, which seems to be competitive in classification performance with Support Vector Machines, the state-of-the-art text categorization algorithms.

Cloud-Type Classification by Two-Layered Fuzzy Logic

  • Kim, Kwang Baek
    • International Journal of Fuzzy Logic and Intelligent Systems
    • /
    • v.13 no.1
    • /
    • pp.67-72
    • /
    • 2013
  • Cloud detection and analysis from satellite images has been a topic of research in many atmospheric and environmental studies; however, it still is a challenging task for many reasons. In this paper, we propose a new method for cloud-type classification using fuzzy logic. Knowing that visible-light images of clouds contain thickness related information, while infrared images haves height-related information, we propose a two-layered fuzzy logic based on the input source to provide us with a relatively clear-cut threshold in classification. Traditional noise-removal methods that use reflection/release characteristics of infrared images often produce false positive cloud areas, such as fog thereby it negatively affecting the classification accuracy. In this study, we used the color information from source images to extract the region of interest while avoiding false positives. The structure of fuzzy inference was also changed, because we utilized three types of source images: visible-light, infrared, and near-infrared images. When a cloud appears in both the visible-light image and the infrared image, the fuzzy membership function has a different form. Therefore we designed two sets of fuzzy inference rules and related classification rules. In our experiment, the proposed method was verified to be efficient and more accurate than the previous fuzzy logic attempt that used infrared image features.

A Study on A Computerized Input Data Model for A General -Purpose Project Management (교량공사를 중심으로 한 범용 프로젝트 관리를 위한 전산 입력 자료 모형 구축)

  • Park, Hongtae
    • Journal of the Society of Disaster Information
    • /
    • v.12 no.1
    • /
    • pp.19-31
    • /
    • 2016
  • The purpose of this study was to establish the initial computerized management database which can be applied to a universal project management computer system for managing universal project management and operation. Database construction model presented in this paper suggested the model of organization, activity and operation of bridge construction(two abutment-three-span) based on the organization information classification system of the facility classification, functional component classification, work classification, resource classification. Database model established in this study are considered to be able to take advantage of a very systematic and scientific management for future universal project management and operations.

A Study on the Classification Scheme of the Internet Search Engine (인터넷 탐색엔진에 관한 연구)

  • 김영보
    • Journal of the Korean BIBLIA Society for library and Information Science
    • /
    • v.8 no.1
    • /
    • pp.197-227
    • /
    • 1997
  • The main purpose of this study is ① to settle and to analyze the classification of the Internet Search Engine comparitively, and ② to build the compatible model of Internet Search Engine classification in order to seek information on the Internet resources. specially in the branch of the Computers and Internet areas. For this study, four Internet Search Engine (Excite, 1-Detect, Simmany, Yahoo Korea!), Inspec Classification and two distionaries were used. The major findings and result of analysis are summarized as follows : 1. The basis of the classification is the scope of topics, the system logic, the clearness, the efficiency. 2. The scope of topics is analyzed comparitively by the number of items from each Search Engine. In the result, Excite is the most superior of the four 3. The system logic is analyzed comparitively by the casuality balance and consistency of the items from each Search Engine. In the result, Excite is the most superior of the four 4. The clearness is analyzed comparitively by the clearness and accuracy of items, the recognition of the searchers. In the result, Excite is the most superior of the four. 5 The efficiency is analyzed comparitively by the exactness of indexing and decreasing the effort of the searchers. In the result, Yahoo Korea! is the most superior of the four. 6 The compatible model of Internet Search Engine classification is estavlished to uplift the scope of topics, the system logic, the clearness, and the efficiency. The model divides the area mainly based upon the topics and resources using‘bookmark’and‘shadow’concept.

  • PDF

Object Classification based on Weakly Supervised E2LSH and Saliency map Weighting

  • Zhao, Yongwei;Li, Bicheng;Liu, Xin;Ke, Shengcai
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.10 no.1
    • /
    • pp.364-380
    • /
    • 2016
  • The most popular approach in object classification is based on the bag of visual-words model, which has several fundamental problems that restricting the performance of this method, such as low time efficiency, the synonym and polysemy of visual words, and the lack of spatial information between visual words. In view of this, an object classification based on weakly supervised E2LSH and saliency map weighting is proposed. Firstly, E2LSH (Exact Euclidean Locality Sensitive Hashing) is employed to generate a group of weakly randomized visual dictionary by clustering SIFT features of the training dataset, and the selecting process of hash functions is effectively supervised inspired by the random forest ideas to reduce the randomcity of E2LSH. Secondly, graph-based visual saliency (GBVS) algorithm is applied to detect the saliency map of different images and weight the visual words according to the saliency prior. Finally, saliency map weighted visual language model is carried out to accomplish object classification. Experimental results datasets of Pascal 2007 and Caltech-256 indicate that the distinguishability of objects is effectively improved and our method is superior to the state-of-the-art object classification methods.

A Study on the Classification of Agriculture (농학분야의 문헌분류 체계에 관한 연구)

  • 김정현;이명규
    • Journal of Korean Library and Information Science Society
    • /
    • v.34 no.1
    • /
    • pp.239-260
    • /
    • 2003
  • The purpose of this study is to devise a classification scheme to arrange the agricultural information efficiently. In the first part it defines the agricultural science and studies the content and system of the agricultural science. It compares current KDC with DDC, UDC and NDC used to agriculture parts, and it studies AGRICOLA SCC. On the basis of it, this study is displayed the new classification for the agricultural science. The new classification scheme Is classified by the basic theories related to agriculture, agriculture of plants, animal agriculture, food products, and auxiliary disciplines in turn. The number of main divisions are set up 23 items.

  • PDF

EVALUATION OF SPEED AND ACCURACY FOR COMPARISON OF TEXTURE CLASSIFICATION IMPLEMENTATION ON EMBEDDED PLATFORM

  • Tou, Jing Yi;Khoo, Kenny Kuan Yew;Tay, Yong Haur;Lau, Phooi Yee
    • Proceedings of the Korean Society of Broadcast Engineers Conference
    • /
    • 2009.01a
    • /
    • pp.89-93
    • /
    • 2009
  • Embedded systems are becoming more popular as many embedded platforms have become more affordable. It offers a compact solution for many different problems including computer vision applications. Texture classification can be used to solve various problems, and implementing it in embedded platforms will help in deploying these applications into the market. This paper proposes to deploy the texture classification algorithms onto the embedded computer vision (ECV) platform. Two algorithms are compared; grey level co-occurrence matrices (GLCM) and Gabor filters. Experimental results show that raw GLCM on MATLAB could achieves 50ms, being the fastest algorithm on the PC platform. Classification speed achieved on PC and ECV platform, in C, is 43ms and 3708ms respectively. Raw GLCM could achieve only 90.86% accuracy compared to the combination feature (GLCM and Gabor filters) at 91.06% accuracy. Overall, evaluating all results in terms of classification speed and accuracy, raw GLCM is more suitable to be implemented onto the ECV platform.

  • PDF

The SWG Component Technology Classification Scheme Researchthrough the Technology Trend Analysis

  • Son, Hong Min;Hu, Jong Wan
    • Journal of Korea Water Resources Association
    • /
    • v.48 no.11
    • /
    • pp.945-955
    • /
    • 2015
  • The technology of the SWG (Smart Water Grid) as one of most important national projects results in significant assignment that is closely associated with systematic management and effective operation. The individual component technics are required to establish directory and classification for the purpose of effectively managing their information related to research and development (R&D). The national science technology (S&T) standard classification tree which results in the representative example has been established with an intention to manage R&D information, human resource, and budget. It has been also revised every five years and then used in the various fields related to the evaluation, administration, and prediction of the national R&D projects. In addition, the standard classification system for R&D projects has been widely used in the UNESCO (United Nations Educational, Scientific and Cultural Organization) and EU (European Union) since the Frascati Manual was established in the Organization for Economic Cooperation and Development (OECD). Therefore, it is necessary for SWG techniques to develop the standard S&T classification tree for research management and evaluation. For this, it is essential to draw the core techniques for the SWG, which are incorporated with IT (Information Technology), NT (Nano Technology), and BT (Biology Technology).