• Title/Summary/Keyword: information classification

Search Result 8,303, Processing Time 0.031 seconds

Improvement of location positioning using KNN, Local Map Classification and Bayes Filter for indoor location recognition system

  • Oh, Seung-Hoon;Maeng, Ju-Hyun
    • Journal of the Korea Society of Computer and Information
    • /
    • v.26 no.6
    • /
    • pp.29-35
    • /
    • 2021
  • In this paper, we propose a method that combines KNN(K-Nearest Neighbor), Local Map Classification and Bayes Filter as a way to increase the accuracy of location positioning. First, in this technique, Local Map Classification divides the actual map into several clusters, and then classifies the clusters by KNN. And posterior probability is calculated through the probability of each cluster acquired by Bayes Filter. With this posterior probability, the cluster where the robot is located is searched. For performance evaluation, the results of location positioning obtained by applying KNN, Local Map Classification, and Bayes Filter were analyzed. As a result of the analysis, it was confirmed that even if the RSSI signal changes, the location information is fixed to one cluster, and the accuracy of location positioning increases.

Effects of Preprocessing on Text Classification in Balanced and Imbalanced Datasets

  • Mehmet F. Karaca
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.18 no.3
    • /
    • pp.591-609
    • /
    • 2024
  • In this study, preprocessings with all combinations were examined in terms of the effects on decreasing word number, shortening the duration of the process and the classification success in balanced and imbalanced datasets which were unbalanced in different ratios. The decreases in the word number and the processing time provided by preprocessings were interrelated. It was seen that more successful classifications were made with Turkish datasets and English datasets were affected more from the situation of whether the dataset is balanced or not. It was found out that the incorrect classifications, which are in the classes having few documents in highly imbalanced datasets, were made by assigning to the class close to the related class in terms of topic in Turkish datasets and to the class which have many documents in English datasets. In terms of average scores, the highest classification was obtained in Turkish datasets as follows: with not applying lowercase, applying stemming and removing stop words, and in English datasets as follows: with applying lowercase and stemming, removing stop words. Applying stemming was the most important preprocessing method which increases the success in Turkish datasets, whereas removing stop words in English datasets. The maximum scores revealed that feature selection, feature size and classifier are more effective than preprocessing in classification success. It was concluded that preprocessing is necessary for text classification because it shortens the processing time and can achieve high classification success, a preprocessing method does not have the same effect in all languages, and different preprocessing methods are more successful for different languages.

A Study of the Application of Relative Location System and Minute Classification System in the DDC (DDC의 상관식 배가법 적용과 분류체계 세분화에 대한 연구)

  • Kwak, Chul-Wan
    • Journal of Korean Library and Information Science Society
    • /
    • v.48 no.3
    • /
    • pp.45-61
    • /
    • 2017
  • The objective of this study is to understand the application of relative location system and minute classification system in the DDC and to identify the effect of the relative location system and minute classification system during the late of 19th century. In order to achieve the objective, four main investigation areas were chosen: relative location system, minute classification system, and DDC influence to other libraries and classification systems. First, DDC applied a relative location system revolutionarily instead of a fixed location system for arranging books on the shelves, so it opened the period of modern library classification systems. Second, it used a minute classification system, and could classify books which had minute subjects. Third, it applied form to a criterion for dividing divisions and sections, so it helped for classifying books. Fourth, it used a numerical decimal system as a classification system, then people could use it economically and practically. Last, DDC influenced modern classification system such as the Expansive Classification and the Subject Classification etc. DDC is a suitable library classification system for the needs of the times, and it is a practical classification system for each library.

Traffic Classification Using Machine Learning Algorithms in Practical Network Monitoring Environments (실제 네트워크 모니터링 환경에서의 ML 알고리즘을 이용한 트래픽 분류)

  • Jung, Kwang-Bon;Choi, Mi-Jung;Kim, Myung-Sup;Won, Young-J.;Hong, James W.
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.33 no.8B
    • /
    • pp.707-718
    • /
    • 2008
  • The methodology of classifying traffics is changing from payload based or port based to machine learning based in order to overcome the dynamic changes of application's characteristics. However, current state of traffic classification using machine learning (ML) algorithms is ongoing under the offline environment. Specifically, most of the current works provide results of traffic classification using cross validation as a test method. Also, they show classification results based on traffic flows. However, these traffic classification results are not useful for practical environments of the network traffic monitoring. This paper compares the classification results using cross validation with those of using split validation as the test method. Also, this paper compares the classification results based on flow to those based on bytes. We classify network traffics by using various feature sets and machine learning algorithms such as J48, REPTree, RBFNetwork, Multilayer perceptron, BayesNet, and NaiveBayes. In this paper, we find the best feature sets and the best ML algorithm for classifying traffics using the split validation.

A Formal Presentation of the Extensional Object Model (외연적 객체모델의 정형화)

  • Jeong, Cheol-Yong
    • Asia pacific journal of information systems
    • /
    • v.5 no.2
    • /
    • pp.143-176
    • /
    • 1995
  • We present an overview of the Extensional Object Model (ExOM) and describe in detail the learning and classification components which integrate concepts from machine learning and object-oriented databases. The ExOM emphasizes flexibility in information acquisition, learning, and classification which are useful to support tasks such as diagnosis, planning, design, and database mining. As a vehicle to integrate machine learning and databases, the ExOM supports a broad range of learning and classification methods and integrates the learning and classification components with traditional database functions. To ensure the integrity of ExOM databases, a subsumption testing rule is developed that encompasses categories defined by type expressions as well as concept definitions generated by machine learning algorithms. A prototype of the learning and classification components of the ExOM is implemented in Smalltalk/V Windows.

  • PDF

The Audio Signal Classification System Using Contents Based Analysis

  • Lee, Kwang-Seok;Kim, Young-Sub;Han, Hag-Yong;Hur, Kang-In
    • Journal of information and communication convergence engineering
    • /
    • v.5 no.3
    • /
    • pp.245-248
    • /
    • 2007
  • In this paper, we research the content-based analysis and classification according to the composition of the feature parameter data base for the audio data to implement the audio data index and searching system. Audio data is classified to the primitive various auditory types. We described the analysis and feature extraction method for the feature parameters available to the audio data classification. And we compose the feature parameters data base in the index group unit, then compare and analyze the audio data centering the including level around and index criterion into the audio categories. Based on this result, we compose feature vectors of audio data according to the classification categories, and simulate to classify using discrimination function.

An Improved PSO Algorithm for the Classification of Multiple Power Quality Disturbances

  • Zhao, Liquan;Long, Yan
    • Journal of Information Processing Systems
    • /
    • v.15 no.1
    • /
    • pp.116-126
    • /
    • 2019
  • In this paper, an improved one-against-one support vector machine algorithm is used to classify multiple power quality disturbances. To solve the problem of parameter selection, an improved particle swarm optimization algorithm is proposed to optimize the parameters of the support vector machine. By proposing a new inertia weight expression, the particle swarm optimization algorithm can effectively conduct a global search at the outset and effectively search locally later in a study, which improves the overall classification accuracy. The experimental results show that the improved particle swarm optimization method is more accurate than a grid search algorithm optimization and other improved particle swarm optimizations with regard to its classification of multiple power quality disturbances. Furthermore, the number of support vectors is reduced.

Identifying Core Robot Technologies by Analyzing Patent Co-classification Information

  • Jeon, Jeonghwan;Suh, Yongyoon;Koh, Jinhwan;Kim, Chulhyun;Lee, Sanghoon
    • Asian Journal of Innovation and Policy
    • /
    • v.8 no.1
    • /
    • pp.73-96
    • /
    • 2019
  • This study suggests a new approach for identifying core robot tech-nologies based on technological cross-impact. Specifically, the approach applies data mining techniques and multi-criteria decision-making methods to the co-classification information of registered patents on the robots. First, a cross-impact matrix is constructed with the confidence values by applying association rule mining (ARM) to the co-classification information of patents. Analytic network process (ANP) is applied to the co-classification frequency matrix for deriving weights of each robot technology. Then, a technique for order performance by similarity to ideal solution (TOPSIS) is employed to the derived cross-impact matrix and weights for identifying core robot technologies from the overall cross-impact perspective. It is expected that the proposed approach could help robot technology managers to formulate strategy and policy for technology planning of robot area.

Texture Classification Using Local Neighbor Differences (지역 근처 차이를 이용한 텍스쳐 분류에 관한 연구)

  • Saipullah, Khairul Muzzammil;Peng, Shao-Hu;Park, Min-Wook;Kim, Deok-Hwan
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2010.04a
    • /
    • pp.377-380
    • /
    • 2010
  • This paper proposes texture descriptor for texture classification called Local Neighbor Differences (LND). LND is a high discriminating texture descriptor and also robust to illumination changes. The proposed descriptor utilizes the sign of differences between surrounding pixels in a local neighborhood. The differences of those pixels are thresholded to form an 8-bit binary codeword. The decimal values of these 8-bit code words are computed and they are called LND values. A histogram of the resulting LND values is created and used as feature to describe the texture information of an image. Experimental results, with respect to texture classification accuracies using OUTEX_TC_00001 test suite has been performed. The results show that LND outperforms LBP method, with average classification accuracies of 92.3% whereas that of local binary patterns (LBP) is 90.7%.

A Study on the Establishment of the Construction Failure Information Classification (건설실패정보 분류체계 구축에 관한 연구)

  • Park Chan-Sik;Jeon Yong-Seok;Shin Young-Hwan;Jang Nae-Chun
    • Korean Journal of Construction Engineering and Management
    • /
    • v.4 no.1 s.13
    • /
    • pp.97-105
    • /
    • 2003
  • Although Construction Failure Information has been reported in literatures, reports of research, and etc., it Is difficult to utilize the information because the information classification does not exist. Therefore, this study investigated and analyzed literatures of domestic and abroad research Institutions and suggested the Construction Failure Information Classification(CFIC). The CFIC is composed of four classified items; facility general information, failure situation information, failure cause Information, and failure counterplan information. Each item is divided sub-items. Through CFIC, Construction Failure Information can be standardized and utilized for useful data to prevent recurrences of construction failure.