• Title/Summary/Keyword: data classification

Search Result 8,108, Processing Time 0.038 seconds

A New Method for Hyperspectral Data Classification

  • Dehghani, Hamid.;Ghassemian, Hassan.
    • Proceedings of the KSRS Conference
    • /
    • 2003.11a
    • /
    • pp.637-639
    • /
    • 2003
  • As the number of spectral bands of high spectral resolution data increases, the capability to detect more detailed classes should also increase, and the classification accuracy should increase as well. Often, it is impossible to access enough training pixels for supervise classification. For this reason, the performance of traditional classification methods isn't useful. In this paper, we propose a new model for classification that operates based on decision fusion. In this classifier, learning is performed at two steps. In first step, only training samples are used and in second step, this classifier utilizes semilabeled samples in addition to original training samples. At the beginning of this method, spectral bands are categorized in several small groups. Information of each group is used as a new source and classified. Each of this primary classifier has special characteristics and discriminates the spectral space particularly. With using of the benefits of all primary classifiers, it is made sure that the results of the fused local decisions are accurate enough. In decision fusion center, some rules are used to determine the final class of pixels. This method is applied to real remote sensing data. Results show classification performance is improved, and this method may solve the limitation of training samples in the high dimensional data and the Hughes phenomenon may be mitigated.

  • PDF

Classification of Objects using CNN-Based Vision and Lidar Fusion in Autonomous Vehicle Environment

  • G.komali ;A.Sri Nagesh
    • International Journal of Computer Science & Network Security
    • /
    • v.23 no.11
    • /
    • pp.67-72
    • /
    • 2023
  • In the past decade, Autonomous Vehicle Systems (AVS) have advanced at an exponential rate, particularly due to improvements in artificial intelligence, which have had a significant impact on social as well as road safety and the future of transportation systems. The fusion of light detection and ranging (LiDAR) and camera data in real-time is known to be a crucial process in many applications, such as in autonomous driving, industrial automation and robotics. Especially in the case of autonomous vehicles, the efficient fusion of data from these two types of sensors is important to enabling the depth of objects as well as the classification of objects at short and long distances. This paper presents classification of objects using CNN based vision and Light Detection and Ranging (LIDAR) fusion in autonomous vehicles in the environment. This method is based on convolutional neural network (CNN) and image up sampling theory. By creating a point cloud of LIDAR data up sampling and converting into pixel-level depth information, depth information is connected with Red Green Blue data and fed into a deep CNN. The proposed method can obtain informative feature representation for object classification in autonomous vehicle environment using the integrated vision and LIDAR data. This method is adopted to guarantee both object classification accuracy and minimal loss. Experimental results show the effectiveness and efficiency of presented approach for objects classification.

An Analysis on Classification Retrieval Operation in University Libraries (대학도서관의 분류검색 운영 분석)

  • Lee Jong-Moon
    • Journal of Korean Library and Information Science Society
    • /
    • v.36 no.2
    • /
    • pp.165-178
    • /
    • 2005
  • This study aims to identify the status of the classification retrieval operation by investigating and analyzing the classification retrieval related to the books in the university libraries. The Investigation concentrated on whether the classification retrieval service is provided, Access Method and classification retrieval level. The data was collected from 97 libraries where URL access was available during the period of survey in 100 libraries selected by the systematic sampling. As a result, while $92.8\%$ of 97 libraries provided the classification retrieval service, $52.2\%$ of it enabled the access to classification retrieval service only by the classification number and $47.8\%$ by classification number and classification directory. Consequently, it was found that the retrieval environment in the libraries where the access was enabled only by classification number should be urgently improved for the activation of classification retrieval.

  • PDF

Research on Comparing the Size of the Data Workforce Across Countries (국가간 데이터직무 인력 규모 비교 연구)

  • Hyemi Um
    • Journal of Information Technology Applications and Management
    • /
    • v.31 no.1
    • /
    • pp.79-95
    • /
    • 2024
  • In modern society, as data plays a crucial role at the levels of businesses, industries, and nations, the utilization of data becomes increasingly important. Consequently, governments are prioritizing the development and implementation of plans to cultivate data workforce, viewing the data industry as a cornerstone of national strategy. To enhance domestic capabilities and nurture workforce in the data industry, it is deemed necessary to conduct an objective comparative analysis with major foreign countries. Therefore, this study aims to analyze cases of domestic and international data industries and explore methods for quantitatively comparing data industry workforce across nations. Initially, the study distinguishes between "data industry workforce" and "data job-related workforce," particularly focusing on professionals handling data-related tasks. Subsequently, it compares the workforce sizes of data job-related workforce across nations, utilizing standardized occupational classification codes based on the International Standard Classification of Occupations(ISCO). However, it should be noted that countries employing their own unique occupational classification systems often require matching job titles with similar meanings for accurate comparison. Through this study, it is anticipated that policymakers will be able to establish future directions for cultivating data workforce based on comparable status.

A Case Study of Land-cover Classification Based on Multi-resolution Data Fusion of MODIS and Landsat Satellite Images (MODIS 및 Landsat 위성영상의 다중 해상도 자료 융합 기반 토지 피복 분류의 사례 연구)

  • Kim, Yeseul
    • Korean Journal of Remote Sensing
    • /
    • v.38 no.6_1
    • /
    • pp.1035-1046
    • /
    • 2022
  • This study evaluated the applicability of multi-resolution data fusion for land-cover classification. In the applicability evaluation, a spatial time-series geostatistical deconvolution/fusion model (STGDFM) was applied as a multi-resolution data fusion model. The study area was selected as some agricultural lands in Iowa State, United States. As input data for multi-resolution data fusion, Moderate Resolution Imaging Spectroradiometer (MODIS) and Landsat satellite images were used considering the landscape of study area. Based on this, synthetic Landsat images were generated at the missing date of Landsat images by applying STGDFM. Then, land-cover classification was performed using both the acquired Landsat images and the STGDFM fusion results as input data. In particular, to evaluate the applicability of multi-resolution data fusion, two classification results using only Landsat images and using both Landsat images and fusion results were compared and evaluated. As a result, in the classification result using only Landsat images, the mixed patterns were prominent in the corn and soybean cultivation areas, which are the main land-cover type in study area. In addition, the mixed patterns between land-cover types of vegetation such as hay and grain areas and grass areas were presented to be large. On the other hand, in the classification result using both Landsat images and fusion results, these mixed patterns between land-cover types of vegetation as well as corn and soybean were greatly alleviated. Due to this, the classification accuracy was improved by about 20%p in the classification result using both Landsat images and fusion results. It was considered that the missing of the Landsat images could be compensated for by reflecting the time-series spectral information of the MODIS images in the fusion results through STGDFM. This study confirmed that multi-resolution data fusion can be effectively applied to land-cover classification.

Using Genetic Rule-Based Classifier System for Data Mining (유전자 알고리즘을 이용한 데이터 마이닝의 분류 시스템에 관한 연구)

  • Han, Myung-Mook
    • Journal of Internet Computing and Services
    • /
    • v.1 no.1
    • /
    • pp.63-72
    • /
    • 2000
  • Data mining means a process of nontrivial extraction of hidden knowledge or potentially useful information from data in large databases. Data mining algorithm is a multi-disciplinary field of research; machine learning, statistics, and computer science all make a contribution. Different classification schemes can be used to categorize data mining methods based on the kinds of tasks to be implemented and the kinds of application classes to be utilized, and classification has been identified as an important task in the emerging field of data mining. Since classification is the basic element of human's way of thinking, it is a well-studied problem in a wide varietyof application. In this paper, we propose a classifier system based on genetic algorithm with robust property, and the proposed system is evaluated by applying it to nDmC problem related to classification task in data mining.

  • PDF

A study on data standardization and utilization for disaster and safety management in educational facilities (교육시설 재난안전관리를 위한 데이터 표준화 및 활용방안 연구)

  • Kang, Seong-Kyung;Lee, Young-Jai
    • The Journal of Information Systems
    • /
    • v.27 no.2
    • /
    • pp.175-196
    • /
    • 2018
  • Purpose The purpose of this study is to identify problems of current educational facility data management and recommend a standardized terminology classification system as a solution. In addition, the research aims to present a preemptive and integrated disaster and safety management framework for educational facilities by seeking efficient business processes through secured data quality, systematic data management, and external data linkage and analysis. Design/methodology/approach A terminology classification system has been established through various processes including filtering and analysis of related data including laws, manuals, educational facilities accidents, and historical records. Furthermore, the terminology classification system has been further reviewed through several consultations with experts and practitioners. In addition, the accumulated data was refined according to the established standard terminology and an Excel database was developed. Based on the data, accident patterns occurred in educational facilities over the past 10 years were analyzed. Findings In the study, a template was developed to collect consistent data for the standardized disaster and safety management terminology classification system in educational facilities. In addition, the standardized data utilization methods are presented from the viewpoint of 'education facility disaster safety data management', 'data analysis and insight', 'business management through data', and 'leaping into big data management'.

Supervised Learning-Based Collaborative Filtering Using Market Basket Data for the Cold-Start Problem

  • Hwang, Wook-Yeon;Jun, Chi-Hyuck
    • Industrial Engineering and Management Systems
    • /
    • v.13 no.4
    • /
    • pp.421-431
    • /
    • 2014
  • The market basket data in the form of a binary user-item matrix or a binary item-user matrix can be modelled as a binary classification problem. The binary logistic regression approach tackles the binary classification problem, where principal components are predictor variables. If users or items are sparse in the training data, the binary classification problem can be considered as a cold-start problem. The binary logistic regression approach may not function appropriately if the principal components are inefficient for the cold-start problem. Assuming that the market basket data can also be considered as a special regression problem whose response is either 0 or 1, we propose three supervised learning approaches: random forest regression, random forest classification, and elastic net to tackle the cold-start problem, comparing the performance in a variety of experimental settings. The experimental results show that the proposed supervised learning approaches outperform the conventional approaches.

Implementation of a Particle Swarm Optimization-based Classification Algorithm for Analyzing DNA Chip Data

  • Han, Xiaoyue;Lee, Min-Soo
    • Genomics & Informatics
    • /
    • v.9 no.3
    • /
    • pp.134-135
    • /
    • 2011
  • DNA chips are used for experiments on genes and provide useful information that could be further analyzed. Using the data extracted from the DNA chips to find useful patterns or information has become a very important issue. In this paper, we explain the application developed for classifying DNA chip data using a classification method based on the Particle Swarm Optimization (PSO) algorithm. Considering that DNA chip data is extremely large and has a fuzzy characteristic, an algorithm that imitates the ecosystem such as the PSO algorithm is suitable to be used for analyzing such data. The application enables researchers to customize the PSO algorithm parameters and see detail results of the classification rules.

Data Reduction for Classification using Entropy-based Partitioning and Center Instances (엔트로피 기반 분할과 중심 인스턴스를 이용한 분류기법의 데이터 감소)

  • Son, Seung-Hyun;Kim, Jae-Yearn
    • Journal of Korean Society of Industrial and Systems Engineering
    • /
    • v.29 no.2
    • /
    • pp.13-19
    • /
    • 2006
  • The instance-based learning is a machine learning technique that has proven to be successful over a wide range of classification problems. Despite its high classification accuracy, however, it has a relatively high storage requirement and because it must search through all instances to classify unseen cases, it is slow to perform classification. In this paper, we have presented a new data reduction method for instance-based learning that integrates the strength of instance partitioning and attribute selection. Experimental results show that reducing the amount of data for instance-based learning reduces data storage requirements, lowers computational costs, minimizes noise, and can facilitates a more rapid search.