• 제목/요약/키워드: Data Classification Systems

검색결과 1,440건 처리시간 0.028초

Construction of an Internet of Things Industry Chain Classification Model Based on IRFA and Text Analysis

  • Zhimin Wang
    • Journal of Information Processing Systems
    • /
    • 제20권2호
    • /
    • pp.215-225
    • /
    • 2024
  • With the rapid development of Internet of Things (IoT) and big data technology, a large amount of data will be generated during the operation of related industries. How to classify the generated data accurately has become the core of research on data mining and processing in IoT industry chain. This study constructs a classification model of IoT industry chain based on improved random forest algorithm and text analysis, aiming to achieve efficient and accurate classification of IoT industry chain big data by improving traditional algorithms. The accuracy, precision, recall, and AUC value size of the traditional Random Forest algorithm and the algorithm used in the paper are compared on different datasets. The experimental results show that the algorithm model used in this paper has better performance on different datasets, and the accuracy and recall performance on four datasets are better than the traditional algorithm, and the accuracy performance on two datasets, P-I Diabetes and Loan Default, is better than the random forest model, and its final data classification results are better. Through the construction of this model, we can accurately classify the massive data generated in the IoT industry chain, thus providing more research value for the data mining and processing technology of the IoT industry chain.

Generation of Efficient Fuzzy Classification Rules Using Evolutionary Algorithm with Data Partition Evaluation (데이터 분할 평가 진화알고리즘을 이용한 효율적인 퍼지 분류규칙의 생성)

  • Ryu, Joung-Woo;Kim, Sung-Eun;Kim, Myung-Won
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • 제18권1호
    • /
    • pp.32-40
    • /
    • 2008
  • Fuzzy rules are very useful and efficient to describe classification rules especially when the attribute values are continuous and fuzzy in nature. However, it is generally difficult to determine membership functions for generating efficient fuzzy classification rules. In this paper, we propose a method of automatic generation of efficient fuzzy classification rules using evolutionary algorithm. In our method we generate a set of initial membership functions for evolutionary algorithm by supervised clustering the training data set and we evolve the set of initial membership functions in order to generate fuzzy classification rules taking into consideration both classification accuracy and rule comprehensibility. To reduce time to evaluate an individual we also propose an evolutionary algorithm with data partition evaluation in which the training data set is partitioned into a number of subsets and individuals are evaluated using a randomly selected subset of data at a time instead of the whole training data set. We experimented our algorithm with the UCI learning data sets, the experiment results showed that our method was more efficient at average compared with the existing algorithms. For the evolutionary algorithm with data partition evaluation, we experimented with our method over the intrusion detection data of KDD'99 Cup, and confirmed that evaluation time was reduced by about 70%. Compared with the KDD'99 Cup winner, the accuracy was increased by 1.54% while the cost was reduced by 20.8%.

The New Criterion of Classification System for Data Linkage (자료 연계성을 고려한 차종 분류 기준의 제시)

  • Kim, Yun-Seob;Oh, Ju-Sam;Kim, Hyun-Seok
    • International Journal of Highway Engineering
    • /
    • 제7권4호
    • /
    • pp.57-68
    • /
    • 2005
  • Vehicle classification system in Korea is operated by two different types depending on operating purpose and place. 8-category classification system operates in Expressway and Provincial road, and 11-category classification system operates in National highway. These different operations decrease the efficiency of practical use of gathering data. Therefore, this study proposes new-modified vehicle classification system for solving this problem. For classification, this study not only focuses on mechanic survey system which is based on vehicle specs, it's also focuses on the applicability of roadside survey. This proposed classification system considers the tendency to vary of vehicle types, and the compatibility with the other classification systems. This system might be the most suitable system for our present situation.

  • PDF

Contribution to Improve Database Classification Algorithms for Multi-Database Mining

  • Miloudi, Salim;Rahal, Sid Ahmed;Khiat, Salim
    • Journal of Information Processing Systems
    • /
    • 제14권3호
    • /
    • pp.709-726
    • /
    • 2018
  • Database classification is an important preprocessing step for the multi-database mining (MDM). In fact, when a multi-branch company needs to explore its distributed data for decision making, it is imperative to classify these multiple databases into similar clusters before analyzing the data. To search for the best classification of a set of n databases, existing algorithms generate from 1 to ($n^2-n$)/2 candidate classifications. Although each candidate classification is included in the next one (i.e., clusters in the current classification are subsets of clusters in the next classification), existing algorithms generate each classification independently, that is, without taking into account the use of clusters from the previous classification. Consequently, existing algorithms are time consuming, especially when the number of candidate classifications increases. To overcome the latter problem, we propose in this paper an efficient approach that represents the problem of classifying the multiple databases as a problem of identifying the connected components of an undirected weighted graph. Theoretical analysis and experiments on public databases confirm the efficiency of our algorithm against existing works and that it overcomes the problem of increase in the execution time.

A neural network approach to defect classification on printed circuit boards (인쇄 회로 기판의 결함 검출 및 인식 알고리즘)

  • An, Sang-Seop;No, Byeong-Ok;Yu, Yeong-Gi;Jo, Hyeong-Seok
    • Journal of Institute of Control, Robotics and Systems
    • /
    • 제2권4호
    • /
    • pp.337-343
    • /
    • 1996
  • In this paper, we investigate the defect detection by making use of pre-made reference image data and classify the defects by using the artificial neural network. The approach is composed of three main parts. The first step consists of a proper generation of two reference image data by using a low level morphological technique. The second step proceeds by performing three times logical bit operations between two ready-made reference images and just captured image to be tested. This results in defects image only. In the third step, by extracting four features from each detected defect, followed by assigning them into the input nodes of an already trained artificial neural network we can obtain a defect class corresponding to the features. All of the image data are formed in a bit level for the reduction of data size as well as time saving. Experimental results show that proposed algorithms are found to be effective for flexible defect detection, robust classification, and high speed process by adopting a simple logic operation.

  • PDF

A patent analysis method for identifying core technologies: Data mining and multi-criteria decision making approach (핵심 기술 파악을 위한 특허 분석 방법: 데이터 마이닝 및 다기준 의사결정 접근법)

  • Kim, Chul-Hyun
    • Journal of the Korea Safety Management & Science
    • /
    • 제16권1호
    • /
    • pp.213-220
    • /
    • 2014
  • This study suggests new approach to identify core technologies through patent analysis. Specially, the approach applied data mining technique and multi-criteria decision making method to the co-classification information of registered patents. First, technological interrelationship matrices of intensity, relatedness, and cross-impact perspectives are constructed with support, lift and confidence values calculated by conducting an association rule mining on the co-classification information of patent data. Second, the analytic network process is applied to the constructed technological interrelationship matrices in order to produce the importance values of technologies from each perspective. Finally, data envelopment analysis is employed to the derived importance values in order to identify priorities of technologies, putting three perspectives together. It is expected that suggested approach could help technology planners to formulate strategy and policy for technological innovation.

A New Lane Departure Warning System using a Support Vector Machine Classifier and a Fuzzy System

  • Kim, Sam-Yong;Oh, Se-Young
    • 제어로봇시스템학회:학술대회논문집
    • /
    • 제어로봇시스템학회 2002년도 ICCAS
    • /
    • pp.110.3-110
    • /
    • 2002
  • $\textbullet$ Lane detection by TFALDA $\textbullet$ SVM for large scale data and multiclass classification problem $\textbullet$ TLC Classification $\textbullet$ Lateral offset estimation by IPT $\textbullet$ Lane departure warning by a fuzzy system $\textbullet$ Experimental results by HiLS $\textbullet$ Conclusion

  • PDF

Application of CCTV Image and Semantic Segmentation Model for Water Level Estimation of Irrigation Channel (관개용수로 CCTV 이미지를 이용한 CNN 딥러닝 이미지 모델 적용)

  • Kim, Kwi-Hoon;Kim, Ma-Ga;Yoon, Pu-Reun;Bang, Je-Hong;Myoung, Woo-Ho;Choi, Jin-Yong;Choi, Gyu-Hoon
    • Journal of The Korean Society of Agricultural Engineers
    • /
    • 제64권3호
    • /
    • pp.63-73
    • /
    • 2022
  • A more accurate understanding of the irrigation water supply is necessary for efficient agricultural water management. Although we measure water levels in an irrigation canal using ultrasonic water level gauges, some errors occur due to malfunctions or the surrounding environment. This study aims to apply CNN (Convolutional Neural Network) Deep-learning-based image classification and segmentation models to the irrigation canal's CCTV (Closed-Circuit Television) images. The CCTV images were acquired from the irrigation canal of the agricultural reservoir in Cheorwon-gun, Gangwon-do. We used the ResNet-50 model for the image classification model and the U-Net model for the image segmentation model. Using the Natural Breaks algorithm, we divided water level data into 2, 4, and 8 groups for image classification models. The classification models of 2, 4, and 8 groups showed the accuracy of 1.000, 0.987, and 0.634, respectively. The image segmentation model showed a Dice score of 0.998 and predicted water levels showed R2 of 0.97 and MAE (Mean Absolute Error) of 0.02 m. The image classification models can be applied to the automatic gate-controller at four divisions of water levels. Also, the image segmentation model results can be applied to the alternative measurement for ultrasonic water gauges. We expect that the results of this study can provide a more scientific and efficient approach for agricultural water management.

Development of Intelligent Job Classification System based on Job Posting on Job Sites (구인구직사이트의 구인정보 기반 지능형 직무분류체계의 구축)

  • Lee, Jung Seung
    • Journal of Intelligence and Information Systems
    • /
    • 제25권4호
    • /
    • pp.123-139
    • /
    • 2019
  • The job classification system of major job sites differs from site to site and is different from the job classification system of the 'SQF(Sectoral Qualifications Framework)' proposed by the SW field. Therefore, a new job classification system is needed for SW companies, SW job seekers, and job sites to understand. The purpose of this study is to establish a standard job classification system that reflects market demand by analyzing SQF based on job offer information of major job sites and the NCS(National Competency Standards). For this purpose, the association analysis between occupations of major job sites is conducted and the association rule between SQF and occupation is conducted to derive the association rule between occupations. Using this association rule, we proposed an intelligent job classification system based on data mapping the job classification system of major job sites and SQF and job classification system. First, major job sites are selected to obtain information on the job classification system of the SW market. Then We identify ways to collect job information from each site and collect data through open API. Focusing on the relationship between the data, filtering only the job information posted on each job site at the same time, other job information is deleted. Next, we will map the job classification system between job sites using the association rules derived from the association analysis. We will complete the mapping between these market segments, discuss with the experts, further map the SQF, and finally propose a new job classification system. As a result, more than 30,000 job listings were collected in XML format using open API in 'WORKNET,' 'JOBKOREA,' and 'saramin', which are the main job sites in Korea. After filtering out about 900 job postings simultaneously posted on multiple job sites, 800 association rules were derived by applying the Apriori algorithm, which is a frequent pattern mining. Based on 800 related rules, the job classification system of WORKNET, JOBKOREA, and saramin and the SQF job classification system were mapped and classified into 1st and 4th stages. In the new job taxonomy, the first primary class, IT consulting, computer system, network, and security related job system, consisted of three secondary classifications, five tertiary classifications, and five fourth classifications. The second primary classification, the database and the job system related to system operation, consisted of three secondary classifications, three tertiary classifications, and four fourth classifications. The third primary category, Web Planning, Web Programming, Web Design, and Game, was composed of four secondary classifications, nine tertiary classifications, and two fourth classifications. The last primary classification, job systems related to ICT management, computer and communication engineering technology, consisted of three secondary classifications and six tertiary classifications. In particular, the new job classification system has a relatively flexible stage of classification, unlike other existing classification systems. WORKNET divides jobs into third categories, JOBKOREA divides jobs into second categories, and the subdivided jobs into keywords. saramin divided the job into the second classification, and the subdivided the job into keyword form. The newly proposed standard job classification system accepts some keyword-based jobs, and treats some product names as jobs. In the classification system, not only are jobs suspended in the second classification, but there are also jobs that are subdivided into the fourth classification. This reflected the idea that not all jobs could be broken down into the same steps. We also proposed a combination of rules and experts' opinions from market data collected and conducted associative analysis. Therefore, the newly proposed job classification system can be regarded as a data-based intelligent job classification system that reflects the market demand, unlike the existing job classification system. This study is meaningful in that it suggests a new job classification system that reflects market demand by attempting mapping between occupations based on data through the association analysis between occupations rather than intuition of some experts. However, this study has a limitation in that it cannot fully reflect the market demand that changes over time because the data collection point is temporary. As market demands change over time, including seasonal factors and major corporate public recruitment timings, continuous data monitoring and repeated experiments are needed to achieve more accurate matching. The results of this study can be used to suggest the direction of improvement of SQF in the SW industry in the future, and it is expected to be transferred to other industries with the experience of success in the SW industry.

A framework for selecting information systems planning (ISP) approach (ISP 방법론 비교 선정을 위한 프레임워크)

  • Sung Kun Kim;Soon Sam Hwang
    • Journal of Information Technology Applications and Management
    • /
    • 제9권3호
    • /
    • pp.129-139
    • /
    • 2002
  • There exist a number of information systems planning (ISP) methodologies. Historically these methodologies have been evolving to reflect new technologies and business requirements. In fact, it is an uneasy task to select a methodology that fits a business need. Though there have been a number of studies proposing new ISP approaches, we are unable to find much research doing a comparative analysis on existing ISP methodologies. Our study, therefore, is to present a classification scheme for ISP approaches and to provide a guideline framework for selecting an approach most suitable to a particular firm's need. Our classification utilizes types of components covered in ISP deliverables and the peculiarity of these components. Such classification scheme and selection framework would help derive an IT-driven new enterprise model more effectively.

  • PDF