• Title/Summary/Keyword: data classification

Search Result 8,054, Processing Time 0.037 seconds

Web-based synthetic-aperture radar data management system and land cover classification

  • Dalwon Jang;Jaewon Lee;Jong-Seol Lee
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.17 no.7
    • /
    • pp.1858-1872
    • /
    • 2023
  • With the advance of radar technologies, the availability of synthetic aperture radar (SAR) images increases. To improve application of SAR images, a management system for SAR images is proposed in this paper. The system provides trainable land cover classification module and display of SAR images on the map. Users of the system can create their own classifier with their data, and obtain the classified results of newly captured SAR images by applying the classifier to the images. The classifier is based on convolutional neural network structure. Since there are differences among SAR images depending on capturing method and devices, a fixed classifier cannot cover all types of SAR land cover classification problems. Thus, it is adopted to create each user's classifier. In our experiments, it is shown that the module works well with two different SAR datasets. With this system, SAR data and land cover classification results are managed and easily displayed.

Decision Tree Classifier for Multiple Abstraction Levels of Data (다중 추상화 수준의 데이터를 위한 결정 트리 분류기)

  • Jeong, Min-A;Lee, Do-Heon
    • The KIPS Transactions:PartD
    • /
    • v.10D no.1
    • /
    • pp.23-32
    • /
    • 2003
  • Since the data is collected from disparate sources in many actual data mining environments, it is common to have data values in different abstraction levels. This paper shows that such multiple abstraction levels of data can cause undesirable effects in decision tree classification. After explaining that equalizing abstraction levels by force cannot provide satisfactory solutions of this problem, it presents a method to utilize the data as it is. The proposed method accommodates the generalization/specialization relationship between data values in both of the construction and the class assignment phase of decision tree classification. The experimental results show that the proposed method reduces classification error rates significantly when multiple abstraction levels of data are involved.

Classification of Genes Based on Age-Related Differential Expression in Breast Cancer

  • Lee, Gunhee;Lee, Minho
    • Genomics & Informatics
    • /
    • v.15 no.4
    • /
    • pp.156-161
    • /
    • 2017
  • Transcriptome analysis has been widely used to make biomarker panels to diagnose cancers. In breast cancer, the age of the patient has been known to be associated with clinical features. As clinical transcriptome data have accumulated significantly, we classified all human genes based on age-specific differential expression between normal and breast cancer cells using public data. We retrieved the values for gene expression levels in breast cancer and matched normal cells from The Cancer Genome Atlas. We divided genes into two classes by paired t test without considering age in the first classification. We carried out a secondary classification of genes for each class into eight groups, based on the patterns of the p-values, which were calculated for each of the three age groups we defined. Through this two-step classification, gene expression was eventually grouped into 16 classes. We showed that this classification method could be applied to establish a more accurate prediction model to diagnose breast cancer by comparing the performance of prediction models with different combinations of genes. We expect that our scheme of classification could be used for other types of cancer data.

An Intelligent System of Marker Gene Selection for Classification of Cancers using Microarray Data (마이크로어레이 데이터를 이용한 암 분류 표지 유전자 선별 시스템)

  • Park, Su-Young;Jung, Chai-Yeoung
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.14 no.10
    • /
    • pp.2365-2370
    • /
    • 2010
  • The method of cancer classification based on microarray could contribute to being accurate cancer classification by finding differently expressing gene pattern statistically according to a cancer type. Therefore, the process to select a closely related informative gene with a particular cancer classification to classify cancer using present microarray technology with effect is essential. In this paper, the system can detect marker genes to likely express the most differentially explaining the effects of cancer using ovarian cancer microarray data. And it compare and analyze a performance of classification of the proposed system with it of established microarray system using multi-perceptron neural network layer. Microarray data set including marker gene that are selected using ANOVA method represent the highest classification accuracy of 98.61%, which show that it improve classification performance than established microarray system.

Development of Accident Classification Model and Ontology for Effective Industrial Accident Analysis based on Textmining (효과적인 산업재해 분석을 위한 텍스트마이닝 기반의 사고 분류 모형과 온톨로지 개발)

  • Ahn, Gilseung;Seo, Minji;Hur, Sun
    • Journal of the Korean Society of Safety
    • /
    • v.32 no.5
    • /
    • pp.179-185
    • /
    • 2017
  • Accident analysis is an essential process to make basic data for accident prevention. Most researches depend on survey data and accident statistics to analyze accidents, but these kinds of data are not sufficient for systematic and detailed analysis. We, in this paper, propose an accident classification model that extracts task type, original cause materials, accident type, and the number of deaths from accident reports. The classification model is a support vector machine (SVM) with word occurrence features, and these features are selected based on mutual information. Experiment shows that the proposed model can extract task type, original cause materials, accident type, and the number of deaths with almost 100% accuracy. We also develop an accident ontology to express the information extracted by the classification model. Finally, we illustrate how the proposed classification model and ontology effectively works for the accident analysis. The classification model and ontology are expected to effectively analyze various accidents.

Land Use Classification of TM Imagery in Hilly Areas: Integration of Image Processing and Expert Knowledge

  • Ding, Feng;Chen, Wenhui;Zheng, Daxian
    • Proceedings of the KSRS Conference
    • /
    • 2003.11a
    • /
    • pp.1329-1331
    • /
    • 2003
  • Improvement of the classification accuracy is one of the major concerns in the field of remote sensing application research in recent years. Previous research shows that the accuracy of the conventional classification methods based only on the original spectral information were usually unsatisfied and need to be refined by manual edit. This present paper describes a method of combining the image processing, ancillary data (such as digital elevation model) and expert knowledge (especially the knowledge of local professionals) to improve the efficiency and accuracy of the satellite image classification in hilly land. Firstly, the Landsat TM data were geo-referenced. Secondly, the individual bands of the image were intensitynormalized and the normalized difference vegetation index (NDVI) image was also generated. Thirdly, a set of sample pixels (collected from field survey) were utilized to discover their corresponding DN (digital number) ranges in the NDVI image, and to explore the relationships between land use type and its corresponding spectral features . Then, using the knowledge discovered from previous steps as well as knowledge from local professionals, with the support of GIS technology and the ancillary data, a set of conditional statements were applied to perform the TM imagery classification. The results showed that the integration of image processing and spatial analysis functions in GIS improved the overall classification result if compared with the conventional methods.

  • PDF

Development of Intelligent Job Classification System based on Job Posting on Job Sites (구인구직사이트의 구인정보 기반 지능형 직무분류체계의 구축)

  • Lee, Jung Seung
    • Journal of Intelligence and Information Systems
    • /
    • v.25 no.4
    • /
    • pp.123-139
    • /
    • 2019
  • The job classification system of major job sites differs from site to site and is different from the job classification system of the 'SQF(Sectoral Qualifications Framework)' proposed by the SW field. Therefore, a new job classification system is needed for SW companies, SW job seekers, and job sites to understand. The purpose of this study is to establish a standard job classification system that reflects market demand by analyzing SQF based on job offer information of major job sites and the NCS(National Competency Standards). For this purpose, the association analysis between occupations of major job sites is conducted and the association rule between SQF and occupation is conducted to derive the association rule between occupations. Using this association rule, we proposed an intelligent job classification system based on data mapping the job classification system of major job sites and SQF and job classification system. First, major job sites are selected to obtain information on the job classification system of the SW market. Then We identify ways to collect job information from each site and collect data through open API. Focusing on the relationship between the data, filtering only the job information posted on each job site at the same time, other job information is deleted. Next, we will map the job classification system between job sites using the association rules derived from the association analysis. We will complete the mapping between these market segments, discuss with the experts, further map the SQF, and finally propose a new job classification system. As a result, more than 30,000 job listings were collected in XML format using open API in 'WORKNET,' 'JOBKOREA,' and 'saramin', which are the main job sites in Korea. After filtering out about 900 job postings simultaneously posted on multiple job sites, 800 association rules were derived by applying the Apriori algorithm, which is a frequent pattern mining. Based on 800 related rules, the job classification system of WORKNET, JOBKOREA, and saramin and the SQF job classification system were mapped and classified into 1st and 4th stages. In the new job taxonomy, the first primary class, IT consulting, computer system, network, and security related job system, consisted of three secondary classifications, five tertiary classifications, and five fourth classifications. The second primary classification, the database and the job system related to system operation, consisted of three secondary classifications, three tertiary classifications, and four fourth classifications. The third primary category, Web Planning, Web Programming, Web Design, and Game, was composed of four secondary classifications, nine tertiary classifications, and two fourth classifications. The last primary classification, job systems related to ICT management, computer and communication engineering technology, consisted of three secondary classifications and six tertiary classifications. In particular, the new job classification system has a relatively flexible stage of classification, unlike other existing classification systems. WORKNET divides jobs into third categories, JOBKOREA divides jobs into second categories, and the subdivided jobs into keywords. saramin divided the job into the second classification, and the subdivided the job into keyword form. The newly proposed standard job classification system accepts some keyword-based jobs, and treats some product names as jobs. In the classification system, not only are jobs suspended in the second classification, but there are also jobs that are subdivided into the fourth classification. This reflected the idea that not all jobs could be broken down into the same steps. We also proposed a combination of rules and experts' opinions from market data collected and conducted associative analysis. Therefore, the newly proposed job classification system can be regarded as a data-based intelligent job classification system that reflects the market demand, unlike the existing job classification system. This study is meaningful in that it suggests a new job classification system that reflects market demand by attempting mapping between occupations based on data through the association analysis between occupations rather than intuition of some experts. However, this study has a limitation in that it cannot fully reflect the market demand that changes over time because the data collection point is temporary. As market demands change over time, including seasonal factors and major corporate public recruitment timings, continuous data monitoring and repeated experiments are needed to achieve more accurate matching. The results of this study can be used to suggest the direction of improvement of SQF in the SW industry in the future, and it is expected to be transferred to other industries with the experience of success in the SW industry.

The Precise Positioning with the 3D Coordinate Transformation of GPS Surveying (GPS 측량의 3차원 좌표변환에 의한 정밀위치결정)

  • Park, Woon-Yong;Yeu, Bock-Mo;Lee, Kee-Boo
    • Journal of Korean Society for Geospatial Information Science
    • /
    • v.8 no.2 s.16
    • /
    • pp.47-60
    • /
    • 2000
  • On this study, Among the classification methods of land cover using satellite imagery, we compared the classification accuracy of Neural Network Classifier and that of Maximum Likelihood Classifier which has the characteristics of parametric and non-parametric classification method. In the assessment of classification accuracy, we analyzed the classification accuracy about testing area as well as training area that many analysts use generally when assess the classification accuracy. As a result, Neural Network Classifier is superior to Maximum Likelihood Classifier as much as 3% in the classification of training data. When ground reference data is used, we could get poor result from both of classification methods, but we could reach conclusion that the classification result of Neural Network Classifier is superior to the classification result of Maximum Likelihood Classifier as much as 10%.

  • PDF

Study on Classification Function into Sasang Constitution Using Data Mining Techniques (데이터마이닝 기법을 이용한 사상체질 판별함수에 관한 연구)

  • Kim Kyu Kon;Kim Jong Won;Lee Eui Ju;Kim Jong Yeol;Choi Sun-Mi
    • Journal of Physiology & Pathology in Korean Medicine
    • /
    • v.18 no.6
    • /
    • pp.1938-1944
    • /
    • 2004
  • In this study, when we make a diagnosis of constitution using QSCC Ⅱ(Questionnaire of Sasang Constitution Classification). data mining techniques are applied to seek the classification function for improving the accuracy. Data used in the analysis are the questionnaires of 1051 patients who had been treated in Dong Eui Oriental Medical Hospital and Kyung Hee Oriental Medical Hospital. The criteria for data cleansing are the response pattern in the opposite questionnaires and the positive proportion of specific questionnaires in each constitution. And the criteria for variable selection are the test of homogeneity in frequency analysis and the coefficients in the linear discriminant function. Discriminant analysis model and decision tree model are applied to seek the classification function into Sasang constitution. The accuracy in learning sample is similar in two models, the higher accuracy in test sample is obtained in discriminant analysis model.

A Classification Analysis using Bayesian Neural Network (베이지안 신경망을 이용한 분류분석)

  • Hwang, Jin-Soo;Choi, Seong-Yong;Jun, Hong-Suk
    • Journal of the Korean Data and Information Science Society
    • /
    • v.12 no.2
    • /
    • pp.11-25
    • /
    • 2001
  • There are several algorithms for classification in modeling relations, patterns, and rules which exist in data. We learn to classify objects on the basis of instances presented to us, not by being given a set of classification rules. The Bayesian learning uses the probability distribution to express our knowledge about unknown parameters and update our knowledge by the law of probability as the evidence gathered from data. Also, the neural network models are designed for predicting an unknown category or quantity on the basis of known attributes by training. In this paper, we compare the misclassification error rates of Bayesian Neural Network method with those of other classification algorithms, CHAID, CART, and QUBST using several data sets.

  • PDF