• Title/Summary/Keyword: automatic classification

Search Result 883, Processing Time 0.024 seconds

Development of Automatic Rule Extraction Method in Data Mining : An Approach based on Hierarchical Clustering Algorithm and Rough Set Theory (데이터마이닝의 자동 데이터 규칙 추출 방법론 개발 : 계층적 클러스터링 알고리듬과 러프 셋 이론을 중심으로)

  • Oh, Seung-Joon;Park, Chan-Woong
    • Journal of the Korea Society of Computer and Information
    • /
    • v.14 no.6
    • /
    • pp.135-142
    • /
    • 2009
  • Data mining is an emerging area of computational intelligence that offers new theories, techniques, and tools for analysis of large data sets. The major techniques used in data mining are mining association rules, classification and clustering. Since these techniques are used individually, it is necessary to develop the methodology for rule extraction using a process of integrating these techniques. Rule extraction techniques assist humans in analyzing of large data sets and to turn the meaningful information contained in the data sets into successful decision making. This paper proposes an autonomous method of rule extraction using clustering and rough set theory. The experiments are carried out on data sets of UCI KDD archive and present decision rules from the proposed method. These rules can be successfully used for making decisions.

Classification of Torso Shapes of Men Aged 40-64 - Based on Measurements Extracted from the 8th Size Korea Scans - (40-64세 남성의 토르소 형태 분류에 관한 연구 - 제8차 Size Korea 인체형상으로부터 추출한 측정값을 이용하여 -)

  • Guo Tingyu;Eun Joo Ryu;Hwa Kyung Song
    • Fashion & Textile Research Journal
    • /
    • v.25 no.1
    • /
    • pp.92-103
    • /
    • 2023
  • As the body shape change which occurs after middle age is the main factor affecting the fit of ready-to-wear clothes, this study was designed to classify and analyze the torso shapes of middle-aged men. This study sorted 3D body scans of 200 men aged 40-64 from the 8th Size Korea (2021) database and extracted their 47 measurement values using the Grasshopper algorithm for automatic extraction landmarks and measurements, developed by the previous research (Ryu & Song, 2022). Eight principal components (torso length, shoulder size, overall body size, abdomen prominence, back protrusion, neck inclination, upper body slope, and hip prominence) were identified and four torso shapes were classified. Shape 1 (28.5%) exhibited the shortest torso length, the narrowest shoulders, and the most protruding back. Shape 2 (21.0%) exhibited the skinniest body and the largest backward inclination of the upper body. Hence, the back appeared to be protruding, and the abdomen looked prominent. Shape 3 (25.5%) had the largest overall body size. Thus, the abdomen looked the least protruding, and it exhibited the flattest back. Shape 4 (25.0%) had the longest torso, widest shoulders, straightest neck, and the least protruding hips. This study suggested these three discriminant functions to identify a new person's torso type.

An Attention-based Temporal Network for Parkinson's Disease Severity Rating using Gait Signals

  • Huimin Wu;Yongcan Liu;Haozhe Yang;Zhongxiang Xie;Xianchao Chen;Mingzhi Wen;Aite Zhao
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.17 no.10
    • /
    • pp.2627-2642
    • /
    • 2023
  • Parkinson's disease (PD) is a typical, chronic neurodegenerative disease involving the concentration of dopamine, which can disrupt motor activity and cause different degrees of gait disturbance relevant to PD severity in patients. As current clinical PD diagnosis is a complex, time-consuming, and challenging task that relays on physicians' subjective evaluation of visual observations, gait disturbance has been extensively explored to make automatic detection of PD diagnosis and severity rating and provides auxiliary information for physicians' decisions using gait data from various acquisition devices. Among them, wearable sensors have the advantage of flexibility since they do not limit the wearers' activity sphere in this application scenario. In this paper, an attention-based temporal network (ATN) is designed for the time series structure of gait data (vertical ground reaction force signals) from foot sensor systems, to learn the discriminative differences related to PD severity levels hidden in sequential data. The structure of the proposed method is illuminated by Transformer Network for its success in excavating temporal information, containing three modules: a preprocessing module to map intra-moment features, a feature extractor computing complicated gait characteristic of the whole signal sequence in the temporal dimension, and a classifier for the final decision-making about PD severity assessment. The experiment is conducted on the public dataset PDgait of VGRF signals to verify the proposed model's validity and show promising classification performance compared with several existing methods.

A Deep Learning Approach for Covid-19 Detection in Chest X-Rays

  • Sk. Shalauddin Kabir;Syed Galib;Hazrat Ali;Fee Faysal Ahmed;Mohammad Farhad Bulbul
    • International Journal of Computer Science & Network Security
    • /
    • v.24 no.3
    • /
    • pp.125-134
    • /
    • 2024
  • The novel coronavirus 2019 is called COVID-19 has outspread swiftly worldwide. An early diagnosis is more important to control its quick spread. Medical imaging mechanics, chest calculated tomography or chest X-ray, are playing a vital character in the identification and testing of COVID-19 in this present epidemic. Chest X-ray is cost effective method for Covid-19 detection however the manual process of x-ray analysis is time consuming given that the number of infected individuals keep growing rapidly. For this reason, it is very important to develop an automated COVID-19 detection process to control this pandemic. In this study, we address the task of automatic detection of Covid-19 by using a popular deep learning model namely the VGG19 model. We used 1300 healthy and 1300 confirmed COVID-19 chest X-ray images in this experiment. We performed three experiments by freezing different blocks and layers of VGG19 and finally, we used a machine learning classifier SVM for detecting COVID-19. In every experiment, we used a five-fold cross-validation method to train and validated the model and finally achieved 98.1% overall classification accuracy. Experimental results show that our proposed method using the deep learning-based VGG19 model can be used as a tool to aid radiologists and play a crucial role in the timely diagnosis of Covid-19.

Development of a Prototype System for Aquaculture Facility Auto Detection Using KOMPSAT-3 Satellite Imagery (KOMPSAT-3 위성영상 기반 양식시설물 자동 검출 프로토타입 시스템 개발)

  • KIM, Do-Ryeong;KIM, Hyeong-Hun;KIM, Woo-Hyeon;RYU, Dong-Ha;GANG, Su-Myung;CHOUNG, Yun-Jae
    • Journal of the Korean Association of Geographic Information Studies
    • /
    • v.19 no.4
    • /
    • pp.63-75
    • /
    • 2016
  • Aquaculture has historically delivered marine products because the country is surrounded by ocean on three sides. Surveys on production have been conducted recently to systematically manage aquaculture facilities. Based on survey results, pricing controls on marine products has been implemented to stabilize local fishery resources and to ensure minimum income for fishermen. Such surveys on aquaculture facilities depend on manual digitization of aerial photographs each year. These surveys that incorporate manual digitization using high-resolution aerial photographs can accurately evaluate aquaculture with the knowledge of experts, who are aware of each aquaculture facility's characteristics and deployment of those facilities. However, using aerial photographs has monetary and time limitations for monitoring aquaculture resources with different life cycles, and also requires a number of experts. Therefore, in this study, we investigated an automatic prototype system for detecting boundary information and monitoring aquaculture facilities based on satellite images. KOMPSAT-3 (13 Scene), a local high-resolution satellite provided the satellite imagery collected between October and April, a time period in which many aquaculture facilities were operating. The ANN classification method was used for automatic detecting such as cage, longline and buoy type. Furthermore, shape files were generated using a digitizing image processing method that incorporates polygon generation techniques. In this study, our newly developed prototype method detected aquaculture facilities at a rate of 93%. The suggested method overcomes the limits of existing monitoring method using aerial photographs, but also assists experts in detecting aquaculture facilities. Aquaculture facility detection systems must be developed in the future through application of image processing techniques and classification of aquaculture facilities. Such systems will assist in related decision-making through aquaculture facility monitoring.

An Implementation of OTB Extension to Produce TOA and TOC Reflectance of LANDSAT-8 OLI Images and Its Product Verification Using RadCalNet RVUS Data (Landsat-8 OLI 영상정보의 대기 및 지표반사도 산출을 위한 OTB Extension 구현과 RadCalNet RVUS 자료를 이용한 성과검증)

  • Kim, Kwangseob;Lee, Kiwon
    • Korean Journal of Remote Sensing
    • /
    • v.37 no.3
    • /
    • pp.449-461
    • /
    • 2021
  • Analysis Ready Data (ARD) for optical satellite images represents a pre-processed product by applying spectral characteristics and viewing parameters for each sensor. The atmospheric correction is one of the fundamental and complicated topics, which helps to produce Top-of-Atmosphere (TOA) and Top-of-Canopy (TOC) reflectance from multi-spectral image sets. Most remote sensing software provides algorithms or processing schemes dedicated to those corrections of the Landsat-8 OLI sensors. Furthermore, Google Earth Engine (GEE), provides direct access to Landsat reflectance products, USGS-based ARD (USGS-ARD), on the cloud environment. We implemented the Orfeo ToolBox (OTB) atmospheric correction extension, an open-source remote sensing software for manipulating and analyzing high-resolution satellite images. This is the first tool because OTB has not provided calibration modules for any Landsat sensors. Using this extension software, we conducted the absolute atmospheric correction on the Landsat-8 OLI images of Railroad Valley, United States (RVUS) to validate their reflectance products using reflectance data sets of RVUS in the RadCalNet portal. The results showed that the reflectance products using the OTB extension for Landsat revealed a difference by less than 5% compared to RadCalNet RVUS data. In addition, we performed a comparative analysis with reflectance products obtained from other open-source tools such as a QGIS semi-automatic classification plugin and SAGA, besides USGS-ARD products. The reflectance products by the OTB extension showed a high consistency to those of USGS-ARD within the acceptable level in the measurement data range of the RadCalNet RVUS, compared to those of the other two open-source tools. In this study, the verification of the atmospheric calibration processor in OTB extension was carried out, and it proved the application possibility for other satellite sensors in the Compact Advanced Satellite (CAS)-500 or new optical satellites.

A New Approach to Automatic Keyword Generation Using Inverse Vector Space Model (키워드 자동 생성에 대한 새로운 접근법: 역 벡터공간모델을 이용한 키워드 할당 방법)

  • Cho, Won-Chin;Rho, Sang-Kyu;Yun, Ji-Young Agnes;Park, Jin-Soo
    • Asia pacific journal of information systems
    • /
    • v.21 no.1
    • /
    • pp.103-122
    • /
    • 2011
  • Recently, numerous documents have been made available electronically. Internet search engines and digital libraries commonly return query results containing hundreds or even thousands of documents. In this situation, it is virtually impossible for users to examine complete documents to determine whether they might be useful for them. For this reason, some on-line documents are accompanied by a list of keywords specified by the authors in an effort to guide the users by facilitating the filtering process. In this way, a set of keywords is often considered a condensed version of the whole document and therefore plays an important role for document retrieval, Web page retrieval, document clustering, summarization, text mining, and so on. Since many academic journals ask the authors to provide a list of five or six keywords on the first page of an article, keywords are most familiar in the context of journal articles. However, many other types of documents could not benefit from the use of keywords, including Web pages, email messages, news reports, magazine articles, and business papers. Although the potential benefit is large, the implementation itself is the obstacle; manually assigning keywords to all documents is a daunting task, or even impractical in that it is extremely tedious and time-consuming requiring a certain level of domain knowledge. Therefore, it is highly desirable to automate the keyword generation process. There are mainly two approaches to achieving this aim: keyword assignment approach and keyword extraction approach. Both approaches use machine learning methods and require, for training purposes, a set of documents with keywords already attached. In the former approach, there is a given set of vocabulary, and the aim is to match them to the texts. In other words, the keywords assignment approach seeks to select the words from a controlled vocabulary that best describes a document. Although this approach is domain dependent and is not easy to transfer and expand, it can generate implicit keywords that do not appear in a document. On the other hand, in the latter approach, the aim is to extract keywords with respect to their relevance in the text without prior vocabulary. In this approach, automatic keyword generation is treated as a classification task, and keywords are commonly extracted based on supervised learning techniques. Thus, keyword extraction algorithms classify candidate keywords in a document into positive or negative examples. Several systems such as Extractor and Kea were developed using keyword extraction approach. Most indicative words in a document are selected as keywords for that document and as a result, keywords extraction is limited to terms that appear in the document. Therefore, keywords extraction cannot generate implicit keywords that are not included in a document. According to the experiment results of Turney, about 64% to 90% of keywords assigned by the authors can be found in the full text of an article. Inversely, it also means that 10% to 36% of the keywords assigned by the authors do not appear in the article, which cannot be generated through keyword extraction algorithms. Our preliminary experiment result also shows that 37% of keywords assigned by the authors are not included in the full text. This is the reason why we have decided to adopt the keyword assignment approach. In this paper, we propose a new approach for automatic keyword assignment namely IVSM(Inverse Vector Space Model). The model is based on a vector space model. which is a conventional information retrieval model that represents documents and queries by vectors in a multidimensional space. IVSM generates an appropriate keyword set for a specific document by measuring the distance between the document and the keyword sets. The keyword assignment process of IVSM is as follows: (1) calculating the vector length of each keyword set based on each keyword weight; (2) preprocessing and parsing a target document that does not have keywords; (3) calculating the vector length of the target document based on the term frequency; (4) measuring the cosine similarity between each keyword set and the target document; and (5) generating keywords that have high similarity scores. Two keyword generation systems were implemented applying IVSM: IVSM system for Web-based community service and stand-alone IVSM system. Firstly, the IVSM system is implemented in a community service for sharing knowledge and opinions on current trends such as fashion, movies, social problems, and health information. The stand-alone IVSM system is dedicated to generating keywords for academic papers, and, indeed, it has been tested through a number of academic papers including those published by the Korean Association of Shipping and Logistics, the Korea Research Academy of Distribution Information, the Korea Logistics Society, the Korea Logistics Research Association, and the Korea Port Economic Association. We measured the performance of IVSM by the number of matches between the IVSM-generated keywords and the author-assigned keywords. According to our experiment, the precisions of IVSM applied to Web-based community service and academic journals were 0.75 and 0.71, respectively. The performance of both systems is much better than that of baseline systems that generate keywords based on simple probability. Also, IVSM shows comparable performance to Extractor that is a representative system of keyword extraction approach developed by Turney. As electronic documents increase, we expect that IVSM proposed in this paper can be applied to many electronic documents in Web-based community and digital library.

Development of BIM Templates for Vest-Pocket Park Landscape Design (소공원의 조경설계를 위한 BIM 템플릿 개발)

  • Seo, Young-hoon;Kim, Dong-pil;Moon, Ho-Gyeong
    • Journal of the Korean Institute of Landscape Architecture
    • /
    • v.44 no.1
    • /
    • pp.40-50
    • /
    • 2016
  • A BIM, which is being applied actively to the construction and civil construction industries, is a technology that can maximize efficiency of various sectors from initial planning and design, construction, and maintenance, to demolition; however, it is in the introductory phase in the field of domestic landscaping. In order to introduce and promote BIM in the field of landscape design, this study developed a prototype of a library and template and analyzed the performance of trial application. For the development of a prototype, annotations and types were analyzed from floor plans of existing small parks, and components of landscape template were deduced. Based on this, play facilities, pergola, and benches were madeintofamily and templates, making automatic design possible. In addition, annotations and tags that are often used in landscape design were made, and a 3D view was materialized through visibility/graphic reassignment. As for tables and quantities, boundary stone table, mounding table, summary sheet of quantities, table of contents, and summary sheet of packaging quantities were grouped and connected with floor plans; regarding landscaping trees, classification criteria and name of trees that are suitable for domestic situations were applied. A landscape template was created to enable the library file format(rfa) that can be mounted on a building with BIM programs. As for problems that arose after the trial application of the prepared template, some CAD files could not be imported; also, while writing tables, the basis of calculation could not be made automatically. Regarding this, it is thought that functions of a BIM program and template need improvement.

Automatic Clustering of Same-Name Authors Using Full-text of Articles (논문 원문을 이용한 동명 저자 자동 군집화)

  • Kang, In-Su;Jung, Han-Min;Lee, Seung-Woo;Kim, Pyung;Goo, Hee-Kwan;Lee, Mi-Kyung;Goo, Nam-Ang;Sung, Won-Kyung
    • Proceedings of the Korea Contents Association Conference
    • /
    • 2006.11a
    • /
    • pp.652-656
    • /
    • 2006
  • Bibliographic information retrieval systems require bibliographic data such as authors, organizations, source of publication to be uniquely identified using keys. In particular, when authors are represented simply as their names, users bear the burden of manually discriminating different users of the same name. Previous approaches to resolving the problem of same-name authors rely on bibliographic data such as co-author information, titles of articles, etc. However, these methods cannot handle the case of single author articles, or the case when articles do not have common terms in their titles. To complement the previous methods, this study introduces a classification-based approach using similarity between full-text of articles. Experiments using recent domestic proceedings showed that the proposed method has the potential to supplement the previous meta-data based approaches.

  • PDF

Analysis on the Sedimentary Environment Change Induced by Typhoon in the Sacheoncheon, Gangneung using Multi-temporal Remote Sensing Data (태풍 루사에 의한 강릉 사천천 주변 퇴적 환경 변화: 다중 시기 원격탐사 자료를 이용한 정보 분석)

  • Park, No-Wook;Jang, Dong-Ho;Chi, Kwang-Hoon
    • Journal of the Korean earth science society
    • /
    • v.27 no.1
    • /
    • pp.83-94
    • /
    • 2006
  • The objective of this paper is to extract and analyze the sediment environment change information in the Sachencheon, Gangneung, Korea that was seriously damaged as a result of typhoon Rusa aftermath early in September, 2002 using multi-temporal remote sensing data. For the extraction of change information, an unsupervised approach based on the automatic determination of thresholding values was applied. As the change detection results, turbidity changes right after typhoon Rusa, the decrease of wetlands, the increase of dry sand and channel width and changes of relative level in the stream due to seasonal variation were observed. Sedimentation in the cultivated areas and restoration works also affected the change near the Sacheoncheon. In addition to the change detection analysis, several environmental thematic maps including microtopographic map, distributions of estimated amount of flood deposits and flood hazard landform classification map were generated by using remote sensing and field survey data. In conclusion, multi-temporal remote sensing data can be effectively used for natural hazard analysis and damage information extraction and specific data processing techniques for high-resolution remote sensing data should also be developed.