• Title/Summary/Keyword: Machine classification

Search Result 2,099, Processing Time 0.028 seconds

Automatic Extraction of Hangul Stroke Element Using Faster R-CNN for Font Similarity (글꼴 유사도 판단을 위한 Faster R-CNN 기반 한글 글꼴 획 요소 자동 추출)

  • Jeon, Ja-Yeon;Park, Dong-Yeon;Lim, Seo-Young;Ji, Yeong-Seo;Lim, Soon-Bum
    • Journal of Korea Multimedia Society
    • /
    • v.23 no.8
    • /
    • pp.953-964
    • /
    • 2020
  • Ever since media contents took over the world, the importance of typography has increased, and the influence of fonts has be n recognized. Nevertheless, the current Hangul font system is very poor and is provided passively, so it is practically impossible to understand and utilize all the shape characteristics of more than six thousand Hangul fonts. In this paper, the characteristics of Hangul font shapes were selected based on the Hangul structure of similar fonts. The stroke element detection training was performed by fine tuning Faster R-CNN Inception v2, one of the deep learning object detection models. We also propose a system that automatically extracts the stroke element characteristics from characters by introducing an automatic extraction algorithm. In comparison to the previous research which showed poor accuracy while using SVM(Support Vector Machine) and Sliding Window Algorithm, the proposed system in this paper has shown the result of 10 % accuracy to properly detect and extract stroke elements from various fonts. In conclusion, if the stroke element characteristics based on the Hangul structural information extracted through the system are used for similar classification, problems such as copyright will be solved in an era when typography's competitiveness becomes stronger, and an automated process will be provided to users for more convenience.

Two-Phase Shallow Semantic Parsing based on Partial Syntactic Parsing (부분 구문 분석 결과에 기반한 두 단계 부분 의미 분석 시스템)

  • Park, Kyung-Mi;Mun, Young-Song
    • The KIPS Transactions:PartB
    • /
    • v.17B no.1
    • /
    • pp.85-92
    • /
    • 2010
  • A shallow semantic parsing system analyzes the relationship that a syntactic constituent of the sentence has with a predicate. It identifies semantic arguments representing agent, patient, instrument, etc. of the predicate. In this study, we propose a two-phase shallow semantic parsing model which consists of the identification phase and the classification phase. We first find the boundary of semantic arguments from partial syntactic parsing results, and then assign appropriate semantic roles to the identified semantic arguments. By taking the sequential two-phase approach, we can alleviate the unbalanced class distribution problem, and select the features appropriate for each task. Experiments show the relative contribution of each phase on the test data.

A Design of Fuzzy Classifier with Hierarchical Structure (계층적 구조를 가진 퍼지 패턴 분류기 설계)

  • Ahn, Tae-Chon;Roh, Seok-Beom;Kim, Yong Soo
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.24 no.4
    • /
    • pp.355-359
    • /
    • 2014
  • In this paper, we proposed the new fuzzy pattern classifier which combines several fuzzy models with simple consequent parts hierarchically. The basic component of the proposed fuzzy pattern classifier with hierarchical structure is a fuzzy model with simple consequent part so that the complexity of the proposed fuzzy pattern classifier is not high. In order to analyze and divide the input space, we use Fuzzy C-Means clustering algorithm. In addition, we exploit Conditional Fuzzy C-Means clustering algorithm to analyze the sub space which is divided by Fuzzy C-Means clustering algorithm. At each clustered region, we apply a fuzzy model with simple consequent part and build the fuzzy pattern classifier with hierarchical structure. Because of the hierarchical structure of the proposed pattern classifier, the data distribution of the input space can be analyzed in the macroscopic point of view and the microscopic point of view. Finally, in order to evaluate the classification ability of the proposed pattern classifier, the machine learning data sets are used.

Decoding Brain Patterns for Colored and Grayscale Images using Multivariate Pattern Analysis

  • Zafar, Raheel;Malik, Muhammad Noman;Hayat, Huma;Malik, Aamir Saeed
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.14 no.4
    • /
    • pp.1543-1561
    • /
    • 2020
  • Taxonomy of human brain activity is a complicated rather challenging procedure. Due to its multifaceted aspects, including experiment design, stimuli selection and presentation of images other than feature extraction and selection techniques, foster its challenging nature. Although, researchers have focused various methods to create taxonomy of human brain activity, however use of multivariate pattern analysis (MVPA) for image recognition to catalog the human brain activities is scarce. Moreover, experiment design is a complex procedure and selection of image type, color and order is challenging too. Thus, this research bridge the gap by using MVPA to create taxonomy of human brain activity for different categories of images, both colored and gray scale. In this regard, experiment is conducted through EEG testing technique, with feature extraction, selection and classification approaches to collect data from prequalified criteria of 25 graduates of University Technology PETRONAS (UTP). These participants are shown both colored and gray scale images to record accuracy and reaction time. The results showed that colored images produces better end result in terms of accuracy and response time using wavelet transform, t-test and support vector machine. This research resulted that MVPA is a better approach for the analysis of EEG data as more useful information can be extracted from the brain using colored images. This research discusses a detail behavior of human brain based on the color and gray scale images for the specific and unique task. This research contributes to further improve the decoding of human brain with increased accuracy. Besides, such experiment settings can be implemented and contribute to other areas of medical, military, business, lie detection and many others.

Data-driven Analysis for Future Land-use Change Prediction : Case Study on Seoul (서울 데이터 기반 필지별 용도전환 발생 예측)

  • Yun, Sung Bum;Mun, Sungchul;Park, Soon Yong;Kim, Taehyun
    • Journal of Broadcast Engineering
    • /
    • v.25 no.2
    • /
    • pp.176-184
    • /
    • 2020
  • Due to constant development and decline on Seoul areas the Seoul government is pushing various policies to regenerate declined Seoul areas. Theses various policies lead to land-use changes around numerous Seoul districts. This study aims to create prediction model which can foresee future land-use changes and while doing so, tried to derive various influential factors which leads to land-use changes. To do so, various open-data from national departments and Seoul government have been collected and implemented into random forest algorithm. The results showed promising accuracy and derived multiple influential factors which causes land-use changes around Seoul districts. The result of this study could further be implemented in policy makings for the public sectors, or could also be used as basis for studying gentrification problems happening in Seoul Area.

A Study on the Abnormal Behavior Detection Model through Data Transfer Data Analysis (자료 전송 데이터 분석을 통한 이상 행위 탐지 모델의 관한 연구)

  • Son, In Jae;Kim, Huy Kang
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.30 no.4
    • /
    • pp.647-656
    • /
    • 2020
  • Recently, there has been an increasing number of cases in which important data (personal information, technology, etc.) of national and public institutions are leaked to the outside world. Surveys show that the largest cause of such leakage accidents is "insiders." Insiders of organization with the most authority can cause more damage than technology leaks caused by external attacks due to the organization. This is due to the characteristics of insiders who have relatively easy access to the organization's major assets. This study aims to present an optimized property selection model for detecting such abnormalities through supervised learning algorithms among machine learning techniques using actual data such as CrossNet data transfer system transmission log, e-mail transmission log, and personnel information, which safely transmits data between separate areas (security area and non-security area) of the business network and the Internet network.

Developing an Ensemble Classifier for Bankruptcy Prediction (부도 예측을 위한 앙상블 분류기 개발)

  • Min, Sung-Hwan
    • Journal of Korea Society of Industrial Information Systems
    • /
    • v.17 no.7
    • /
    • pp.139-148
    • /
    • 2012
  • An ensemble of classifiers is to employ a set of individually trained classifiers and combine their predictions. It has been found that in most cases the ensembles produce more accurate predictions than the base classifiers. Combining outputs from multiple classifiers, known as ensemble learning, is one of the standard and most important techniques for improving classification accuracy in machine learning. An ensemble of classifiers is efficient only if the individual classifiers make decisions as diverse as possible. Bagging is the most popular method of ensemble learning to generate a diverse set of classifiers. Diversity in bagging is obtained by using different training sets. The different training data subsets are randomly drawn with replacement from the entire training dataset. The random subspace method is an ensemble construction technique using different attribute subsets. In the random subspace, the training dataset is also modified as in bagging. However, this modification is performed in the feature space. Bagging and random subspace are quite well known and popular ensemble algorithms. However, few studies have dealt with the integration of bagging and random subspace using SVM Classifiers, though there is a great potential for useful applications in this area. The focus of this paper is to propose methods for improving SVM performance using hybrid ensemble strategy for bankruptcy prediction. This paper applies the proposed ensemble model to the bankruptcy prediction problem using a real data set from Korean companies.

An Analysis of Convergence Phenomenon Using Industrial Convergence Coefficient (산업융합계수를 활용한 융합현상에 관한 연구)

  • Hwang, Sung-Hyun
    • The Journal of the Korea Contents Association
    • /
    • v.17 no.3
    • /
    • pp.666-674
    • /
    • 2017
  • Today, the term of fusion, such as technology convergence and industrial convergence, is emerging as one of the most important trends in our society. The purpose of this study is to analyze the convergence coefficient of each industry using patent data and to analyze the convergence phenomenon in industry based on convergence coefficient. To do this, 2011-2015 Korean patent data were utilized. The research findings revealed that the ICC by industry was the highest in order of man-made fibres, paints/varnishes, petroleum products/nuclear fuel and other chemicals. Also, according to the inter-industry convergence matrix, the number of convergence patents was the greatest in order of office machinery and computers, special purpose machinery industries and Measuring instruments. Added same analysis was conducted through Industry with high number of patents. As a result, the convergence has been actively carried out in the fields of optical instruments, Basic chemical, Fabricated metal products, Measuring instruments and special purpose machine manufacturing industries.

Feature Extraction to Detect Hoax Articles (낚시성 인터넷 신문기사 검출을 위한 특징 추출)

  • Heo, Seong-Wan;Sohn, Kyung-Ah
    • Journal of KIISE
    • /
    • v.43 no.11
    • /
    • pp.1210-1215
    • /
    • 2016
  • Readership of online newspapers has grown with the proliferation of smart devices. However, fierce competition between Internet newspaper companies has resulted in a large increase in the number of hoax articles. Hoax articles are those where the title does not convey the content of the main story, and this gives readers the wrong information about the contents. We note that the hoax articles have certain characteristics, such as unnecessary celebrity quotations, mismatch in the title and content, or incomplete sentences. Based on these, we extract and validate features to identify hoax articles. We build a large-scale training dataset by analyzing text keywords in replies to articles and thus extracted five effective features. We evaluate the performance of the support vector machine classifier on the extracted features, and a 92% accuracy is observed in our validation set. In addition, we also present a selective bigram model to measure the consistency between the title and content, which can be effectively used to analyze short texts in general.

HMM-based Upper-body Gesture Recognition for Virtual Playing Ground Interface (가상 놀이 공간 인터페이스를 위한 HMM 기반 상반신 제스처 인식)

  • Park, Jae-Wan;Oh, Chi-Min;Lee, Chil-Woo
    • The Journal of the Korea Contents Association
    • /
    • v.10 no.8
    • /
    • pp.11-17
    • /
    • 2010
  • In this paper, we propose HMM-based upper-body gesture. First, to recognize gesture of space, division about pose that is composing gesture once should be put priority. In order to divide poses which using interface, we used two IR cameras established on front side and side. So we can divide and acquire in front side pose and side pose about one pose in each IR camera. We divided the acquired IR pose image using SVM's non-linear RBF kernel function. If we use RBF kernel, we can divide misclassification between non-linear classification poses. Like this, sequences of divided poses is recognized by gesture using HMM's state transition matrix. The recognized gesture can apply to existent application to do mapping to OS Value.