• Title/Summary/Keyword: Machine classification

Search Result 2,079, Processing Time 0.03 seconds

Experimental Verification of the Versatility of SPAM-based Image Steganalysis (SPAM 기반 영상 스테그아날리시스의 범용성에 대한 실험적 검증)

  • Kim, Jaeyoung;Park, Hanhoon;Park, Jong-Il
    • Journal of Broadcast Engineering
    • /
    • v.23 no.4
    • /
    • pp.526-535
    • /
    • 2018
  • Many steganography algorithms have been studied, and steganalysis for detecting stego images which steganography is applied to has also been studied in parallel. Especially, in the case of the image steganalysis, the features such as ALE, SPAM, and SRMQ are extracted from the statistical characteristics of the image, and stego images are classified by learning the classifier using various machine learning algorithms. However, these studies did not consider the effect of image size, aspect ratio, or message-embedding rate, and thus the features might not function normally for images with conditions different from those used in the their studies. In this paper, we analyze the classification rate of the SPAM-based image stegnalysis against variety image sizes aspect ratios and message-embedding rates and verify its versatility.

Breaking character and natural image based CAPTCHA using feature classification (특징 분리를 통한 자연 배경을 지닌 글자 기반 CAPTCHA 공격)

  • Kim, Jaehwan;Kim, Suah;Kim, Hyoung Joong
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.25 no.5
    • /
    • pp.1011-1019
    • /
    • 2015
  • CAPTCHA(Completely Automated Public Turing test to tell Computers and Humans Apart) is a test used in computing to distinguish whether or not the user is computer or human. Many web sites mostly use the character-based CAPTCHA consisting of digits and characters. Recently, with the development of OCR technology, simple character-based CAPTCHA are broken quite easily. As an alternative, many web sites add noise to make it harder for recognition. In this paper, we analyzed the most recent CAPTCHA, which incorporates the addition of the natural images to obfuscate the characters. We proposed an efficient method using support vector machine to separate the characters from the background image and use convolutional neural network to recognize each characters. As a result, 368 out of 1000 CAPTCHAs were correctly identified, it was demonstrated that the current CAPTCHA is not safe.

Automatic Extraction of Hangul Stroke Element Using Faster R-CNN for Font Similarity (글꼴 유사도 판단을 위한 Faster R-CNN 기반 한글 글꼴 획 요소 자동 추출)

  • Jeon, Ja-Yeon;Park, Dong-Yeon;Lim, Seo-Young;Ji, Yeong-Seo;Lim, Soon-Bum
    • Journal of Korea Multimedia Society
    • /
    • v.23 no.8
    • /
    • pp.953-964
    • /
    • 2020
  • Ever since media contents took over the world, the importance of typography has increased, and the influence of fonts has be n recognized. Nevertheless, the current Hangul font system is very poor and is provided passively, so it is practically impossible to understand and utilize all the shape characteristics of more than six thousand Hangul fonts. In this paper, the characteristics of Hangul font shapes were selected based on the Hangul structure of similar fonts. The stroke element detection training was performed by fine tuning Faster R-CNN Inception v2, one of the deep learning object detection models. We also propose a system that automatically extracts the stroke element characteristics from characters by introducing an automatic extraction algorithm. In comparison to the previous research which showed poor accuracy while using SVM(Support Vector Machine) and Sliding Window Algorithm, the proposed system in this paper has shown the result of 10 % accuracy to properly detect and extract stroke elements from various fonts. In conclusion, if the stroke element characteristics based on the Hangul structural information extracted through the system are used for similar classification, problems such as copyright will be solved in an era when typography's competitiveness becomes stronger, and an automated process will be provided to users for more convenience.

Two-Phase Shallow Semantic Parsing based on Partial Syntactic Parsing (부분 구문 분석 결과에 기반한 두 단계 부분 의미 분석 시스템)

  • Park, Kyung-Mi;Mun, Young-Song
    • The KIPS Transactions:PartB
    • /
    • v.17B no.1
    • /
    • pp.85-92
    • /
    • 2010
  • A shallow semantic parsing system analyzes the relationship that a syntactic constituent of the sentence has with a predicate. It identifies semantic arguments representing agent, patient, instrument, etc. of the predicate. In this study, we propose a two-phase shallow semantic parsing model which consists of the identification phase and the classification phase. We first find the boundary of semantic arguments from partial syntactic parsing results, and then assign appropriate semantic roles to the identified semantic arguments. By taking the sequential two-phase approach, we can alleviate the unbalanced class distribution problem, and select the features appropriate for each task. Experiments show the relative contribution of each phase on the test data.

A Design of Fuzzy Classifier with Hierarchical Structure (계층적 구조를 가진 퍼지 패턴 분류기 설계)

  • Ahn, Tae-Chon;Roh, Seok-Beom;Kim, Yong Soo
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.24 no.4
    • /
    • pp.355-359
    • /
    • 2014
  • In this paper, we proposed the new fuzzy pattern classifier which combines several fuzzy models with simple consequent parts hierarchically. The basic component of the proposed fuzzy pattern classifier with hierarchical structure is a fuzzy model with simple consequent part so that the complexity of the proposed fuzzy pattern classifier is not high. In order to analyze and divide the input space, we use Fuzzy C-Means clustering algorithm. In addition, we exploit Conditional Fuzzy C-Means clustering algorithm to analyze the sub space which is divided by Fuzzy C-Means clustering algorithm. At each clustered region, we apply a fuzzy model with simple consequent part and build the fuzzy pattern classifier with hierarchical structure. Because of the hierarchical structure of the proposed pattern classifier, the data distribution of the input space can be analyzed in the macroscopic point of view and the microscopic point of view. Finally, in order to evaluate the classification ability of the proposed pattern classifier, the machine learning data sets are used.

Decoding Brain Patterns for Colored and Grayscale Images using Multivariate Pattern Analysis

  • Zafar, Raheel;Malik, Muhammad Noman;Hayat, Huma;Malik, Aamir Saeed
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.14 no.4
    • /
    • pp.1543-1561
    • /
    • 2020
  • Taxonomy of human brain activity is a complicated rather challenging procedure. Due to its multifaceted aspects, including experiment design, stimuli selection and presentation of images other than feature extraction and selection techniques, foster its challenging nature. Although, researchers have focused various methods to create taxonomy of human brain activity, however use of multivariate pattern analysis (MVPA) for image recognition to catalog the human brain activities is scarce. Moreover, experiment design is a complex procedure and selection of image type, color and order is challenging too. Thus, this research bridge the gap by using MVPA to create taxonomy of human brain activity for different categories of images, both colored and gray scale. In this regard, experiment is conducted through EEG testing technique, with feature extraction, selection and classification approaches to collect data from prequalified criteria of 25 graduates of University Technology PETRONAS (UTP). These participants are shown both colored and gray scale images to record accuracy and reaction time. The results showed that colored images produces better end result in terms of accuracy and response time using wavelet transform, t-test and support vector machine. This research resulted that MVPA is a better approach for the analysis of EEG data as more useful information can be extracted from the brain using colored images. This research discusses a detail behavior of human brain based on the color and gray scale images for the specific and unique task. This research contributes to further improve the decoding of human brain with increased accuracy. Besides, such experiment settings can be implemented and contribute to other areas of medical, military, business, lie detection and many others.

Data-driven Analysis for Future Land-use Change Prediction : Case Study on Seoul (서울 데이터 기반 필지별 용도전환 발생 예측)

  • Yun, Sung Bum;Mun, Sungchul;Park, Soon Yong;Kim, Taehyun
    • Journal of Broadcast Engineering
    • /
    • v.25 no.2
    • /
    • pp.176-184
    • /
    • 2020
  • Due to constant development and decline on Seoul areas the Seoul government is pushing various policies to regenerate declined Seoul areas. Theses various policies lead to land-use changes around numerous Seoul districts. This study aims to create prediction model which can foresee future land-use changes and while doing so, tried to derive various influential factors which leads to land-use changes. To do so, various open-data from national departments and Seoul government have been collected and implemented into random forest algorithm. The results showed promising accuracy and derived multiple influential factors which causes land-use changes around Seoul districts. The result of this study could further be implemented in policy makings for the public sectors, or could also be used as basis for studying gentrification problems happening in Seoul Area.

A Study on the Abnormal Behavior Detection Model through Data Transfer Data Analysis (자료 전송 데이터 분석을 통한 이상 행위 탐지 모델의 관한 연구)

  • Son, In Jae;Kim, Huy Kang
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.30 no.4
    • /
    • pp.647-656
    • /
    • 2020
  • Recently, there has been an increasing number of cases in which important data (personal information, technology, etc.) of national and public institutions are leaked to the outside world. Surveys show that the largest cause of such leakage accidents is "insiders." Insiders of organization with the most authority can cause more damage than technology leaks caused by external attacks due to the organization. This is due to the characteristics of insiders who have relatively easy access to the organization's major assets. This study aims to present an optimized property selection model for detecting such abnormalities through supervised learning algorithms among machine learning techniques using actual data such as CrossNet data transfer system transmission log, e-mail transmission log, and personnel information, which safely transmits data between separate areas (security area and non-security area) of the business network and the Internet network.

Developing an Ensemble Classifier for Bankruptcy Prediction (부도 예측을 위한 앙상블 분류기 개발)

  • Min, Sung-Hwan
    • Journal of Korea Society of Industrial Information Systems
    • /
    • v.17 no.7
    • /
    • pp.139-148
    • /
    • 2012
  • An ensemble of classifiers is to employ a set of individually trained classifiers and combine their predictions. It has been found that in most cases the ensembles produce more accurate predictions than the base classifiers. Combining outputs from multiple classifiers, known as ensemble learning, is one of the standard and most important techniques for improving classification accuracy in machine learning. An ensemble of classifiers is efficient only if the individual classifiers make decisions as diverse as possible. Bagging is the most popular method of ensemble learning to generate a diverse set of classifiers. Diversity in bagging is obtained by using different training sets. The different training data subsets are randomly drawn with replacement from the entire training dataset. The random subspace method is an ensemble construction technique using different attribute subsets. In the random subspace, the training dataset is also modified as in bagging. However, this modification is performed in the feature space. Bagging and random subspace are quite well known and popular ensemble algorithms. However, few studies have dealt with the integration of bagging and random subspace using SVM Classifiers, though there is a great potential for useful applications in this area. The focus of this paper is to propose methods for improving SVM performance using hybrid ensemble strategy for bankruptcy prediction. This paper applies the proposed ensemble model to the bankruptcy prediction problem using a real data set from Korean companies.

An Analysis of Convergence Phenomenon Using Industrial Convergence Coefficient (산업융합계수를 활용한 융합현상에 관한 연구)

  • Hwang, Sung-Hyun
    • The Journal of the Korea Contents Association
    • /
    • v.17 no.3
    • /
    • pp.666-674
    • /
    • 2017
  • Today, the term of fusion, such as technology convergence and industrial convergence, is emerging as one of the most important trends in our society. The purpose of this study is to analyze the convergence coefficient of each industry using patent data and to analyze the convergence phenomenon in industry based on convergence coefficient. To do this, 2011-2015 Korean patent data were utilized. The research findings revealed that the ICC by industry was the highest in order of man-made fibres, paints/varnishes, petroleum products/nuclear fuel and other chemicals. Also, according to the inter-industry convergence matrix, the number of convergence patents was the greatest in order of office machinery and computers, special purpose machinery industries and Measuring instruments. Added same analysis was conducted through Industry with high number of patents. As a result, the convergence has been actively carried out in the fields of optical instruments, Basic chemical, Fabricated metal products, Measuring instruments and special purpose machine manufacturing industries.