• 제목/요약/키워드: software classification

Search Result 911, Processing Time 0.024 seconds

Feature-Based Relation Classification Using Quantified Relatedness Information

  • Huang, Jin-Xia;Choi, Key-Sun;Kim, Chang-Hyun;Kim, Young-Kil
    • ETRI Journal
    • /
    • v.32 no.3
    • /
    • pp.482-485
    • /
    • 2010
  • Feature selection is very important for feature-based relation classification tasks. While most of the existing works on feature selection rely on linguistic information acquired using parsers, this letter proposes new features, including probabilistic and semantic relatedness features, to manifest the relatedness between patterns and certain relation types in an explicit way. The impact of each feature set is evaluated using both a chi-square estimator and a performance evaluation. The experiments show that the impact of relatedness features is superior to existing well-known linguistic features, and the contribution of relatedness features cannot be substituted using other normally used linguistic feature sets.

Object-oriented Information Extraction and Application in High-resolution Remote Sensing Image

  • WEI Wenxia;Ma Ainai;Chen Xunwan
    • Proceedings of the KSRS Conference
    • /
    • 2004.10a
    • /
    • pp.125-127
    • /
    • 2004
  • High-resolution satellite images offer abundance information of the earth surface for remote sensing applications. The information includes geometry, texture and attribute characteristic. The pixel-based image classification can't satisfy high-resolution satellite image's classification precision and produce large data redundancy. Object-oriented information extraction not only depends on spectrum character, but also use geometry and structure information. It can provide an accessible and truly revolutionary approach. Using Beijing Spot 5 high-resolution image and object-oriented classification with the eCognition software, we accomplish the cultures' precise classification. The test areas have five culture types including water, vegetation, road, building and bare lands. We use nearest neighbor classification and appraise the overall classification accuracy. The average of five species reaches 0.90. All of maximum is 1. The standard deviation is less than 0.11. The overall accuracy can reach $95.47\%.$ This method offers a new technology for high-resolution satellite images' available applications in remote sensing culture classification.

  • PDF

Classification of HTTP Automated Software Communication Behavior Using a NoSQL Database

  • Tran, Manh Cong;Nakamura, Yasuhiro
    • IEIE Transactions on Smart Processing and Computing
    • /
    • v.5 no.2
    • /
    • pp.94-99
    • /
    • 2016
  • Application layer attacks have for years posed an ever-serious threat to network security, since they always come after a technically legitimate connection has been established. In recent years, cyber criminals have turned to fully exploiting the web as a medium of communication to launch a variety of forbidden or illicit activities by spreading malicious automated software (auto-ware) such as adware, spyware, or bots. When this malicious auto-ware infects a network, it will act like a robot, mimic normal behavior of web access, and bypass the network firewall or intrusion detection system. Besides that, in a private and large network, with huge Hypertext Transfer Protocol (HTTP) traffic generated each day, communication behavior identification and classification of auto-ware is a challenge. In this paper, based on a previous study, analysis of auto-ware communication behavior, and with the addition of new features, a method for classification of HTTP auto-ware communication is proposed. For that, a Not Only Structured Query Language (NoSQL) database is applied to handle large volumes of unstructured HTTP requests captured every day. The method is tested with real HTTP traffic data collected through a proxy server of a private network, providing good results in the classification and detection of suspicious auto-ware web access.

Automatic Classification of Blog Posts using Various Term Weighting (다양한 어휘 가중치를 이용한 블로그 포스트의 자동 분류)

  • Kim, Su-Ah;Jho, Hee-Sun;Lee, Hyun Ah
    • Journal of Advanced Marine Engineering and Technology
    • /
    • v.39 no.1
    • /
    • pp.58-62
    • /
    • 2015
  • Most blog sites provide predefined classes based on contents or topics, but few bloggers choose classes for their posts because of its cumbersome manual process. This paper proposes an automatic blog post classification method that variously combines term frequency, document frequency and class frequency from each classes to find appropriate weighting scheme. In experiment, combination of term frequency, category term frequency and inversed (excepted category's) document frequency shows 77.02% classification precisions.

Hardware Accelerated Design on Bag of Words Classification Algorithm

  • Lee, Chang-yong;Lee, Ji-yong;Lee, Yong-hwan
    • Journal of Platform Technology
    • /
    • v.6 no.4
    • /
    • pp.26-33
    • /
    • 2018
  • In this paper, we propose an image retrieval algorithm for real-time processing and design it as hardware. The proposed method is based on the classification of BoWs(Bag of Words) algorithm and proposes an image search algorithm using bit stream. K-fold cross validation is used for the verification of the algorithm. Data is classified into seven classes, each class has seven images and a total of 49 images are tested. The test has two kinds of accuracy measurement and speed measurement. The accuracy of the image classification was 86.2% for the BoWs algorithm and 83.7% the proposed hardware-accelerated software implementation algorithm, and the BoWs algorithm was 2.5% higher. The image retrieval processing speed of BoWs is 7.89s and our algorithm is 1.55s. Our algorithm is 5.09 times faster than BoWs algorithm. The algorithm is largely divided into software and hardware parts. In the software structure, C-language is used. The Scale Invariant Feature Transform algorithm is used to extract feature points that are invariant to size and rotation from the image. Bit streams are generated from the extracted feature point. In the hardware architecture, the proposed image retrieval algorithm is written in Verilog HDL and designed and verified by FPGA and Design Compiler. The generated bit streams are stored, the clustering step is performed, and a searcher image databases or an input image databases are generated and matched. Using the proposed algorithm, we can improve convenience and satisfaction of the user in terms of speed if we search using database matching method which represents each object.

MalDC: Malicious Software Detection and Classification using Machine Learning

  • Moon, Jaewoong;Kim, Subin;Park, Jangyong;Lee, Jieun;Kim, Kyungshin;Song, Jaeseung
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.16 no.5
    • /
    • pp.1466-1488
    • /
    • 2022
  • Recently, the importance and necessity of artificial intelligence (AI), especially machine learning, has been emphasized. In fact, studies are actively underway to solve complex and challenging problems through the use of AI systems, such as intelligent CCTVs, intelligent AI security systems, and AI surgical robots. Information security that involves analysis and response to security vulnerabilities of software is no exception to this and is recognized as one of the fields wherein significant results are expected when AI is applied. This is because the frequency of malware incidents is gradually increasing, and the available security technologies are limited with regard to the use of software security experts or source code analysis tools. We conducted a study on MalDC, a technique that converts malware into images using machine learning, MalDC showed good performance and was able to analyze and classify different types of malware. MalDC applies a preprocessing step to minimize the noise generated in the image conversion process and employs an image augmentation technique to reinforce the insufficient dataset, thus improving the accuracy of the malware classification. To verify the feasibility of our method, we tested the malware classification technique used by MalDC on a dataset provided by Microsoft and malware data collected by the Korea Internet & Security Agency (KISA). Consequently, an accuracy of 97% was achieved.

Software Quality Classification using Bayesian Classifier (베이지안 분류기를 이용한 소프트웨어 품질 분류)

  • Hong, Euy-Seok
    • Journal of Information Technology Services
    • /
    • v.11 no.1
    • /
    • pp.211-221
    • /
    • 2012
  • Many metric-based classification models have been proposed to predict fault-proneness of software module. This paper presents two prediction models using Bayesian classifier which is one of the most popular modern classification algorithms. Bayesian model based on Bayesian probability theory can be a promising technique for software quality prediction. This is due to the ability to represent uncertainty using probabilities and the ability to partly incorporate expert's knowledge into training data. The two models, Na$\ddot{i}$veBayes(NB) and Bayesian Belief Network(BBN), are constructed and dimensionality reduction of training data and test data are performed before model evaluation. Prediction accuracy of the model is evaluated using two prediction error measures, Type I error and Type II error, and compared with well-known prediction models, backpropagation neural network model and support vector machine model. The results show that the prediction performance of BBN model is slightly better than that of NB. For the data set with ambiguity, although the BBN model's prediction accuracy is not as good as the compared models, it achieves better performance than the compared models for the data set without ambiguity.

Development of Plantar Pressure Measurement System and Personal Classification Study based on Plantar Pressure Image

  • Ho, Jong Gab;Kim, Dae Gyeom;Kim, Young;Jang, Seung-wan;Min, Se Dong
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.15 no.11
    • /
    • pp.3875-3891
    • /
    • 2021
  • In this study, a Velostat pressure sensor was manufactured to develop a plantar pressure measurement system and a C#-based application was developed to monitor and collect plantar pressure data in real time. In order to evaluate the characteristics of the proposed plantar pressure measurement system, the accuracy of plantar pressure index and personal classification was verified by comparing with MatScan, a commercial plantar pressure measurement system. As a result, the output characteristics according to the weight of the Velostat pressure sensor were evaluated and a trend line with the reliability of r2 = 0.98 was detected. The Root Mean Square Error(RMSE) of the weighted area was 11.315 cm2, the RMSE of the x coordinate of Center of Pressure(CoPx) was 1.036 cm and the RMSE of the y coordinate of Center of Pressure(CoPy) was 0.936 cm. Finally, inaccuracy of personal classification, the proposed system was 99.47% and MatScan was 96.86%. Based on the advantage of being simple to implement and capable of manufacturing at low cost, it is considered that it can be applied to various fields of measuring vital signs such as sitting posture and breathing in addition to the plantar pressure measurement system.

Discriminative Feature Vector Selection for Emotion Classification Based on Speech (음성신호기반의 감정분석을 위한 특징벡터 선택)

  • Choi, Ha-Na;Byun, Sung-Woo;Lee, Seok-Pil
    • The Transactions of The Korean Institute of Electrical Engineers
    • /
    • v.64 no.9
    • /
    • pp.1363-1368
    • /
    • 2015
  • Recently, computer form were smaller than before because of computing technique's development and many wearable device are formed. So, computer's cognition of human emotion has importantly considered, thus researches on analyzing the state of emotion are increasing. Human voice includes many information of human emotion. This paper proposes a discriminative feature vector selection for emotion classification based on speech. For this, we extract some feature vectors like Pitch, MFCC, LPC, LPCC from voice signals are divided into four emotion parts on happy, normal, sad, angry and compare a separability of the extracted feature vectors using Bhattacharyya distance. So more effective feature vectors are recommended for emotion classification.

A Dynamic Approach to Estimate Change Impact using Type of Change Propagation

  • Gupta, Chetna;Singh, Yogesh;Chauhan, Durg Singh
    • Journal of Information Processing Systems
    • /
    • v.6 no.4
    • /
    • pp.597-608
    • /
    • 2010
  • Software evolution is an ongoing process carried out with the aim of extending base applications either for adding new functionalities or for adapting software to changing environments. This brings about the need for estimating and determining the overall impact of changes to a software system. In the last few decades many such change/impact analysis techniques have been developed to identify consequences of making changes to software systems. In this paper we propose a new approach of estimating change/impact analysis by classifying change based on type of change classification e.g. (a) nature and (b) extent of change propagation. The impact set produced consists of two dimensions of information: (a) statements affected by change propagation and (b) percentage i.e. statements affected in each category and involving the overall system. We also propose an algorithm for classifying the type of change. To establish confidence in effectiveness and efficiency we illustrate this technique with the help of an example. Results of our analysis are promising towards achieving the aim of the proposed endeavor to enhance change classification. The proposed dynamic technique for estimating impact sets and their percentage of impact will help software maintainers in performing selective regression testing by analyzing impact sets regarding the nature of change and change dependency.