• Title/Summary/Keyword: set test

Search Result 4,988, Processing Time 0.031 seconds

Study of Machine-Learning Classifier and Feature Set Selection for Intent Classification of Korean Tweets about Food Safety

  • Yeom, Ha-Neul;Hwang, Myunggwon;Hwang, Mi-Nyeong;Jung, Hanmin
    • Journal of Information Science Theory and Practice
    • /
    • v.2 no.3
    • /
    • pp.29-39
    • /
    • 2014
  • In recent years, several studies have proposed making use of the Twitter micro-blogging service to track various trends in online media and discussion. In this study, we specifically examine the use of Twitter to track discussions of food safety in the Korean language. Given the irregularity of keyword use in most tweets, we focus on optimistic machine-learning and feature set selection to classify collected tweets. We build the classifier model using Naive Bayes & Naive Bayes Multinomial, Support Vector Machine, and Decision Tree Algorithms, all of which show good performance. To select an optimum feature set, we construct a basic feature set as a standard for performance comparison, so that further test feature sets can be evaluated. Experiments show that precision and F-measure performance are best when using a Naive Bayes Multinomial classifier model with a test feature set defined by extracting Substantive, Predicate, Modifier, and Interjection parts of speech.

Classification of Korean Ancient Glass Pieces by Pattern Recognition Method (패턴인지법에 의한 한국산 고대 유리제품의 분류)

  • Lee Chul;Czae Myung-Zoon;Kim Seungwon;Kang Hyung Tae;Lee Jong Du
    • Journal of the Korean Chemical Society
    • /
    • v.36 no.1
    • /
    • pp.113-124
    • /
    • 1992
  • The pattern recognition methods of chemometrics have been applied to multivariate data, for which ninety four Korean ancient glass pieces have been determined for 12 elements by neutron activation analysis. For the purpose, principal component analysis and non-linear mapping have been used as the unsupervised learning methods. As the result, the glass samples have been classified into 6 classes. The SIMCA (statistical isolinear multiple component analysis), adopted as a supervised learning method, has been applied to the 6 training set and the test set. The results of the 6 training set were in accord with the results by principal component analysis and non-linear mapping. For test set, 17 of 33 samples were each allocated to one of the 6 training set.

  • PDF

CNN-based Android Malware Detection Using Reduced Feature Set

  • Kim, Dong-Min;Lee, Soo-jin
    • Journal of the Korea Society of Computer and Information
    • /
    • v.26 no.10
    • /
    • pp.19-26
    • /
    • 2021
  • The performance of deep learning-based malware detection and classification models depends largely on how to construct a feature set to be applied to training. In this paper, we propose an approach to select the optimal feature set to maximize detection performance for CNN-based Android malware detection. The features to be included in the feature set were selected through the Chi-Square test algorithm, which is widely used for feature selection in machine learning and deep learning. To validate the proposed approach, the CNN model was trained using 36 characteristics selected for the CICANDMAL2017 dataset and then the malware detection performance was measured. As a result, 99.99% of Accuracy was achieved in binary classification and 98.55% in multiclass classification.

Pre-Evaluation for Prediction Accuracy by Using the Customer's Ratings in Collaborative Filtering (협업필터링에서 고객의 평가치를 이용한 선호도 예측의 사전평가에 관한 연구)

  • Lee, Seok-Jun;Kim, Sun-Ok
    • Asia pacific journal of information systems
    • /
    • v.17 no.4
    • /
    • pp.187-206
    • /
    • 2007
  • The development of computer and information technology has been combined with the information superhighway internet infrastructure, so information widely spreads not only in special fields but also in the daily lives of people. Information ubiquity influences the traditional way of transaction, and leads a new E-commerce which distinguishes from the existing E-commerce. Not only goods as physical but also service as non-physical come into E-commerce. As the scale of E-Commerce is being enlarged as well. It keeps people from finding information they want. Recommender systems are now becoming the main tools for E-Commerce to mitigate the information overload. Recommender systems can be defined as systems for suggesting some Items(goods or service) considering customers' interests or tastes. They are being used by E-commerce web sites to suggest products to their customers who want to find something for them and to provide them with information to help them decide which to purchase. There are several approaches of recommending goods to customer in recommender system but in this study, the main subject is focused on collaborative filtering technique. This study presents a possibility of pre-evaluation for the prediction performance of customer's preference in collaborative filtering before the process of customer's preference prediction. Pre-evaluation for the prediction performance of each customer having low performance is classified by using the statistical features of ratings rated by each customer is conducted before the prediction process. In this study, MovieLens 100K dataset is used to analyze the accuracy of classification. The classification criteria are set by using the training sets divided 80% from the 100K dataset. In the process of classification, the customers are divided into two groups, classified group and non classified group. To compare the prediction performance of classified group and non classified group, the prediction process runs the 20% test set through the Neighborhood Based Collaborative Filtering Algorithm and Correspondence Mean Algorithm. The prediction errors from those prediction algorithm are allocated to each customer and compared with each user's error. Research hypothesis : Two research hypotheses are formulated in this study to test the accuracy of the classification criterion as follows. Hypothesis 1: The estimation accuracy of groups classified according to the standard deviation of each user's ratings has significant difference. To test the Hypothesis 1, the standard deviation is calculated for each user in training set which is divided 80% from MovieLens 100K dataset. Four groups are classified according to the quartile of the each user's standard deviations. It is compared to test the estimation errors of each group which results from test set are significantly different. Hypothesis 2: The estimation accuracy of groups that are classified according to the distribution of each user's ratings have significant differences. To test the Hypothesis 2, the distributions of each user's ratings are compared with the distribution of ratings of all customers in training set which is divided 80% from MovieLens 100K dataset. It assumes that the customers whose ratings' distribution are different from that of all customers would have low performance, so six types of different distributions are set to be compared. The test groups are classified into fit group or non-fit group according to the each type of different distribution assumed. The degrees in accordance with each type of distribution and each customer's distributions are tested by the test of ${\chi}^2$ goodness-of-fit and classified two groups for testing the difference of the mean of errors. Also, the degree of goodness-of-fit with the distribution of each user's ratings and the average distribution of the ratings in the training set are closely related to the prediction errors from those prediction algorithms. Through this study, the customers who have lower performance of prediction than the rest in the system are classified by those two criteria, which are set by statistical features of customers ratings in the training set, before the prediction process.

Setting Time and Strength Characteristics of Cement Mixtures with Set Accelerating Agent for Shotcrete (숏크리트용 급결제를 첨가한 시멘트 모르타르의 응결 및 강도특성)

  • Kim Jin-Cheol;Ryu Jong-Hyun
    • Journal of the Korea Concrete Institute
    • /
    • v.16 no.1 s.79
    • /
    • pp.70-78
    • /
    • 2004
  • Although set accelerating agents are used generally in New Austrian Tunneling Method, the standards for test methods and quality of set accelerating agents are not prescribed domestically. In this study, the proprieties of the various standards and the characteristics of set accelerating agents for shotcrete were evaluated. The alkali contents of set accelerating agents based on silicate, aluminate and cement were higher than those of alkali-free ones. From the result, it is thought that the quality control of aggregate should be enhanced and that the number of test cycle of alkali-aggregate reaction should be increased. The setting times of cement paste with set accelerating agents based on silicate and alkali-free ones were different largely with mixing methods. Compressive strength of mortar with set accelerating agents based on silicate, aluminate and cement at one day satisfied the specifications of Korea Concrete Institute. However, the strength ratio compared to control mix at 28 days showed as $50{\~}65\%$ except for the alkali-free set accelerating agents. As a results of setting time and strength test, the establishment of domestic standards that can reflect the characteristics of materials and construction methods of tunnels and that can increase quality of set accelerating agents is required immediately.

A Visual Image Perception of Clothing Colors, Color Combinations of Borean Traditional Dress for Woman(Part I) (복식색과 색조합의 이미지 지각(제1보) -여자 저고리, 치마를 중심으로 한 준실험 연구 -)

  • 이혜숙;김재숙
    • Journal of the Korean Society of Clothing and Textiles
    • /
    • v.22 no.5
    • /
    • pp.597-606
    • /
    • 1998
  • The purposes of the study were 1) to evaluate the visual image of colored Korean traditional dress for woman 2) to analyze the colors and, color combinations effect on the image perception using gestalt theory. The research method was a quasi-experimental with a between subjects design. The experimental materials developed for the study were a set of stimuli and a response scale. The stimuli was consisted of 17 drawings of females wearing Korean tradinational dress, by using CAD simulation. A response scale consisted of semantic differential scales. The subjects were 1138 undergraduate students of Taejon city, Chungnam province and Chungbuk province. Their responses to the semantic differential scales were analyzed using factor analysis, one-way ANOVA, Duncan's multiple range test, 1-test. Results were as follows; 1) The image of the stimulus was consisted of the 4 different dimensions.(sociability, evaluation, visibility, attractiveness) 2) Clothing colors had significant effects on image perception of the evaluation dimension, visibility dimension and attractiveness dimension in the mono-color set. The blue showed the most positive image on the evaluation dimension, and the yellow and the gray showed negative image on the same dimension. The yellow showed the most salient image and the gray showed the least salient image on the visibility dimension. The red showed the most attractive image and the green showed the least attractive image on the attractiveness dimension. 3) In hi-color set stimulus, the perceived image was influenced by color combinations. The yellow blouse-the red skirt set showed the most sociable image on the sociability dimension. The blue blouse-the green skirt set showed the most positive image on the evaluation dimension. The yellow blouse-the red skirt set showed the most salient image and the blue blouse-the green skirt set showed the least salient image on the visibility dimension. And the red blouse-the yellow skirt set showed the most attractive image on the attractiveness dimension. On conclusion the visual image of Korean traditional dress wearer was affected by dress colors and color combinations.

  • PDF

GSnet: An Integrated Tool for Gene Set Analysis and Visualization

  • Choi, Yoon-Jeong;Woo, Hyun-Goo;Yu, Ung-Sik
    • Genomics & Informatics
    • /
    • v.5 no.3
    • /
    • pp.133-136
    • /
    • 2007
  • The Gene Set network viewer (GSnet) visualizes the functional enrichment of a given gene set with a protein interaction network and is implemented as a plug-in for the Cytoscape platform. The functional enrichment of a given gene set is calculated using a hypergeometric test based on the Gene Ontology annotation. The protein interaction network is estimated using public data. Set operations allow a complex protein interaction network to be decomposed into a functionally-enriched module of interest. GSnet provides a new framework for gene set analysis by integrating a priori knowledge of a biological network with functional enrichment analysis.

Design of Radio Frequency Test Set for TC&R RF Subsystem Verification of LEO and GEO Satellites (저궤도 및 정지궤도위성의 TC&R RF 서브시스템 검증을 위한 RF 시험 장비 설계)

  • Cho, Seung-Won;Lee, Sang-Jeong
    • Journal of the Korean Society for Aeronautical & Space Sciences
    • /
    • v.42 no.8
    • /
    • pp.674-682
    • /
    • 2014
  • Radio Frequency Test Set (RFTS) is essential to verify Telemetry, Command & Ranging (TC&R) RF subsystem of both Low Earth Orbit (LEO) and Geostationary Earth Orbit (GEO) satellite during Assembly Integration & Test (AI&T). The existing RFTS was specialized for each project and needed to be modified for each new satellite. The new design enables RFTS to be used in various projects. The hardware and software was designed considering this and therefore it could be directly used in other projects within a similar test period without modification or inconvenience. It will be also easily controlled, modified, and managed through the extension in modularization according to each function and the use of COTS (commercial on-the-self) and this will improve system reliability. A more reliable RF test measurement is also provided in this new RFTS by using an accurate reference clock signal.

Automatic Test Data Generation for Mutation Testing Using Genetic Algorithms (유전자 알고리즘을 이용한 뮤테이션 테스팅의 테스트 데이터 자동 생성)

  • 정인상;창병모
    • The KIPS Transactions:PartD
    • /
    • v.8D no.1
    • /
    • pp.81-86
    • /
    • 2001
  • one key goal of software testing is to generate a 'good' test data set, which is consideres as the most difficult and time-consuming task. This paper discusses how genetic algorithns can be used for automatic generation of test data set for software testing. We employ mutation testing to show the effectiveness of genetic algorithms (GAs) in automatic test data generation. The approach presented in this paper is different from other in that test generation process requireas no lnowledge of implementation details of a program under test. In addition, we have conducted some experiments and compared our approach with random testing which is also regarded as a black-box test generation technique to show its effectiveness.

  • PDF

Modal teat/analysis result correlation of folding fin (접는 날개에 대한 모드시험/해석결과 보정)

  • 양해석
    • Journal of KSNVE
    • /
    • v.6 no.3
    • /
    • pp.305-315
    • /
    • 1996
  • Present paper aims at the correlation of modal characteristics of folding fin between test and analysis using an optimization theory. Folding fin is composed of a movable fin, a base fin, and many functional components related to the folding mechanism. Joint parts of folding fin in FEM are initially modeled as rigid elements resulting some difference between test and analysis in modal characteristics. Therefore, some equivalent springs representing joint parts are introduced to improve the FEM model. The springs were set as design variables, while the frequency difference between test and analysis was set as the object function. Bayesian procedure was ujsed for the minimization.

  • PDF