• Title/Summary/Keyword: Feature selection

Search Result 1,076, Processing Time 0.037 seconds

Analyzing Machine Learning Techniques for Fault Prediction Using Web Applications

  • Malhotra, Ruchika;Sharma, Anjali
    • Journal of Information Processing Systems
    • /
    • v.14 no.3
    • /
    • pp.751-770
    • /
    • 2018
  • Web applications are indispensable in the software industry and continuously evolve either meeting a newer criteria and/or including new functionalities. However, despite assuring quality via testing, what hinders a straightforward development is the presence of defects. Several factors contribute to defects and are often minimized at high expense in terms of man-hours. Thus, detection of fault proneness in early phases of software development is important. Therefore, a fault prediction model for identifying fault-prone classes in a web application is highly desired. In this work, we compare 14 machine learning techniques to analyse the relationship between object oriented metrics and fault prediction in web applications. The study is carried out using various releases of Apache Click and Apache Rave datasets. En-route to the predictive analysis, the input basis set for each release is first optimized using filter based correlation feature selection (CFS) method. It is found that the LCOM3, WMC, NPM and DAM metrics are the most significant predictors. The statistical analysis of these metrics also finds good conformity with the CFS evaluation and affirms the role of these metrics in the defect prediction of web applications. The overall predictive ability of different fault prediction models is first ranked using Friedman technique and then statistically compared using Nemenyi post-hoc analysis. The results not only upholds the predictive capability of machine learning models for faulty classes using web applications, but also finds that ensemble algorithms are most appropriate for defect prediction in Apache datasets. Further, we also derive a consensus between the metrics selected by the CFS technique and the statistical analysis of the datasets.

Face Feature Extraction for Child Ocular Inspection and Diagnosis of Colics by Crying Analysis (소아 망진을 위한 얼굴 특징 추출 및 영아 산통 진단을 위한 울음소리 분석)

  • Cho Dong-Uk;Kim Bong-Hyun
    • The KIPS Transactions:PartB
    • /
    • v.13B no.2 s.105
    • /
    • pp.97-104
    • /
    • 2006
  • There is no method to control for the child efficiently when disease happens who cannot be able to express his symptoms. Therefore, doctor's diagnosis depends on inquiring from child's patients, that leads to wrong diagnosis result. For this, in this paper, we would like to develop child ocular inspection, auscultation diagnosis instruments, using Oriental medicine principle that living body signal of five organs and six hallow organs which reflects patients face and voice We would like to get more accurate diagnosis result for child's symptoms from doctor's intuition on the basis of diagnostic sight visualization, objectification, quantization itself. This paper develops color revision, YCbCr application, and face color selection and five sensory organs and nose or apex extraction method etc, in child ocular inspection by first work achievement sequence among the whole development systems. Also, in occasion of child auscultation, crying characteristics of colics through pitch, intensity and formant analysis is numerized and objectifies doctor's intuition through this. Finally, experiments are performed to verify the effectiveness of the proposed methods.

Simultaneous Optimization Model of Case-Based Reasoning for Effective Customer Relationship Management (효과적인 고객관계관리를 위한 사례기반추론 동시 최적화 모형)

  • Ahn, Hyun-Chul;Kim, Kyoung-Jae;Han, In-Goo
    • Journal of Intelligence and Information Systems
    • /
    • v.11 no.2
    • /
    • pp.175-195
    • /
    • 2005
  • 사례기반추론(case-based reasoning)은 사례간 유사도를 평가하여 유사한 이웃사례를 찾아내고, 이웃사례의 결과를 이용하여 새로운 사례에 대한 예측결과를 생성하는 전통적인 인공지능기법 중 하나다. 이러한 사례기반추론이 최근 적용이 쉽고 간단하다는 장점과 모형의 갱신이 실시간으로 이루어진다는 점 등으로 인해, 온라인 환경에서의 고객관계관리를 위한 도구로 학계와 실무에서 주목을 받고 있다 하지만, 전통적인 사례기반추론의 경우, 타 인공지능기법에 비해 정확도가 상대적으로 크게 떨어진다는 점이 종종 문제점으로 제기되어 왔다. 이에, 본 연구에서는 사례기반추론의 성과를 획기적으로 개선하기 위한 방법으로 유전자 알고리즘을 활용한 사례기반추론의 동시 최적화 모형을 제안하고자 한다. 본 연구가 제안하는 모형에서는 기존 연구에서 사례기반추론의 성과에 중대한 영향을 미치는 요소들로 제시된 바 있는 사례 특징변수의 상대적 가중치 선정(feature weighting)과 참조사례 선정(instance selection)을 유전자 알고리즘을 이용해 최적화함으로서, 사례간 유사도를 보다 정밀하게 도출하는 동시에 추론의 결과를 왜곡할 수 있는 오류사례의 영향을 최소화하고자 하였다. 제안모형의 유용성을 검증하기 위해, 본 연구에서는 국내 한 전문 인터넷 쇼핑몰의 구매예측모형 구축사례에 제안모형을 적용하여 그 성과를 살펴보았다. 그 결과, 제안모형이 지금까지 기존 연구에서 제안된 다른 사례기반추론 개선모형들은 물론, 로지스틱 회귀분석(LOGIT), 다중판별분석(MDA), 인공신경망(ANN), SVM 등 다른 인공지능 기법들에 비해서도 상대적으로 우수한 성과를 도출할 수 있음을 확인할 수 있었다.

  • PDF

Development of a software framework for sequential data assimilation and its applications in Japan

  • Noh, Seong-Jin;Tachikawa, Yasuto;Shiiba, Michiharu;Kim, Sun-Min;Yorozu, Kazuaki
    • Proceedings of the Korea Water Resources Association Conference
    • /
    • 2012.05a
    • /
    • pp.39-39
    • /
    • 2012
  • Data assimilation techniques have received growing attention due to their capability to improve prediction in various areas. Despite of their potentials, applicable software frameworks to probabilistic approaches and data assimilation are still limited because the most of hydrologic modelling software are based on a deterministic approach. In this study, we developed a hydrological modelling framework for sequential data assimilation, namely MPI-OHyMoS. MPI-OHyMoS allows user to develop his/her own element models and to easily build a total simulation system model for hydrological simulations. Unlike process-based modelling framework, this software framework benefits from its object-oriented feature to flexibly represent hydrological processes without any change of the main library. In this software framework, sequential data assimilation based on the particle filters is available for any hydrologic models considering various sources of uncertainty originated from input forcing, parameters and observations. The particle filters are a Bayesian learning process in which the propagation of all uncertainties is carried out by a suitable selection of randomly generated particles without any assumptions about the nature of the distributions. In MPI-OHyMoS, ensemble simulations are parallelized, which can take advantage of high performance computing (HPC) system. We applied this software framework for several catchments in Japan using a distributed hydrologic model. Uncertainty of model parameters and radar rainfall estimates is assessed simultaneously in sequential data assimilation.

  • PDF

An Adaptive Method For Face Recognition Based Filters and Selection of Features (필터 및 특징 선택 기반의 적응형 얼굴 인식 방법)

  • Cho, Byoung-Mo;Kim, Gi-Han;Rhee, Phill-Kyu
    • The Journal of the Korea Contents Association
    • /
    • v.9 no.6
    • /
    • pp.1-8
    • /
    • 2009
  • There are a lot of influences, such as location of camera, luminosity, brightness, and direction of light, which affect the performance of 2-dimensional image recognition. This paper suggests an adaptive method for face-image recognition in noisy environments using evolvable filtering and feature extraction which uses one sample image from camera. This suggested method consists of two main parts. One is the environmental-adjustment module which determines optimum sets of filters, filter parameters, and dimensions of features by using "steady state genetic algorithm". The other another part is for face recognition module which performs recognition of face-image using the previous results. In the processing, we used Gabor wavelet for extracting features in the images and k-Nearest Neighbor method for the classification. For testing of the adaptive face recognition method, we tested the adaptive method in the brightness noise, in the impulse noise and in the composite noise and verified that the adaptive method protects face recognition-rate's rapidly decrease which can be occurred generally in the noisy environments.

An Evaluation and Combination of Noise Reduction Filtering and Edge Detection Filtering for the Feature Element Selection in Stereo Matching (스테레오 정합 특징 요소 선택을 위한 잡음 감소 필터링과 에지 검출 필터링의 성능 평가와 결합)

  • Moon, Chang-Gi;Ye, Chul-Soo
    • Korean Journal of Remote Sensing
    • /
    • v.23 no.4
    • /
    • pp.273-285
    • /
    • 2007
  • Most stereo matching methods use intensity values in small image patches to measure the correspondence between two points. If the noisy pixels are used in computing the corresponding point, the matching performance becomes low. For this reason, the noise plays a critical role in determining the matching performance. In this paper, we propose a method for combining intensity and edge filters robust to the noise in order to improve the performance of stereo matching using high resolution satellite imagery. We used intensity filters such as Mean, Median, Midpoint and Gaussian filter and edge filters such as Gradient, Roberts, Prewitt, Sobel and Laplacian filter. To evaluate the performance of intensity and edge filters, experiments were carried out on both synthetic images and satellite images with uniform or gaussian noise. Then each filter was ranked based on its performance. Among the intensity and edge filters, Median and Sobel filter showed best performance while Midpoint and Laplacian filter showed worst result. We used Ikonos satellite stereo imagery in the experiments and the matching method using Median and Sobel filter showed better matching results than other filter combinations.

Transformation of 'Ilmibyeo' using pCAMBIA 1300 and Microstructural Investigation of Leaves (pCAMBIA 1300 벡터를 이용한 일미벼의 형질전환 및 잎의 전자현미경적 관찰)

  • Guo, Jia;Seong, Eun-Soo;Kim, Young-Hwa;Jo, Hye-Jeong;Cho, Joon-Hyeong;Wang, Myeong-Hyeon
    • Korean Journal of Plant Resources
    • /
    • v.20 no.5
    • /
    • pp.437-441
    • /
    • 2007
  • The argE gene of E.coli was introduced into #Ilmibyeo# cultivar of rice by Agrobacterium tumefaciens and a large number of transgenic plants were produced. Embryogenic calli were co-cultivated with A. tumefaciens strain AGL1 carrying the plasmid pCAMBIA1300 containing hygromycin resistance(HygR). Transgenic plants showing in vitro resistance to 50mg/L hygromycin were obtained using a selection procedure. Stable integration of argE and HPT genes into chromosomal DNA was proven by southern blot analysis and PCR analysis of genomic isolated from $T_0$ progenies. The fragments of 650 bp(HPT) were detected in transgenic rice lines. The 230 bp(argE) fragments were showed in agarose gel, and detected fragments were matched with size of argE specific primer. The microscopic feature of leaf on scanning electron microscope(SEM) revealed differences between clear and chalky in shape and arrangement of stoma but did not discriminate.

Deep Learning-based Approach for Classification of Tribological Time Series Data for Hand Creams (딥러닝을 이용한 핸드크림의 마찰 시계열 데이터 분류)

  • Kim, Ji Won;Lee, You Min;Han, Shawn;Kim, Kyeongtaek
    • Journal of Korean Society of Industrial and Systems Engineering
    • /
    • v.44 no.3
    • /
    • pp.98-105
    • /
    • 2021
  • The sensory stimulation of a cosmetic product has been deemed to be an ancillary aspect until a decade ago. That point of view has drastically changed on different levels in just a decade. Nowadays cosmetic formulators should unavoidably meet the needs of consumers who want sensory satisfaction, although they do not have much time for new product development. The selection of new products from candidate products largely depend on the panel of human sensory experts. As new product development cycle time decreases, the formulators wanted to find systematic tools that are required to filter candidate products into a short list. Traditional statistical analysis on most physical property tests for the products including tribology tests and rheology tests, do not give any sound foundation for filtering candidate products. In this paper, we suggest a deep learning-based analysis method to identify hand cream products by raw electric signals from tribological sliding test. We compare the result of the deep learning-based method using raw data as input with the results of several machine learning-based analysis methods using manually extracted features as input. Among them, ResNet that is a deep learning model proved to be the best method to identify hand cream used in the test. According to our search in the scientific reported papers, this is the first attempt for predicting test cosmetic product with only raw time-series friction data without any manual feature extraction. Automatic product identification capability without manually extracted features can be used to narrow down the list of the newly developed candidate products.

A Selection Method of Backbone Network through Multi-Classification Deep Neural Network Evaluation of Road Surface Damage Images (도로 노면 파손 영상의 다중 분류 심층 신경망 평가를 통한 Backbone Network 선정 기법)

  • Shim, Seungbo;Song, Young Eun
    • The Journal of The Korea Institute of Intelligent Transport Systems
    • /
    • v.18 no.3
    • /
    • pp.106-118
    • /
    • 2019
  • In recent years, research and development on image object recognition using artificial intelligence have been actively carried out, and it is expected to be used for road maintenance. Among them, artificial intelligence models for object detection of road surface are continuously introduced. In order to develop such object recognition algorithms, a backbone network that extracts feature maps is essential. In this paper, we will discuss how to select the appropriate neural network. To accomplish it, we compared with 4 different deep neural networks using 6,000 road surface damage images. Based on three evaluation methods for analyzing characteristics of neural networks, we propose a method to determine optimal neural networks. In addition, we improved the performance through optimal tuning of hyper-parameters, and finally developed a light backbone network that can achieve 85.9% accuracy of road surface damage classification.

Hyperspectral imaging technique to evaluate the firmness and the sweetness index of tomatoes

  • Rahman, Anisur;Park, Eunsoo;Bae, Hyungjin;Cho, Byoung-Kwan
    • Korean Journal of Agricultural Science
    • /
    • v.45 no.4
    • /
    • pp.823-837
    • /
    • 2018
  • The objective of this study was to evaluate the firmness and the sweetness index (SI) of tomatoes with a hyperspectral imaging (HSI) technique within the wavelength range of 1000 - 1550 nm. The hyperspectral images of 95 tomatoes were acquired with a push-broom hyperspectral reflectance imaging system, from which the mean spectra of each tomato were extracted from the regions of interest. The reference firmness and sweetness index of the same sample was measured and calibrated with their corresponding spectral data by partial least squares (PLS) regression with different preprocessing methods. The calibration model developed by PLS regression based on the Savitzky-Golay second-derivative preprocessed spectra resulted in a better performance for both the firmness and the SI of the tomatoes compared to models developed by other preprocessing methods. The correlation coefficients ($R_{pred}$) were 0.82, and 0.74 with a standard error of prediction of 0.86 N, and 0.63, respectively. Then, the feature wavelengths were identified using a model-based variable selection method, i.e., variable importance in projection, from the PLS regression analyses. Finally, chemical images were derived by applying the respective regression coefficients on the spectral image in a pixel-wise manner. The resulting chemical images provided detailed information on the firmness and the SI of the tomatoes. The results show that the proposed HSI technique has potential for rapid and non-destructive evaluation of firmness and the sweetness index of tomatoes.