• Title/Summary/Keyword: outliers

Search Result 656, Processing Time 0.024 seconds

Safety Robust Speaker Recognition Against Utterance Variationsed (발성변화에 강인한 화자 인식에 관한 연구)

  • Lee Ki-Yong
    • Journal of Internet Computing and Services
    • /
    • v.5 no.2
    • /
    • pp.69-73
    • /
    • 2004
  • A speaker model In speaker recognition system is to be trained from a large data set gathered in multiple sessions. Large data set requires large amount of memory and computation, and moreover it's practically hard to make users utter the data inseveral sessions. Recently the incremental adaptation methods are proposed to cover the problems, However, the data set gathered from multiple sessions is vulnerable to the outliers from the irregular utterance variations and the presence of noise, which result in inaccurate speaker model. In this paper, we propose an incremental robust adaptation method to minimize the influence of outliers on Gaussian Mixture Madel based speaker model. The robust adaptation is obtained from an incremental version of M-estimation. Speaker model is initially trained from small amount of data and it is adapted recursively with the data available in each session, Experimental results from the data set gathered over seven months show that the proposed method is robust against outliers.

  • PDF

Application of Discrete Wavelet Transforms to Identify Unknown Attacks in Anomaly Detection Analysis (이상 탐지 분석에서 알려지지 않는 공격을 식별하기 위한 이산 웨이블릿 변환 적용 연구)

  • Kim, Dong-Wook;Shin, Gun-Yoon;Yun, Ji-Young;Kim, Sang-Soo;Han, Myung-Mook
    • Journal of Internet Computing and Services
    • /
    • v.22 no.3
    • /
    • pp.45-52
    • /
    • 2021
  • Although many studies have been conducted to identify unknown attacks in cyber security intrusion detection systems, studies based on outliers are attracting attention. Accordingly, we identify outliers by defining categories for unknown attacks. The unknown attacks were investigated in two categories: first, there are factors that generate variant attacks, and second, studies that classify them into new types. We have conducted outlier studies that can identify similar data, such as variants, in the category of studies that generate variant attacks. The big problem of identifying anomalies in the intrusion detection system is that normal and aggressive behavior share the same space. For this, we applied a technique that can be divided into clear types for normal and attack by discrete wavelet transformation and detected anomalies. As a result, we confirmed that the outliers can be identified through One-Class SVM in the data reconstructed by discrete wavelet transform.

A Study on the Optimization of a Contracted Power Prediction Model for Convenience Store using XGBoost Regression (XGBoost 회귀를 활용한 편의점 계약전력 예측 모델의 최적화에 대한 연구)

  • Kim, Sang Min;Park, Chankwon;Lee, Ji-Eun
    • Journal of Information Technology Services
    • /
    • v.21 no.4
    • /
    • pp.91-103
    • /
    • 2022
  • This study proposes a model for predicting contracted power using electric power data collected in real time from convenience stores nationwide. By optimizing the prediction model using machine learning, it will be possible to predict the contracted power required to renew the contract of the existing convenience store. Contracted power is predicted through the XGBoost regression model. For the learning of XGBoost model, the electric power data collected for 16 months through a real-time monitoring system for convenience stores nationwide were used. The hyperparameters of the XGBoost model were tuned using the GridesearchCV, and the main features of the prediction model were identified using the xgb.importance function. In addition, it was also confirmed whether the preprocessing method of missing values and outliers affects the prediction of reduced power. As a result of hyperparameter tuning, an optimal model with improved predictive performance was obtained. It was found that the features of power.2020.09, power.2021.02, area, and operating time had an effect on the prediction of contracted power. As a result of the analysis, it was found that the preprocessing policy of missing values and outliers did not affect the prediction result. The proposed XGBoost regression model showed high predictive performance for contract power. Even if the preprocessing method for missing values and outliers was changed, there was no significant difference in the prediction results through hyperparameters tuning.

Cultural Tunneling Effect: Conceptual adoption & Application in movie industry

  • Roh, Seungkook
    • Asia Marketing Journal
    • /
    • v.16 no.3
    • /
    • pp.77-100
    • /
    • 2014
  • Many researchers have analyzed the relationship between the financial success patterns of a motion picture and many other factors, such as the production cost, marketing, stars, awards, reviews, genre, and rating. Through these studies, many researchers and investors concluded that big budgets to make a blockbuster movie can serve as an insurance policy to meet their ROI; thus the box office is dominated by blockbuster movies. High-budget blockbuster movies are more likely to receive attention because these movies are more recognizable given their high expenses for production and casting. Therefore, audiences choose blockbusters in an effort to reduce the searching cost and to mitigate the possibility of a regrettable choice. This behavior of consumers, in turn, causes distributors to allocate screens for blockbusters, resulting in "concentration of blockbuster consumption." As such, low-budget films cannot easily become popular due to the lack of distribution. Indeed, low-budget films released on a small number of screens often end up becoming dismal failures. However, there are exceptional examples which are contrary to the general idea in the movie industry that a big budget and showings on a large number of screens can guarantee the success of a movie. Although researchers have attempted to analyze the performances of movies with small budgets, such movies are likely to be regarded as outliers and then be entirely discarded, as they are far from the 'three-sigma' range, especially given that previous research methodologies could not explain the financial success of such unique examples. This study attempts to explain the financial success at the box office of low-budget movies by applying the concept of the tunnel effect in quantum mechanics, as the phenomenon found in the movie industry is similar to a particle's movement in quantum physics. The tunneling effect is a phenomenon by which a particle without enough energy to pass over a potential barrier tunnels through it. Adopting the analogy, this study draws a tunneling probability function and cultural constant to forecast other outliers using the Schrödinger equation. Moreover, the study finds that word-of-mouth creates in the movie industry this phenomenon of finding outliers.

A Study on the Applicability of Machine Learning Algorithms for Detecting Hydraulic Outliers in a Borehole (시추공 수리 이상점 탐지를 위한 기계학습 알고리즘의 적용성 연구)

  • Seungbeom Choi; Kyung-Woo Park;Changsoo Lee
    • Tunnel and Underground Space
    • /
    • v.33 no.6
    • /
    • pp.561-573
    • /
    • 2023
  • Korea Atomic Energy Research Institute (KAERI) constructed the KURT (KAERI Underground Research Tunnel) to analyze the hydrogeological/geochemical characteristics of deep rock mass. Numerous boreholes have been drilled to conduct various field tests. The selection of suitable investigation intervals within a borehole is of great importance. When objectives are centered around hydraulic flow and groundwater sampling, intervals with sufficient groundwater flow are the most suitable. This study defines such points as hydraulic outliers and aimed to detect them using borehole geophysical logging data (temperature and EC) from a 1 km depth borehole. For systematic and efficient outlier detection, machine learning algorithms, such as DBSCAN, OCSVM, kNN, and isolation forest, were applied and their applicability was assessed. Following data preprocessing and algorithm optimization, the four algorithms detected 55, 12, 52, and 68 outliers, respectively. Though this study confirms applicability of the machine learning algorithms, it is suggested that further verification and supplements are desirable since the input data were relatively limited.

Development of Statistical System for Checking Multivariate Normality and Outliers (다변량 정규성과 이상치 검정을 위한 통계 시스템 개발)

  • 최용석;김종건;강명래
    • The Korean Journal of Applied Statistics
    • /
    • v.14 no.2
    • /
    • pp.223-231
    • /
    • 2001
  • 다변량분석 기법을 위해서는 자료가 정규성(normality)가정을 만족해야한다. 본 연구에서는 GUI환경에서 일변량 및 다변량자료의 정규성검정, 이상치제거 및 변수변환을 하는 시스템을 Visual Basic 언어로서 구축하여 사용자들이 보다 편리하게 사용할 수 있음을 소개 하고자 한다.

  • PDF

INFLUENCE ANALYSIS OF CHOLESKY DECOMPOSITION

  • Kim, Myung-Geun
    • Journal of applied mathematics & informatics
    • /
    • v.28 no.3_4
    • /
    • pp.913-921
    • /
    • 2010
  • The derivative influence measure is adapted to the Cholesky decomposition of a covariance matrix. Formulas for the derivative influence of observations on the Cholesky root and the inverse Cholesky root of a sample covariance matrix are derived. It is easy to implement this influence diagnostic method for practical use. A numerical example is given for illustration.

Robust Estimation and Outlier Detection

  • Myung Geun Kim
    • Communications for Statistical Applications and Methods
    • /
    • v.1 no.1
    • /
    • pp.33-40
    • /
    • 1994
  • The conditional expectation of a random variable in a multivariate normal random vector is a multiple linear regression on its predecessors. Using this fact, the least median of squares estimation method developed in a multiple linear regression is adapted to a multivariate data to identify influential observations. The resulting method clearly detect outliers and it avoids the masking effect.

  • PDF

Stereo Vision-based Visual Odometry Using Robust Visual Feature in Dynamic Environment (동적 환경에서 강인한 영상특징을 이용한 스테레오 비전 기반의 비주얼 오도메트리)

  • Jung, Sang-Jun;Song, Jae-Bok;Kang, Sin-Cheon
    • The Journal of Korea Robotics Society
    • /
    • v.3 no.4
    • /
    • pp.263-269
    • /
    • 2008
  • Visual odometry is a popular approach to estimating robot motion using a monocular or stereo camera. This paper proposes a novel visual odometry scheme using a stereo camera for robust estimation of a 6 DOF motion in the dynamic environment. The false results of feature matching and the uncertainty of depth information provided by the camera can generate the outliers which deteriorate the estimation. The outliers are removed by analyzing the magnitude histogram of the motion vector of the corresponding features and the RANSAC algorithm. The features extracted from a dynamic object such as a human also makes the motion estimation inaccurate. To eliminate the effect of a dynamic object, several candidates of dynamic objects are generated by clustering the 3D position of features and each candidate is checked based on the standard deviation of features on whether it is a real dynamic object or not. The accuracy and practicality of the proposed scheme are verified by several experiments and comparisons with both IMU and wheel-based odometry. It is shown that the proposed scheme works well when wheel slip occurs or dynamic objects exist.

  • PDF