• Title/Summary/Keyword: Entropy threshold

Search Result 54, Processing Time 0.022 seconds

Analysis of the Number of Ratings and the Performance of Collaborative Filtering (사용자의 평가 횟수와 협동적 필터링 성과간의 관계 분석)

  • Lee, Hong-Ju;Kim, Jong-U;Park, Seong-Ju
    • Proceedings of the Korean Operations and Management Science Society Conference
    • /
    • 2005.05a
    • /
    • pp.629-638
    • /
    • 2005
  • In this paper, we consider two issues in collaborative filtering, which are closely related with the number of ratings of a user. First issue is the relationship between the number of ratings of a user and the performance of collaborative filtering. The relationship is investigated with two datasets, EachMovie and Movielens datasets. The number of ratings of a user is critical when the number of ratings is small, but after the number is over a certain threshold, its influence on recommendation performance becomes smaller. We also provide an explanation on the relationship between the number of ratings of a user and the performance in terms of neighborhood formations in collaborative filtering. The second issue is how to select an initial product list for new users for gaining user responses. We suggest and analyze 14 selection strategies which include popularity, favorite, clustering, genre, and entropy methods. Popularity methods are adequate for getting higher number of ratings from users, and favorite methods are good for higher average preference ratings of users.

  • PDF

Split Password-Based Authenticated Key Exchange (분할된 패스워드 기반 인증된 키교환 프로토콜)

  • 류종호;염흥열
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.14 no.5
    • /
    • pp.23-36
    • /
    • 2004
  • This paper presents a password based authentication and key exchange protocol which can be used for both authenticating users and exchanging session keys for a subsequent secure communication over an untrusted network. Our idea is to increase a randomness of the password verification data, i.e., we split the password, and then amplify the split passwords in the high entropy-structured password verification data. And in order to prevent the verifier-compromised attack, we construct our system such that the password verification data is encrypted with the verifier's key and the private key of verifier used to encrypt it is stored in a secure place like a smart cards. Also we propose the distributed password authentication scheme utilizing many authentication servers in order to prevent the server-compromised attack occurred when only one server is used. Furthermore, the security analysis on the proposed protocol has been presented as a conclusion.

Fuzzy discretization with spatial distribution of data and Its application to feature selection (데이터의 공간적 분포를 고려한 퍼지 이산화와 특징선택에의 응용)

  • Son, Chang-Sik;Shin, A-Mi;Lee, In-Hee;Park, Hee-Joon;Park, Hyoung-Seob;Kim, Yoon-Nyun
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.20 no.2
    • /
    • pp.165-172
    • /
    • 2010
  • In clinical data minig, choosing the optimal subset of features is such important, not only to reduce the computational complexity but also to improve the usefulness of the model constructed from the given data. Moreover the threshold values (i.e., cut-off points) of selected features are used in a clinical decision criteria of experts for differential diagnosis of diseases. In this paper, we propose a fuzzy discretization approach, which is evaluated by measuring the degree of separation of redundant attribute values in overlapping region, based on spatial distribution of data with continuous attributes. The weighted average of the redundant attribute values is then used to determine the threshold value for each feature and rough set theory is utilized to select a subset of relevant features from the overall features. To verify the validity of the proposed method, we compared experimental results, which applied to classification problem using 668 patients with a chief complaint of dyspnea, based on three discretization methods (i.e., equal-width, equal-frequency, and entropy-based) and proposed discretization method. From the experimental results, we confirm that the discretization methods with fuzzy partition give better results in two evaluation measures, average classification accuracy and G-mean, than those with hard partition.

Motion Vector Coding Using Adaptive Motion Resolution (적응적인 움직임 벡터 해상도를 이용한 움직임 벡터 부호화 방법)

  • Jang, Myung-Hun;Seo, Chan-Won;Han, Jong-Ki
    • Journal of Broadcast Engineering
    • /
    • v.17 no.1
    • /
    • pp.165-178
    • /
    • 2012
  • In most conventional video codecs, such as MPEG-2 and MPEG-4, inter coding is performed with the fixed motion vector resolution. When KTA software was developed, resolution for MVs can be selected in each slice. Although KTA codec uses a variety of resolutions for ME, the selected resolution is applied over the entire pixels in the slice and the statistical property of the local area is not considered. In this paper, we propose an adaptive decision scheme for motion vector resolution which depends on region, where MV search area is divided to multiple regions according to the distance from PMV. In each region, the assigned resolution is used to estimate MV. Each region supports different resolution for ME from other regions. The efficiency of the proposed scheme is affected from threshold values to divide the search area and the entropy coding method to encode the estimated MV. Simulation results with HM3.0 which is the reference software of HEVC show that the proposed scheme provides bit rate gains of 0.9%, 0.6%, and 2.9% in Random Access, Low Delay with B picture, and Low Delay with P picture structures, respectively.

A study on end-to-end speaker diarization system using single-label classification (단일 레이블 분류를 이용한 종단 간 화자 분할 시스템 성능 향상에 관한 연구)

  • Jaehee Jung;Wooil Kim
    • The Journal of the Acoustical Society of Korea
    • /
    • v.42 no.6
    • /
    • pp.536-543
    • /
    • 2023
  • Speaker diarization, which labels for "who spoken when?" in speech with multiple speakers, has been studied on a deep neural network-based end-to-end method for labeling on speech overlap and optimization of speaker diarization models. Most deep neural network-based end-to-end speaker diarization systems perform multi-label classification problem that predicts the labels of all speakers spoken in each frame of speech. However, the performance of the multi-label-based model varies greatly depending on what the threshold is set to. In this paper, it is studied a speaker diarization system using single-label classification so that speaker diarization can be performed without thresholds. The proposed model estimate labels from the output of the model by converting speaker labels into a single label. To consider speaker label permutations in the training, the proposed model is used a combination of Permutation Invariant Training (PIT) loss and cross-entropy loss. In addition, how to add the residual connection structures to model is studied for effective learning of speaker diarization models with deep structures. The experiment used the Librispech database to generate and use simulated noise data for two speakers. When compared with the proposed method and baseline model using the Diarization Error Rate (DER) performance the proposed method can be labeling without threshold, and it has improved performance by about 20.7 %.

Atrial Fibrillation Detection Algorithm through Non-Linear Analysis of Irregular RR Interval Rhythm (불규칙 RR 간격 리듬의 비선형적 특성 분석을 통한 심방세동 검출 알고리즘)

  • Cho, Ik-Sung;Kwon, Hyeog-Soong
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.15 no.12
    • /
    • pp.2655-2663
    • /
    • 2011
  • Several algorithms have been developed to detect AF which rely either on the form of P waves or the based on the time frequency domain analysis of RR variability. However, locating the P wave fiducial point is very difficult because of the low amplitude of the P wave and the corruption by noise. Also, the time frequency domain analysis of RR variability has disadvantage to get the details of irregular RR interval rhythm. In this study, we describe an atrial fibrillation detection algorithm through non-linear analysis of irregular RR interval rhythm based on the variability, randomness and complexity. We employ a new statistical techniques root mean squares of successive differences(RMSSD), turning points ratio(TPR) and sample entropy(SpEn). The detection algorithm was tested using the optimal threshold on two databases, namely the MIT-BIH Atrial Fibrillation Database and the Arrhythmia Database. We have achieved a high sensitivity(Se:94.5%), specificity(Sp:96.2%) and Se(89.8%), Sp(89.62%) respectively.

Night Time Leading Vehicle Detection Using Statistical Feature Based SVM (통계적 특징 기반 SVM을 이용한 야간 전방 차량 검출 기법)

  • Joung, Jung-Eun;Kim, Hyun-Koo;Park, Ju-Hyun;Jung, Ho-Youl
    • IEMEK Journal of Embedded Systems and Applications
    • /
    • v.7 no.4
    • /
    • pp.163-172
    • /
    • 2012
  • A driver assistance system is critical to improve a convenience and stability of vehicle driving. Several systems have been already commercialized such as adaptive cruise control system and forward collision warning system. Efficient vehicle detection is very important to improve such driver assistance systems. Most existing vehicle detection systems are based on a radar system, which measures distance between a host and leading (or oncoming) vehicles under various weather conditions. However, it requires high deployment cost and complexity overload when there are many vehicles. A camera based vehicle detection technique is also good alternative method because of low cost and simple implementation. In general, night time vehicle detection is more complicated than day time vehicle detection, because it is much more difficult to distinguish the vehicle's features such as outline and color under the dim environment. This paper proposes a method to detect vehicles at night time using analysis of a captured color space with reduction of reflection and other light sources in images. Four colors spaces, namely RGB, YCbCr, normalized RGB and Ruta-RGB, are compared each other and evaluated. A suboptimal threshold value is determined by Otsu algorithm and applied to extract candidates of taillights of leading vehicles. Statistical features such as mean, variance, skewness, kurtosis, and entropy are extracted from the candidate regions and used as feature vector for SVM(Support Vector Machine) classifier. According to our simulation results, the proposed statistical feature based SVM provides relatively high performances of leading vehicle detection with various distances in variable nighttime environments.

Voice Recognition Performance Improvement using the Convergence of Voice signal Feature and Silence Feature Normalization in Cepstrum Feature Distribution (음성 신호 특징과 셉스트럽 특징 분포에서 묵음 특징 정규화를 융합한 음성 인식 성능 향상)

  • Hwang, Jae-Cheon
    • Journal of the Korea Convergence Society
    • /
    • v.8 no.5
    • /
    • pp.13-17
    • /
    • 2017
  • Existing Speech feature extracting method in speech Signal, there are incorrect recognition rates due to incorrect speech which is not clear threshold value. In this article, the modeling method for improving speech recognition performance that combines the feature extraction for speech and silence characteristics normalized to the non-speech. The proposed method is minimized the noise affect, and speech recognition model are convergence of speech signal feature extraction to each speech frame and the silence feature normalization. Also, this method create the original speech signal with energy spectrum similar to entropy, therefore speech noise effects are to receive less of the noise. the performance values are improved in signal to noise ration by the silence feature normalization. We fixed speech and non speech classification standard value in cepstrum For th Performance analysis of the method presented in this paper is showed by comparing the results with CHMM HMM, the recognition rate was improved 2.7%p in the speech dependent and advanced 0.7%p in the speech independent.

Effective Nonlinear Filters with Visual Perception Characteristics for Extracting Sketch Features (인간시각 인식특성을 지닌 효율적 비선형 스케치 특징추출 필터)

  • Cho, Sung-Mok;Cho, Ok-Lae
    • Journal of the Korea Society of Computer and Information
    • /
    • v.11 no.1 s.39
    • /
    • pp.139-145
    • /
    • 2006
  • Feature extraction technique in digital images has many applications such as robot vision, medical diagnostic system, and motion video transmission, etc. There are several methods for extracting features in digital images for example nonlinear gradient, nonlinear laplacian, and entropy convolutional filter. However, conventional convolutional filters are usually not efficient to extract features in an image because image feature formation in eyes is more sensitive to dark regions than to bright regions. A few nonlinear filters using difference between arithmetic mean and harmonic mean in a window for extracting sketch features are described in this paper They have some advantages, for example simple computation, dependence on local intensities and less sensitive to small intensity changes in very dark regions. Experimental results demonstrate more successful features extraction than other conventional filters over a wide variety of intensity variations.

  • PDF

Probabilistic Models for Local Patterns Analysis

  • Salim, Khiat;Hafida, Belbachir;Ahmed, Rahal Sid
    • Journal of Information Processing Systems
    • /
    • v.10 no.1
    • /
    • pp.145-161
    • /
    • 2014
  • Recently, many large organizations have multiple data sources (MDS') distributed over different branches of an interstate company. Local patterns analysis has become an effective strategy for MDS mining in national and international organizations. It consists of mining different datasets in order to obtain frequent patterns, which are forwarded to a centralized place for global pattern analysis. Various synthesizing models [2,3,4,5,6,7,8,26] have been proposed to build global patterns from the forwarded patterns. It is desired that the synthesized rules from such forwarded patterns must closely match with the mono-mining results (i.e., the results that would be obtained if all of the databases are put together and mining has been done). When the pattern is present in the site, but fails to satisfy the minimum support threshold value, it is not allowed to take part in the pattern synthesizing process. Therefore, this process can lose some interesting patterns, which can help the decider to make the right decision. In such situations we propose the application of a probabilistic model in the synthesizing process. An adequate choice for a probabilistic model can improve the quality of patterns that have been discovered. In this paper, we perform a comprehensive study on various probabilistic models that can be applied in the synthesizing process and we choose and improve one of them that works to ameliorate the synthesizing results. Finally, some experiments are presented in public database in order to improve the efficiency of our proposed synthesizing method.