• Title/Summary/Keyword: Sparse Data Set

Search Result 47, Processing Time 0.022 seconds

A Method for Microarray Data Analysis based on Bayesian Networks using an Efficient Structural learning Algorithm and Data Dimensionality Reduction (효율적 구조 학습 알고리즘과 데이타 차원축소를 통한 베이지안망 기반의 마이크로어레이 데이타 분석법)

  • 황규백;장정호;장병탁
    • Journal of KIISE:Software and Applications
    • /
    • v.29 no.11
    • /
    • pp.775-784
    • /
    • 2002
  • Microarray data, obtained from DNA chip technologies, is the measurement of the expression level of thousands of genes in cells or tissues. It is used for gene function prediction or cancer diagnosis based on gene expression patterns. Among diverse methods for data analysis, the Bayesian network represents the relationships among data attributes in the form of a graph structure. This property enables us to discover various relations among genes and the characteristics of the tissue (e.g., the cancer type) through microarray data analysis. However, most of the present microarray data sets are so sparse that it is difficult to apply general analysis methods, including Bayesian networks, directly. In this paper, we harness an efficient structural learning algorithm and data dimensionality reduction in order to analyze microarray data using Bayesian networks. The proposed method was applied to the analysis of real microarray data, i.e., the NC160 data set. And its usefulness was evaluated based on the accuracy of the teamed Bayesian networks on representing the known biological facts.

A Space Model to Annual Rainfall in South Korea

  • Lee, Eui-Kyoo
    • Communications for Statistical Applications and Methods
    • /
    • v.10 no.2
    • /
    • pp.445-456
    • /
    • 2003
  • Spatial data are usually obtained at selected locations even though they are potentially available at all locations in a continuous region. Moreover the monitoring locations are clustered in some regions, sparse in other regions. One important goal of spatial data analysis is to predict unknown response values at any location throughout a region of interest. Thus, an appropriate space model should be set up and their estimates and predictions must be accompanied by measures of uncertainty. In this study we see that a space model proposed allows a best interpolation to annual rainfall data in South Korea.

Improved Collaborative Filtering Using Entropy Weighting

  • Kwon, Hyeong-Joon
    • International Journal of Advanced Culture Technology
    • /
    • v.1 no.2
    • /
    • pp.1-6
    • /
    • 2013
  • In this paper, we evaluate performance of existing similarity measurement metric and propose a novel method using user's preferences information entropy to reduce MAE in memory-based collaborative recommender systems. The proposed method applies a similarity of individual inclination to traditional similarity measurement methods. We experiment on various similarity metrics under different conditions, which include an amount of data and significance weighting from n/10 to n/60, to verify the proposed method. As a result, we confirm the proposed method is robust and efficient from the viewpoint of a sparse data set, applying existing various similarity measurement methods and Significance Weighting.

  • PDF

On the Fitting ANOVA Models to Unbalanced Data

  • Jong-Tae Park;Jae-Heon Lee;Byung-Chun Kim
    • Communications for Statistical Applications and Methods
    • /
    • v.2 no.1
    • /
    • pp.48-54
    • /
    • 1995
  • A direct method for fitting analysis-of-variance models to unbalanced data is presented. This method exploits sparsity and rank deficiency of the matrix and is based on Gram-Schmidt orthogonalization of a set of sparse columns of the model matrix. The computational algorithm of the sum of squares for testing estmable hyphotheses is given.

  • PDF

High Resolution ISAR Imaging Based on Improved Smoothed L0 Norm Recovery Algorithm

  • Feng, Junjie;Zhang, Gong
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.9 no.12
    • /
    • pp.5103-5115
    • /
    • 2015
  • In radar imaging, a target is usually consisted of a few strong scatterers which are sparsely distributed. In this paper, an improved sparse signal recovery algorithm based on smoothed l0 (SL0) norm method is proposed to achieve high resolution ISAR imaging with limited pulse numbers. Firstly, one new smoothed function is proposed to approximate the l0 norm to measure the sparsity. Then a single loop step is used instead of two loop layers in SL0 method which increases the searching density of variable parameter to ensure the recovery accuracy without increasing computation amount, the cost function is undated in every loop for the next loop until the termination is satisfied. Finally, the new set of solution is projected into the feasible set. Simulation results show that the proposed algorithm is superior to the several popular methods both in terms of the reconstruction performance and computation time. Real data ISAR imaging obtained by the proposed algorithm is competitive to several other methods.

Introduction to variational Bayes for high-dimensional linear and logistic regression models (고차원 선형 및 로지스틱 회귀모형에 대한 변분 베이즈 방법 소개)

  • Jang, Insong;Lee, Kyoungjae
    • The Korean Journal of Applied Statistics
    • /
    • v.35 no.3
    • /
    • pp.445-455
    • /
    • 2022
  • In this paper, we introduce existing Bayesian methods for high-dimensional sparse regression models and compare their performance in various simulation scenarios. Especially, we focus on the variational Bayes approach proposed by Ray and Szabó (2021), which enables scalable and accurate Bayesian inference. Based on simulated data sets from sparse high-dimensional linear regression models, we compare the variational Bayes approach with other Bayesian and frequentist methods. To check the practical performance of the variational Bayes in logistic regression models, a real data analysis is conducted using leukemia data set.

Clustering of Stereo Matching Data for Vehicle Segmentation (차량분리를 위한 스테레오매칭 데이터의 클러스터링)

  • Lee, Ki-Yong;Lee, Joon-Woong
    • Journal of Institute of Control, Robotics and Systems
    • /
    • v.16 no.8
    • /
    • pp.744-750
    • /
    • 2010
  • To segment instances of vehicle classes in a sparse stereo-matching data set, this paper presents an algorithm for clustering based on DP (Dynamic Programming). The algorithm is agglomerative: it begins with each element in the set as a separate cluster and merges them into successively larger clusters according to similarity of two clusters. Here, similarity is formulated as a cost function of DP. The proposed algorithm is proven to be effective by experiments performed on various images acquired by a moving vehicle.

Multimodal Biometrics Recognition from Facial Video with Missing Modalities Using Deep Learning

  • Maity, Sayan;Abdel-Mottaleb, Mohamed;Asfour, Shihab S.
    • Journal of Information Processing Systems
    • /
    • v.16 no.1
    • /
    • pp.6-29
    • /
    • 2020
  • Biometrics identification using multiple modalities has attracted the attention of many researchers as it produces more robust and trustworthy results than single modality biometrics. In this paper, we present a novel multimodal recognition system that trains a deep learning network to automatically learn features after extracting multiple biometric modalities from a single data source, i.e., facial video clips. Utilizing different modalities, i.e., left ear, left profile face, frontal face, right profile face, and right ear, present in the facial video clips, we train supervised denoising auto-encoders to automatically extract robust and non-redundant features. The automatically learned features are then used to train modality specific sparse classifiers to perform the multimodal recognition. Moreover, the proposed technique has proven robust when some of the above modalities were missing during the testing. The proposed system has three main components that are responsible for detection, which consists of modality specific detectors to automatically detect images of different modalities present in facial video clips; feature selection, which uses supervised denoising sparse auto-encoders network to capture discriminative representations that are robust to the illumination and pose variations; and classification, which consists of a set of modality specific sparse representation classifiers for unimodal recognition, followed by score level fusion of the recognition results of the available modalities. Experiments conducted on the constrained facial video dataset (WVU) and the unconstrained facial video dataset (HONDA/UCSD), resulted in a 99.17% and 97.14% Rank-1 recognition rates, respectively. The multimodal recognition accuracy demonstrates the superiority and robustness of the proposed approach irrespective of the illumination, non-planar movement, and pose variations present in the video clips even in the situation of missing modalities.

Sparse Channel Estimation Based on Combined Measurements in OFDM Systems (OFDM 시스템에서 측정 벡터 결합을 이용한 채널 추정 방법)

  • Min, Byeongcheon;Park, Daeyoung
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.41 no.1
    • /
    • pp.1-11
    • /
    • 2016
  • We investigate compressive sensing techniques to estimate sparse channel in Orthogonal Frequency Division Multiplexing(OFDM) systems. In the case of large channel delay spread, compressive sensing may not be applicable because it is affected by length of measurement vectors. In this paper, we increase length of measurement vector adding pilot information to OFDM data block. The increased measurement vector improves probability of finding path delay set and Mean Squared Error(MSE) performance. Simulation results show that signal recovery performance of a proposed scheme is better than conventional schemes.

Coding-based Storage Design for Continuous Data Collection in Wireless Sensor Networks

  • Zhan, Cheng;Xiao, Fuyuan
    • Journal of Communications and Networks
    • /
    • v.18 no.3
    • /
    • pp.493-501
    • /
    • 2016
  • In-network storage is an effective technique for avoiding network congestion and reducing power consumption in continuous data collection in wireless sensor networks. In recent years, network coding based storage design has been proposed as a means to achieving ubiquitous access that permits any query to be satisfied by a few random (nearby) storage nodes. To maintain data consistency in continuous data collection applications, the readings of a sensor over time must be sent to the same set of storage nodes. In this paper, we present an efficient approach to updating data at storage nodes to maintain data consistency at the storage nodes without decoding out the old data and re-encoding with new data. We studied a transmission strategy that identifies a set of storage nodes for each source sensor that minimizes the transmission cost and achieves ubiquitous access by transmitting sparsely using the sparse matrix theory. We demonstrate that the problem of minimizing the cost of transmission with coding is NP-hard. We present an approximation algorithm based on regarding every storage node with memory size B as B tiny nodes that can store only one packet. We analyzed the approximation ratio of the proposed approximation solution, and compared the performance of the proposed coding approach with other coding schemes presented in the literature. The simulation results confirm that significant performance improvement can be achieved with the proposed transmission strategy.