• Title/Summary/Keyword: Sum of squares

Search Result 192, Processing Time 0.019 seconds

Rank-Based Nonlinear Normalization of Oligonucleotide Arrays

  • Park, Peter J.;Kohane, Isaac S.;Kim, Ju Han
    • Genomics & Informatics
    • /
    • v.1 no.2
    • /
    • pp.94-100
    • /
    • 2003
  • Motivation: Many have observed a nonlinear relationship between the signal intensity and the transcript abundance in microarray data. The first step in analyzing the data is to normalize it properly, and this should include a correction for the nonlinearity. The commonly used linear normalization schemes do not address this problem. Results: Nonlinearity is present in both cDNA and oligonucleotide arrays, but we concentrate on the latter in this paper. Across a set of chips, we identify those genes whose within-chip ranks are relatively constant compared to other genes of similar intensity. For each gene, we compute the sum of the squares of the differences in its within-chip ranks between every pair of chips as our statistic and we select a small fraction of the genes with the minimal changes in ranks at each intensity level. These genes are most likely to be non-differentially expressed and are subsequently used in the normalization procedure. This method is a generalization of the rank-invariant normalization (Li and Wong, 2001), using all available chips rather than two at a time to gather more information, while using the chip that is least likely to be affected by nonlinear effects as the reference chip. The assumption in our method is that there are at least a small number of non­differentially expressed genes across the intensity range. The normalized expression values can be substantially different from the unnormalized values and may result in altered down-stream analysis.

Predicting the Soluble Solids of Apples by Near Infrared Spectroscopy (II) - PLS and ANN Models - (근적외선을 이용한 사과의 당도예측 (II) - 부분최소제곱 및 인공신경회로망 모델 -)

  • ;W. R. Hruschka;J. A. Abbott;;B. S. Park
    • Journal of Biosystems Engineering
    • /
    • v.23 no.6
    • /
    • pp.571-582
    • /
    • 1998
  • The PLS(Partial Least Square) and ANN(Artificial Neural Network) were introduced to develop the soluble solids content prediction model of apples which is followed by making a subsequent selection of photosensor. For the optimal PLS model, number of factors needed for spectrum analysis were increased until the convergence of prediction residual error sum of squares. Analysis has shown that even part of the overall wavelength with no pretreatment may turn out better performing. The best PLS model was found in the 800 to 1,100nm wavelength region without pretreatment of second derivation, having $R^2$=0.9236, bias= -0.0198bx, SEP=0.2527bx for unknown samples. On the other hand, for the ANN model the second derivation led to higher performance. On partial range of 800 to 1,100nm wavelengh region, prediction model with second derivation for unknown samples reached $R^2$=0.9177, SEP=0.2903bx in contrast to $R^2$=0.7507, SEP =0.4622bx without pretreatment.

  • PDF

Segmentation of Measured Point Data for Reverse Engineering (역공학을 위한 측정점의 영역화)

  • 양민양;이응기
    • Korean Journal of Computational Design and Engineering
    • /
    • v.4 no.3
    • /
    • pp.173-179
    • /
    • 1999
  • In reverse engineering, when a shape containing multi-patched surfaces is digitized, the boundaries of these surfaces should be detected. The objective of this paper is to introduce a computationally efficient segmentation technique for extracting edges, ad partitioning the 3D measuring point data based on the location of the boundaries. The procedure begins with the identification of the edge points. An automatic edge-based approach is developed on the basis of local geometry. A parametric quadric surface approximation method is used to estimate the local surface curvature properties. the least-square approximation scheme minimizes the sum of the squares of the actual euclidean distance between the neighborhood data points and the parametric quadric surface. The surface curvatures and the principal directions are computed from the locally approximated surfaces. Edge points are identified as the curvature extremes, and zero-crossing, which are found from the estimated surface curvatures. After edge points are identified, edge-neighborhood chain-coding algorithm is used for forming boundary curves. The original point set is then broke down into subsets, which meet along the boundaries, by scan line algorithm. All point data are applied to each boundary loops to partition the points to different regions. Experimental results are presented to verify the developed method.

  • PDF

Geometrical description based on forward selection & backward elimination methods for regression models (다중회귀모형에서 전진선택과 후진제거의 기하학적 표현)

  • Hong, Chong-Sun;Kim, Moung-Jin
    • Journal of the Korean Data and Information Science Society
    • /
    • v.21 no.5
    • /
    • pp.901-908
    • /
    • 2010
  • A geometrical description method is proposed to represent the process of the forward selection and backward elimination methods among many variable selection methods for multiple regression models. This graphical method shows the process of the forward selection and backward elimination on the first and second quadrants, respectively, of half circle with a unit radius. At each step, the SSR is represented by the norm of vector and the extra SSR or partial determinant coefficient is represented by the angle between two vectors. Some lines are dotted when the partial F test results are statistically significant, so that statistical analysis could be explored. This geometrical description can be obtained the final regression models based on the forward selection and backward elimination methods. And the goodness-of-fit for the model could be explored.

Interblock Information from BIBD Mixed Effects (균형불완비블록설계의 혼합효과에서 블록간 정보)

  • Choi, Jaesung
    • The Korean Journal of Applied Statistics
    • /
    • v.28 no.2
    • /
    • pp.151-158
    • /
    • 2015
  • This paper discusses how to use projections for the analysis of data from balanced incomplete block designs. A model is suggested as a matrix form for the interblock analysis. A second set of treatment effects can be found by projections from the suggested interblock model. The variance and covariance matrix of two estimated vectors of treatment effects is derived. The uncorrelation of two estimated vectors can be verified from their covaraince structure. The fitting constants method is employed for the calculation of block sum of squares adjusted for treatment effects.

Suppression for Logistic Regression Model (로지스틱 회귀모형에서의 SUPPRESSION)

  • Hong C. S.;Kim H. I.;Ham J. H.
    • The Korean Journal of Applied Statistics
    • /
    • v.18 no.3
    • /
    • pp.701-712
    • /
    • 2005
  • The suppression for logistic regression models has been debated no longer than that for linear regression models since, among many other reasons, sum of squares for regression (SSR) or coefficient of determination ($R^2$) could be defined into various ways. Based on four kinds of $R^2$'s: two kinds are most preferred, and the other two are proposed by Liao & McGee (2003), four kinds of SSR's are derived so that the suppression for logistic models is explained. Many data fitted to logistic models are generated by Monte Carlo method. We explore when suppression happens, and compare with that for linear regression models.

The Study for NHPP Software Reliability Growth Model Based on Hyper-exponential Distribution (초지수분포(Hyper-exponential)를 이용한 소프트웨어 신뢰성장 모형에 관한 연구)

  • Kim, Hee-Cheul;Shin, Hyun-Cheul
    • Convergence Security Journal
    • /
    • v.7 no.1
    • /
    • pp.45-53
    • /
    • 2007
  • Finite failure NHPP models presented in the literature exhibit either constant, monotonic increasing or monotonic decreasing failure occurrence rates per fault. In this paper, Goel-Okumoto and Yamada-Ohba-Osaki model was reviewed, proposes the hyper-exponential distribution reliability model, which maked out efficiency application for software reliability. Algorithm to estimate the parameters used to maximum likelihood estimator and bisection method. For model determination and selection, explored goodness of fit (the error sum of squares). The methodology developed in this paper is exemplified with a software reliability random data set introduced by of Weibull distribution (shape 0.1 & scale 1) of Minitab (version 14) statistical package.

  • PDF

GPS-Based Orbit Determination for KOMPSAT-5 Satellite

  • Hwang, Yoo-La;Lee, Byoung-Sun;Kim, Young-Rok;Roh, Kyoung-Min;Jung, Ok-Chul;Kim, Hae-Dong
    • ETRI Journal
    • /
    • v.33 no.4
    • /
    • pp.487-496
    • /
    • 2011
  • Korea Multi-Purpose Satellite-5 (KOMPSAT-5) is the first satellite in Korea that provides 1 m resolution synthetic aperture radar (SAR) images. Precise orbit determination (POD) using a dual-frequency IGOR receiver data is performed to conduct high-resolution SAR images. We suggest orbit determination strategies based on a differential GPS technique. Double-differenced phase observations are sampled every 30 seconds. A dynamic model approach using an estimation of general empirical acceleration every 6 minutes through a batch least-squares estimator is applied. The orbit accuracy is validated using real data from GRACE and KOMPSAT-2 as well as simulated KOMPSAT-5 data. The POD results using GRACE satellite are adjusted through satellite laser ranging data and compared with publicly available reference orbit data. Operational orbit determination satisfies 5 m root sum square (RSS) in one sigma, and POD meets the orbit accuracy requirements of less than 20 cm and 0.003 cm/s RSS in position and velocity, respectively.

Improved time delay estimation by adaptive eigenvector decomposition for two noisy acoustic sensors (잡음이 있는 두 음향 센서를 이용한 시간 지연 추정을 위한 향상된 적응 고유벡터 추정 기반 알고리즘)

  • Lim, Jun-Seok
    • The Journal of the Acoustical Society of Korea
    • /
    • v.37 no.6
    • /
    • pp.499-505
    • /
    • 2018
  • Time delay estimation between two acoustic sensors is widely used in room acoustics and sonar for target position estimation, tracking and synchronization. A cross-correlation based method is representative for the time delay estimation. However, this method does not have enough consideration for the noise added to the receiving acoustic sensors. This paper proposes a new time delay estimation method considering the added noise on the receiver acoustic sensors. From comparing with the existing GCC (Generalized Cross Correlation) method, and adaptive eigen decomposition method, we show that the proposed method outperforms other methods for a colored signal source in the white Gaussian noise condition.

The Comparative Study for NHPP Software Reliability Growth Model Based on Non-linear Intensity Function (비선형 강도함수를 가진 NHPP 소프트웨어 신뢰성장 모형에 관한 비교 연구)

  • Kim, Hee-Cheul
    • Convergence Security Journal
    • /
    • v.7 no.2
    • /
    • pp.1-8
    • /
    • 2007
  • Finite failure NHPP models presented in the literature exhibit either constant, monotonic increasing or monotonic decreasing failure occurrence rates per fault (intensity function). In this paper, intensity function of Goel-Okumoto model was reviewed, proposes Kappa (2) and the Burr distribution, which maked out efficiency application for software reliability. Algorithm to estimate the parameters used to maximum likelihood estimator and bisection method. For model determination and selection, explored goodness of fit (the error sum of squares) The methodology developed in this paper is exemplified with a software reliability real data set introduced by NTDS (Naval Tactical Data System)

  • PDF