• Title/Summary/Keyword: pairs set

Search Result 288, Processing Time 0.031 seconds

Statistical Method of Ranking Candidate Genes for the Biomarker

  • Kim, Byung-Soo;Kim, In-Young;Lee, Sun-Ho;Rha, Sun-Young
    • Communications for Statistical Applications and Methods
    • /
    • v.14 no.1
    • /
    • pp.169-182
    • /
    • 2007
  • Receive operating characteristic (ROC) approach can be employed to rank candidate genes from a microarray experiment, in particular, for the biomarker development with the purpose of population screening of a cancer. In the cancer microarray experiment based on n patients the researcher often wants to compare the tumor tissue with the normal tissue within the same individual using a common reference RNA. Ideally, this experiment produces n pairs of microarray data. However, it is often the case that there are missing values either in the normal or tumor tissue data. Practically, we have $n_1$ pairs of complete observations, $n_2$ "normal only" and $n_3$ "tumor only" data for the microarray. We refer to this data set as a mixed data set. We develop a ROC approach on the mixed data set to rank candidate genes for the biomarker development for the colorectal cancer screening. It turns out that the correlation between two ranks in terms of ROC and t statistics based on the top 50 genes of ROC rank is less than 0.6. This result indicates that employing a right approach of ranking candidate genes for the biomarker development is important for the allocation of resources.

Plagiarism Detection among Source Codes using Adaptive Methods

  • Lee, Yun-Jung;Lim, Jin-Su;Ji, Jeong-Hoon;Cho, Hwaun-Gue;Woo, Gyun
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.6 no.6
    • /
    • pp.1627-1648
    • /
    • 2012
  • We propose an adaptive method for detecting plagiarized pairs from a large set of source code. This method is adaptive in that it uses an adaptive algorithm and it provides an adaptive threshold for determining plagiarism. Conventional algorithms are based on greedy string tiling or on local alignments of two code strings. However, most of them are not adaptive; they do not consider the characteristics of the program set, thereby causing a problem for a program set in which all the programs are inherently similar. We propose adaptive local alignment-a variant of local alignment that uses an adaptive similarity matrix. Each entry of this matrix is the logarithm of the probabilities of the keywords based on their frequency in a given program set. We also propose an adaptive threshold based on the local outlier factor (LOF), which represents the likelihood of an entity being an outlier. Experimental results indicate that our method is more sensitive than JPlag, which uses greedy string tiling for detecting plagiarism-suspected code pairs. Further, the adaptive threshold based on the LOF is shown to be effective, and the detection performance shows high sensitivity with negligible loss of specificity, compared with that using a fixed threshold.

A Sampling-based Algorithm for Top-${\kappa}$ Similarity Joins (Top-${\kappa}$ 유사도 조인을 위한 샘플링 기반 알고리즘)

  • Park, Jong Soo
    • Journal of KIISE:Databases
    • /
    • v.41 no.4
    • /
    • pp.256-261
    • /
    • 2014
  • The problem of top-${\kappa}$ set similarity joins finds the top-${\kappa}$ pairs of records ranked by their similarities between two sets of input records. We propose an efficient algorithm to return top-${\kappa}$ similarity join pairs using a sampling technique. From a sample of the input records, we construct a histogram of set similarity joins, and then compute an estimated similarity threshold in the histogram for top-${\kappa}$ join pairs within the error bound of 95% confidence level based on statistical inference. Finally, the estimated threshold is applied to the traditional similarity join algorithm which uses the min-heap structure to get top-${\kappa}$ similarity joins. The experimental results show the good performance of the proposed algorithm on large real datasets.

GROBNER-SHIRSHOV BASES FOR REPRESENTATION THEORY

  • Kang, Seok-Jin;Lee, Kyu-Hwan
    • Journal of the Korean Mathematical Society
    • /
    • v.37 no.1
    • /
    • pp.55-72
    • /
    • 2000
  • In this paper, we develop the Grobner-Shirshov basis theory for the representations of associative algebras by introducing the notion of Grobner-Shirshov pairs. Our result can be applied to solve the reduction problem in representation theory and to construct monomial bases of representations of associative algebras. As an illustration, we give an explicit construction of Grobner-Shirshov pairs and monomial bases for finite dimensional irreducible representations of the simple tie algebra sl$_3$. Each of these monomial bases is in 1-1 correspondence with the set of semistandard Young tableaux with a given shape.

  • PDF

THE EXTENDIBILITY OF DIOPHANTINE PAIRS WITH FIBONACCI NUMBERS AND SOME CONDITIONS

  • Park, Jinseo
    • Journal of the Chungcheong Mathematical Society
    • /
    • v.34 no.3
    • /
    • pp.209-219
    • /
    • 2021
  • A set {a1, a2, ⋯ , am} of positive integers is called a Diophantine m-tuple if aiaj + 1 is a perfect square for all 1 ≤ i < j ≤ m. Let Fn be the nth Fibonacci number which is defined by F0 = 0, F1 = 1 and Fn+2 = Fn+1 + Fn. In this paper, we find the extendibility of Diophantine pairs {F2k, b} with some conditions.

Development and Application of Protein-Protein interaction Prediction System, PreDIN (Prediction-oriented Database of Interaction Network)

  • 서정근
    • Proceedings of the Korean Society for Bioinformatics Conference
    • /
    • 2002.06a
    • /
    • pp.5-23
    • /
    • 2002
  • Motivation: Protein-protein interaction plays a critical role in the biological processes. The identification of interacting proteins by bioinformatical methods can provide new lead In the functional studies of uncharacterized proteins without performing extensive experiments. Results: Protein-protein interactions are predicted by a computational algorithm based on the weighted scoring system for domain interactions between interacting protein pairs. Here we propose potential interaction domain (PID) pairs can be extracted from a data set of experimentally identified interacting protein pairs. where one protein contains a domain and its interacting protein contains the other. Every combinations of PID are summarized in a matrix table termed the PID matrix, and this matrix has proposed to be used for prediction of interactions. The database of interacting proteins (DIP) has used as a source of interacting protein pairs and InterPro, an integrated database of protein families, domains and functional sites, has used for defining domains in interacting pairs. A statistical scoring system. named "PID matrix score" has designed and applied as a measure of interaction probability between domains. Cross-validation has been performed with subsets of DIP data to evaluate the prediction accuracy of PID matrix. The prediction system gives about 50% of sensitivity and 98% of specificity, Based on the PID matrix, we develop a system providing several interaction information-finding services in the Internet. The system, named PreDIN (Prediction-oriented Database of Interaction Network) provides interacting domain finding services and interacting protein finding services. It is demonstrated that mapping of the genome-wide interaction network can be achieved by using the PreDIN system. This system can be also used as a new tool for functional prediction of unknown proteins.

  • PDF

SOUND SIMILARITY JUDGMENTS AND PHONOLOGICAL UNITS

  • Yoon, Yeo-Bom
    • Proceedings of the KSPS conference
    • /
    • 1997.07a
    • /
    • pp.142-143
    • /
    • 1997
  • The purpose of this paper is to assess the psychological status of the phoneme, syllable, and various postulated subsyllabic units in Korean by applying the Sound Similarity Judgment (SSJ) task, to compare the results with those in English, and to discuss the advantage and disadvantage of the SSJ task as a tool for linguistic research. In Experiment 1, 30 subjects listened to pairs of 56 eve words which were systematically varied from 'totally different' (e.g., pan-met) to 'identical' (e.g., pan-pan). Subjects were then asked to rate sound similarity of each pair on a 10-point scale. Not very surprisingly, there was a strong correlation between the number of phonemic segments matched and the similarity score provided by the subjects. This result was in accord with the previous results from English (e.g., Vitz & Winkler, 1973; Derwing & Nearey, 1986) and supported the assumption that the phoneme is the basic phonological unit in Korean and English. However, there were sharply contrasting results between the two languages. When the pairs shared two phonemes (e.g., pan-pat; pan-pen; pan-man), the pairs sharing the fIrst two phonemes were judged significantly more similar than the other two types of pairs. Quite to the contrary, in the comparable English experiments, the pairs sharing the last two phonemes were judged significantly more similar than the other two types of pairs. Experiment 2 was designed to conflrm the results of Experiment 1 by controlling the 'degree' of similarity between phonemes. For example, the pair pan-pam can be judged more similar than the pair pan-nan, although both pairs share the same number of phonemes. This could be interpreted either as confirming the result of Experiment 1 or as the fact that /n/ is more similar to /m/ than /p/ is to /n/ in terms of shared number of distinctive features. The results of Experiment 2 supported the former interpretation. Thus, the results of both experiments clearly showed that, although the 'number' of matched phonemes is the important predictor in judging sound similarity of monosyllabic pairs of both languages, the 'position' of the matched phonemes exerts a different influence in judging sound similarity in the two languages. This contrasting set of results may provide interesting implications for the internal structure of the syllable in the two languages.

  • PDF

An Efficient Algorithm for Finding the k-edge Survivability in Ring Networks

  • Myung, Young-Soo
    • Management Science and Financial Engineering
    • /
    • v.16 no.3
    • /
    • pp.85-93
    • /
    • 2010
  • Given an undirected network with a set of source-sink pairs, we are assumed to get a benefit if a pair of source and sink nodes are connected. The k-edge survivability of a network is defined as the total benefit secured after arbitrarily selected k edges are destroyed. The problem of computing k-edge survivability is known to be NP-hard and has applications of evaluating the survivability or vulnerability of a network. In this paper, we consider the k-edge survivability problem restricted to ring networks and develop an algorithm to solve it in O($n^3$|K|) time where n is the number of nodes and K is the set of source-sink pairs.

Hard calibration of a structured light for the Euclidian reconstruction (3차원 복원을 위한 구조적 조명 보정방법)

  • 신동조;양성우;김재희
    • Proceedings of the IEEK Conference
    • /
    • 2003.11a
    • /
    • pp.183-186
    • /
    • 2003
  • A vision sensor should be calibrated prior to infer a Euclidian shape reconstruction. A point to point calibration. also referred to as a hard calibration, estimates calibration parameters by means of a set of 3D to 2D point pairs. We proposed a new method for determining a set of 3D to 2D pairs for the structured light hard calibration. It is simply determined based on epipolar geometry between camera image plane and projector plane, and a projector calibrating grid pattern. The projector calibration is divided two stages; world 3D data acquisition Stage and corresponding 2D data acquisition stage. After 3D data points are derived using cross ratio, corresponding 2D point in the projector plane can be determined by the fundamental matrix and horizontal grid ID of a projector calibrating pattern. Euclidian reconstruction can be achieved by linear triangulation. and experimental results from simulation are presented.

  • PDF

LINEAR MAPS PRESERVING PAIRS OF HERMITIAN MATRICES ON WHICH THE RANK IS ADDITIVE AND APPLICATIONS

  • TANG XIAO-MIN;CAO CHONG-GUANG
    • Journal of applied mathematics & informatics
    • /
    • v.19 no.1_2
    • /
    • pp.253-260
    • /
    • 2005
  • Denote the set of n ${\times}$ n complex Hermitian matrices by Hn. A pair of n ${\times}$ n Hermitian matrices (A, B) is said to be rank-additive if rank (A+B) = rank A+rank B. We characterize the linear maps from Hn into itself that preserve the set of rank-additive pairs. As applications, the linear preservers of adjoint matrix on Hn and the Jordan homomorphisms of Hn are also given. The analogous problems on the skew Hermitian matrix space are considered.