• Title/Summary/Keyword: sequence of sets

Search Result 361, Processing Time 0.029 seconds

INSTABILITY OF THE BETTI SEQUENCE FOR PERSISTENT HOMOLOGY AND A STABILIZED VERSION OF THE BETTI SEQUENCE

  • JOHNSON, MEGAN;JUNG, JAE-HUN
    • Journal of the Korean Society for Industrial and Applied Mathematics
    • /
    • v.25 no.4
    • /
    • pp.296-311
    • /
    • 2021
  • Topological Data Analysis (TDA), a relatively new field of data analysis, has proved very useful in a variety of applications. The main persistence tool from TDA is persistent homology in which data structure is examined at many scales. Representations of persistent homology include persistence barcodes and persistence diagrams, both of which are not straightforward to reconcile with traditional machine learning algorithms as they are sets of intervals or multisets. The problem of faithfully representing barcodes and persistent diagrams has been pursued along two main avenues: kernel methods and vectorizations. One vectorization is the Betti sequence, or Betti curve, derived from the persistence barcode. While the Betti sequence has been used in classification problems in various applications, to our knowledge, the stability of the sequence has never before been discussed. In this paper we show that the Betti sequence is unstable under the 1-Wasserstein metric with regards to small perturbations in the barcode from which it is calculated. In addition, we propose a novel stabilized version of the Betti sequence based on the Gaussian smoothing seen in the Stable Persistence Bag of Words for persistent homology. We then introduce the normalized cumulative Betti sequence and provide numerical examples that support the main statement of the paper.

Acoustic Analysis of Koreans' Production Errors in English - with reference to nasalization and lateralization (한국인 화자의 영어 발음 오류에 관한 음향적 분석 - 비음화와 설측음화를 중심으로 -)

  • Kim, Mi-Hye;Kang, Sun-Mi;Kim, Kee-Ho
    • Speech Sciences
    • /
    • v.15 no.3
    • /
    • pp.53-63
    • /
    • 2008
  • This paper examined the acoustic differences in English speech production between English native speakers and Korean learners. Korean speakers seem to produce errors by over-applying the Korean phonological rules(nasalization and lateralization) to English speech under the conditions comparable to those of Korean which contain nasal+lateral or lateral+nasal sequences. Being based on this prediction, the experimental data is grouped into three sets, [n]+[l] sequence, [l]+[n]sequence, and [m]+[l] sequence. The result shows that, Korean speakers usually nasalize or lateralize the target words or phrases in every three categories while English natives don't. In set A([n]+[l] sequence), both nasalization and lateralization were found in [n]+[l] sequence, the same circumstances where both nasalization and lateralization can be placed as in Korean. In the case of set B([l]+[n] sequence), only lateralization is observed. It is because the nasalization never occurs in the sequence of l-n in Korean. There is no lateralization in set C([m]+[l] sequence), because only nasalization occurs in the sequence of m-l in Korean. This results reconfirmed that the nasalization and lateralization rules in Korean deeply influence on the English production data. Korean speakers need to be taught not to over-apply Korean phonological rule to English production for accurate pronunciation.

  • PDF

A Report on the Inter-Gene Correlations in cDNA Microarray Data Sets (cDNA 마이크로어레이에서 유전자간 상관 관계에 대한 보고)

  • Kim, Byung-Soo;Jang, Jee-Sun;Kim, Sang-Cheol;Lim, Jo-Han
    • The Korean Journal of Applied Statistics
    • /
    • v.22 no.3
    • /
    • pp.617-626
    • /
    • 2009
  • A series of recent papers reported that the inter-gene correlations in Affymetrix microarray data sets were strong and long-ranged, and the assumption of independence or weak dependence among gene expression signals which was often employed without justification was in conflict with actual data. Qui et al. (2005) indicated that applying the nonparametric empirical Bayes method in which test statistics were pooled across genes for performing the statistical inference resulted in the large variance of the number of differentially expressed genes. Qui et al. (2005) attributed this effect to strong and long-ranged inter-gene correlations. Klebanov and Yakovlev (2007) demonstrated that the inter-gene correlations provided a rich source of information rather than being a nuisance in the statistical analysis and they developed, by transforming the original gene expression sequence, a sequence of independent random variables which they referred to as a ${\delta}$-sequence. We note in this report using two cDNA microarray data sets experimented in this country that the strong and long-ranged inter-gene correlations were still valid in cDNA microarray data and also the ${\delta}$-sequence of independence could be derived from the cDNA microarray data. This note suggests that the inter-gene correlations be considered in the future analysis of the cDNA microarray data sets.

Operations of fuzzy bags

  • Kim, Kyung-Soo;Miyamoto, Sadaaki
    • 제어로봇시스템학회:학술대회논문집
    • /
    • 1996.10a
    • /
    • pp.28-31
    • /
    • 1996
  • A bag is a set-like entity which can contain repeated elements. Fuzzy bags have been studied by Yager, who defined their basic relations and operations. However, his definitions of the basic relations and operations are inconsistent with the corresponding relations and operations for ordinary fuzzy sets. The present paper presents new basic relations and operations of fuzzy bags using a grade sequence for each element of the universal set. Moreover the .alpha.-cut, t-norms, the extension principle, and the composition of fuzzy bag relations are described.

  • PDF

A study on the parameter estimate using selective recursive least square (SRLS을 이용한 파라미터 추정에 관한 연구)

  • 유치형;이재하;정찬수
    • 제어로봇시스템학회:학술대회논문집
    • /
    • 1989.10a
    • /
    • pp.441-444
    • /
    • 1989
  • This correspondence presents a recursive estimation algorithm which, unlike conventional ones; updates the estimates only when a sufficient improvement can be obtained with a bounded noise assumption, the resulting sequence of estimates is a sequence of convex sets(ellipsoids) in the parameter space. For the cases studied, the algorithm use less than 20 percent of the. data to update, the estimate and still acquired good accuracy for spectral estimation.

  • PDF

Establishment of Quantitative Analysis Method for Genetically Modified Maize Using a Reference Plasmid and Novel Primers

  • Moon, Gi-Seong;Shin, Weon-Sun
    • Preventive Nutrition and Food Science
    • /
    • v.17 no.4
    • /
    • pp.274-279
    • /
    • 2012
  • For the quantitative analysis of genetically modified (GM) maize in processed foods, primer sets and probes based on the 35S promoter (p35S), nopaline synthase terminator (tNOS), p35S-hsp70 intron, and zSSIIb gene encoding starch synthase II for intrinsic control were designed. Polymerase chain reaction (PCR) products (80~101 bp) were specifically amplified and the primer sets targeting the smaller regions (80 or 81 bp) were more sensitive than those targeting the larger regions (94 or 101 bp). Particularly, the primer set 35F1-R1 for p35S targeting 81 bp of sequence was even more sensitive than that targeting 101 bp of sequence by a 3-log scale. The target DNA fragments were also specifically amplified from all GM labeled food samples except for one item we tested when 35F1-R1 primer set was applied. A reference plasmid pGMmaize (3 kb) including the smaller PCR products for p35S, tNOS, p35S-hsp70 intron, and the zSSIIb gene was constructed for real-time PCR (RT-PCR). The linearity of standard curves was confirmed by using diluents ranging from $2{\times}10^1{\sim}10^5$ copies of pGMmaize and the $R^2$ values ranged from 0.999~1.000. In the RT-PCR, the detection limit using the novel primer/probe sets was 5 pg of genomic DNA from MON810 line indicating that the primer sets targeting the smaller regions (80 or 81 bp) could be used for highly sensitive detection of foreign DNA fragments from GM maize in processed foods.

Muti-variable Sequence Stratigraphic Model and its Application to Shelf-Slope System of the Southwestern Ulleung Basin Margin (다중변수 순차층서 모델 개발을 통한 울릉분지 남서부 대륙주변부의 층서연구)

  • Yoon Seok Hoon;Park Se Jin;Chough Sung Kwun
    • The Korean Journal of Petroleum Geology
    • /
    • v.5 no.1_2 s.6
    • /
    • pp.36-47
    • /
    • 1997
  • This study presents multi-variable sequence model for a broader application of sequence concept proposed by Exxon group. The concept of the multi-variable model is based on the fact that internal organization and boundary type of the sequences are determined by three varying factors including 3rd-order cycles of eustasy, and tectonic movement and sediment influx with 2nd-order changes. Instead of Exxon group's systems tracts, this model adopts parasequence sets as the fundamental building blocks of the sequence, because they are descriptive stratigraphic units simply defined by internal stacking pattern, reflecting interactions of accommodation and sediment influx. Seven sequence types which vary in number and type of internal parasequence sets are formulated as associations of four types of accommodation development and three grades of sediment influx. In the southwestern margin of Ulleung Basin, the multi-variable sequence analysis of shelf-slope sequence shows systematic changes in stratal patterns and the numbs, of constituent parasequence sets (i.e. sequence type). These changes are interpreted to reflect temporal and spatial changes in type and rate of tectonic movement and sediment influx, as a result of back-arc opening and closing. During the back-arc opening, rapid subsidence, continuous rise of relative sea level, and high sediment influx gave rise to sequences dominantly of single progradational parasequence set. In the early stage of back-arc closing accompanied by local contractional deformation, different types of sequences contemporaneously formed depending on the spatial changes in tectonically-controlled accommodation and influx rates. During the subsequent slow back-arc subsidence, rise-dominated relative sea-level cycle was coupled with moderate to high sedimentation rate to have resulted in sequences consisting of $2~3$ parasequence sets.

  • PDF

Centralized Kalman Filter with Adaptive Measurement Fusion: its Application to a GPS/SDINS Integration System with an Additional Sensor

  • Lee, Tae-Gyoo
    • International Journal of Control, Automation, and Systems
    • /
    • v.1 no.4
    • /
    • pp.444-452
    • /
    • 2003
  • An integration system with multi-measurement sets can be realized via combined application of a centralized and federated Kalman filter. It is difficult for the centralized Kalman filter to remove a failed sensor in comparison with the federated Kalman filter. All varieties of Kalman filters monitor innovation sequence (residual) for detection and isolation of a failed sensor. The innovation sequence, which is selected as an indicator of real time estimation error plays an important role in adaptive mechanism design. In this study, the centralized Kalman filter with adaptive measurement fusion is introduced by means of innovation sequence. The objectives of adaptive measurement fusion are automatic isolation and recovery of some sensor failures as well as inherent monitoring capability. The proposed adaptive filter is applied to the GPS/SDINS integration system with an additional sensor. Simulation studies attest that the proposed adaptive scheme is effective for isolation and recovery of immediate sensor failures.

An Efficient Video Retrieval Algorithm Using Key Frame Matching for Video Content Management

  • Kim, Sang Hyun
    • International Journal of Contents
    • /
    • v.12 no.1
    • /
    • pp.1-5
    • /
    • 2016
  • To manipulate large video contents, effective video indexing and retrieval are required. A large number of video indexing and retrieval algorithms have been presented for frame-wise user query or video content query whereas a relatively few video sequence matching algorithms have been proposed for video sequence query. In this paper, we propose an efficient algorithm that extracts key frames using color histograms and matches the video sequences using edge features. To effectively match video sequences with a low computational load, we make use of the key frames extracted by the cumulative measure and the distance between key frames, and compare two sets of key frames using the modified Hausdorff distance. Experimental results with real sequence show that the proposed video sequence matching algorithm using edge features yields the higher accuracy and performance than conventional methods such as histogram difference, Euclidean metric, Battachaya distance, and directed divergence methods.

Correlation Analysis between Regulatory Sequence Motifs and Expression Profiles by Kernel CCA

  • Rhee, Je-Keun;Joung, Je-Gun;Chang, Jeong-Ho;Zhang, Byoung-Tak
    • Proceedings of the Korean Society for Bioinformatics Conference
    • /
    • 2005.09a
    • /
    • pp.63-68
    • /
    • 2005
  • Transcription factors regulate gene expression by binding to gene upstream region. Each transcription factor has the specific binding site in promoter region. So the analysis of gene upstream sequence is necessary for understanding regulatory mechanism of genes, under a plausible idea that assumption that DNA sequence motif profiles are closely related to gene expression behaviors of the corresponding genes. Here, we present an effective approach to the analysis of the relation between gene expression profiles and gene upstream sequences on the basis of kernel canonical correlation analysis (kernel CCA). Kernel CCA is a useful method for finding relationships underlying between two different data sets. In the application to a yeast cell cycle data set, it is shown that gene upstream sequence profile is closely related to gene expression patterns in terms of canonical correlation scores. By the further analysis of the contributing values or weights of sequence motifs in the construction of a pair of sequence motif profiles and expression profiles, we show that the proposed method can identify significant DNA sequence motifs involved with some specific gene expression patterns, including some well known motifs and those putative, in the process of the yeast cell cycle.

  • PDF