• Title/Summary/Keyword: Data Sequence

Search Result 3,093, Processing Time 0.028 seconds

A study on the speech recognition by HMM based on multi-observation sequence (다중 관측열을 토대로한 HMM에 의한 음성 인식에 관한 연구)

  • 정의봉
    • Journal of the Korean Institute of Telematics and Electronics S
    • /
    • v.34S no.4
    • /
    • pp.57-65
    • /
    • 1997
  • The purpose of this paper is to propose the HMM (hidden markov model) based on multi-observation sequence for the isolated word recognition. The proosed model generates the codebook of MSVQ by dividing each word into several sections followed by dividing training data into several sections. Then, we are to obtain the sequential value of multi-observation per each section by weighting the vectors of distance form lower values to higher ones. Thereafter, this the sequential with high probability value while in recognition. 146 DDD area names are selected as the vocabularies for the target recognition, and 10LPC cepstrum coefficients are used as the feature parameters. Besides the speech recognition experiments by way of the proposed model, for the comparison with it, the experiments by DP, MSVQ, and genral HMM are made with the same data under the same condition. The experiment results have shown that HMM based on multi-observation sequence proposed in this paper is proved superior to any other methods such as the ones using DP, MSVQ and general HMM models in recognition rate and time.

  • PDF

A QUADRATIC APPROXIMATION FOR PROTEIN SEQUENCE TO STRUCTURE MAPPING

  • Oh, Se-Young;Yun, Jae-Heon;Chung, Sei-Young
    • Journal of applied mathematics & informatics
    • /
    • v.12 no.1_2
    • /
    • pp.155-164
    • /
    • 2003
  • A method is proposed to predict the distances between given residue pairs (between C$\sub$${\alpha}$/ atoms) of a protein using a sequence to structure mapping by indefinite quadratic approximation. The prediction technique requires a data fitting in three dimensional space with coordinates of the residues of known structured proteins and leads to a numerical ref resentation of 20 amino acids by minimizing a large least norm iteratively. These approximations are used in distance prediction for given residue pairs. Some computational experience on a test set of small proteins from Brookhaven Protein Data Bank are given.

A Multicarrier CDMA System Using Divided Spreading Sequence for Time and Frequency Diversity (시간 주파수 다이버시티를 위한 분할된 확산코드를 이용한 멀티캐리어 CDMA 시스템)

  • 박형근;주양익;김용석;차균현
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.27 no.6B
    • /
    • pp.569-578
    • /
    • 2002
  • This paper proposes a new multicarrier code division multiple access (CDMA) system. The proposed multicarrier CDMA system provides the advantages that the transmission bandwidth is more efficiently utilized by using divided spreading sequence, time and frequency diversity is achieved in frequency selective nultipath (acting channel, and inter-carrier interference (ICI) can be minimized by using specific data and code pattern. In this system, transmitted data bits are serial-to-parallel converted to some parallel branches. On each branch each bit is direct-sequence spread-spectrum modulated by divided spreading sequences and transmitted using orthogonal carriers. The receiver providers a Rake for each carrier, and the outputs of Rakes are combined to get time and frequency diversity. This multicarrier CDMA system allows additional flexibility in the choice of system parameters. Upon varying system parameters, bit error rate (BER) performance is examined for the proposed multicarrier CDMA system. Simulation results show that the proposed multicarrier CDMA scheme can achieve better performance than the other types of conventional multicarrier CDMA systems.

The Design of High Speed Bit and Word Processor (비트 및 워드 연산용 초고속 프로세서 설계)

  • Her, Jae-Dong;Yang, Oh
    • Proceedings of the KIEE Conference
    • /
    • 2002.07d
    • /
    • pp.2534-2536
    • /
    • 2002
  • This paper presents the design of high speed bit and word processor for sequence logic control using a FPGA. This FPGA is able to execute sequence instruction during program fetch cycle, because the program memory was separated from the data memory for high speed execution at 40MHz clock. Also this processor has 274 instructions set with a 32bit fixed width, so instruction decoding time and data memory interface time was reduced. This FPGA was synthesized by V600EHQ240 and Foundation tool of Xilinx company. The final simulation was successfully performed under Foundation tool simulation environment. And the FPGA programmed by VHDL for a 240 pin HQFP package. Finally the benchmark was performed to prove that the designed for bit and word processor has better performance than Q4A of Mitsubishi for the sequence logic control.

  • PDF

Uniqueness Criteria for Signal Reconstruction from Phase-Only Data (위상만을 이용한 신호복원의 유일성 판단법)

  • 이동욱;김영태
    • The Transactions of the Korean Institute of Electrical Engineers D
    • /
    • v.50 no.2
    • /
    • pp.59-64
    • /
    • 2001
  • In this paper, we propose an alternate method for determining the uniqueness of the reconstruction of a complex sequence from its phase. Uniqueness constraints could be derived in terms of the zeros of a complex polynomial defined by the DFT of the sequence. However, rooting of complex polynomials of high order is a very difficult problem. Instead of finding zeros of a complex polynomial, the proposed uniqueness criteria show that non-singularity of a matrix can guarantee the uniqueness of the reconstruction of a complex sequence from its phase-only data. It has clear advantage over the rooting method in numerical stability and computational time.

  • PDF

A Simple GUI-based Sequencing Format Conversion Tool for the Three NGS Platforms

  • Rhie, A-Rang;Yang, San-Duk;Lee, Kyung-Eun;Thong, Chin Ting;Park, Hyun-Seok
    • Genomics & Informatics
    • /
    • v.8 no.2
    • /
    • pp.97-99
    • /
    • 2010
  • To allow for a quick conversion of the proprietary sequence data from various sequencing platforms, sequence format conversion toolkits are required that can be easily integrated into workflow systems. In this respect, a format conversion tool, as well as quality conversion tool would be the minimum requirements to integrate reads from different platforms. We have developed the Pyrus NGS Sequencing Format Converter, a simple software toolkit which allows to convert three kinds of Next Generation Sequencing reads, into commonly used fasta or fastq formats. The converter modules are all implemented, uniformly, in Java GUI modules that can be integrated in software applications for displaying the data content in the same format.

Comparison Architecture for Large Number of Genomic Sequences

  • Choi, Hae-won;Ryoo, Myung-Chun;Park, Joon-Ho
    • Journal of Information Technology and Architecture
    • /
    • v.9 no.1
    • /
    • pp.11-19
    • /
    • 2012
  • Generally, a suffix tree is an efficient data structure since it reveals the detailed internal structures of given sequences within linear time. However, it is difficult to implement a suffix tree for a large number of sequences because of memory size constraints. Therefore, in order to compare multi-mega base genomic sequence sets using suffix trees, there is a need to re-construct the suffix tree algorithms. We introduce a new method for constructing a suffix tree on secondary storage of a large number of sequences. Our algorithm divides three files, in a designated sequence, into parts, storing references to the locations of edges in hash tables. To execute experiments, we used 1,300,000 sequences around 300Mbyte in EST to generate a suffix tree on disk.

Direct Sequence Spread Spectrum Transmitter using FPGAs

  • Abhijit S. Pandya;Souza, Ralph-D′;Chae, Gyoo-Yong
    • Journal of information and communication convergence engineering
    • /
    • v.2 no.2
    • /
    • pp.76-79
    • /
    • 2004
  • The DS-SS (Direct Sequence Spread Spec1nun) transmitter is part of a low data rate (∼150 kbps - burst rate and 64 bps - average data rate) wireless communication system. It is traditionally implemented using Digital Signal processing chip (DSP). However, with rapid increase in variety of services through cell phones, such as, web access, video transfer, online games etc. demand for higher rate is increasing steadily. Since the chip rate and thereby the sampling rate requirements of the system are fairly high, the transmitter should implemented using Field programmable Gate Arrays FPGAs instead of a DSP. This paper shows the steps taken to get a working prototype of the transmitter unit on a FPGA based platform.

NBLAST: a graphical user interface-based two-way BLAST software with a dot plot viewer

  • Choi, Beom-Soon;Choi, Seon Kang;Kim, Nam-Soo;Choi, Ik-Young
    • Genomics & Informatics
    • /
    • v.20 no.3
    • /
    • pp.36.1-36.6
    • /
    • 2022
  • BLAST, a basic bioinformatics tool for searching local sequence similarity, has been one of the most widely used bioinformatics programs since its introduction in 1990. Users generally use the web-based NCBI-BLAST program for BLAST analysis. However, users with large sequence data are often faced with a problem of upload size limitation while using the web-based BLAST program. This proves inconvenient as scientists often want to run BLAST on their own data, such as transcriptome or whole genome sequences. To overcome this issue, we developed NBLAST, a graphical user interface-based BLAST program that employs a two-way system, allowing the use of input sequences either as "query" or "target" in the BLAST analysis. NBLAST is also equipped with a dot plot viewer, thus allowing researchers to create custom database for BLAST and run a dot plot similarity analysis within a single program. It is available to access to the NBLAST with http://nbitglobal.com/nblast.

Parallelization of Genome Sequence Data Pre-Processing on Big Data and HPC Framework (빅데이터 및 고성능컴퓨팅 프레임워크를 활용한 유전체 데이터 전처리 과정의 병렬화)

  • Byun, Eun-Kyu;Kwak, Jae-Hyuck;Mun, Jihyeob
    • KIPS Transactions on Computer and Communication Systems
    • /
    • v.8 no.10
    • /
    • pp.231-238
    • /
    • 2019
  • Analyzing next-generation genome sequencing data in a conventional way using single server may take several tens of hours depending on the data size. However, in order to cope with emergency situations where the results need to be known within a few hours, it is required to improve the performance of a single genome analysis. In this paper, we propose a parallelized method for pre-processing genome sequence data which can reduce the analysis time by utilizing the big data technology and the highperformance computing cluster which is connected to the high-speed network and shares the parallel file system. For the reliability of analytical data, we have chosen a strategy to parallelize the existing analytical tools and algorithms to the new environment. Parallelized processing, data distribution, and parallel merging techniques have been developed and performance improvements have been confirmed through experiments.