• Title/Summary/Keyword: Sequence Search

Search Result 655, Processing Time 0.025 seconds

A New Adaptive Window Size-based Three Step Search Scheme (적응형 윈도우 크기 기반 NTSS (New Three-Step Search Algorithm) 알고리즘)

  • Yu Jonghoon;Oh Seoung-Jun;Ahn Chang-bum;Park Ho-Chong
    • Journal of the Institute of Electronics Engineers of Korea SP
    • /
    • v.43 no.1 s.307
    • /
    • pp.75-84
    • /
    • 2006
  • With considering center-biased characteristic, NTSS(New Three-Step Search Algorithm) can improve the performance of TSS(Three-Step Search Algorithm) which is one of the most popular fast block matching algorithms(BMA) to search a motion vector in a video sequence. Although NTSS has generally better Quality than TSS for a small motion sequence, it is hard to say that NTSS can provide better quality than TSS for a large motion sequence. It even deteriorates the quality to increase a search window size using NTSS. In order to address this drawback, this paper aims to develop a new adaptive window size-based three step search scheme, called AWTSS, which can improve quality at various window sizes in both the small and the large motion video sequences. In this scheme, the search window size is dynamically changed to improve coding efficiency according to the characteristic of motion vectors. AWTSS can improve the video quality more than 0.5dB in case of large motion with keeping the same quality in case of small motion.

An Efficient Algorithm for Similarity Search in Large Biosequence Database (대용량 유전체를 위한 효율적인 유사성 검색 알고리즘)

  • Jeong, In-Seon;Park, Kyoung-Wook;Lim, Hyeong-Seok
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • v.9 no.2
    • /
    • pp.1073-1076
    • /
    • 2005
  • Since the size of biosequence database grows exponentially every year, it becomes impractical to use Smith-Waterman algorithm for exact sequence similarity search. For fast sequence similarity search, researchers have been proposed heuristic methods that use the frequency of characters in subsequences. These methods have the defect that different sequences are treated as the same sequence. Because of using only the frequency of characters, the accuracy of these methods are lower than Smith-Waterman algorithm. In this paper, we propose an algorithm which processes query efficiently by indexing the frequency of characters including the positional information of characters in subsequences. The experiments show that our algorithm improve the accuracy of sequence similarity search approximately 5${\sim}$20% than heuristic algorithms using only the frequency of characters.

  • PDF

Performance Improvement of BLAST using Grid Computing and Implementation of Genome Sequence Analysis System (그리드 컴퓨팅을 이용한 BLAST 성능개선 및 유전체 서열분석 시스템 구현)

  • Kim, Dong-Wook;Choi, Han-Suk
    • The Journal of the Korea Contents Association
    • /
    • v.10 no.7
    • /
    • pp.81-87
    • /
    • 2010
  • This paper proposes a G-BLAST(BLAST using Grid Computing) system, an integrated software package for BLAST searches operated in heterogeneous distributed environment. G-BLAST employed 'database splicing' method to improve the performance of BLAST searches using exists computing resources. G-BLAST is a basic local alignment search tool of DNA Sequence using grid computing in heterogeneous distributed environment. The G-BLAST improved the existing BLAST search performance in gene sequence analysis. Also G-BLAST implemented the pipeline and data management method for users to easily manage and analyze the BLAST search results. The proposed G-BLAST system has been confirmed the speed and efficiency of BLAST search performance in heterogeneous distributed computing.

The Best Sequence of Moves and the Size of Komi on a Very Small Go Board, using Monte-Carlo Tree Search (몬테카를로 트리탐색을 활용한 초소형 바둑에서의 최상의 수순과 덤의 크기)

  • Lee, Byung-Doo
    • Journal of Korea Game Society
    • /
    • v.18 no.5
    • /
    • pp.77-82
    • /
    • 2018
  • Go is the most complex board game in which the computer can not search all possible moves using an exhaustive search to find the best one. Prior to AlphaGo, all powerful computer Go programs have used the Monte-Carlo Tree Search (MCTS) to overcome the difficulty in positional evaluation and the very large branching factor in a game tree. In this paper, we tried to find the best sequence of moves using an MCTS on a very small Go board. We found that a $2{\times}2$ Go game would be ended in a tie and the size of Komi should be 0 point; Meanwhile, in a $3{\times}3$ Go Black can always win the game and the size of Komi should be 9 points.

Optimal stacking sequence design of laminate composite structures using tabu embedded simulated annealing

  • Rama Mohan Rao, A.;Arvind, N.
    • Structural Engineering and Mechanics
    • /
    • v.25 no.2
    • /
    • pp.239-268
    • /
    • 2007
  • This paper deals with optimal stacking sequence design of laminate composite structures. The stacking sequence optimisation of laminate composites is formulated as a combinatorial problem and is solved using Simulated Annealing (SA), an algorithm devised based on inspiration of physical process of annealing of solids. The combinatorial constraints are handled using a correction strategy. The SA algorithm is strengthened by embedding Tabu search in order to prevent recycling of recently visited solutions and the resulting algorithm is referred to as tabu embedded simulated Annealing (TSA) algorithm. Computational performance of the proposed TSA algorithm is enhanced through cache-fetch implementation. Numerical experiments have been conducted by considering rectangular composite panels and composite cylindrical shell with different ply numbers and orientations. Numerical studies indicate that the TSA algorithm is quite effective in providing practical designs for lay-up sequence optimisation of laminate composites. The effect of various neighbourhood search algorithms on the convergence characteristics of TSA algorithm is investigated. The sensitiveness of the proposed optimisation algorithm for various parameter settings in simulated annealing is explored through parametric studies. Later, the TSA algorithm is employed for multi-criteria optimisation of hybrid composite cylinders for simultaneously optimising cost as well as weight with constraint on buckling load. The two objectives are initially considered individually and later collectively to solve as a multi-criteria optimisation problem. Finally, the computational efficiency of the TSA based stacking sequence optimisation algorithm has been compared with the genetic algorithm and found to be superior in performance.

Reinterpretation of the protein identification process for proteomics data

  • Kwon, Kyung-Hoon;Lee, Sang-Kwang;Cho, Kun;Park, Gun-Wook;Kang, Byeong-Soo;Park, Young-Mok
    • Interdisciplinary Bio Central
    • /
    • v.1 no.3
    • /
    • pp.9.1-9.6
    • /
    • 2009
  • Introduction: In the mass spectrometry-based proteomics, biological samples are analyzed to identify proteins by mass spectrometer and database search. Database search is the process to select the best matches to the experimental mass spectra among the amino acid sequence database and we identify the protein as the matched sequence. The match score is defined to find the matches from the database and declare the highest scored hit as the most probable protein. According to the score definition, search result varies. In this study, the difference among search results of different search engines or different databases was investigated, in order to suggest a better way to identify more proteins with higher reliability. Materials and Methods: The protein extract of human mesenchymal stem cell was separated by several bands by one-dimensional electrophorysis. One-dimensional gel was excised one by one, digested by trypsin and analyzed by a mass spectrometer, FT LTQ. The tandem mass (MS/MS) spectra of peptide ions were applied to the database search of X!Tandem, Mascot and Sequest search engines with IPI human database and SwissProt database. The search result was filtered by several threshold probability values of the Trans-Proteomic Pipeline (TPP) of the Institute for Systems Biology. The analysis of the output which was generated from TPP was performed. Results and Discussion: For each MS/MS spectrum, the peptide sequences which were identified from different conditions such as search engines, threshold probability, and sequence database were compared. The main difference of peptide identification at high threshold probability was caused by not the difference of sequence database but the difference of the score. As the threshold probability decreases, the missed peptides appeared. Conversely, in the extremely high threshold level, we missed many true assignments. Conclusion and Prospects: The different identification result of the search engines was mainly caused by the different scoring algorithms. Usually in proteomics high-scored peptides are selected and low-scored peptides are discarded. Many of them are true negatives. By integrating the search results from different parameter and different search engines, the protein identification process can be improved.

Sequence Data Indexing Method based on Minimum DTW Distance (최소 DTW 거리 기반의 데이터 시퀀스 색인 기법)

  • Khil, Ki-Jeong;Song, Seok-Il;Song, Chai-Jong;Lee, Seok-Pil;Jang, Sei-Jin;Lee, Jong-Seol
    • The Journal of the Korea Contents Association
    • /
    • v.11 no.12
    • /
    • pp.52-59
    • /
    • 2011
  • In this paper, we propose an indexing method to support efficient similarity search for sequence databases. We present a new distance measurement called minimum DTW distance to enhance the filtering effects. The minimum DTW distance is to measure the minimum distance between a sequence data and the group of similar sequences. It enables similarity search through hierarchical index structure by filtering sequence databases. Finally, we show the superiority of our method through some experiments.

Implementation and Application of Multiple Local Alignment (다중 지역 정렬 알고리즘 구현 및 응용)

  • Lee, Gye Sung
    • The Journal of the Convergence on Culture Technology
    • /
    • v.5 no.3
    • /
    • pp.339-344
    • /
    • 2019
  • Global sequence alignment in search of similarity or homology favors larger size of the sequence because it keeps looking for more similar section between two sequences in the hope that it adds up scores for matched part in the rest of the sequence. If a substantial size of mismatched section exists in the middle of the sequence, it greatly reduces the total alignment score. In this case a whole sequence would be better to be divided into multiple sections. Overall alignment score over the multiple sections of the sequence would increase as compared to global alignment. This method is called multiple local alignment. In this paper, we implement a multiple local alignment algorithm, an extension of Smith-Waterman algorithm and show the experimental results for the algorithm that is able to search for sub-optimal sequence.

Proteomics Data Analysis using Representative Database

  • Kwon, Kyung-Hoon;Park, Gun-Wook;Kim, Jin-Young;Park, Young-Mok;Yoo, Jong-Shin
    • Bioinformatics and Biosystems
    • /
    • v.2 no.2
    • /
    • pp.46-51
    • /
    • 2007
  • In the proteomics research using mass spectrometry, the protein database search gives the protein information from the peptide sequences that show the best match with the tandem mass spectra. The protein sequence database has been a powerful knowledgebase for this protein identification. However, as we accumulate the protein sequence information in the database, the database size gets to be huge. Now it becomes hard to consider all the protein sequences in the database search because it consumes much computing time. For the high-throughput analysis of the proteome, usually we have used the non-redundant refined database such as IPI human database of European Bioinformatics Institute. While the non-redundant database can supply the search result in high speed, it misses the variation of the protein sequences. In this study, we have concerned the proteomics data in the point of protein similarities and used the network analysis tool to build a new analysis method. This method will be able to save the computing time for the database search and keep the sequence variation to catch the modified peptides.

  • PDF

Acquisition of Direct-Sequence Cellular Communication System for Code Division Mutlipie Access (부호 분할 다원 접속을 위한 직접 확산 셀룰라 통신 시스팀의 동기)

  • 전정식;한영열
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.18 no.2
    • /
    • pp.207-217
    • /
    • 1993
  • In this paper, knowing a priori probability of phase offset between transmitted codes and reference codes in the receiver, we construct the state diagram of acquisition process of the direct sequence spread-spectrum communication system using the expanding window search. The scannings are performed from the cell with higher probability code epoch synchronization to that with lower one. We expand window size from initial value by r times of its previous size in each search, construct the corresponding the state diagrams, and derive average synchronization time using the Markov process and Mason's gain formula. Average synchronization times versus number of search n and detection probability $P_d$ are calculated.

  • PDF