• Title/Summary/Keyword: 서열정렬

Search Result 105, Processing Time 0.031 seconds

Rough Computational Annotation and Hierarchical Conserved Area Viewing Tool for Genomes Using Multiple Relation Graph. (다중 관계 그래프를 이용한 유전체 보존영역의 계층적 시각화와 개략적 전사 annotation 도구)

  • Lee, Do-Hoon
    • Journal of Life Science
    • /
    • v.18 no.4
    • /
    • pp.565-571
    • /
    • 2008
  • Due to rapid development of bioinformatics technologies, various biological data have been produced in silico. So now days complicated and large scale biodata are used to accomplish requirement of researcher. Developing visualization and annotation tool using them is still hot issues although those have been studied for a decade. However, diversity and various requirements of users make us hard to develop general purpose tool. In this paper, I propose a novel system, Genome Viewer and Annotation tool (GenoVA), to annotate and visualize among genomes using known information and multiple relation graph. There are several multiple alignment tools but they lose conserved area for complexity of its constrains. The GenoVA extracts all associated information between all pair genomes by extending pairwise alignment. High frequency conserved area and high BLAST score make a block node of relation graph. To represent multiple relation graph, the system connects among associated block nodes. Also the system shows the known information, COG, gene and hierarchical path of block node. In this case, the system can annotates missed area and unknown gene by navigating the special block node's clustering. I experimented ten bacteria genomes for extracting the feature to visualize and annotate among them. GenoVA also supports simple and rough computational annotation of new genome.

Analysis of Spatial Trip Regularity using Trajectory Data in Urban Areas (도시부 경로자료를 이용한 통행의 공간적 규칙성 분석)

  • Lee, Su jin;Jang, Ki tae
    • The Journal of The Korea Institute of Intelligent Transport Systems
    • /
    • v.17 no.6
    • /
    • pp.96-110
    • /
    • 2018
  • As the development of ICT has made it easier to collect various traffic information, research on creating new traffic attributes is drawing attention. Estimation and forecasts of demand and traffic volume are one of the main indicators that are essential to traffic operation, assuming that the traffic pattern at a particular node or link is repeated. Traditionally, a survey method was used to demonstrate this similarity on trip behavior. However, the method was limited to achieving high accuracy with high costs and responses that relied on the respondents' memory. Recently, as traffic data has become easier to gather through ETC system, smart card, studies are performed to identify the regularity of trip in various ways. In, this study, route-level trip data collected in Daegu metropolitan city were analyzed to confirm that individual traveler forms a spatially similar trip chain over several days. For this purpose, we newly define the concept of spatial trip regularity and assess the spatial difference between daily trip chains using the sequence alignment algorithm, Dynamic Time Warping. In addition, we will discuss the applications as the indicators of fixed traffic demand and transportation services.

cSNP Identification and Genotyping from C4B and BAT2 Assigned to the SLA Class III Region (돼지 SLA class III 영역 내 C4B 및 BAT2의 cSNP 동정 및 이를 이용한 유전자형 분석)

  • Kim, J.H.;Lim, H.T.;Seo, B.Y.;Lee, S.H.;Lee, J.B.;Yoo, C.K.;Jung, E.J.;Jeon, J.T.
    • Journal of Animal Science and Technology
    • /
    • v.49 no.5
    • /
    • pp.549-558
    • /
    • 2007
  • C4B and BAT2, assigned to the SLA class III region, were recently reported on relation with human diseases. The primers for RT-PCR and RACE-PCR for CDS analysis of these genes of pig were designed by aligning the CDSs of humans and mice from GenBank. After we amplified and sequenced with these primers and cDNAs, the full-length CDSs of pig were determined. The CDS lengths of C4B and BAT2 were shown as 5226 bp and 6501 bp. In addition, the identities of nucleotide sequences with human and mouse were 76% to 87%, and the identities of amino acids were 72% to 90%. After we carried out the alignment with determined CDSs in this study and pig genomic sequences from GenBank, the primers for cSNP detection in genome were designed in intron regions that flanked one or more exons. Then, we amplified and directly sequenced with genomic DNAs of six pig breeds. Four cSNPs from C4B and three 3 cSNPs from BAT2 were identified. In addition, amino acid substitution occurred in six cSNP positions except for C4248T of C4B. By the Multiplex-ARMS method, we genotyped seven cSNPs with DNA samples used for direct sequencing. We verified that this result was the same as that analyzed using direct sequencing. To demonstrate recrudescence, we performed both direct sequencing and Multiplex-ARMS on two randomly selected DNA samples. The genotype of each sample showed the same result from both methods. Therefore, seven cSNPs were identified from C4B and BAT2 and could be used as the basic data for haplotype analysis of SLA class III region. Moreover, the Multiplex-ARMS method should be powerful for genotyping of genes assigned to the whole SLA region for the xenograft study.

A Taxonomic Reconsideration of the Genus Lemna L. (Lemnaceae) in Korea (한국산 좀개구리밥속(개구리밥과)의 분류학적 실체에 대한 재고)

  • Kim, Yong-In;Shim, Sang In;Park, Jin Hee
    • Korean Journal of Environment and Ecology
    • /
    • v.31 no.4
    • /
    • pp.349-364
    • /
    • 2017
  • Duckweed family (Lemnaceae Martinov), including the genus Lemna L., is a typical floating aquatic perennial plant, and about five genera and 40 species in the family are in wide distribution around the world except the polar regions. The genus Lemna is the smallest and the simplest plant among the angiosperms. It has a characteristic of doubling every three days with fast vegetative propagation, which helps the organisms to increase in rapid growth. As such, the plant is ideal for environmental pollution assessment and toxicity test. Although taxonomists and scholars have used different scientific names for the species, many of them have agreed that there is only one member of species of the genus Lemna in Korea. Paying attention to the external morphological variation observed in the Korean genus Lemna, we conducted a molecular phylogenetic analysis to identify the entity of the Korean Lemna species and to investigate the possibility of two or more members of the species existing in Korea. We determined and aligned the DNA sequences of the atpF-H region of the chloroplast DNA in 37 populations of the nationally distributed Lemna species. The results showed that the sequence length of the cp DNA atpF-H region was 463-483 bp, the length of the aligned sequences was 488 bp, and the number of variation site in nucleotide sequences was 47. There were two types of aligned sequences of the cp DNA atpF-H region from 37 populations of Lemna species in Korea. The maximum parsimony analysis revealed that the Korean Lemna consists of two clades, and one of them had two subclades. The results suggest that, contrary to the general understanding, at least two taxa (L.aequinoctialis, L.minor) exist in Korea.

Correlation Analysis of the Arirangs Based on the Informatics Algorithms (정보 알고리즘 기반 아리랑의 계통도 및 상관관계 분석)

  • Kim, Hak Yong
    • The Journal of the Korea Contents Association
    • /
    • v.14 no.4
    • /
    • pp.407-417
    • /
    • 2014
  • An arirang is the most famous Korean folk song and was registered in UNESCO(Unitied Nations Educational, Scientific and cultural Organization) as an intangible cultural heritage in 2012. Most arirangs are composed of text and refrain parts. Genealogy of the arirang was classified in refrain patterns by using multiple sequence alignment algorithm. There are two different refrain patterns, slow and fast melodies. Of 106 arirangs, 38 and 68 arirangs contain fast and slow melodies, respectively. 73 arirangs and 104 their key words were extracted from bipartate arirang network that composed of arirangs, text works, and their relationships. The correlation among the arirangs was analyzed from the selected arirangs and key words by using pairwise comparison matrix. Also, analysis of correlation among the arirnags was performed by stepwise removal of the single degree nodes from the bipartate arirang network In this study, arirangs were analyzed in genealogy and correlation among arirangs by using informatic algorithm and network technology, in which arirang research will be constructed a stepping stone for the popularization and globalization of the arirangs.

A Novel Sub-image Retrieval Approach using Dot-Matrix (점 행렬을 이용한 새로운 부분 영상 검색 기법)

  • Kim, Jun-Ho;Kang, Kyoung-Min;Lee, Do-Hoon
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.13 no.3
    • /
    • pp.1330-1336
    • /
    • 2012
  • The Image retrieval has been study different approaches which are text-based, contents-based, area-based method and sub-image finding. The sub-image retrieval is to find a query image in the target one. In this paper, we propose a novel sub-image retrieval algorithm by Dot-Matrix method to be used in the bioinformatics. Dot-Matrix is a method to evaluate similarity between two sequences and we redefine the problem for retrieval of sub-image to the finding similarity of two images. For the approach, the 2 dimensional array of image converts a the vector which has gray-scale value. The 2 converted images align by dot-matrix and the result shows candidate sub-images. We used 10 images as target and 5 queries: duplicated, small scaled, and large scaled images included x-axes and y-axes scaled one for experiment.

Image Analysis Algorithms for Comparative Genomic Hybridization (분자 세포 유전학 기법에 응용되는 영상 처리 기술)

  • Kim, De-Sok;Yoo, Jin-Sung;Lee, Jin-Woo;Kim, Jong-Won;Moon, Shin-Yong;Choi, Young-Min
    • Proceedings of the KOSOMBE Conference
    • /
    • v.1998 no.11
    • /
    • pp.66-69
    • /
    • 1998
  • Comparative genomic hybridization (CGH) is an important molecular cytogenetics technique that maps abnormal copy number of specific DNA sequence of the chromosome. CGH is based on quantitative digital image analysis of ratio images from fluorescently labeled chromosomes. In this paper, we would like to introduce how recently developed image analysis algorithms are used for CGH techniques. To average the ratio profile of each chromosome, binarization, skeletonization, and stretching of chromosome images have been studied. Developed algorithms have been implemented in the karyotyping system ChIPS commercially developed at Biomedlab Co. Ltd.

  • PDF

Research on Malware Classification with Network Activity for Classification and Attack Prediction of Attack Groups (공격그룹 분류 및 예측을 위한 네트워크 행위기반 악성코드 분류에 관한 연구)

  • Lim, Hyo-young;Kim, Wan-ju;Noh, Hong-jun;Lim, Jae-sung
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.42 no.1
    • /
    • pp.193-204
    • /
    • 2017
  • The security of Internet systems critically depends on the capability to keep anti-virus (AV) software up-to-date and maintain high detection accuracy against new malware. However, malware variants evolve so quickly they cannot be detected by conventional signature-based detection. In this paper, we proposed a malware classification method based on sequence patterns generated from the network flow of malware samples. We evaluated our method with 766 malware samples and obtained a classification accuracy of approximately 40.4%. In this study, malicious codes were classified only by network behavior of malicious codes, excluding codes and other characteristics. Therefore, this study is expected to be further developed in the future. Also, we can predict the attack groups and additional attacks can be prevented.

Construction of Genetic Linkage Map and Identification of Quantitative Trait Loci in Populus davidiana using Genotyping-by-sequencing (Genotyping-by-sequencing 기법을 이용한 사시나무(Populus davidiana) 유전연관지도 작성 및 양적형질 유전자좌 탐색)

  • Suvi Kim;Yang-gil Kim;Dayoung Lee;Hye-jin Lee;Kyu-Suk Kang
    • Journal of Korean Society of Forest Science
    • /
    • v.112 no.1
    • /
    • pp.40-56
    • /
    • 2023
  • Tree species within the Populus genus grow rapidly and have an excellent capacity to absorb carbon, conferring substantial ability to effective purify the environment. Poplar breeding can be achieved rapidly and efficiently if a genetic linkage map is constructed and quantitative trait loci (QTLs) are identified. Here, a high-density genetic linkage map was constructed for the control pollinated progeny using the genotyping-by-sequencing (GBS) technique, which is a next-generation sequencing method. A search was also performed for the genes associated with quantitative traits located in the genetic linkage map by examining the variables of height and diameter at root collar, and resilience to insect damage. The height and diameter at root collar were measured directly, while the ability to recover from insect damage was scored in a 4-year-old breeding population of aspen hybrids (Odae19 × Bonghyeon4 F1) established in the research forest of Seoul National University. After DNA extraction, paternity was confirmed using five microsatellite markers, and only the individuals for which paternity was confirmed were used for the analysis. The DNA was cut using restriction enzymes and the obtained DNA fragments were prepared using a GBS library and sequenced. The analyzed results were sorted using Populus trichocarpa as a reference genome. Overall, 58,040 aligned single-nucleotide polymorphism (SNP) markers were identified, 17,755 of which were used for mapping genetic linkages. The genetic linkage map was divided into 19 linkage groups, with a total length of 2,129.54 cM. The analysis failed to identify any growth-related QTLs, but a gene assumed to be related to recovery from insect damage was identified on linkage group (chromosome) 4 through genome-wide association study.

Detection of Novel Genetic Variations of the MG1R * 3 Allele in Pig(Sus scrofa) (돼지 Melanocortin Receptor 1(MC1R) 대립유전자 3의 신규 유전변이 탐색)

  • Cho, I.C.;Jeong, Y.H.;Jung, J.K.;Seong, P.N.;Oh, W.Y.;Ko, M.S.;Kim, B.W.;Lee, J.G.;Jeon, J.T.
    • Journal of Animal Science and Technology
    • /
    • v.46 no.1
    • /
    • pp.1-6
    • /
    • 2004
  • This study was conducted to investigate novel genetic variations of MCIR^*3 allele. In general, white spotting or white belt on a black backgroud in pigs is determined by the E$^p$ allele at the MCIR/Extention locus. E$^p$ shares a frameshift mutation with the E$^{D2}$ allele for dominant black color. An oligonucleotide primer set was designed to amplify complete coding sequence of the porcine MCIR gene. The MCIR coding sequences obtained from five breeds those were Landrace(white). Yorkshire(white), Hampshire(belt), Berkshire(spot) and Jeju native black pigs(black), were used for this study. A multiple sequence alignment of the MCIR coding region using Clustal W was performed. The total length of the MCIR coding sequence ranged from 963 to 966 base pairs(bp) among the selected breeds. The sequence analysis of the complete coding region of MCIR was revealed that Hampshire and Jeju native black pig have 3 cytosines deletion and Birkshire has 2 cytosines deletion at codon 23(nt68) in Extention loci. Besides the finding, there were three different missense mutations and a frameshift mutation in the MCIR coding region.