• Title/Summary/Keyword: 유전자 데이터베이스

Search Result 182, Processing Time 0.03 seconds

Gene expression profile of the early embryonic gene of the silkworm, Bombyx mori (누에 수정란 초기발현유전자 데이터베이스 구축)

  • Choi, Kwang-Ho;Goo, Tae-Won;Kim, Seong-Ryul;Kim, Sung-Wan;Chun, Jae-Buhm;Park, Seoung-Won;Kang, Seok-Woo
    • Journal of Sericultural and Entomological Science
    • /
    • v.51 no.2
    • /
    • pp.191-196
    • /
    • 2013
  • This study was aimed for development of a useful genes that has a transcript expressional specificity in the early embryonic stage of the silkworm, Bombyx mori. We constructed and analyzed a full-length cDNA library from silkworm's eggs which after a lapse of 2 ~ 6 hours post oviposit. A total 960 clones were randomly selected, and the 5' ends of the inserts were sequenced to generate 652 expressed sequence tags(EST). 334 unique ESTs were generated after the assembly of 652 ESTs. The annotation of 334 unique ESTs by BLAST search revealed that 156(47%) of the sequences represented known genes, whereas 178(53%) of the sequences has no matches in the database. Of the 156 known genes, the most abundant genes were heat shock protein hsp20.8 gene(12 times) and ubiqutin-like protein gene(11 times). The functional groups of these ESTs with matches in the database were constructed according to their putative molecular functions. Among thirteen functional categories, the largest groups were protein synthesis(9.6%) and cellular organization( 8.1%). Further defined studies on molecular functions and biological roles of their promoters will give us wellfined information and its application.

A Study on the Semiautomatic Construction of Domain-Specific Relation Extraction Datasets from Biomedical Abstracts - Mainly Focusing on a Genic Interaction Dataset in Alzheimer's Disease Domain - (바이오 분야 학술 문헌에서의 분야별 관계 추출 데이터셋 반자동 구축에 관한 연구 - 알츠하이머병 유관 유전자 간 상호 작용 중심으로 -)

  • Choi, Sung-Pil;Yoo, Suk-Jong;Cho, Hyun-Yang
    • Journal of Korean Library and Information Science Society
    • /
    • v.47 no.4
    • /
    • pp.289-307
    • /
    • 2016
  • This paper introduces a software system and process model for constructing domain-specific relation extraction datasets semi-automatically. The system uses a set of terms such as genes, proteins diseases and so forth as inputs and then by exploiting massive biological interaction database, generates a set of term pairs which are utilized as queries for retrieving sentences containing the pairs from scientific databases. To assess the usefulness of the proposed system, this paper applies it into constructing a genic interaction dataset related to Alzheimer's disease domain, which extracts 3,510 interaction-related sentences by using 140 gene names in the area. In conclusion, the resulting outputs of the case study performed in this paper indicate the fact that the system and process could highly boost the efficiency of the dataset construction in various subfields of biomedical research.

Tag-SNP selection and online database construction for haplotype-based marker development in tomato (유전자 단위 haplotype을 대변하는 토마토 Tag-SNP 선발 및 웹 데이터베이스 구축)

  • Jeong, Hye-ri;Lee, Bo-Mi;Lee, Bong-Woo;Oh, Jae-Eun;Lee, Jeong-Hee;Kim, Ji-Eun;Jo, Sung-Hwan
    • Journal of Plant Biotechnology
    • /
    • v.47 no.3
    • /
    • pp.218-226
    • /
    • 2020
  • This report describes methods for selecting informative single nucleotide polymorphisms (SNPs), and the development of an online Solanaceae genome database, using 234 tomato resequencing data entries deposited in the NCBI SRA database. The 126 accessions of Solanum lycopersicum, 68 accessions of Solanum lycopersicum var. cerasiforme, and 33 accessions of Solanum pimpinellifolium, which are frequently used for breeding, and some wild-species tomato accessions were included in the analysis. To select tag-SNPs, we identified 29,504,960 SNPs in 234 tomatoes and then separated the SNPs in the genic and intergenic regions according to gene annotation. All tag-SNP were selected from non-synonymous SNPs among the SNPs present in the gene region and, as a result, we obtained tag-SNP from 13,845 genes. When there were no non-synonymous SNPs in the gene, the genes were selected from synonymous SNPs. The total number of tag-SNPs selected was 27,539. To increase the usefulness of the information, a Solanaceae genome database website, TGsol (http://tgsol. seeders.co.kr/), was constructed to allow users to search for detailed information on resources, SNPs, haplotype, and tag-SNPs. The user can search the tag-SNP and flanking sequences for each gene by searching for a gene name or gene position through the genome browser. This website can be used to efficiently search for genes related to traits or to develop molecular markers.

Design and Implementation of SOAP Servers Object Model for Gene Interaction Databases (유전자 상호작용 데이터베이스 SOAP서버 객체 모델의 설계 및 구현)

  • LEE HO IL;Yoo Seongjoon;Kim Minkyung
    • Journal of KIISE:Databases
    • /
    • v.32 no.2
    • /
    • pp.120-128
    • /
    • 2005
  • Recently main Bioinformatics databases(DDBJ, ENSEMBL, KEGG, etc.) provide analysis tools and data using web services for the convenience of bioinformaticians. Thus, defining SOAP server objects and their methods are very important to provide services for web services. We define SOAP server objects for interaction databases such as BIND, MINT and DIP.

Two-level Classification for Large-scale Fingerprint Identification System (대규모 지문식별시스템을 위한 2단계 분류)

  • 민준기;윤은경;조성배
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2004.04b
    • /
    • pp.730-732
    • /
    • 2004
  • 지문인식시스템은 크게 지문의 특징 추출단계, 입력지문과 유사한 후보지문을 찾는 검색단계, 마지막으로 입력지문과 후보지문들 간의 동일성을 판단하는 검증단계의 세 부분으로 나뉠 수 있다. 그리고 대규모 지문 데이터베이스를 기반으로 인식시스템을 구축하는 경우, 지문인식의 정확성과 더불어 신속성도 함께 고려해야 한다. 본 논문에서는 지문인식시스템의 전체 성능 향상을 위해 분류 단계에서의 개선방안으로 유전자알고리즘 기반의 특징 선택과 이의 조합을 다중분류기로 구축하는 2단계분류방법을 제안한다. NIST 데이터베이스 4에 대하여 실험한 결과 기존연구의 결과에 필적하는 분류율을 나타냈으며, 유전자알고리즘을 통해 적합한 방향성 조합을 제시할 수 있었다.

  • PDF

Integrated Genetic Algorithm with Direct Search for Optimum Design of RC Frames (직접탐색을 이용한 유전자 알고리즘에 의한 RC 프레임의 최적설계)

  • Kwak, Hyo-Gyoung;Kim, Ji-Eun
    • Journal of the Computational Structural Engineering Institute of Korea
    • /
    • v.21 no.1
    • /
    • pp.21-34
    • /
    • 2008
  • An improved optimum design method for reinforced concrete frames using integrated genetic algorithm(GA) with direct search method is presented. First, various sets of initially assumed sections are generated using GA, and then, for each resultant design member force condition optimum solutions are selected by regression analysis and direct search within pre-determined design section database. In advance, global optimum solutions are selected from accumulated results through several generations. Proposed algorithm makes up for the weak point in standard genetic algorithm(GA), that is, low efficiency in convergence causing the deterioration of quality of final solutions and shows fast convergence together with improved results. Moreover, for the purpose of elevating economic efficiency, optimum design based on the nonlinear structural analysis is performed and therefore makes all members resist against given loading condition with the nearest resisting capacity. The investigation for the effectiveness of the introduced design procedure is conducted through correlation study for example structures.

Development and Performance Evaluation of Parallel Sequence Analysis System on PC-Cluster (PC-Cluster 기반 병렬형 유전자 서열 검색 시스템의 개발 및 성능 평가)

  • Shin Yong-Won;Park Jeong-Seon
    • Journal of Biomedical Engineering Research
    • /
    • v.25 no.6
    • /
    • pp.617-621
    • /
    • 2004
  • In recent, researchers in the field of Bioinformatics need to analyze thousands of genome sequences efficiently according to introduce of new analysis methods and technologies such as genome expression microchip. This rapid growth in the field of bio-engineering needs computing resources to analyze rapidly for genome sequences, but it does not introduce the computing resources due to an enormous investment expense. The core factor of this study is integrated environment based PC-Cluster system & high speed access rate up to 155Mbps, continuous collection system for bio-information at home and abroad. The results of the study are establishment & stabilization of information and communication infrastructure, establishment & stabilization of high performance computer network up to 155Mbps, development of PC-Cluster system with 32 nodes, a parallel BLAST on Cluster system, which can provides scalable speedup in terms of response time, and development of collection & search system for bio-information.

A Study on XML Compress method for efficient integration and storing of XML-based Clinical Information (XML 기반의 통합 임상정보를 효율적으로 저장하기 위한 XML 압축 기법에 대한 연구)

  • Yu, Wee-Hyuk;Jeong, Jong-Il;Lee, Tae-Heon;Shin, Dong-Kyoo;Shin, Dong-Il
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2005.05a
    • /
    • pp.71-74
    • /
    • 2005
  • 임상정보 문서는 환자 진료기록뿐만 아니라 처방전, 개인적 유전자정보를 가지고 있다. 이러한 임상 정보 문서는 병원 시스템들간에 교환 및 공유함으로써 양질의 의료서비스를 제공할 수 있다. 이와 관련하여 임상정보의 통합을 위한 기존의 연구들은 각각 HL7 메시지를 XML 문서로 변환하고 XML 기반의 CDA 를 관계형 데이터베이스에 저장하는 연구가 진행되었다. 그러나 관계형 데이터베이스는 문서의 데이터 별 테이블 단위로 생성, 저장된다. 그러나 HL7 과 CDA 는 문서 중심의 XML 문서이기 때문에 관계형 데이터베이스에 저장 시 문서 별 많은 변이가 존재하여 테이블 증가를 갖는다. 따라서 비정규적인 구조에 적합한 데이터베이스를 선택하기 위해 XML 전용 데이터베이스와 관계형 데이터베이스 비교하고 효율적 저장을 위해 압축기법을 제시한다. 압축기법을 적용한 임상 정보 데이터베이스는 대용량 임상정보 문서의 크기를 압축함으로써 문서의 크기를 줄임으로써 데이터베이스의 효율적 저장을 향상시킨다.

  • PDF

Capacity Analysis of Civil Defense Shelter and Optimal Positioning Using Spatial-Database and Genetic Algorithm (공간데이터베이스와 유전자 알고리즘을 활용한 민방위대피소 수용 능력 분석 및 최적 위치 선정)

  • Yoo, Su Hong;Bae, Jun Su;Lee, Ji Sang;Sohn, Hong Gyoo
    • KSCE Journal of Civil and Environmental Engineering Research
    • /
    • v.39 no.6
    • /
    • pp.955-963
    • /
    • 2019
  • Currently, the establishment and management of civil defense shelters are under the initiative of the government and local governments to protect the lives of citizens. In the future, there is a need for efficient civil defense shelters operation through the expansion of general shelters, including designated dedicated shelters. Therefore, it is more efficient to consider the distribution of residents and the location of access to shelters, not the quantitative operation considering only the number of residents. This study uses genetic algorithms and Huff gravity model based on census output data, building data, and road network information to understand the distribution of inhabitants more precisely than existing administrative district data. In addition, the spatial- database was used for efficient data management and fast processing, and if this study is improved, it can be used as a basis for the selection and improvement of general shelters positioning for a wider area.