• Title/Summary/Keyword: biological information processing

Search Result 275, Processing Time 0.055 seconds

PubMine: An Ontology-Based Text Mining System for Deducing Relationships among Biological Entities

  • Kim, Tae-Kyung;Oh, Jeong-Su;Ko, Gun-Hwan;Cho, Wan-Sup;Hou, Bo-Kyeng;Lee, Sang-Hyuk
    • Interdisciplinary Bio Central
    • /
    • v.3 no.2
    • /
    • pp.7.1-7.6
    • /
    • 2011
  • Background: Published manuscripts are the main source of biological knowledge. Since the manual examination is almost impossible due to the huge volume of literature data (approximately 19 million abstracts in PubMed), intelligent text mining systems are of great utility for knowledge discovery. However, most of current text mining tools have limited applicability because of i) providing abstract-based search rather than sentence-based search, ii) improper use or lack of ontology terms, iii) the design to be used for specific subjects, or iv) slow response time that hampers web services and real time applications. Results: We introduce an advanced text mining system called PubMine that supports intelligent knowledge discovery based on diverse bio-ontologies. PubMine improves query accuracy and flexibility with advanced search capabilities of fuzzy search, wildcard search, proximity search, range search, and the Boolean combinations. Furthermore, PubMine allows users to extract multi-dimensional relationships between genes, diseases, and chemical compounds by using OLAP (On-Line Analytical Processing) techniques. The HUGO gene symbols and the MeSH ontology for diseases, chemical compounds, and anatomy have been included in the current version of PubMine, which is freely available at http://pubmine.kobic.re.kr. Conclusions: PubMine is a unique bio-text mining system that provides flexible searches and analysis of biological entity relationships. We believe that PubMine would serve as a key bioinformatics utility due to its rapid response to enable web services for community and to the flexibility to accommodate general ontology.

Speech Signal Processing using Pitch Synchronous Multi-Spectra and DSP System Design in Cochlear Implant (피치동기 다중 스펙트럼을 이용한 청각보철장치의 음성신호처리 및 DSP 시스템 설계)

  • Shin, J. I.;Park, S. J.;Shin, D. K.;Lee, J. H.;Park, S. H.
    • Journal of Biomedical Engineering Research
    • /
    • v.20 no.4
    • /
    • pp.495-502
    • /
    • 1999
  • We propose efficient speech signal processing algorithms and a system for cochlear implant in this paper. The outer and the middle car which perform amplifying, lowpass filtering and AGC, are modeled by an analog system, and the inner ear acting as a time-delayed multi filter and the transducer is implemented by the DSP circuit which enables real-time processing. Especially, the basilar membrane characteristic of the inner ear is modeled by a nonlinear filter bank, and then tonotopy and periodicity of the auditory system is satisfied by using a pitch-synchronous multi-spectra(PSMS) method. Moreover, most of the speech processing is performed by S/W so the system can be easily modified. And as our program is written in C-language, it can be easily transplanted to the system using other processors.

  • PDF

Mining Maximal Frequent Contiguous Sequences in Biological Data Sequences (생물학적 데이터 서열들에서 빈번한 최대길이 연속 서열 마이닝)

  • Kang, Tae-Ho;Yoo, Jae-Soo
    • The KIPS Transactions:PartD
    • /
    • v.15D no.2
    • /
    • pp.155-162
    • /
    • 2008
  • Biological sequences such as DNA sequences and amino acid sequences typically contain a large number of items. They have contiguous sequences that ordinarily consist of hundreds of frequent items. In biological sequences analysis(BSA), a frequent contiguous sequence search is one of the most important operations. Many studies have been done for mining sequential patterns efficiently. Most of the existing methods for mining sequential patterns are based on the Apriori algorithm. In particular, the prefixSpan algorithm is one of the most efficient sequential pattern mining schemes based on the Apriori algorithm. However, since the algorithm expands the sequential patterns from frequent patterns with length-1, it is not suitable for biological dataset with long frequent contiguous sequences. In recent years, the MacosVSpan algorithm was proposed based on the idea of the prefixSpan algorithm to significantly reduce its recursive process. However, the algorithm is still inefficient for mining frequent contiguous sequences from long biological data sequences. In this paper, we propose an efficient method to mine maximal frequent contiguous sequences in large biological data sequences by constructing the spanning tree with the fixed length. To verify the superiority of the proposed method, we perform experiments in various environments. As the result, the experiments show that the proposed method is much more efficient than MacosVSpan in terms of retrieval performance.

Gender Classification System Based on Deep Learning in Low Power Embedded Board (저전력 임베디드 보드 환경에서의 딥 러닝 기반 성별인식 시스템 구현)

  • Jeong, Hyunwook;Kim, Dae Hoe;Baddar, Wisam J.;Ro, Yong Man
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.6 no.1
    • /
    • pp.37-44
    • /
    • 2017
  • While IoT (Internet of Things) industry has been spreading, it becomes very important for object to recognize user's information by itself without any control. Above all, gender (male, female) is dominant factor to analyze user's information on account of social and biological difference between male and female. However since each gender consists of diverse face feature, face-based gender classification research is still in challengeable research field. Also to apply gender classification system to IoT, size of device should be reduced and device should be operated with low power. Consequently, To port the function that can classify gender in real-world, this paper contributes two things. The first one is new gender classification algorithm based on deep learning and the second one is to implement real-time gender classification system in embedded board operated by low power. In our experiment, we measured frame per second for gender classification processing and power consumption in PC circumstance and mobile GPU circumstance. Therefore we verified that gender classification system based on deep learning works well with low power in mobile GPU circumstance comparing to in PC circumstance.

Mathematical modeling for flocking flight of autonomous multi-UAV system, including environmental factors

  • Kwon, Youngho;Hwang, Jun
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.14 no.2
    • /
    • pp.595-609
    • /
    • 2020
  • In this study, we propose a decentralized mathematical model for predictive control of a system of multi-autonomous unmanned aerial vehicles (UAVs), also known as drones. Being decentralized and autonomous implies that all members make their own decisions and fly depending on the dynamic information received from other unmanned aircraft in the area. We consider a variety of realistic characteristics, including time delay and communication locality. For this flocking flight, we do not possess control for central data processing or control over each UAV, as each UAV runs its collision avoidance algorithm by itself. The main contribution of this work is a mathematical model for stable group flight even in adverse weather conditions (e.g., heavy wind, rain, etc.) by adding Gaussian noise. Two of our proposed variance control algorithms are presented in this work. One is based on a simple biological imitation from statistical physical modeling, which mimics animal group behavior; the other is an algorithm for cooperatively tracking an object, which aligns the velocities of neighboring agents corresponding to each other. We demonstrate the stability of the control algorithm and its applicability in autonomous multi-drone systems using numerical simulations.

Adaptive Selective Compressive Sensing based Signal Acquisition Oriented toward Strong Signal Noise Scene

  • Wen, Fangqing;Zhang, Gong;Ben, De
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.9 no.9
    • /
    • pp.3559-3571
    • /
    • 2015
  • This paper addresses the problem of signal acquisition with a sparse representation in a given orthonormal basis using fewer noisy measurements. The authors formulate the problem statement for randomly measuring with strong signal noise. The impact of white Gaussian signals noise on the recovery performance is analyzed to provide a theoretical basis for the reasonable design of the measurement matrix. With the idea that the measurement matrix can be adapted for noise suppression in the adaptive CS system, an adapted selective compressive sensing (ASCS) scheme is proposed whose measurement matrix can be updated according to the noise information fed back by the processing center. In terms of objective recovery quality, failure rate and mean-square error (MSE), a comparison is made with some nonadaptive methods and existing CS measurement approaches. Extensive numerical experiments show that the proposed scheme has better noise suppression performance and improves the support recovery of sparse signal. The proposed scheme should have a great potential and bright prospect of broadband signals such as biological signal measurement and radar signal detection.

Suffix Tree Constructing Algorithm for Large DNA Sequences Analysis (대용량 DNA서열 처리를 위한 서픽스 트리 생성 알고리즘의 개발)

  • Choi, Hae-Won
    • Journal of Korea Society of Industrial Information Systems
    • /
    • v.15 no.1
    • /
    • pp.37-46
    • /
    • 2010
  • A Suffix Tree is an efficient data structure that exposes the internal structure of a string and allows efficient solutions to a wide range of complex string problems, in particular, in the area of computational biology. However, as the biological information explodes, it is impossible to construct the suffix trees in main memory. We should find an efficient technique to construct the trees in a secondary storage. In this paper, we present a method for constructing a suffix tree in a disk for large set of DNA strings using new index scheme. We also show a typical application example with a suffix tree in the disk.

Gene Expression Profiling of Eukaryotic Microalga, Haematococcus pluvialis

  • EOM HYUNSUK;PARK SEUNGHYE;LEE CHOUL-GYUN;JIN EONSEON
    • Journal of Microbiology and Biotechnology
    • /
    • v.15 no.5
    • /
    • pp.1060-1066
    • /
    • 2005
  • Under environmental stress, such as strong irradiance or nitrogen deficiency, unicellular green algae of the genus Haematococcus accumulate secondary carotenoids, i.e. astaxanthin, in the cytosol. The induction and regulation of astaxanthin biosynthesis in microalgae has recently received considerable attention owing to the increasing use of secondary carotenoids as a source of pigmentation for fish aquacultures, and as a potential drug in cancer prevention as a free-radical quencher. Accordingly, this study generated expressed sequence tags (ESTs) from a library constructed from astaxanthin-induced Haematococcus pluvialis. Partial sequences were obtained from the 5' ends of 1,858 individual cDNAs, and then grouped into 1,025 non-overlapping sequences, among which 708 sequences were singletons, while the remainder fell into 317 clusters. Approximately $63\%$ of the EST sequences showed similarity to previously described sequences in public databases. H. pluvialis was found to consist of a relatively high percentage of genes involved in genetic information processing ($15\%$) and metabolism ($11\%$), whereas a relatively low percentage of sequences was involved in the signal transduction ($3\%$), structure ($2\%$), and environmental information process ($3\%$). In addition, a relatively large fraction of H. pluvialis sequences was classified as genes involved in photosynthesis ($9\%$) and cellular process ($9\%$). Based on this EST analysis, the full-length cDNA sequence for superoxide dismutase (SOD) of H. pluvialis was cloned, and the expression of this gene was investigated. The abundance of SOD changed substantially in response to different culture conditions, indicating the possible regulation of this gene in H. pluvialis.

Construction of PANM Database (Protostome DB) for rapid annotation of NGS data in Mollusks

  • Kang, Se Won;Park, So Young;Patnaik, Bharat Bhusan;Hwang, Hee Ju;Kim, Changmu;Kim, Soonok;Lee, Jun Sang;Han, Yeon Soo;Lee, Yong Seok
    • The Korean Journal of Malacology
    • /
    • v.31 no.3
    • /
    • pp.243-247
    • /
    • 2015
  • A stand-alone BLAST server is available that provides a convenient and amenable platform for the analysis of molluscan sequence information especially the EST sequences generated by traditional sequencing methods. However, it is found that the server has limitations in the annotation of molluscan sequences generated using next-generation sequencing (NGS) platforms due to inconsistencies in molluscan sequence available at NCBI. We constructed a web-based interface for a new stand-alone BLAST, called PANM-DB (Protostome DB) for the analysis of molluscan NGS data. The PANM-DB includes the amino acid sequences from the protostome groups-Arthropoda, Nematoda, and Mollusca downloaded from GenBank with the NCBI taxonomy Browser. The sequences were translated into multi-FASTA format and stored in the database by using the formatdb program at NCBI. PANM-DB contains 6% of NCBInr database sequences (as of 24-06-2015), and for an input of 10,000 RNA-seq sequences the processing speed was 15 times faster by using PANM-DB when compared with NCBInr DB. It was also noted that PANM-DB show two times more significant hits with diverse annotation profiles as compared with Mollusks DB. Hence, the construction of PANM-DB is a significant step in the annotation of molluscan sequence information obtained from NGS platforms. The PANM-DB is freely downloadable from the web-based interface (Malacological Society of Korea, http://malacol.or/kr/blast) as compressed file system and can run on any compatible operating system.

Application of Geographical Information System on Golf Course Design for Reduction of Environmental Impacts (지형정보시스템기법을 이용한 친환경적 골프코스 설계)

  • Joo, Young-Kyoo;Lee, Whal-Hee;Lee, Mu-Chun
    • Asian Journal of Turfgrass Science
    • /
    • v.20 no.1
    • /
    • pp.93-105
    • /
    • 2006
  • The construction of golf courses has had adverse effects on the natural landscape and delicate ecosystem of Korea. Efficiency in planning and design was necessary to minimize the environmental impact of the original construction. However, the ordinal design methods have limited the data processing by the massive scale of the project of golf course development. Conventional design methods did not have a proper tool for alternative plans on pre-estimation of landscape destruction or minimizing of the environmental impact. Therefore, advanced computerized techniques need to be adapted for golf course design to solve the problems concerning the environmental impacts. Geographic information system (GIS) was applied on the process of geographical data input and analysis through the final outputs. Simulation works by the total database management enable the pre-investigation of the design In view of an assessment of environmental impacts. It is also possible to evaluate plans easily and propose the alternatives properly. Precise quantity calculation of engineering works by computer system should be guaranteed scientific, economic, and environmentally-sound.