• Title/Summary/Keyword: BioPython

Search Result 8, Processing Time 0.025 seconds

Benchmarking of BioPerl, Perl, BioJava, Java, BioPython, and Python for Primitive Bioinformatics Tasks and Choosing a Suitable Language

  • Ryu, Tae-Wan
    • International Journal of Contents
    • /
    • v.5 no.2
    • /
    • pp.6-15
    • /
    • 2009
  • Recently many different programming languages have emerged for the development of bioinformatics applications. In addition to the traditional languages, languages from open source projects such as BioPerl, BioPython, and BioJava have become popular because they provide special tools for biological data processing and are easy to use. However, it is not well-studied which of these programming languages will be most suitable for a given bioinformatics task and which factors should be considered in choosing a language for a project. Like many other application projects, bioinformatics projects also require various types of tasks. Accordingly, it will be a challenge to characterize all the aspects of a project in order to choose a language. However, most projects require some common and primitive tasks such as file I/O, text processing, and basic computation for counting, translation, statistics, etc. This paper presents the benchmarking results of six popular languages, Perl, BioPerl, Python, BioPython, Java, and BioJava, for several common and simple bioinformatics tasks. The experimental results of each language are compared through quantitative evaluation metrics such as execution time, memory usage, and size of the source code. Other qualitative factors, including writeability, readability, portability, scalability, and maintainability, that affect the success of a project are also discussed. The results of this research can be useful for developers in choosing an appropriate language for the development of bioinformatics applications.

Differences between Species Based on Multiple Sequence Alignment Analysis (다중서열정렬에 기반한 종의 차이)

  • Hyeok-Zu Kwon;Sang-Jin Kim;Geun-Mu Kim
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • v.19 no.2
    • /
    • pp.467-472
    • /
    • 2024
  • Multiple sequence alignment (MSA) is a method of collecting and aligning multiple protein sequences or nucleic acid sequences that perform the same function in various organisms at once. clustalW, a representative multiple sequence alignment algorithm using BioPython, compares the degree of alignment by column position. In addition, a web logo and phylogenetic tree are created to visualize conserved sequences in order to improve understanding. An example was given to confirm the differences between humans and other species, and applications of BioPython are presented.

Python Package Prototype for Adaptive Optics Modeling and Simulation

  • Choi, Seonghwan;Bang, Byungchae;Kim, Jihun;Jung, Gwanghee;Baek, Ji-Hye;Park, Jongyeob;Han, Jungyul;Kim, Yunjong
    • The Bulletin of The Korean Astronomical Society
    • /
    • v.46 no.2
    • /
    • pp.53.3-53.3
    • /
    • 2021
  • Adaptive Optics (AO) was first studied in the field of astronomy, and its applications have been extended to the field of laser, microscopy, bio, medical, and free space laser communication. AO modelling and simulation are required throughout the system development process. It is necessary not only for proper design but also for performance verification after the final system is built. In KASI, we are trying to develop the AO Python Package for AO modelling and simulation. It includes modelling classes of atmosphere, telescope, Shack-Hartmann wavefront sensor, deformable mirror, which are the components for an AO system. It also includes the ability to simulate the entire AO system over time. It is being developed in the Super Eye Bridge project to develop a segmented mirror, an adaptive optics, and an emersion grating spectrograph, which are future telescope technologies. And it is planned to be used as a performance analysis system for several telescope projects in Korea.

  • PDF

Estimation and Validation of the Leaf Areas of Five June-bearing Strawberry (Fragaria × ananassa) Cultivars using Non-destructive Methods (일계성 딸기 5품종의 비파괴적 방법을 사용한 엽면적 추정 및 검증)

  • Jo, Jung Su;Sim, Ha Seon;Jung, Soo Bin;Moon, Yu Hyun;Jo, Won Jun;Woo, Ui Jeong;Kim, Sung Kyeom
    • Journal of Bio-Environment Control
    • /
    • v.31 no.2
    • /
    • pp.98-103
    • /
    • 2022
  • Non-destructive estimation of leaf area is a more efficient and convenient method than leaf excision. Thus, several models predicting leaf area have been developed for various horticultural crops. However, there are limited studies on estimating the leaf area of strawberry plants. In this study, we predicted the leaf areas via nonlinear regression analysis using the leaf lengths and widths of three-compound leaves in five domestic strawberry cultivars ('Arihyang', 'Jukhyang', 'Keumsil', 'Maehyang', and 'Seollhyang'). The coefficient of determination (R2) between the actual and estimated leaf areas varied from 0.923 to 0.973. The R2 value varied for each cultivar; thus, leaf area estimation models must be developed for each cultivar. The leaf areas of the three cultivars 'Jukhyang', 'Seolhyang', and 'Maehyang' could be non-destructively predicted using the model developed in this study, as they had R2 values over 0.96. The cultivars 'Arihyang' and 'Geumsil' had slightly low R2 values, 0.938 and 0.923, respectively. The leaf area estimation model for each cultivar was coded in Python and is provided in this manuscript. The estimation models developed in this study could be used extensively in other strawberry-related studies.

Development of Deep Learning AI Model and RGB Imagery Analysis Using Pre-sieved Soil (입경 분류된 토양의 RGB 영상 분석 및 딥러닝 기법을 활용한 AI 모델 개발)

  • Kim, Dongseok;Song, Jisu;Jeong, Eunji;Hwang, Hyunjung;Park, Jaesung
    • Journal of The Korean Society of Agricultural Engineers
    • /
    • v.66 no.4
    • /
    • pp.27-39
    • /
    • 2024
  • Soil texture is determined by the proportions of sand, silt, and clay within the soil, which influence characteristics such as porosity, water retention capacity, electrical conductivity (EC), and pH. Traditional classification of soil texture requires significant sample preparation including oven drying to remove organic matter and moisture, a process that is both time-consuming and costly. This study aims to explore an alternative method by developing an AI model capable of predicting soil texture from images of pre-sorted soil samples using computer vision and deep learning technologies. Soil samples collected from agricultural fields were pre-processed using sieve analysis and the images of each sample were acquired in a controlled studio environment using a smartphone camera. Color distribution ratios based on RGB values of the images were analyzed using the OpenCV library in Python. A convolutional neural network (CNN) model, built on PyTorch, was enhanced using Digital Image Processing (DIP) techniques and then trained across nine distinct conditions to evaluate its robustness and accuracy. The model has achieved an accuracy of over 80% in classifying the images of pre-sorted soil samples, as validated by the components of the confusion matrix and measurements of the F1 score, demonstrating its potential to replace traditional experimental methods for soil texture classification. By utilizing an easily accessible tool, significant time and cost savings can be expected compared to traditional methods.

An Integrated Flood Simulation System for Upstream and Downstream of the Agricultural Reservoir Watershed (농촌 유역 저수지 상·하류 통합 홍수 모의 시스템 구축 및 적용)

  • Kwak, Jihye;Kim, Jihye;Lee, Hyunji;Lee, Junhyuk;Cho, Jaepil;Kang, Moon Seong
    • Journal of The Korean Society of Agricultural Engineers
    • /
    • v.65 no.1
    • /
    • pp.41-49
    • /
    • 2023
  • To utilize the hydraulic and hydrological models when simulating floods in agricultural watersheds, it is necessary to consider agricultural reservoirs, farmland, and farmland drainage system, which are characteristics of agricultural watersheds. However, most of them are developed individually by different researchers, also, each model has a different simulation scope, so it is hard to use them integrally. As a result, there is a need to link each hydraulic and hydrological model. Therefore, this study established an integrated flood simulation system for the comprehensive flood simulation of agricultural reservoir watersheds. The system can be applied easily to various watersheds because historical weather data and the SSP (Shared Socio-economic Pathways) climate change scenario database of ninety weather stations were built-in. Individual hydraulic and hydrological models were coded and coupled through Python. The system consists of multiplicative random cascade model, Clark unit hydrograph model, frequency analysis model, HEC-5 (Hydrologic Engineering Center-5), HEC-RAS (Hydrologic Engineering Center-River Analysis System), and farmland drainage simulation model. In the case of external models with limitations in conceptualization, such as HEC-5 and HEC-RAS, the python interpreter approaches the operating system and gives commands to run the models. All models except two are built based on the logical concept.

Web-Based Data Processing and Model Linkage Techniques for Agricultural Water-Resource Analysis (농촌유역 물순환 해석을 위한 웹기반 자료 전처리 및 모형 연계 기법 개발)

  • Park, Jihoon;Kang, Moon Seong;Song, Jung-Hun;Jun, Sang Min;Kim, Kyeung;Ryu, Jeong Hoon
    • Journal of The Korean Society of Agricultural Engineers
    • /
    • v.57 no.5
    • /
    • pp.101-111
    • /
    • 2015
  • Establishment of appropriate data in certain formats is essential for agricultural water cycle analysis, which involves complex interactions and uncertainties such as climate change, social & economic change, and watershed environmental change. The main objective of this study was to develop web-based Data processing and Model linkage Techniques for Agricultural Water-Resource analysis (AWR-DMT). The developed techniques consisted of database development, data processing technique, and model linkage technique. The watershed of this study was the upper Cheongmi stream and Geunsam-Ri. The database was constructed using MS SQL with data code, watershed characteristics, reservoir information, weather station information, meteorological data, processed data, hydrological data, and paddy field information. The AWR-DMT was developed using Python. Processing technique generated probable rainfall data using non-stationary frequency analysis and evapotranspiration data. Model linkage technique built input data for agricultural watershed models, such as the TANK and Agricultural Watershed Supply (AWS). This study might be considered to contribute to the development of intelligent watercycle analysis by developing data processing and model linkage techniques for agricultural water-resource analysis.

Tea Leaf Disease Classification Using Artificial Intelligence (AI) Models (인공지능(AI) 모델을 사용한 차나무 잎의 병해 분류)

  • K.P.S. Kumaratenna;Young-Yeol Cho
    • Journal of Bio-Environment Control
    • /
    • v.33 no.1
    • /
    • pp.1-11
    • /
    • 2024
  • In this study, five artificial intelligence (AI) models: Inception v3, SqueezeNet (local), VGG-16, Painters, and DeepLoc were used to classify tea leaf diseases. Eight image categories were used: healthy, algal leaf spot, anthracnose, bird's eye spot, brown blight, gray blight, red leaf spot, and white spot. Software used in this study was Orange 3 which functions as a Python library for visual programming, that operates through an interface that generates workflows to visually manipulate and analyze the data. The precision of each AI model was recorded to select the ideal AI model. All models were trained using the Adam solver, rectified linear unit activation function, 100 neurons in the hidden layers, 200 maximum number of iterations in the neural network, and 0.0001 regularizations. To extend the functionality of Orange 3, new add-ons can be installed and, this study image analytics add-on was newly added which is required for image analysis. For the training model, the import image, image embedding, neural network, test and score, and confusion matrix widgets were used, whereas the import images, image embedding, predictions, and image viewer widgets were used for the prediction. Precisions of the neural networks of the five AI models (Inception v3, SqueezeNet (local), VGG-16, Painters, and DeepLoc) were 0.807, 0.901, 0.780, 0.800, and 0.771, respectively. Finally, the SqueezeNet (local) model was selected as the optimal AI model for the detection of tea diseases using tea leaf images owing to its high precision and good performance throughout the confusion matrix.