• Title/Summary/Keyword: Data Sequence

Search Result 3,093, Processing Time 0.039 seconds

A DNA Sequence Alignment Algorithm Using Quality Information and a Fuzzy Inference Method (품질 정보와 퍼지 추론 기법을 이용한 DNA 염기 서열 배치 알고리즘)

  • Kim, Kwang-Baek
    • Journal of Intelligence and Information Systems
    • /
    • v.13 no.2
    • /
    • pp.55-68
    • /
    • 2007
  • DNA sequence alignment algorithms in computational molecular biology have been improved by diverse methods. In this paper, we proposed a DNA sequence alignment algorithm utilizing quality information and a fuzzy inference method utilizing characteristics of DNA sequence fragments and a fuzzy logic system in order to improve conventional DNA sequence alignment methods using DNA sequence quality information. In conventional algorithms, DNA sequence alignment scores were calculated by the global sequence alignment algorithm proposed by Needleman-Wunsch applying quality information of each DNA fragment. However, there may be errors in the process for calculating DNA sequence alignment scores in case of low quality of DNA fragment tips, because overall DNA sequence quality information are used. In the proposed method, exact DNA sequence alignment can be achieved in spite of low quality of DNA fragment tips by improvement of conventional algorithms using quality information. And also, mapping score parameters used to calculate DNA sequence alignment scores, are dynamically adjusted by the fuzzy logic system utilizing lengths of DNA fragments and frequencies of low quality DNA bases in the fragments. From the experiments by applying real genome data of NCBI (National Center for Biotechnology Information), we could see that the proposed method was more efficient than conventional algorithms using quality information in DNA sequence alignment.

  • PDF

Identification of three independent fern gametophytes and Hymenophyllum wrightii f. serratum from Korea based on molecular data

  • LEE, Chang Shook;LEE, Kanghyup;HWANG, Youngsim
    • Korean Journal of Plant Taxonomy
    • /
    • v.50 no.4
    • /
    • pp.403-412
    • /
    • 2020
  • Colonies of three independent gametophytes (one that is filamentous and two that are ribbon-like) without sporophytes occur in Gyeonggi-do, Gangwon-do, Gyeongsang-do, and Jeju-do, Korea. They have a moss-like appearance at first sight, with tiny plantlets and gemmae, and grow in cool, shaded, relatively deep dint places of large rocks, such as the small caves in high mountains, close to valleys. The gametophytes were identified based on morphological and molecular data by chloroplast DNA (cpDNA) sequence data (rbcL, rps4 gene and rps4-trnS intergenic spacer). Here, rbcL, rps4 gene and rps4-trnS intergenic spacer data of one independent gametophyte distributed in Korea have the same morphology, DNA sequence and monophyletic group as Crepidomanes intricatum from the eastern United States. They also share the same cpDNA data with Crepidomanes schmidtianum recently reported from Korea. The other independent gametophyte should be Hymenophyllum wrightii based on cpDNA data. The last one was presumed to be Pleurosoriopsis makinoi based on molecular data. The taxonomic status was confirmed to be the forma of Hymenophyllum wrightii through a revision of Hymenophyllum wrightii f. serratum based on molecular data.

Development and Application of Pre/Post-processor to EMTP for Sequence Impedance Analysis of Underground Transmission Cables (지중 송전선로 대칭분 임피던스 해석을 위한 EMTP 전후처리기 개발과 활용)

  • Choi, Jong-Kee;Jang, Byung-Tae;An, Yong-Ho;Choi, Sang-Kyu;Lee, Myoung-Hee
    • The Transactions of The Korean Institute of Electrical Engineers
    • /
    • v.63 no.10
    • /
    • pp.1364-1370
    • /
    • 2014
  • Power system fault analysis has been based on symmetrical component method, which describes power system elements by positive, negative and zero sequence impedance. Obtaining accurate line impedances as possible are very important for estimating fault current magnitude and setting distance relay accurately. Especially, accurate calculation of zero sequence impedance is important because most of transmission line faults are line-to-ground faults, not balanced three-phase fault. Since KEPCO has started measuring of transmission line impedance at 2005, it has been revealed that the measured and calculated line impedances are well agreed within reasonable accuracy. In case of underground transmission lines, however, large discrepancies in zero sequence impedance were observed occasionally. Since zero sequence impedance is an important input data for distance relay to locate faulted point correctly, it is urgently required to analyze, detect and consider countermeasures to the source of these discrepancies. In this paper, development of pre/post processor to ATP (Alternative Transient Program) version of EMTP (Electro-Magnetic Transient Program) for sequence impedance calculation was described. With the developed processor ATP-cable, effects of ground resistance and ECC (Earth Continuity Conductor) on sequence impedance were analyzed.

Analysis of shopping website visit types and shopping pattern (쇼핑 웹사이트 탐색 유형과 방문 패턴 분석)

  • Choi, Kyungbin;Nam, Kihwan
    • Journal of Intelligence and Information Systems
    • /
    • v.25 no.1
    • /
    • pp.85-107
    • /
    • 2019
  • Online consumers browse products belonging to a particular product line or brand for purchase, or simply leave a wide range of navigation without making purchase. The research on the behavior and purchase of online consumers has been steadily progressed, and related services and applications based on behavior data of consumers have been developed in practice. In recent years, customization strategies and recommendation systems of consumers have been utilized due to the development of big data technology, and attempts are being made to optimize users' shopping experience. However, even in such an attempt, it is very unlikely that online consumers will actually be able to visit the website and switch to the purchase stage. This is because online consumers do not just visit the website to purchase products but use and browse the websites differently according to their shopping motives and purposes. Therefore, it is important to analyze various types of visits as well as visits to purchase, which is important for understanding the behaviors of online consumers. In this study, we explored the clustering analysis of session based on click stream data of e-commerce company in order to explain diversity and complexity of search behavior of online consumers and typified search behavior. For the analysis, we converted data points of more than 8 million pages units into visit units' sessions, resulting in a total of over 500,000 website visit sessions. For each visit session, 12 characteristics such as page view, duration, search diversity, and page type concentration were extracted for clustering analysis. Considering the size of the data set, we performed the analysis using the Mini-Batch K-means algorithm, which has advantages in terms of learning speed and efficiency while maintaining the clustering performance similar to that of the clustering algorithm K-means. The most optimized number of clusters was derived from four, and the differences in session unit characteristics and purchasing rates were identified for each cluster. The online consumer visits the website several times and learns about the product and decides the purchase. In order to analyze the purchasing process over several visits of the online consumer, we constructed the visiting sequence data of the consumer based on the navigation patterns in the web site derived clustering analysis. The visit sequence data includes a series of visiting sequences until one purchase is made, and the items constituting one sequence become cluster labels derived from the foregoing. We have separately established a sequence data for consumers who have made purchases and data on visits for consumers who have only explored products without making purchases during the same period of time. And then sequential pattern mining was applied to extract frequent patterns from each sequence data. The minimum support is set to 10%, and frequent patterns consist of a sequence of cluster labels. While there are common derived patterns in both sequence data, there are also frequent patterns derived only from one side of sequence data. We found that the consumers who made purchases through the comparative analysis of the extracted frequent patterns showed the visiting pattern to decide to purchase the product repeatedly while searching for the specific product. The implication of this study is that we analyze the search type of online consumers by using large - scale click stream data and analyze the patterns of them to explain the behavior of purchasing process with data-driven point. Most studies that typology of online consumers have focused on the characteristics of the type and what factors are key in distinguishing that type. In this study, we carried out an analysis to type the behavior of online consumers, and further analyzed what order the types could be organized into one another and become a series of search patterns. In addition, online retailers will be able to try to improve their purchasing conversion through marketing strategies and recommendations for various types of visit and will be able to evaluate the effect of the strategy through changes in consumers' visit patterns.

Seismic Stratigraphy and Sedimentary Environment of the Dukjuk-Do Sand Ridge in Western Gyeonggi Bay, Korea (경기만 서부 덕적도 사퇴의 탄성파층서 및 퇴적환경 연구)

  • Lee, Yoon-Oh;Choi, Sang-Il;Jeong, Gyo-Cheol
    • The Journal of Engineering Geology
    • /
    • v.24 no.1
    • /
    • pp.9-21
    • /
    • 2014
  • We examined high-resolution seismic data, side scan sonar data, surface sediments, and vibrocore samples from a sand ridge off the western part of Dukjuk-Do in Gyeonggi Bay, with the aim of interpretation of seismic stratigraphy and sedimentary environment. Based on the seismic data, the deposited sands are divided into three sedimentary units. 14C age data indicate that the top sequence (sequence I) formed at 5000-6000 yr BP, when a transgression resulted in strong shifting tides. Analyses of the vibrocore samples indicate that sequence II is a paleo-mudflat layer of intertidal sediments dominated by mud. Sequence III consists of terrestrial sediments that are presumed to have been deposited at the end of the Pleistocene, unconformably overlying the acoustic bedrock and Mesozoic granite. The side scan sonar data indicate that sand waves were formed on the seabed on top of the sand ridge. Generally, this is the direction of $N20^{\circ}E$, which coincides with the direction of tidal flow. Sand ripples occur away from the top of the sand ridge and are distributed homogeneously across a sandy slope. Vibrocore analyses indicate that the surface sediments and core sediments (samples VC-1, -2, and -3) are homogeneous, without any internal structures, and are characterized by a mixture of medium and fine sand (1-$2{\phi}$), respectively.

Genome wide association study on feed conversion ratio using imputed sequence data in chickens

  • Wang, Jiaying;Yuan, Xiaolong;Ye, Shaopan;Huang, Shuwen;He, Yingting;Zhang, Hao;Li, Jiaqi;Zhang, Xiquan;Zhang, Zhe
    • Asian-Australasian Journal of Animal Sciences
    • /
    • v.32 no.4
    • /
    • pp.494-500
    • /
    • 2019
  • Objective: Feed consumption contributes a large percentage for total production costs in the poultry industry. Detecting genes associated with feeding traits will be of benefit to improve our understanding of the molecular determinants for feed efficiency. The objective of this study was to identify candidate genes associated with feed conversion ratio (FCR) via genomewide association study (GWAS) using sequence data imputed from single nucleotide polymorphism (SNP) panel in a Chinese indigenous chicken population. Methods: A total of 435 Chinese indigenous chickens were phenotyped for FCR and were genotyped using a 600K SNP genotyping array. Twenty-four birds were selected for sequencing, and the 600K SNP panel data were imputed to whole sequence data with the 24 birds as the reference. The GWAS were performed with GEMMA software. Results: After quality control, 8,626,020 SNPs were used for sequence based GWAS, in which ten significant genomic regions were detected to be associated with FCR. Ten candidate genes, ubiquitin specific peptidase 44, leukotriene A4 hydrolase, ETS transcription factor, R-spondin 2, inhibitor of apoptosis protein 3, sosondowah ankyrin repeat domain family member D, calmodulin regulated spectrin associated protein family member 2, zinc finger and BTB domain containing 41, potassium sodium-activated channel subfamily T member 2, and member of RAS oncogene family were annotated. Several of them were within or near the reported FCR quantitative trait loci, and others were newly reported. Conclusion: Results from this study provide valuable prior information on chicken genomic breeding programs, and potentially improve our understanding of the molecular mechanism for feeding traits.

Estimation Method of Cable Fault Location in Rocket Motors Using M-sequence Signals (M시퀀스 신호를 이용한 로켓 추진기관 케이블 결함 위치 추정 기법)

  • Son, Ji-Hong
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.21 no.5
    • /
    • pp.84-92
    • /
    • 2020
  • This paper describes the estimation method of cable fault location in rocket motors using M-sequence (Maximal Length Sequence). In order to estimate the location of a cable fault, three methods have been usually used: TDR (Time Domain Reflectometry), FDR (Frequency Domain Reflectometry), and TFDR (Time-Frequency Domain Reflectometry). However, these methods suffer the disadvantage of requiring users to be close to a test field, which is dangerous. The estimation method of cable fault location using M-sequence is proposed to solve this problem. The proposed method can make use of DAS (Data Acquisition System). The experiments were three cases: damaged, open, and short. The RG-58 coaxial cable was used in the experiments. As a result, the proposed method has better performance than that of conventional methods such as TDR and TFDR.

Molecular Characterization of Epoxide Hydrolase from Aspergillus niger LK using Phylogenetic Analysis (진화적 유연관계 분석을 통한 Aspergillus niger LK의 Epoxide Hydrolase의 특성분석)

  • 김희숙;이은열;이수정;이지원
    • KSBB Journal
    • /
    • v.19 no.1
    • /
    • pp.42-49
    • /
    • 2004
  • A gene coding for epoxide hydrolase (EH) of Aspergillus niger LK, a fungus possessing the enantioselective hydrolysis activity for racemic epoxides, was characterized by phylogenetic analysis. The deduced protein of A. niger LK epoxide hydrolase shares significant sequence similarity with several bacterial EHs and mammalian microsomal EHs (mEH) and belongs to the a/${\beta}$ hydrolase fold family. EH from A. niger LK had 90.6% identity with 3D crystal structure of lqo7 in Protein Data Bank. Sequence comparison with other source EHs suggested that Asp$\^$l92/, Asp$\^$374/ and His$\^$374/ constituted the catalytic triad. Based on the multiple sequence comparison of the functional and structural domain sequence, the phylogenetic tree between relevant epoxide hydrolases from various species were reconstructed by using Neighbor-Joining method. Genetic distances were so far as 1.841-2.682 but characteristic oxyanion hole and catalytic triad were highly conserved, which means they have diverged from a common ancestor.

A retroviral insertion in the tyrosinase (TYR) gene is associated with the recessive white plumage color in the Yeonsan Ogye chicken

  • Cho, Eunjin;Kim, Minjun;Manjula, Prabuddha;Cho, Sung Hyun;Seo, Dongwon;Lee, Seung-Sook;Lee, Jun Heon
    • Journal of Animal Science and Technology
    • /
    • v.63 no.4
    • /
    • pp.751-758
    • /
    • 2021
  • The recessive white (locus c) phenotype observed in chickens is associated with three alleles (recessive white c, albino ca, and red-eyed white cre) and causative mutations in the tyrosinase (TYR) gene. The recessive white mutation (c) inhibits the transcription of TYR exon 5 due to a retroviral sequence insertion in intron 4. In this study, we genotyped and sequenced the insertion in TYR intron 4 to identify the mutation causing the unusual white plumage of Yeonsan Ogye chickens, which normally have black plumage. The white chickens had a homozygous recessive white genotype that matched the sequence of the recessive white type, and the inserted sequence exhibited 98% identity with the avian leukosis virus ev-1 sequence. In comparison, brindle and normal chickens had the homozygous color genotype, and their sequences were the same as the wild-type sequence, indicating that this phenotype is derived from other mutation(s). In conclusion, white chickens have a recessive white mutation allele. Since the size of the sample used in this study was limited, further research through securing additional samples to perform validation studies is necessary. Therefore, after validation studies, a selection system for conserving the phenotypic characteristics and genetic diversity of the population could be established if additional studies to elucidate specific phenotype-related genes in Yeonsan Ogye are performed.

Design of Traffic Sequence for the Maritime Data Communications in HF band (HF대 해상 데이터통신을 위한 통신시퀜스 설계)

  • Go, Yun-Gyu;Lee, Yeung-Su;Choi, Jo-Cheon
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2008.05a
    • /
    • pp.81-85
    • /
    • 2008
  • The INMARSAT is able to long range maritime communications that can not use for expensive charge in small ship. Additional an opinion of unuseful NBDP that is international discussion for replacement methods for the effective data communications by using HF band. A feature of HF band communication is ionospheric propagation that have not the distance question as A2, A3 and A4 sea areas. Therefore all navigation ship should has supplied service such as MSI, VMS, E-mail beside of distress and public communication that is demanded a design of communication sequence for using SSB transceiver. This paper has designed the new packet and communication sequence of truly and automatically radio link for maritime data communications by SSB in HF band.

  • PDF