• Title/Summary/Keyword: Sequence information

Search Result 3,994, Processing Time 0.031 seconds

Cloning and Characterization of a Novel Laccase Gene, fvlac7, Based on the Genomic Sequence of Flammulina velutipes

  • Kim, Jong-Kun;Lim, Seon-Hwa;Kang, Hee-Wan
    • Mycobiology
    • /
    • v.41 no.1
    • /
    • pp.37-41
    • /
    • 2013
  • Laccases (EC 1.10.3.2) are copper-containing polyphenol oxidases found in white-rot fungi. Here, we report the cloning and analysis of the nucleotide sequence of a new laccase gene, fvlac7, based on the genomic sequence of Flammulina velutipes. A primer set was designed from the putative mRNA that was aligned to the genomic DNA of F. velutipes. A cDNA fragment approximately 1.6-kb long was then amplified by reverse transcriptase-PCR using total RNA, which was subsequently cloned and sequenced. The cDNA sequence of fvlac7 was then compared to that of the genomic DNA, and 16 introns were found in the genomic DNA sequence. The fvlac7 protein, which consists of 538 amino acids, showed only 42~51% identity with 12 different mushroom species containing two laccases of F. velutipes, suggesting the fvlac7 is a novel laccase gene. The first 25 amino acids of Fvlac7 correspond to a predicted signal sequence, four copper-binding sites, and four N-glycosylation sites. Fvlac7 cDNA was heterologously overexpressed in an Escherichia coli system with an approximate expected molecular weight of 60 kDa.

Korean phrase structure parsing using sequence-to-sequence learning (Sequence-to-sequence 모델을 이용한 한국어 구구조 구문 분석)

  • Hwang, Hyunsun;Lee, Changki
    • 한국어정보학회:학술대회논문집
    • /
    • 2016.10a
    • /
    • pp.20-24
    • /
    • 2016
  • Sequence-to-sequence 모델은 입력열을 길이가 다른 출력열로 변환하는 모델로, 단일 신경망 구조만을 사용하는 End-to-end 방식의 모델이다. 본 논문에서는 Sequence-to-sequence 모델을 한국어 구구조 구문 분석에 적용한다. 이를 위해 구구조 구문 트리를 괄호와 구문 태그 및 어절로 이루어진 출력열의 형태로 만들고 어절들을 단일 기호 'XX'로 치환하여 출력 단어 사전의 수를 줄였다. 그리고 최근 기계번역의 성능을 높이기 위해 연구된 Attention mechanism과 Input-feeding을 적용하였다. 실험 결과, 세종말뭉치의 구구조 구문 분석 데이터에 대해 기존의 연구보다 높은 F1 89.03%의 성능을 보였다.

  • PDF

Korean morphological analysis and phrase structure parsing using multi-task sequence-to-sequence learning (Multi-task sequence-to-sequence learning을 이용한 한국어 형태소 분석과 구구조 구문 분석)

  • Hwang, Hyunsun;Lee, Changki
    • 한국어정보학회:학술대회논문집
    • /
    • 2017.10a
    • /
    • pp.103-107
    • /
    • 2017
  • 한국어 형태소 분석 및 구구조 구문 분석은 한국어 자연어처리에서 난이도가 높은 작업들로서 최근에는 해당 문제들을 출력열 생성 문제로 바꾸어 sequence-to-sequence 모델을 이용한 end-to-end 방식의 접근법들이 연구되었다. 한국어 형태소 분석 및 구구조 구문 분석을 출력열 생성 문제로 바꿀 시 해당 출력 결과는 하나의 열로서 합쳐질 수가 있다. 본 논문에서는 sequence-to-sequence 모델을 이용하여 한국어 형태소 분석 및 구구조 구문 분석을 동시에 처리하는 모델을 제안한다. 실험 결과 한국어 형태소 분석과 구구조 구문 분석을 동시에 처리할 시 형태소 분석이 구구조 구문 분석에 영향을 주는 것을 확인 하였으며, 구구조 구문 분석 또한 형태소 분석에 영향을 주어 서로 영향을 줄 수 있음을 확인하였다.

  • PDF

Interpretation of Noun Sequence using Semantic Information Extracted from Machine Readable Dictionary and Corpus (기계가독형사전과 코퍼스에서 추출한 의미정보를 이용한 명사열의 의미해석)

  • 이경순;김도완;김길창;최기선
    • Korean Journal of Cognitive Science
    • /
    • v.12 no.1_2
    • /
    • pp.11-24
    • /
    • 2001
  • The interpretation of noun sequence is to find semantic relation between the nouns in noun sequence. To interpret noun sequence, semantic knowledge about words and relation between words is required. In this thesis, we propose a method to interpret a semantic relation between nouns in noun sequence. We extract semantic information from an machine readable dictionary (MRD) and corpus using regular expressions. Based on the extracted information, semantic relation of noun sequence is interpreted. And. we use verb subcategorization information together with the semantic information from an MRD and corpus. Previous researches use semantic knowledge extracted only from an MRD but our method uses an MRD. corpus. and subcategorizaton information to interpret noun sequences. Experimental result shows that our method improves the accuracy rate by +40.30% and the coverage rate by + 12.73% better than previous researches.

  • PDF

DNA Sequence Visualization with k-convex Hull (k-convex hull을 이용한 DNA 염기 배열의 가시화)

  • Kim, Min Ah;Lee, Eun Jeong;Cho, Hwan Gyu
    • Journal of the Korea Computer Graphics Society
    • /
    • v.2 no.2
    • /
    • pp.61-68
    • /
    • 1996
  • In this paper we propose a new visualization technique to characterize qualitative information of a large DNA sequence. While a long DNA sequence has huge information, it is not easy to obtain genetic information from the DNA sequence. We transform DNA sequences into a polygon to compute their homology in image domain rather than text domain. Our program visualizes DNA sequences with colored random walk plots and simplify them k-convex hulls. A random walk plot represents DNA sequence as a curve in a plane. A k-convex hull simplifies a random work plot by removing some parts of its insignificant information. This technique gives a biologist an insight to detect and classify DNA sequences with easy. Experiments with real genome data proves our approach gives a good visual forms for long DNA sequences for homology analysis.

  • PDF

Improvement of Performance of Malware Similarity Analysis by the Sequence Alignment Technique (서열 정렬 기법을 이용한 악성코드 유사도 분석의 성능 개선)

  • Cho, In Kyeom;Im, Eul Gyu
    • KIISE Transactions on Computing Practices
    • /
    • v.21 no.3
    • /
    • pp.263-268
    • /
    • 2015
  • Malware variations could be defined as malicious executable files that have similar functions but different structures. In order to classify the variations, this paper analyzed sequence alignment, the method used in Bioinformatics. This method found common parts of the Malwares' API call information. This method's performance is dependent on the API call information's length; if the length is too long, the performance should be very poor. Therefore we removed the repeated patterns in API call information in order to improve the performance of sequence alignment analysis, before the method was applied. Finally the similarity between malware was analyzed using sequence alignment. The experimental results with the real malware samples were presented.

Study on the Generation of Inaudible Binary Random Number Using Canonical Signed Digit Coding (표준 부호 디지트 코딩을 이용한 비가청 이진 랜덤 신호 발생에 관한 연구)

  • Nam, MyungWoo;Lee, Young-Seok
    • The Journal of Korea Institute of Information, Electronics, and Communication Technology
    • /
    • v.8 no.4
    • /
    • pp.263-269
    • /
    • 2015
  • Digital watermarking is imperceptible and statistically undetectable information embeds into digital data. Most information in digital audio watermarking schemes have used binary random sequences. The embedded binary random sequence distorts and modifies the original data while it plays a vital role in security. In this paper, a binary random sequence to improve imperceptibility in perceptual region of the human auditory system is proposed. The basic idea of this work is a modification of a binary random sequence according to the frequency analysis of adjacent binary digits that have different signs in the sequence. The canonical signed digit code (CSDC) is also applied to modify a general binary random sequence and the pair-matching function between original and its modified version. In our experiment, frequency characteristics of the proposed binary random sequence was evaluated and analyzed by Bark scale representation of frequency and frequency gains.

Non-linear Extended Binary Sequence with Low Cross-Correlation (낮은 상호 상관관계를 갖는 비선형 확장 이진 수열)

  • Choi, Un-Sook;Cho, Sung-Jin;Kwon, Sook-Hee
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.16 no.4
    • /
    • pp.730-736
    • /
    • 2012
  • PN(Pseudo Noise) sequences play an important role in wireless communications, such as in a CDMA(code division multiple access) communication system. If there is a crash when multiple users simultaneously connected to a system, then PN sequences with low correlation help to minimize multiple access interference in such communication system. In this paper we propose a family of non-linear extended binary sequences with low cross-correlations and the family include $m$-sequence, GMW sequence, Kasami sequence and No sequence with optimal cross-correlation in terms of Welch bound. And we analyze cross-correlation of these sequences.

The preverified test sequence generation method satisfying the completeness criteria (완전표준성을 만족하는 선행검증 시험열 생성방법에 관한 연구)

  • 박진호;양대헌;송주석;임상용
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.23 no.9A
    • /
    • pp.2383-2390
    • /
    • 1998
  • As network provides diverse functionalities recently, many rpotocol standards have become complex and many implementations have appeared. Such trends require us to test th econformance of implementations, called the conformance testing. Many researches have been performed on generating test sequence and on fualt masking base don T,U,D,W methods. At this jpoint, te new problem is suggeste dwhich is calle dthe completenes s criteria. The test sequences for the conformance testing have come up with this problem as well as fault masking. In this paper, we suggest the method of generating the preverified test sequence which can avoid the completeness criteria problem. The preverified test sequence is much more reliable than others by using the preverified edge. For the reliability of conformance testing, we define the immunity of the test sequence and provide the clue for the analysis of the test results using the immunity. The analysis of the results makes it possible for us to test the implementation again with more reliability. Also, the preverified test sequence is flexible so that it is combined with the fault-tolerant sequence for fault masking.

  • PDF

Analysis of Kronecker Sequence with Partial-Period Correlation in a Multiple-dwell Serical Serarch System (복수적분 시구간 직렬탐색 시스템에서 부분 상관기를 이용한 Kronecker 부호의 특성 분석)

  • 임연주;박상규
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.25 no.8B
    • /
    • pp.1333-1340
    • /
    • 2000
  • This paper shows that the Kronecker sequence, the rapid acquisition spreading code, can be used in packet wireless data communication systems. The general properties of the Kronecker sequence such as construction and correlation characteristics are described, and it is shown that the Kronecker sequence can use a partial-period correlation for a faster acquisition. Based on above Properties, it can be expected that the Kronecker sequence can be used in packet communication systems because the probability of false alarm for the Kronecker sequence is lower and flatter(that is, less sensitive to Ec/No variations) than that for the PN sequence under the assumption that both sequences have the same acquisition time.

  • PDF