• Title/Summary/Keyword: Vector Tag

Search Result 103, Processing Time 0.026 seconds

Automatic Word Spacing of the Korean Sentences by Using End-to-End Deep Neural Network (종단 간 심층 신경망을 이용한 한국어 문장 자동 띄어쓰기)

  • Lee, Hyun Young;Kang, Seung Shik
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.8 no.11
    • /
    • pp.441-448
    • /
    • 2019
  • Previous researches on automatic spacing of Korean sentences has been researched to correct spacing errors by using n-gram based statistical techniques or morpheme analyzer to insert blanks in the word boundary. In this paper, we propose an end-to-end automatic word spacing by using deep neural network. Automatic word spacing problem could be defined as a tag classification problem in unit of syllable other than word. For contextual representation between syllables, Bi-LSTM encodes the dependency relationship between syllables into a fixed-length vector of continuous vector space using forward and backward LSTM cell. In order to conduct automatic word spacing of Korean sentences, after a fixed-length contextual vector by Bi-LSTM is classified into auto-spacing tag(B or I), the blank is inserted in the front of B tag. For tag classification method, we compose three types of classification neural networks. One is feedforward neural network, another is neural network language model and the other is linear-chain CRF. To compare our models, we measure the performance of automatic word spacing depending on the three of classification networks. linear-chain CRF of them used as classification neural network shows better performance than other models. We used KCC150 corpus as a training and testing data.

Construction of Shuttle Promoter-probe and Expression Vectors for Escherichia coli and Bacillus subtilis, and Expression of B. thuringiensis subsp. kurstaki HD-73 Crystal Protein Gene in the Two Species

  • Park, Seung-Hwan;Koo, Bon-Tag;Shin, Byung-Sik;Kim, Jeong-Il
    • Journal of Microbiology and Biotechnology
    • /
    • v.1 no.1
    • /
    • pp.37-44
    • /
    • 1991
  • A shuttle promoter-probe vector, pEB203, was derived from pBR322, pPL703 and pUB110. Using the vector, a useful DNA fragment, 319 bp EcoRI fragment, having strong promoter activity has been cloned from Bacillus subtills chromosomal DNA. Selection was based on chloramphenicol resistance which is dependent upon the introduction of DNA fragments allowing expression of a chloramphenicol acetyl transferase gene. The nucleotide sequence of the 319 bp fragment has been determined and the putative -35 and -10 region, ribosome binding site, and ATG initiation codon were observed. This promoter was named EB promoter and the resultant plasmid which can be used as an expression vector was named pEBP313. The crystal protein gene from B. thuringiensis subsp. kurstaki HD-73 was cloned downstream from the EB promoter without its own promoter. When the resultant plasmid, pBT313, was introduced into Escherichia coli and B. subtilis, efficient synthesis of crystal protein was observed in both cells, and the cp gene expression in B. subtilis begins early in the vegetative phase. The cell extracts from both clones were toxic to Hyphantria cunea larvae.

  • PDF

Intein-mediated expression of Trichoderma reesei Cellobiohydrolase I Cellulose Binding Domain in E. coli (Intein을 이용한 대장균에서의 Trichoderma reesei 유래의 Cellobiohydrolase I 섬유소 결합 도메인의 발현)

  • Choi, Shin-Geon
    • Journal of Industrial Technology
    • /
    • v.36
    • /
    • pp.33-37
    • /
    • 2016
  • Cellulose binding domains (CBDs) of cellulases are thought to assist in the hydrolysis of insoluble crystalline cellulose. To gain sufficient amount of CBDs, the self-cleavable intein tag was used for expression and purification of Trichoderma reesei cellobiohydrolase I CBD in E. coli. Synthetic CBD genes, CBD or linker-CBD were cloned into expression vector pTYB11. Recombinant CBDs were successfully purified by intein mediated purification with an affinity chitin-binding domain. The final yields of recombinant CBD and linker-CBD were 3.2 mg/L and 1.4 mg/L, respectively. The functional bindings of recombinant CBDs were confirmed by Avicel binding experiments. The simple and easy purification method using self-cleavable intein tag can be further used in pretreatment of crystalline cellulose or characterization of engineered CBDs.

  • PDF

Semantic-Based K-Means Clustering for Microblogs Exploiting Folksonomy

  • Heu, Jee-Uk
    • Journal of Information Processing Systems
    • /
    • v.14 no.6
    • /
    • pp.1438-1444
    • /
    • 2018
  • Recently, with the development of Internet technologies and propagation of smart devices, use of microblogs such as Facebook, Twitter, and Instagram has been rapidly increasing. Many users check for new information on microblogs because the content on their timelines is continually updating. Therefore, clustering algorithms are necessary to arrange the content of microblogs by grouping them for a user who wants to get the newest information. However, microblogs have word limits, and it has there is not enough information to analyze for content clustering. In this paper, we propose a semantic-based K-means clustering algorithm that not only measures the similarity between the data represented as a vector space model, but also measures the semantic similarity between the data by exploiting the TagCluster for clustering. Through the experimental results on the RepLab2013 Twitter dataset, we show the effectiveness of the semantic-based K-means clustering algorithm.

A Broad-Host-Range Promoter-Probe Vector, pKU20, and Its Use in Promoter Cloning and Expression of Bacillus thuringiensis Crystal Protein Gene in Pseudomonas putida

  • SHIN, BYUNG SIK;BON TAG KOO;SEUNG HWAN PARK;HO YONG PARK;JEONG IL KIM
    • Journal of Microbiology and Biotechnology
    • /
    • v.1 no.4
    • /
    • pp.240-245
    • /
    • 1991
  • We have constructed a promoter-probe vector pKU20 using pKT230, a derivative of broad-host-range plsmid RSF1010, as a base. The pKU20 contains structural gene for aminoglycoside phos-photransferase (aph), without promoter, and a multiple cloning site upstream the aph. Using this vector, a 412base pairs (bp) PstI fragment showing strong promoter activity both in Escherichia coli LE392 and Pseudomonas putida KCTC1644 has been cloned from Pseudomonas fluorescens chromosomal DNA on the basis of streptomycin resistance. The nucleotide sequence of the 412 bp fragment has been determined and the putative - 35 and -10 region was observed. Insecticidal protein gene of Bacillus thuringiensis subsp. kurstaki HD-73 inserted on downstream of the promoterlike DNA fragment was efficiently expressed in E. coli and P. putida. The toxin protein was efficiently synthesized in an insoluble form in both strains.

  • PDF

Instruction Queue Architecture for Low Power Microprocessors (마이크로프로세서 전력소모 절감을 위한 명령어 큐 구조)

  • Choi, Min;Maeng, Seung-Ryoul
    • Journal of the Institute of Electronics Engineers of Korea SD
    • /
    • v.45 no.11
    • /
    • pp.56-62
    • /
    • 2008
  • Modern microprocessors must deliver high application performance, while the design process should not subordinate power. In terms of performance and power tradeoff, the instructions window is particularly important. This is because a large instruction window leads to achieve high performance. However, naive scaling conventional instruction window can severely affect the complexity and power consumption. This paper explores an architecture level approach to reduce power dissipation. We propose a low power issue logic with an efficient tag translation. The direct lookup table (DTL) issue logic eliminates the associative wake-up of conventional instruction window. The tag translation scheme deals with data dependencies and resource conflicts by using bit-vector based structure. Experimental results show that, for SPEC2000 benchmarks, the proposed design reduces power consumption by 24.45% on average over conventional approach.

Cloning and Characterization of a Bile Salt Hydrolase from Enterococcus faecalis Strain Isolated from Healthy Elderly Volunteers (사람 분변에서 분리한 Enterococcusfaecalis가 생성하는 BileSaltHydrolase의 특징)

  • Eom, Seok-Jin;Kim, Geun-Bae
    • Journal of Dairy Science and Biotechnology
    • /
    • v.29 no.1
    • /
    • pp.49-54
    • /
    • 2011
  • Bile salt hydrolase (BSH, EC 3.5.1.24) activity, which cleaves amide bond between carboxyl group (bile acid) and amino group (glycine or taurine), is commonly detected in gut-associated species of human and animal. During the screening of BSH active strains from the fecal samples of elderly human volunteers, strain CU30-2 was isolated on the basis of the highly active BSH producing activity. A bsh gene of the isolate was cloned into the pET22b expression vector and overexpressed in Escherichia coli BL21 (DE3) Gold by induction with 1mM IPTG. The overexpressed BSH enzyme with 6x His-tag was purified with apparent homogeneity using a $Ni^+$-NTA agarose column and characterized. The BSH enzyme of E. faecalis CU30-2 exhibited approximately 50 times higher activity against glycol-conjugated bile salts than tauro-conjugated bile salts having the highest activity against glycocholic acid. Considering the prevalence of E. faecalis strains in the human GI tract and glycol-conjugates dominated bile acid composition of human bile, further study is needed to investigate the impact of the BSH activity exerted by E. faecalis strains to the host as well as to the BSH producing strains.

  • PDF

Expression and Purification of a Recombinant scFv towards the Exotoxin of the Pathogen, Burkholderia pseudomallei

  • Lim, Kue-Peng;Li, Hong-Bin;Sheila Nathan
    • Journal of Microbiology
    • /
    • v.42 no.2
    • /
    • pp.126-132
    • /
    • 2004
  • A single chain variable fragment (scFv) specific towards B. pseudomallei exotoxin had previously been generated from an existing hybridoma cell line (6E6AF83B) and cloned into the phage display vector pComb3H. In this study, the scFv was subcloned into the pComb3X vector to facilitate the detection and purification of expressed antibodies. Detection was facilitated by the presence of a hemagglutinin (HA) tag, and purification was facilitated by the presence of a histidine tag. The culture was grown at 30$^{\circ}C$ until log phase was achieved and then induced with 1 mM IPTG in the absence of any additional carbon source. Induction was continued at 30$^{\circ}C$ for five h. The scFv was discerned by dual processes-direct enzyme-linked immunosorbent assays (ELISA), and Western blotting. When compared to E. coli strains ER2537 and HB2151, scFv expression was observed to be highest in the E. coli strain Topl0F'. The expressed scFv protein was purified via nickel-mediated affinity chromatography and results indicated that two proteins a 52 kDa protein, and a 30 kDa protein were co-purified. These antibodies, when blotted against immobilized exotoxin, exhibited significant specificity towards the exotoxin, com-pared to other B. pseudomallei antigens. Thus, these antibodies should serve as suitable reagents for future affinity purification of the exotoxin.

Terms Based Sentiment Classification for Online Review Using Support Vector Machine (Support Vector Machine을 이용한 온라인 리뷰의 용어기반 감성분류모형)

  • Lee, Taewon;Hong, Taeho
    • Information Systems Review
    • /
    • v.17 no.1
    • /
    • pp.49-64
    • /
    • 2015
  • Customer reviews which include subjective opinions for the product or service in online store have been generated rapidly and their influence on customers has become immense due to the widespread usage of SNS. In addition, a number of studies have focused on opinion mining to analyze the positive and negative opinions and get a better solution for customer support and sales. It is very important to select the key terms which reflected the customers' sentiment on the reviews for opinion mining. We proposed a document-level terms-based sentiment classification model by select in the optimal terms with part of speech tag. SVMs (Support vector machines) are utilized to build a predictor for opinion mining and we used the combination of POS tag and four terms extraction methods for the feature selection of SVM. To validate the proposed opinion mining model, we applied it to the customer reviews on Amazon. We eliminated the unmeaning terms known as the stopwords and extracted the useful terms by using part of speech tagging approach after crawling 80,000 reviews. The extracted terms gained from document frequency, TF-IDF, information gain, chi-squared statistic were ranked and 20 ranked terms were used to the feature of SVM model. Our experimental results show that the performance of SVM model with four POS tags is superior to the benchmarked model, which are built by extracting only adjective terms. In addition, the SVM model based on Chi-squared statistic for opinion mining shows the most superior performance among SVM models with 4 different kinds of terms extraction method. Our proposed opinion mining model is expected to improve customer service and gain competitive advantage in online store.