• Title/Summary/Keyword: similarity weight

Search Result 376, Processing Time 0.032 seconds

TF-IDF Based Association Rule Analysis System for Medical Data (의료 정보 추출을 위한 TF-IDF 기반의 연관규칙 분석 시스템)

  • Park, Hosik;Lee, Minsu;Hwang, Sungjin;Oh, Sangyoon
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.5 no.3
    • /
    • pp.145-154
    • /
    • 2016
  • Because of the recent interest in the u-Health and development of IT technology, a need of utilizing a medical information data has been increased. Among previous studies that utilize various data mining algorithms for processing medical information data, there are studies of association rule analysis. In the studies, an association between the symptoms with specified diseases is the target to discover, however, infrequent terms which can be important information for a disease diagnosis are not considered in most cases. In this paper, we proposed a new association rule mining system considering the importance of each term using TF-IDF weight to consider infrequent but important items. In addition, the proposed system can predict candidate diagnoses from medical text records using term similarity analysis based on medical ontology.

Cloning and Identification of Essential Residues for Thermostable β-glucosidase (BgIB) from Thermotoga maritima (Thermotoga maritima로부터 고온성 β-glucosidase (BgIB)의 클로닝과 필수아미노산 잔기의 확인)

  • Hong, Su-Young;Cho, Kye-Man;Kim, Yong-Hee;Hong, Sun-Joo;Cho, Soo-Jeong;Cho, Yong-Un;Kim, Hoon;Yun, Han-Dae
    • Journal of Life Science
    • /
    • v.16 no.7 s.80
    • /
    • pp.1148-1157
    • /
    • 2006
  • A hyperthermophilic bacterium Thernotoga maritima produced thermostable ${\beta}-glucosidase$. The gene encoding ${\beta}-glucosidase$ from T. maritima MSB8 was cloned and expressed in Escherichia coli. The en-zyme (BgIB) hydrolyzed ${\beta}-glucosidase$ linkages between glucose and alkyl, aryl of saccharide groups such as salicin, arbutin, and $_pNPG$. The insert DNA contained ORF with 2,166 bp encodes a 721 amino acids (calculated molecular mass of 80,964 and pl of 4.93). The amino a.id sequence of BglB showed the similarity to family 3 glycosyl hydrolases. The molecular weight of the enzyme was estimated to be approximately 81kDa by MUG-nondenaturing PAGE (4-methylumbelliferyl 13-D-glucoside-nondenaturing polyacrylamide gel electophoresis) and SDS-PACE. The ${\beta}-glucosidase$ exhibited maximal activity at pH 7.0 and $80^{\circ}C$. By exchanging two possible residues (Glu-232 and Asp-242) to Ala by site-directed mutagenesis method, it was found that these were essential for enzymatic activity.

Protease Properties of Protease-Producing Bacteria Isolated from the Digestive Tract of Octopus vulgaris (Octopus vulgaris의 장관으로부터 분리한 단백질 분해효소 생성 균주와 생성된 효소의 특성)

  • Liu, Qing;Ren, Pei;Piao, Meizi;Yang, Ji-Young
    • Journal of Life Science
    • /
    • v.23 no.12
    • /
    • pp.1486-1494
    • /
    • 2013
  • A high protease-producing strain was isolated and identified from the digestive tract of octopus vulgaris by detecting a hydrolysis circle of protease and its activity. The strain was identified by morphology observation, biochemical experiments, and 16S rRNA sequence analysis. The protease obtained from the strain was purified by a three-step process involving ammonium sulfate precipitation, carboxy methyl-cellulose (CM-52) cation-exchange chromatography, and DEAE-Sephadex A50 anion-exchange chromatography. The properties of protease were characterized as well. The strain Bacillus sp. QDV-3, which produced the highest activity of protease, was isolated. On the basis of the phenotypic and biochemical characterization and 16S rRNA gene-sequencing studies, the isolate was identified as follows: domain: Bacteria; phylum: Firmicutes; class: Bacilli; order: Bacillales; family: Bacillaceae; and genus: Bacillus. The isolate was shown to have a 99.2% similarity with Bacillus flexus. A high active protease designated as QDV-E, with a molecular weight of 61.6 kDa, was obtained. The enzyme was found to be active in the pH range of 9.0-9.5 and its optimum temperature was $40^{\circ}C$. The protease activity retained more than 96% at the temperature of $50^{\circ}C$ for 60 min. Phenylmethylsulfonyl fluoride (PMSF) inhibited the enzyme activity, thus confirming that this protease isolated from Bacillus sp. QDV-3 is an alkaline serine protease. Metal ions, $Mn^{2+}$ and $Mg^{2+}$, were determined to enhance the protease activity, whereas $Ba^{2+}$, $Zn^{2+}$, and $Cu^{2+}$ were found to inactivate the enzyme.

Identification and inhibiting effect of Lactobacillus salivarius the formation of plaque and the production of volatile sulfur compounds by anaerobic bacteria (치태형성과 혐기성 세균의 황화합물 생성에 대한 Lactobacillus salivarius의 억제효과 및 동정)

  • Kim, Mi-Hyung;Sun, Gem-Ju;Ahn, Yeon-Jun
    • Journal of Korean society of Dental Hygiene
    • /
    • v.5 no.2
    • /
    • pp.131-145
    • /
    • 2005
  • There are normal inhabitants doing medically useful functions in the body. There are many kinds of bacteria performing specific functions in the oral cavity. Two strains of lactic add bacteria were isolated from normal inhabitants of children's oral cavity, which inhibited the production of volatile sulfur compounds by anaerobic bacteria. The authors identified the isolates by 16S rDNA partial sequencing. 1. Two isolates were Gram-positive bacilli and produced hydrogen peroxide. 2. When Streptococcus mutans was cultured in the media, the mean weight of formed artificial plaque on the orthodontic wires was $124.4{\pm}30.4$ mg, whereas being reduced to $5.2{\pm}2.0$ mg and $10.6{\pm}6.6$ mg in the media cultured with Streptococcus mutans and each isolate, respectively(p<0.05). 3. The number of viable cells of Streptococcus mutans was $3.4{\times}10^9$ per ml in the cultured solution, whereas those of Streptococcus mutans in the combined culture with each of isolates were $4.6{\times}10^8$ and $2.4{\times}10^8$ per ml. 4. The optical density was 1.286 in the supernatant of Fusoacterium nucleatum after vortexing for 30 minutes, whereas in the supernatant of combined Fusoacterium nucleatum and each isolate, they were reduced to 0.628 and 00497, which the percentages of coaggregation between them were 2904% and 57.8%, respectively. 5. The optical density of Fusoacterium nucleatum precipitate was 1.794 in the culture media containing cysteine and $FeSO_4$ being reduced to 1.144 and 0.915 in the coaggregated precipitates of Fusoacterium nucleatum and each isolate. 6. The similarity values of 16S rDNA sequence between each of isolates and Lactobacillus salivarius subsp. salicinius were 99.60% and 99.73%, respectively, meaning that isolates were Lactobacillus salivarius subsp. salicinius. These results indicated that two strains isolated from children's saliva, which inhibited the formation of plaque and the production of volatile sulfur compounds, were identified as Lactobacillus salivarius subsp. salicinius.

  • PDF

Methodology of Prior Art Search Based on Hierarchical Citation Analysis (계층적 인용관계분석을 통한 선행기술 탐색방법론)

  • Kang, Jiho;Kim, Jongchan;Lee, Joonhyuck;Park, Sangsung;Jang, Dongsik
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.27 no.1
    • /
    • pp.72-78
    • /
    • 2017
  • Prior art search is a core process of technology management performed by inventors and applicants, patent examiners, and employees in the patent industry. As a result of insufficient academic research on a systematic prior art search methodology, the process has been often carried out depending on the subjective judgment of researchers. Previous studies on exploring prior arts based on semantics have also have the risk of underestimating the similarity of major prior arts due to the nature of patent documents where the same technical ideas are expressed in various terms. In this study, we propose an effective prior art search methodology based on hierarchical citation analysis, which provides a clear criterion for selecting core prior arts by calculating weights according to the relative importance of the collected patents. In order to verify the feasibility of the proposed methodology, a case study was conducted to explore the core prior art of one patent in the display field. As a result, 10 core prior art candidates were selected out of the 206 precedent patents.

Molecular Cloning of cDNA Encoding a Putative Eugenol Synthase in Tomato (Solanum lycopersicum 'Micro-Tom') and Prediction of 3D Structure and Physiochemical Properties (토마토 'Micro-Tom' 과실의 eugenol synthase 유전자 클로닝, 단백질의 3차 구조 및 생리화학적 특성 예측)

  • Kang, Seung-Won;Seo, Sang-Gyu;Lee, Tai-Ho;Lee, Gung-Pyo
    • Journal of agriculture & life science
    • /
    • v.46 no.4
    • /
    • pp.9-20
    • /
    • 2012
  • Eugenol is a volatile compound synthesized by eugenol synthase in various plants and belongs to phenylpropene compounds. However, characteristics of eugenol synthase in tomato has not been known. Therefore, we cloned a full length cDNA of a putative eugenol synthase from tomato 'Micro-Tom' using rapid amplification of cDNA ends (RACE) technique and named a clone SlEGS. Open reading frame of SlEGS was 921bp long and its deduced amino acid sequence was 307bp. The BLAST analysis indicated that SlEGS shared high similarity with PhEGS1 (67.1%) and CbEGS2 (69.4%). Amino acid composition of SlEGS was determined by CLC genomics workbench tool and 3D structure of SlEGS was constructed by homology modeling using Swiss-PDB viewer and validated using PROCHECK and ProSA-web tool. In addition, the physiochemical properties of SlEGS was evaluated using ExPASy's ProtParam tool. Molecular weight was 33.93kDa and isoelectric point was 5.85 showing acidic nature. Other properties such as extinction coefficient, instability index, aliphatic index, and grand average hydropathy was also analyzed.

Cloning and Expression of a Alkaline Protease from Bacillus clausii I-52 (Bacillus clausii I-52로부터 alkaline protease 유전자의 클로닝 및 발현)

  • Joo, Han-Seung;Choi, Jang Won
    • Journal of agriculture & life science
    • /
    • v.45 no.6
    • /
    • pp.201-212
    • /
    • 2011
  • The alkaline protease gene was cloned from a halo-tolerant alkalophilic Bacillus clausii I-52 isolated from the heavily polluted tidal mud flat of West Sea in Inchon Korea, which produced a strong extracellular alkaline protease (BCAP). Based on the full genome sequence of Bacillus subtilis, PCR primers were designed to allow for the amplification and cloning of the intact pro-BCAP gene including promoter region. The full-length gene consists of 1,143 bp and encodes 381 amino acids, which includes 29 residues of a putative signal peptide and an additional 77-amino-acid propeptide at its N-terminus. The mature BCAP deduced from the nucleotide sequence consists of 275 amino acids with a N-terminal amino acid of Ala, and a relative molecular weight and pI value was 27698.7 Da and 6.3, respectively. The amino acid sequence shares the highest similarity (99%) to the nattokinase precursor from B. subtilis and subtilisin E precursor from B. subtilis BSn5. The substrate specificity indicated that the recombinant BCAP could hydrolyze efficiently the synthetic substrate, N-Suc-Ala-Ala-Pro-Phe-pNA,and did not hydrolyze the substrates with basic amino acids at the P1 site. The recombinant BCAP was strongly inhibited by typical serine protease inhibitor, PMSF, indicating that BCAP is a member of the serine proteases.

Utilization of age information for speaker verification using multi-task learning deep neural networks (멀티태스크 러닝 심층신경망을 이용한 화자인증에서의 나이 정보 활용)

  • Kim, Ju-ho;Heo, Hee-Soo;Jung, Jee-weon;Shim, Hye-jin;Kim, Seung-Bin;Yu, Ha-Jin
    • The Journal of the Acoustical Society of Korea
    • /
    • v.38 no.5
    • /
    • pp.593-600
    • /
    • 2019
  • The similarity in tones between speakers can lower the performance of speaker verification. To improve the performance of speaker verification systems, we propose a multi-task learning technique using deep neural network to learn speaker information and age information. Multi-task learning can improve generalization performances, because it helps deep neural networks to prevent hidden layers from overfitting into one task. However, we found in experiments that learning of age information does not work well in the process of learning the deep neural network. In order to improve the learning, we propose a method to dynamically change the objective function weights of speaker identification and age estimation in the learning process. Results show the equal error rate based on RSR2015 evaluation data set, 6.91 % for the speaker verification system without using age information, 6.77 % using age information only, and 4.73 % using age information when weight change technique was applied.

Characteristics of Dioscorea alata L. Introduced from Tropical and Subtropical Regions (도입 마(Dioscorea alata L.)의 특성 분석)

  • Chang, K.J.;Yoo, K.O.;Park, C.H.;Park, J.I.;Hong, K.H.;Park, J.H.
    • Journal of Practical Agriculture & Fisheries Research
    • /
    • v.3 no.1
    • /
    • pp.48-69
    • /
    • 2001
  • A lot of clones of the genus Dioscorea have been introduced from some tropical and subtropical regions since 1997. In 33 clones of water yams (Dioscorea alata L.), some morphological characteristics were investigated at the field. Variation ranges of the total weight and tuber number per stump were within the ranges from 90 to 2,147 g with an average of 610 g ; and 1.3-4.7 with an average of 2.8, respectively. The color tones observed on the tuber-flesh were sorted into 3 color-categories, i.e., white, pale brown and pale purple, and those on leaves were sorted into 3 color-categories, i.e., green, heavy green and purplish green. Intraspecific genetic relationship of 19 variation types of the Yam classified by their external morphological characteristics such as leaf and tuber shape was assessed by DNA using random and specific primers. Twenty two out of 113 primers (100 random[10-mer] primers, two 15 mer [M13 core sequence, and (GGAT)4 sequence]) had been used in PCR-amplification. Only 12 primers, however, were successful in DNA amplification in all of the analyzed plants, resulting in 93 randomly and specifically amplified DNA fragments. The analyzed taxa showed very high polymorphisms(69 bands, 71.0%), allowing individual taxon to be identified based on DNA fingerprinting. Monomorphic bands among total amplified DNA bands of each primer was low under the 50%. Similarity indices between accessions were computed from PCR(polymerase chain reaction) data, and genetic relationships among intraspecific variations were closely related at the levels ranging from 0.66 to 0.90.

Production of a New Biosurfactant by a New Yeast Species Isolated from Prunus mume Sieb. et Zucc.

  • Jeong-Seon Kim;Miran Lee;Dae-Won Ki;Soon-Wo Kwon;Young-Joon Ko;Jong-Shik Kim;Bong-Sik Yun;Soo-Jin Kim
    • Journal of Microbiology and Biotechnology
    • /
    • v.33 no.8
    • /
    • pp.1023-1029
    • /
    • 2023
  • Biosurfactants reduce surface and interfacial tension due to their amphiphilic properties and are an eco-friendly alternative for chemical surfactants. In this study, a new yeast strain JAF-11 that produces a biosurfactant was selected using drop collapse method, and the properties of the extracts were investigated. The nucleotide sequences of the strain were compared with closely related strains and identified based on the D1/D2 domain of the large subunit ribosomal DNA (LSU) and internal transcribed spacer (ITS) regions. Neodothiora populina CPC 39399T, the closest species with strain JAF-11, showed a sequence similarity of 97.75% for LSU and 94.27% for ITS, respectively. The result suggests that the strain JAF-11 represents a distinct species that cannot be assigned to any existing genus or species in the family Dothideaceae. Strain JAF-11 produced a biosurfactant reducing the surface tension of water from 72 mN/m to 34.5 mN/m on the sixth day of culture and the result of measuring the critical micelle concentration (CMC) by extracting the crude biosurfactant was found to be 24 mg/l. The molecular weight 502 of the purified biosurfactant was confirmed by measuring the fast atom bombardment mass spectrum. The chemical structure was analyzed by measuring 1H nuclear magnetic resonance (NMR), 13C NMR, and two-dimensional NMRs of the compound. The molecular formula was C26H46O9, and it was composed of one octanoyl group and two hexanoyl groups to myo-inositol moiety. The new biosurfactant is the first report of a compound produced by a new yeast strain, JAF-11.