• 제목/요약/키워드: similarity weight

검색결과 376건 처리시간 0.029초

PMCN: Combining PDF-modified Similarity and Complex Network in Multi-document Summarization

  • Tu, Yi-Ning;Hsu, Wei-Tse
    • International Journal of Knowledge Content Development & Technology
    • /
    • 제9권3호
    • /
    • pp.23-41
    • /
    • 2019
  • This study combines the concept of degree centrality in complex network with the Term Frequency $^*$ Proportional Document Frequency ($TF^*PDF$) algorithm; the combined method, called PMCN (PDF-Modified similarity and Complex Network), constructs relationship networks among sentences for writing news summaries. The PMCN method is a multi-document summarization extension of the ideas of Bun and Ishizuka (2002), who first published the $TF^*PDF$ algorithm for detecting hot topics. In their $TF^*PDF$ algorithm, Bun and Ishizuka defined the publisher of a news item as its channel. If the PDF weight of a term is higher than the weights of other terms, then the term is hotter than the other terms. However, this study attempts to develop summaries for news items. Because the $TF^*PDF$ algorithm summarizes daily news, PMCN replaces the concept of "channel" with "the date of the news event", and uses the resulting chronicle ordering for a multi-document summarization algorithm, of which the F-measure scores were 0.042 and 0.051 higher than LexRank for the famous d30001t and d30003t tasks, respectively.

Relevancy contemplation in medical data analytics and ranking of feature selection algorithms

  • P. Antony Seba;J. V. Bibal Benifa
    • ETRI Journal
    • /
    • 제45권3호
    • /
    • pp.448-461
    • /
    • 2023
  • This article performs a detailed data scrutiny on a chronic kidney disease (CKD) dataset to select efficient instances and relevant features. Data relevancy is investigated using feature extraction, hybrid outlier detection, and handling of missing values. Data instances that do not influence the target are removed using data envelopment analysis to enable reduction of rows. Column reduction is achieved by ranking the attributes through feature selection methodologies, namely, extra-trees classifier, recursive feature elimination, chi-squared test, analysis of variance, and mutual information. These methodologies are ranked via Technique for Order of Preference by Similarity to Ideal Solution (TOPSIS) using weight optimization to identify the optimal features for model building from the CKD dataset to facilitate better prediction while diagnosing the severity of the disease. An efficient hybrid ensemble and novel similarity-based classifiers are built using the pruned dataset, and the results are thereafter compared with random forest, AdaBoost, naive Bayes, k-nearest neighbors, and support vector machines. The hybrid ensemble classifier yields a better prediction accuracy of 98.31% for the features selected by extra tree classifier (ETC), which is ranked as the best by TOPSIS.

Molecular Identification and Expression of Myosin Light Chain in Shortspine Spurdog (Squalus mitsukurii)

  • Kim, Soo Cheol;Sumi, Kanij Rukshana;Sharker, Md Rajib;Kho, Kang Hee
    • 한국해양생명과학회지
    • /
    • 제3권1호
    • /
    • pp.1-8
    • /
    • 2018
  • Myosin is considered as the vital motor protein in vertebrates and invertebrates. Our present study was conducted to decipher the occurrence of myosin in dog fish (Squalus mitsukurii). We isolated one clone containing 979 bp cDNA sequence, which consisted of a complete coding sequence of 453 bp and a deduced amino acid sequence of 150 amino acids from the open reading frame with molecular weight, isoelectric point and aliphatic index are 16.72 Kda, 4.49 and 78.00, respectively. It contained 428 bp long 3' UTR with single potential polyadenylation signals (AATAAA). The predicted EF CA2+ binding domains were identified in residue 6-41, 83-118 and 133-150. A BLAST search indicates this protein exhibits a strong similarity to whale shark (Rhincodon typus) MLC3 (91% identical) and also house mouse (Mus musculus) MLC isoform 3f (81% identical). Phylogenetic analysis revealed that this protein is a MLC 3 isoform like protein. This protein also demonstrates highly conserved region with other myosin proteins. Homology modeling of S. mitsukuri was performed using crystal structure of Gallus gallus skeletal muscle myosin II based on high similarity. Reverse transcription-polymerase chain reaction (PCR), quantitative PCR results exhibits dogfish myosin protein is highly expressed in muscle tissue.

퍼지 유사관계를 이용한 다차원 특징들의 가중치 결정과 감성기반 음악검색 (The Weight Decision of Multi-dimensional Features using Fuzzy Similarity Relations and Emotion-Based Music Retrieval)

  • 임지혜;이준환
    • 한국지능시스템학회논문지
    • /
    • 제21권5호
    • /
    • pp.637-644
    • /
    • 2011
  • 음원이 디지털화 되면서 쉽게 음악을 구매하고 들을 수 있게 되었다. 하지만 많은 음악 중에서 음악가, 장르, 제목, 앨범 타이틀 등 전통적인 음악 정보를 이용하여 사용자들이 자신의 취향에 맞는 음악을 찾는 데는 여전히 어려움이 있다. 이러한 어려움을 해소하기 위해 내용기반 음악검색과 감성기반 음악검색 방법 등이 제안되고 개발되고 있다. 본 논문에서는 이러한 어려움을 해소하기 위한 감성기반 음악 검색방법에서 다차원 벡터형태의 MPEG-7 저수준 오디오 서술자들의 감성기반 검색에서의 중요도를 결정하기 위한 새로운 방법을 제안하였다. 제안된 방법에서는 상호간에 대립되는 감성을 대표되는 음악들의 유사성을 다차원 서술자 관점에서 측정하고 이 유사관계를 러프 근사화와 군집 내/군집 간의 유사성 비율을 이용하여 서술자의 중요성을 결정한다. 중요성을 바탕으로 결정된 가중치는 여러 개의 오디오 서술자들의 유사성을 총체화하는데 이용되며 이를 활용하여 감성기반 음악검색을 수행한다. 제안된 방법은 내용기반 음악 검색을 기반으로 한 감성기반 음악검색 구조에서 실험한 결과 평균 검색 개수측면에서 기존의 휴리스틱 방법보다 좋은 검색 결과를 나타내었다.

유한요소법 및 다구찌 기법에 의한 소형항공기용 HUMS 하우징 경량화 (Weight Lightening of HUMS Housing for Small Aircraft by Using FEM and Taguchi Method)

  • 김진수;윤대원;박태상;정재은;오재응
    • 한국소음진동공학회논문집
    • /
    • 제23권12호
    • /
    • pp.1045-1055
    • /
    • 2013
  • It is true that the dependency on import is currently high in case of the safety checkup system of domestic airplanes, and it is at the point of time that localization of HUMS for small airplanes is required. In this study, the design factors were selected for the lightweight of HUMS for small airplanes by using Pro-Engineer which is a design tool and Abaqus. 9 models were made through experiment plans with Taguchi method for this, and the each model for weight lightening was selected through vibration analysis and shock analysis while in operation with experiment profile values. After fabricating HUMS, it was verified that as a result of experiment with the same profile values as the analysis, there was similarity between the analyzed values and values of the experiment. As a result of performing weight lightening which is the purpose of the study, electronic performance for small airplanes is assured and a design plan reducing 15 % weight compared to the targeted weight was deduced. Besides, it could be verified that the light weight model satisfied the maximum allowable displacement value of PCB[printed circuit board] and accordingly satisfied electronic properties of HUMS. In this study, the reliability of a product was certified through the result of an experiment on ground. If the reliability of HUMS were verified through a test flight in the future, it is considered that it would make a big contribution to localization of aerospace electronic equipment.

면 객체 매칭을 위한 판별모델의 성능 평가 (Evaluation of Classifiers Performance for Areal Features Matching)

  • 김지영;김정옥;유기윤;허용
    • 한국측량학회지
    • /
    • 제31권1호
    • /
    • pp.49-55
    • /
    • 2013
  • 데이터마이닝과 바이오인식 분야의 판별모델의 성능평가 방법을 이종의 공간 데이터 셋의 매칭에 적용함으로써 좋은 매칭결과를 보이는 판별모델을 도출하고자 한다. 이를 위하여 매칭 기준별 매칭 후보객체 쌍의 거리 값을 구하고, 이들 거리 값을 Min-Max 방법과 Tanh 방법으로 정규화하여 유사도를 산출한다. 산출된 유사도를 CRITIC 방법, Matcher Weighting 방법 그리고 Simple Sum 방법으로 결합하여 형상유사도를 도출하는 판별모델을 적용하였다. 각 판별모델을 PR곡선과 AUC-PR로 평가한 결과, Tanh 정규화와 Simple Sum 방법을 적용한 판별모델의 AUC-PR이 0.893으로 가장 높게 나타났다. 따라서 이종의 공간 데이터 셋의 매칭을 위해서는 Tanh 정규화를 이용하여 각 매칭기준별 유사도를 산출하고 Simple Sum 방법으로 형상유사도를 구하는 판별모델이 적합한 것으로 사료된다.

In silico characterisation, homology modelling and structure-based functional annotation of blunt snout bream (Megalobrama amblycephala) Hsp70 and Hsc70 proteins

  • Tran, Ngoc Tuan;Jakovlic, Ivan;Wang, Wei-Min
    • Journal of Animal Science and Technology
    • /
    • 제57권12호
    • /
    • pp.44.1-44.9
    • /
    • 2015
  • Background: Heat shock proteins play an important role in protection from stress stimuli and metabolic insults in almost all organisms. Methods: In this study, computational tools were used to deeply analyse the physicochemical characteristics and, using homology modelling, reliably predict the tertiary structure of the blunt snout bream (Ma-) Hsp70 and Hsc70 proteins. Derived three-dimensional models were then used to predict the function of the proteins. Results: Previously published predictions regarding the protein length, molecular weight, theoretical isoelectric point and total number of positive and negative residues were corroborated. Among the new findings are: the extinction coefficient (33725/33350 and 35090/34840 - Ma-Hsp70/ Ma-Hsc70, respectively), instability index (33.68/35.56 - both stable), aliphatic index (83.44/80.23 - both very stable), half-life estimates (both relatively stable), grand average of hydropathicity (-0.431/-0.473 - both hydrophilic) and amino acid composition (alanine-lysine-glycine/glycine-lysine-aspartic acid were the most abundant, no disulphide bonds, the N-terminal of both proteins was methionine). Homology modelling was performed by SWISS-MODEL program and the proposed model was evaluated as highly reliable based on PROCHECK's Ramachandran plot, ERRAT, PROVE, Verify 3D, ProQ and ProSA analyses. Conclusions: The research revealed a high structural similarity to Hsp70 and Hsc70 proteins from several taxonomically distant animal species, corroborating a remarkably high level of evolutionary conservation among the members of this protein family. Functional annotation based on structural similarity provides a reliable additional indirect evidence for a high level of functional conservation of these two genes/proteins in blunt snout bream, but it is not sensitive enough to functionally distinguish the two isoforms.

Molecular Characterization of a Defensin-like Peptide from Larvae of a Beetle, Protaetia brevitarsis

  • Hwang, Jae-Sam;Kang, Bo-Ram;Kim, Seong-Ryul;Yun, Eun-Young;Park, Kwan-Ho;Jeon, Jae-Pil;Nam, Sung-Hee;Suh, Hwa-Jin;Hong, Mee-Yeon;Kim, Ik-Soo
    • International Journal of Industrial Entomology and Biomaterials
    • /
    • 제17권1호
    • /
    • pp.131-135
    • /
    • 2008
  • A cDNA encoding a defensin-like peptide (Protaetiamycine) from the larvae of a beetle, Protaetia brevitarsis was cloned. The DNAs encoded the deduced propeptide of 79 amino acid residues with the predicted molecular weight of 8.4 kDa and PI of 8.24. Overall amino acid sequence of this protein has 39% similarity to that of Rhodnius prolixus defensin, 43% similarity to that of Acalolepta luxuriosa defensin, and 72% similarity to that of Oryctes rhinoceros defensin, suggesting that this gene is an insect defensin. In an attempt to apply the anti-bacterial peptide to the development of therapeutic agents, a 12-mer peptide amidated at its C-terminus, ACAAHCLAIGRG-$NH_2$ (Ala55-Lys66-$NH_2$, 12Pbn) was synthesized. This peptide showed some antifungal activity against Candida albicans. To increase antifungal activity, six 9-mer peptides were synthesized by modifying amino acid sequences of 12Pbn fragment. Among these peptides, 9Pbm3-9Pbm6 exhibited strong activity compared with Cecropin B and mellitin.

단백질 서열의 n-Gram 자질을 이용한 세포내 위치 예측 (Classification Protein Subcellular Locations Using n-Gram Features)

  • 김진숙
    • 한국콘텐츠학회:학술대회논문집
    • /
    • 한국콘텐츠학회 2007년도 추계 종합학술대회 논문집
    • /
    • pp.12-16
    • /
    • 2007
  • 단백질의 기능은 그 기능을 발휘하는 세포내의 위치와 밀접한 연관이 있다. 따라서 새로운 단백질의 서열이 밝혀지면 이 단백질의 세포내 위치를 규명하는 것은 생물학적으로 매우 중요한 일이다. 이 논문에서는 단백질의 n-그램과 kNN (k-Nearest Neighbor) 분류기를 이용한 새로운 세포내 위치예측 방법을 다룬다. 이 방법은 입력 단백질 서열과 가장 유사한 가중치를 가지는 k개의 단백질이 가지는 세포내 위치 정보들을 취합하여 입력 단백질의 세포내 위치를 추정한다. 단백질간의 유사도 가중치는 두 단백질서열의 5-그램 자질의 유사도를 비교하여 계산된다. 단백질의 세포내 위치예측 정확도를 검증하기 위해 SWISS-PROT 단백질 데이터베이스로 부터 세포내 위치가 알려진 51,885개의 서열을 추출하여 대용량 테스트 컬렉션을 구축하였으며, 다른 연구자들이 제공하는 또 하나의 소용량 테스트 컬렉션을 실험에 사용하였다. 이 논문에서 사용한 예측방법은 대용량 테스트컬렉션에 대해 약 93%의 정확도를 보여주었으며, 소용량 데스트컬렉션을 이용하여 이전 실험과 비교하였을 때도 이 방법이 다른 시스템에 비해 성능이 우월함을 알 수 있었다.

  • PDF

고객 맞춤 서비스를 위한 HPPS(Hybrid Preference Prediction System) 설계 (A Design of HPPS(Hybrid Preference Prediction System) for Customer-Tailored Service)

  • 정은희;이병관
    • 한국멀티미디어학회논문지
    • /
    • 제14권11호
    • /
    • pp.1467-1477
    • /
    • 2011
  • 본 논문에서는 고객 맞춤 서비스의 선호도를 정확하게 예측하기 위하여 사용자 프로파일 분석, 사용자간 유사도 분석을 이용한 HPPS(Hybrid Preference Prediction System) 설계를 제안한다. 기존의 NBCFA(Neighborhood Based Collaborative Filtering Algorithm)과 달리, 본 논문은 첫째, 선호도 예측식에서 이웃의 상품 평가가 없을 경우 상품에 대한 평균값을 이용하도록 하였고, 둘째, 선호도 예측식에서 사용자의 특성을 분석한 가중치를 반영하도록 하였고, 끝으로, 인접 이웃을 선정할 때 유사도, 상품 평가 여부, 평가 횟수를 반영하여 HPPS에 선호도의 정확도를 향상시켰다. 따라서 첫째와 둘째의 선호도 예측식을 이용하면 HPPS의 정확도는 기존의 NBCFA에 비해 97.24% 향상되었고, 인접이웃 선정방식에서도 HPPS 시스템의 정확도가 75% 향상되었다.