• 제목/요약/키워드: score sequence

검색결과 175건 처리시간 0.031초

한국어 text-to-speech(TTS) 시스템을 위한 엔드투엔드 합성 방식 연구 (An end-to-end synthesis method for Korean text-to-speech systems)

  • 최연주;정영문;김영관;서영주;김회린
    • 말소리와 음성과학
    • /
    • 제10권1호
    • /
    • pp.39-48
    • /
    • 2018
  • A typical statistical parametric speech synthesis (text-to-speech, TTS) system consists of separate modules, such as a text analysis module, an acoustic modeling module, and a speech synthesis module. This causes two problems: 1) expert knowledge of each module is required, and 2) errors generated in each module accumulate passing through each module. An end-to-end TTS system could avoid such problems by synthesizing voice signals directly from an input string. In this study, we implemented an end-to-end Korean TTS system using Google's Tacotron, which is an end-to-end TTS system based on a sequence-to-sequence model with attention mechanism. We used 4392 utterances spoken by a Korean female speaker, an amount that corresponds to 37% of the dataset Google used for training Tacotron. Our system obtained mean opinion score (MOS) 2.98 and degradation mean opinion score (DMOS) 3.25. We will discuss the factors which affected training of the system. Experiments demonstrate that the post-processing network needs to be designed considering output language and input characters and that according to the amount of training data, the maximum value of n for n-grams modeled by the encoder should be small enough.

Clinical validation of the 3-dimensional double-echo steady-state with water excitation sequence of MR neurography for preoperative facial and lingual nerve identification

  • Kwon, Dohyun;Lee, Chena;Chae, YeonSu;Kwon, Ik Jae;Kim, Soung Min;Lee, Jong-Ho
    • Imaging Science in Dentistry
    • /
    • 제52권3호
    • /
    • pp.259-266
    • /
    • 2022
  • Purpose: This study aimed to evaluate the clinical usefulness of magnetic resonance (MR) neurography using the 3-dimensional double-echo steady-state with water excitation (3D-DESS-WE) sequence for the preoperative delineation of the facial and lingual nerves. Materials and Methods: Patients underwent MR neurography for a tumor in the parotid gland area or lingual neuropathy from January 2020 to December 2021 were reviewed. Preoperative MR neurography using the 3D-DESS-WE sequence was evaluated. The visibility of the facial nerve and lingual nerve was scored on a 5-point scale, with poor visibility as 1 point and excellent as 5 points. The facial nerve course relative to the tumor was identified as superficial, deep, or encased. This was compared to the actual nerve course identified during surgery. The operative findings in lingual nerve surgery were also described. Results: Ten patients with parotid tumors and 3 patients with lingual neuropathy were included. Among 10 parotid tumor patients, 8 were diagnosed with benign tumors and 2 with malignant tumors. The median facial nerve visibility score was 4.5 points. The distribution of scores was as follows: 5 points in 5 cases, 4 points in 1 case, 3 points in 2 cases, and 2 points in 2 cases. The lingual nerve continuity score in the affected area was lower than in the unaffected area in all 3 patients. The average visibility score of the lingual nerve was 2.67 on the affected side and 4 on the unaffected side. Conclusion: This study confirmed that the preoperative localization of the facial and lingual nerves using MR neurography with the 3D-DESS-WE sequence was feasible and contributed to surgical planning for the parotid area and lingual nerve.

품질 정보와 퍼지 추론 기법을 이용한 DNA 염기 서열 배치 알고리즘 (A DNA Sequence Alignment Algorithm Using Quality Information and a Fuzzy Inference Method)

  • 김광백
    • 지능정보연구
    • /
    • 제13권2호
    • /
    • pp.55-68
    • /
    • 2007
  • 분자 생물학(computational molecular biology) 분야에서 DNA 염기 서열 배치 알고리즘은 다양한 방법으로 개선되어 왔다. 본 논문에서는 기존의 DNA 염기의 품질 정보(quality information)를 이용한 DNA 염기 서열 배치 방법을 개선하기 위하여 퍼지 논리 시스템(fuzzy logic system)과 DNA 염기 서열 단편의 특징을 적용한 품질 정보와 퍼지 추론 기법을 이용한 DNA 염기 서열 배치 알고리즘을 제안한다. 기존의 알고리즘은 Needleman-Wunsch가 제안한 전역 배치 알고리즘에 각 DNA 염기의 품질 정보를 적용하여 DNA 염기 서열 배치 점수를 계산하였다. 그러나 전체 DNA 염기의 품질 정보를 이용하여 계산하기 때문에 DNA 염기 말단 부분의 품질이 낮은 경우에는 DNA 염기 서열 배치 점수를 계산하는 과정에서 오차가 발생한다. 본 논문에서는 기존의 품질 정보를 이용한 알고리즘을 개선하여 DNA 염기 서열의 말단 부위의 품질이 낮은 경우에도 정확히 서열을 배치할 수 있도록 한다. 또한 DNA 염기 서열 단편의 길이와 낮은 품질의 DNA 염기 빈도를 퍼지 논리 시스템에 적용하여 DNA 염기 서열 배치 점수를 계산하는데 적용되는 매핑 점수 인자(parameter)를 동적으로 조정한다. 제안된 알고리즘의 성능 평가를 위해 NCBI(National Center for Biotechnology Information)의 실체 유전체 데이터를 받아 성능을 분석한 결과, 제안된 알고리즘이 기존의 품질 정보만을 이용한 알고리즘 보다 DNA 염기 서열 배치에 있어서 효율적임을 확인하였다.

  • PDF

자동차 부품의 재활용을 위한 설계시의 주요인자 결정 (Determination of Design Parameters for Automobile Parts Recycling)

  • 목학수;문광섭;박홍석;성재현;최흥원
    • 한국정밀공학회지
    • /
    • 제20권1호
    • /
    • pp.159-171
    • /
    • 2003
  • In this paper, same parts of a domestic automobiles and foreign automobiles are disassembled fur the evaluation of disassemblability, especially door trim and bumper. Influencing factors of disassembly are determined by the classification of bottleneck process in disassembly process. On the bases of disassembly sequence and structure of parts and subassembly, disassemblability is classified into aye categories. The influencing factors, which are related with the five categories are determined. By these relations, the checklist for disassembly evaluation is draw up and score tables of checked factors are established. For the establishing the disassembly score tables, the weighting values of each five categories are calculated by the disassembly test of automobiles and then, the weighting values of each influencing factors of five categories are calculated by the method of AHP (Analytic Hierarchy Process). And the last, the weighting values are modified and recalculated from the disassembly test. Using these weighting values, the score of influencing factors are determined and then, the score tables are established based on the score of influencing factors.

Sequence-to-Sequence 모델 기반으로 한 한국어 형태소 분석의 재순위화 모델 (A Reranking Model for Korean Morphological Analysis Based on Sequence-to-Sequence Model)

  • 최용석;이공주
    • 정보처리학회논문지:소프트웨어 및 데이터공학
    • /
    • 제7권4호
    • /
    • pp.121-128
    • /
    • 2018
  • Sequence-to-sequence(Seq2seq) 모델은 입력열과 출력열의 길이가 다를 경우에도 적용할 수 있는 모델로 한국어 형태소 분석에서 많이 사용되고 있다. 일반적으로 Seq2seq 모델을 이용한 한국어 형태소 분석에서는 원문을 음절 단위로 처리하고 형태소와 품사를 음절 단위로 출력한다. 음절 단위의 형태소 분석은 사전 미등록어 문제를 쉽게 처리할 수 있다는 장점이 있는 반면 형태소 단위의 사전 정보를 반영하지 못한다는 단점이 있다. 본 연구에서는 Seq2seq 모델의 후처리로 재순위화 모델을 추가하여 형태소 분석의 최종 성능을 향상시킬 수 있는 모델을 제안한다. Seq2seq 모델에 빔 서치를 적용하여 K개 형태소 분석 결과를 생성하고 이들 결과의 순위를 재조정하는 재순위화 모델을 적용한다. 재순위화 모델은 기존의 음절 단위 처리에서 반영하지 못했던 형태소 단위의 임베딩 정보와 n-gram 문맥 정보를 활용한다. 제안한 재순위화 모델은 기존 Seq2seq 모델에 비해 약 1.17%의 F1 점수가 향상되었다.

Genetic assessment of BoLA-DRB3 polymorphisms by comparing Bangladesh, Ethiopian, and Korean cattle

  • Mandefro, Ayele;Sisay, Tesfaye;Edea, Zewdu;Uzzaman, Md. Rasel;Kim, Kwan-Suk;Dadi, Hailu
    • Journal of Animal Science and Technology
    • /
    • 제63권2호
    • /
    • pp.248-261
    • /
    • 2021
  • Attributable to their major function in pathogen recognition, the use of bovine leukocyte antigens (BoLA) as disease markers in immunological traits in cattle is well established. However, limited report exists on polymorphism of the BoLA gene in zebu cattle breeds by high resolution typing methods. Thus, we used a polymerase chain reaction sequence-based typing (PCR-SBT) method to sequence exon 2 of the BoLA class II DRB3 gene from 100 animals (Boran, n = 13; Sheko, n = 20; Fogera, n = 16; Horro, n = 19), Hanwoo cattle (n = 18) and Bangladesh Red Chittagong zebu (n = 14). Out of the 59 detected alleles, 43 were already deposited under the Immuno Polymorphism Database for major histocompatibility complex (IPD-MHC) while 16 were unique to this study. Assessment of the level of genetic variability at the population and sequence levels with genetic distance in the breeds considered in this study showed that Zebu breeds had a gene diversity score greater than 0.752, nucleotide diversity score greater than 0.152, and mean number of pairwise differences higher than 14, being very comparable to those investigated for other cattle breeds. Regarding neutrality tests analyzed, we investigated that all the breeds except Hanwoo had an excess number of alleles and could be expected from a recent population expansion or genetic hitchhiking. Howbeit, the observed heterozygosity was not significantly (p < 0.05) higher than the expected heterozygosity. The Hardy Weinberg equilibrium (HWE) analysis revealed non-significant excess of heterozygote animals, indicative of plausible over-dominant selection. The pairwise FST values suggested a low genetic variation among all the breeds (FST = 0.056; p < 0.05), besides the rooting from the evolutionary or domestication history of the cattle. No detached clade was observed in the evolutionary divergence study of the BoLA-DRB3 gene, inferred from the phylogenetic tree based on the maximum likelihood model. The investigation herein indicated the clear differences in BoLA-DRB3 gene variability between African and Asian cattle breeds.

치기공과학생의 임상실습만족도에 대한 조사 연구 -대구지역을 중심으로- (A Study on Satisfaction of Clinical Practice of Dental Technology Students - Focused on Daegu region -)

  • 이화식;배봉진;박명호
    • 대한치과기공학회지
    • /
    • 제31권4호
    • /
    • pp.45-52
    • /
    • 2009
  • This study is analyzed to conduct better on-site practices with recognizing importance of the clinical practice of Dept. of dental technology and use it as a basic material in the clinical practice. Target people who are students studying dental technology in D college in Daegu were questioned by survey. Study results below 1. Average score of the survey about satisfaction of the operating method of clinical practice shows 3.26. In detail elements, 'credit assignment(10 credits)' is 3.65 as the highest score, 'execution period(vacation)' is 3.50, 'choice of the clinical practice organization' is 3.25, 'measures after practice' is 2.98 and 'pre-education' is 2.98 as the lowest score. 2. Through the real clinical practice, 'experience of new equipments and technology' is 3.64 as the highest score, 'choice of lecturer' is 3.61, 'guidance way' is 3.49, 'contents properness' is 3.44, 'environment of practice organization' is 3.36, 'evaluation way' is 3.35 and 'practical use of the evaluation material' is 3.18 as the lowest score. 3. The average score of survey about satisfaction after clinical practice of the participated students is 3.46 that is higher than both 'satisfaction about operating method(3.26)' about clinical practice of college and 'satisfaction about organization(3.44)' about environment of dental craft organizations and labs, guidance way of lecturer and evaluation. 4. In the improvement of distribution of the clinical practice evaluation, in the 'practice organization: college' viewpoint, '7:3' is 35.77% as the highest response, '6:4' is 25.20%, '8:2' is 22.76% and '4:6' is 16.26 in regular sequence. 5. In site evaluation reflection of clinical practice, 50% reflection is 32.93% as the highest percentage, 60% reflection is 26.83%, 20% reflection is 20.73% and 80% reflection is 6.10% in regular sequence. In attendance score, it shows percentage of reflecting 50% and 40% is 26.98%, students wanting to reflect 30% is 25.40%, reflecting 10% is 20.63% and no reflecting is 0%. In result of the analyzed data, clinical practice has to be studied more in considering that clinical practice is important point in education of Dept. of Dental Technology and also problems in college and on-site practice need improvements.

  • PDF

SCORE SEQUENCES IN ORIENTED GRAPHS

  • Pirzada, S.;Naikoo, T.A.;Shah, N.A.
    • Journal of applied mathematics & informatics
    • /
    • 제23권1_2호
    • /
    • pp.257-268
    • /
    • 2007
  • An oriented graph is a digraph with no symmetric pairs of directed arcs and without loops. The score of a vertex $v_i$ in an oriented graph D is $a_{v_i}\;(or\;simply\;a_i)=n-1+d_{v_i}^+-d_{v_i}^-,\;where\; d_{v_i}^+\;and\;d_{v_i}^-$ are the outdegree and indegree, respectively, of $v_i$ and n is the number of vertices in D. In this paper, we give a new proof of Avery's theorem and obtain some stronger inequalities for scores in oriented graphs. We also characterize strongly transitive oriented graphs.

Global Sequence Homology Detection Using Word Conservation Probability

  • Yang, Jae-Seong;Kim, Dae-Kyum;Kim, Jin-Ho;Kim, Sang-Uk
    • Interdisciplinary Bio Central
    • /
    • 제3권4호
    • /
    • pp.14.1-14.9
    • /
    • 2011
  • Protein homology detection is an important issue in comparative genomics. Because of the exponential growth of sequence databases, fast and efficient homology detection tools are urgently needed. Currently, for homology detection, sequence comparison methods using local alignment such as BLAST are generally used as they give a reasonable measure for sequence similarity. However, these methods have drawbacks in offering overall sequence similarity, especially in dealing with eukaryotic genomes that often contain many insertions and duplications on sequences. Also these methods do not provide the explicit models for speciation, thus it is difficult to interpret their similarity measure into homology detection. Here, we present a novel method based on Word Conservation Score (WCS) to address the current limitations of homology detection. Instead of counting each amino acid, we adopted the concept of 'Word' to compare sequences. WCS measures overall sequence similarity by comparing word contents, which is much faster than BLAST comparisons. Furthermore, evolutionary distance between homologous sequences could be measured by WCS. Therefore, we expect that sequence comparison with WCS is useful for the multiple-species-comparisons of large genomes. In the performance comparisons on protein structural classifications, our method showed a considerable improvement over BLAST. Our method found bigger micro-syntenic blocks which consist of orthologs with conserved gene order. By testing on various datasets, we showed that WCS gives faster and better overall similarity measure compared to BLAST.

Reinterpretation of the protein identification process for proteomics data

  • Kwon, Kyung-Hoon;Lee, Sang-Kwang;Cho, Kun;Park, Gun-Wook;Kang, Byeong-Soo;Park, Young-Mok
    • Interdisciplinary Bio Central
    • /
    • 제1권3호
    • /
    • pp.9.1-9.6
    • /
    • 2009
  • Introduction: In the mass spectrometry-based proteomics, biological samples are analyzed to identify proteins by mass spectrometer and database search. Database search is the process to select the best matches to the experimental mass spectra among the amino acid sequence database and we identify the protein as the matched sequence. The match score is defined to find the matches from the database and declare the highest scored hit as the most probable protein. According to the score definition, search result varies. In this study, the difference among search results of different search engines or different databases was investigated, in order to suggest a better way to identify more proteins with higher reliability. Materials and Methods: The protein extract of human mesenchymal stem cell was separated by several bands by one-dimensional electrophorysis. One-dimensional gel was excised one by one, digested by trypsin and analyzed by a mass spectrometer, FT LTQ. The tandem mass (MS/MS) spectra of peptide ions were applied to the database search of X!Tandem, Mascot and Sequest search engines with IPI human database and SwissProt database. The search result was filtered by several threshold probability values of the Trans-Proteomic Pipeline (TPP) of the Institute for Systems Biology. The analysis of the output which was generated from TPP was performed. Results and Discussion: For each MS/MS spectrum, the peptide sequences which were identified from different conditions such as search engines, threshold probability, and sequence database were compared. The main difference of peptide identification at high threshold probability was caused by not the difference of sequence database but the difference of the score. As the threshold probability decreases, the missed peptides appeared. Conversely, in the extremely high threshold level, we missed many true assignments. Conclusion and Prospects: The different identification result of the search engines was mainly caused by the different scoring algorithms. Usually in proteomics high-scored peptides are selected and low-scored peptides are discarded. Many of them are true negatives. By integrating the search results from different parameter and different search engines, the protein identification process can be improved.