• 제목/요약/키워드: word length

검색결과 229건 처리시간 0.021초

문장 길이와 단어 정렬에 기반한 한-영 문장 정렬 (Korean-English Sentence Alignment Based on Sentence Length and Word Alignment)

  • 임재수;서희철;이상주;임해창
    • 한국정보과학회 언어공학연구회:학술대회논문집(한글 및 한국어 정보처리)
    • /
    • 한국정보과학회언어공학연구회 2001년도 제13회 한글 및 한국어 정보처리 학술대회
    • /
    • pp.302-309
    • /
    • 2001
  • 말뭉치를 통한 통계적인 자연 언어 처리에 관한 연구가 다국어 처리 분야에서도 활발히 진행되고 있는 가운데, 본 논문에서는 병렬 말뭉치 구축 및 활용의 기본이 되는 문장 정렬을 위한 효과적인 방법을 제안한다. 먼저, 기존의 문장 길이를 이용한 방법을 한-영 문장 정렬에 적용해 보고, 길이 정보만을 이용했을 때의 한계점을 지적한다. 그리고, 사전과 품사 대응 확률을 이용한 단어 정렬을 통하여, 길이 기반의 정렬 방식이 갖는 문제점을 보완할 수 있는 방법을 제시한다. 실험을 통하여 제안한 방법이 길이에 기반한 방법에 비하여 높은 성능을 나타냄을 알 수 있었다. 또한 한-영 문장 정렬에의 어휘 정보 활용에 있어서 문제가 될 수 있는 요소가 어떤 것들이 있는지 알아본다.

  • PDF

연결 숫자음 인식 시스템의 구현과 성능 변화 (A Study on the Implementation of Connected-Digit Recognition System and Changes of its Performance)

  • 윤영선;박윤상;채의근
    • 대한음성학회지:말소리
    • /
    • 제45호
    • /
    • pp.47-61
    • /
    • 2003
  • In this paper, we consider the implementation of connected digit recognition system and the several approaches to improve its performance. To implement efficiently the fixed or variable length digit recognition system, finite state network (FSN) is required. We merge the word network algorithm that implements the FSN with one pass dynamic programming search algorithm that is used for general speech recognition system for fast search. To find the efficient modeling of digit recognition system, we perform some experiments along the various conditions to affect the performance and summarize the results.

  • PDF

한국어의 중간구 오름조 현상에 대하여 (On the Rising Tone of Intermediate Phrase in Standard Korean)

  • 곽동기
    • 대한음성학회지:말소리
    • /
    • 제40호
    • /
    • pp.13-27
    • /
    • 2000
  • It is generally accepted that there appears the rising tone at the end of the intermediate phrase in standard Korean. There have been discussions about whether the syllable with the rising tone, even if it is a particle or an ending, might be accented or not. The accented syllable is the most prominent one in the given phonological strings. It is determined by the nondistinctive stress which is located on the first or second syllable of lexical word according to vowel length and syllable weight. So pitch does not have any close relationship with accent. The intermediate phrase-final rising tone, therefore, is not associated with accent, but used to convey other pragmatic meanings, that is, i) speech style is more friendly, ii) the speaker tries to send the information for the hearer to hear more clearly, and iii) the speaker wants the hearer to keep on listening to him or her because the speaker's utterance is not complete.

  • PDF

CNC 공작기계의 실시간 3차원 NURBS 보간기 개발 (Development of the Real-Time 3D NURBS Interpolator for CNC Machines)

  • 홍원표;양민양
    • 한국정밀공학회:학술대회논문집
    • /
    • 한국정밀공학회 2000년도 춘계학술대회 논문집
    • /
    • pp.1032-1035
    • /
    • 2000
  • Increasing demands on precision machining with computerized numerical control (CNC) machines have necessitated that the tool to move not only position error as small as possible, but also with smoothly varying feedrates in space. This paper presents a new high precision interpolation algorithm for 3-dimensional (3D) Non-Uniform Rational B-Spline (NURBS) curve in the reference-pulse CNC technique. Based on the minimum path error strategy, real-time NURBS interpolator was developed in software and this was implemented with a PC-NC milling machine. The several experimental results have shown that the proposed NURBS interpolator is useful for the high precision machining of complex shapes. It is expected that this algorithm can be applied to the CNC machines for the machining of 3D free-form surfaces.

  • PDF

신경망을 이용한 우리말 음성의 인식에 관한 연구 - 복합 신경망을 이용한 초성자음 인식에 관한 연구 (A Study on the Word Recognition of Korean Speech using Neural Network- A study on the initial consonant Recognition using composite Neural Network)

  • 김석동;이행세
    • 한국음향학회지
    • /
    • 제11권3호
    • /
    • pp.14-24
    • /
    • 1992
  • 본 논문은 신경망을 이용한 자음인히기에 관한 연구이다. 우선 자음과 모음이 포함된 음성에서 자음부분을 분리하였다. 각각의 자음을 몇개의 집단으로 나누어서 자음구간대 영교차율을 조사하였다. 마지막으로 자음을 인식하기 위해 제어망과 몇개의 소규모 망으로 구성한 혼합 신경망을 제안한다. 제어망은 입력된 자음이 어느 집단에 속하는가를 결정하고, 소규모망에서는 각 집단에 속하는 자음을 인식한다.

  • PDF

문서 이미지에서 문자 추출과 3차원 면적-가중치 그래프를 이용한 단어 그룹핑 (Text Extraction and Word Grouping using 3D Area-Weighted Graph in Document)

  • 옥세영;박환철;조환규
    • 한국정보과학회:학술대회논문집
    • /
    • 한국정보과학회 1998년도 가을 학술발표논문집 Vol.25 No.2 (2)
    • /
    • pp.556-558
    • /
    • 1998
  • 이미지 분석이나 데이터 베이스 인덱싱 또는 종이 문서를 전자 문서화 하는 문제는컴퓨터 비젼 응용분야에서 중요 관심사가 되어왔다. 이러한 문제들을 처리하기 위해서는 제일 먼저 이미지와 문자가 혼합되어 있는 문서에서 자동으로 문자와 이미지들을 분리해 내는 과정이 필수 적이다. 본 논문에서는 신문이나 광고등에서 볼 수 있는 이미지, 음각 문자와 양각 문자가 섞여 있는 문서에서 문자만을 추출하는 알고리즘을 제안한다. 이 알고리즘은 Run-length code를 이용하여 문자나 이미지의 경계선(bound) 모양의 특징을 추출하여 음각 문자와 이미지, 양각 문자를 구분한다. 그리고 추출된 글자들을 3차원 공간상에 매핑한 후 3차원 면적 가중치 그래프를 이용하여 관련된 단어들로 묶어주는 3차원 그룹핑 알고리즘을 제시한다. 실험결과로는 추출된 문자와 그룹핑된 결과를 보여준다.

  • PDF

MSVQ를 이용한 HMM에 의한 단독어 인식 (Isolated Word Recognition By HMM using Multisection MSVQ)

  • 안태옥;변용규;김순협
    • 대한전자공학회논문지
    • /
    • 제27권9호
    • /
    • pp.1468-1475
    • /
    • 1990
  • In this paper, isolated words are recognized using multisection VQ and HMM. As recognition vocabuaries, 20 area-name which is uttered 5 times by 3 speakers is selected. In generating codebook, we devide recognition vocabulary into equal length, section, and make standard VQ codebook to each section and calculate observation by section and than recognize isolated words by HMM training. Multisection VQ codebook has time information and as observation is calculated by eacy section, computation is lesser and recongnition rate is higher than by whole codword. As a result, it is proved that recognition rate is higher in case of HMM using multisection VQ codebook.

  • PDF

찰벼품종을 달리하여 제조한 유과의 품질 특성 비교 (Comparison of Some Characteristics Relevent to Yukwa (Fried Rice Cookie) made from Different Waxy Rice Cultivars)

  • 최영희;강미영
    • 동아시아식생활학회지
    • /
    • 제10권1호
    • /
    • pp.71-76
    • /
    • 2000
  • This study was carried out in order to investigated the degree of expansion, textural and sensory characteristics of Yukwa made from various cultivars of waxy rice, With 5 varieties of waxy rice and a nonwaxy rice, Yukwa were prepared by the standardized method that had been established optimum preparation conditions. Yukwa made from Shinsunchalbyeo and Whasunchal showed lower degree of expansion than Hangangchalbyeo and IR 29, but showed higher crispness and softer texture among tested waxy rice cultivars. Sensory characteristics of these cultivars showed high score in flavor, crispness and preference. Whasunchalbyeo and Shinsunchalbyeo were appropriate varieties for Yukwa preparation and they were both short grain in length/width. Whasunchalbyeo has the highest score of water uptake and reducing sugar content in Key word.

  • PDF

4-수준 계량인자가 포함된 반사계획에 관한 연구 (A Study on Developing Fold-Over Designs with Four-Level Quantitative Factors)

  • 최규필;변재현
    • 대한산업공학회지
    • /
    • 제28권3호
    • /
    • pp.283-290
    • /
    • 2002
  • Two-level fractional factorial designs are widely used when many factors are considered. When two-level fractional factorial designs are used, some effects are confounded with each other. To break the confounding between effects, we can use fractional factorial designs, called fold-over designs, in which certain signs in the design generators are switched. In this paper, optimal fold-over designs with four-level quantitative and two-level factors are presented for (1) the initial designs without curvature effect and (2) those with curvature effect. Optimal fold-over design tables are provided for 8-run, 16-run, and 32-run experiments.

한국어 다음절 단어의 초성, 중성, 종성단위의 음절간 조건부 확률 (Conditional Probability of a 'Choseong', a 'Jungseong', and a 'Jongseong' Between Syllables in Multi-Syllable Korean Words)

  • 이재홍;이재학
    • 전자공학회논문지B
    • /
    • 제28B권9호
    • /
    • pp.692-703
    • /
    • 1991
  • A Korean word is composed of syllables. A Korean syllable is regarded as a random variable according to its probabilistic property in occurrence. A Korean syllable is divided into 'choseong', 'jungseong', and 'jongseong' which are regarded as random variables. We can consider teh conditional probatility of syllable as an index which represents the occurrence correlation between syllables in Korean words. Since the number of syllables is enormous, we use the conditional probability of a' choseong', a 'jungseong', and a 'jongseong' between syllables as an index which represents the occurrence correlation between syllables in Korean words. The length distribution of Korean woeds is computed according to frequency and to kind. Form the cumulative frequency of a Korean syllable computed from multi-syllable Korean woeds, all probabilities and conditiona probabilities are computed for the three random variables. The conditional probabilities of 'choseong'- 'choseong', 'jungseong'- 'jungseong', 'jongseong'-'jongseong', 'jongseong'-'choseong' between adjacent syllables in multi-syllable Korean woeds are computed.

  • PDF