• 제목/요약/키워드: Lexical model

검색결과 98건 처리시간 0.02초

어절 내 형태소 출현 정보와 클러스터링 기법을 이용한 어휘지식 자동 획득 (The automatic Lexical Knowledge acquisition using morpheme information and Clustering techniques)

  • 유원희;서태원;임희석
    • 컴퓨터교육학회논문지
    • /
    • 제13권1호
    • /
    • pp.65-73
    • /
    • 2010
  • 본 논문은 자연어처리 연구를 위하여 지도학습(supervised learning)방식의 어휘지식(lexical knowledge) 수동 구축 방법의 한계점을 극복하기 위하여 비지도학습(unsupervised learning)방식의 자동 어휘지식 획득 모델을 제안한다. 제안하는 모델은 벡터화, 클러스터링, 어휘지식 획득 과정을 통하여 입력으로 주어지는 어휘목록에서 어휘지식을 자동으로 획득한다. 모델의 어휘지식 획득 과정에서 파라미터 변화에 따른 어휘지식 개수의 변화와 어휘지식의 특징이 나타나는 어휘 지식 사전의 일부 모습을 보인다. 실험결과 어휘지식 중 하나로 획득되는 어휘범주 지식의 클러스터가 일정한 개수에서 수렴하는 것이 관찰되어 어휘지식을 필요로 하는 전자사전 자동구축의 가능성을 확인하였다. 또한 한국어 특성이 반영되어 좌 우 통사정보가 포함된 어휘사전을 구축하였다.

  • PDF

한중일영 다국어 어휘 데이터베이스의 모형

  • 차재은;강범모
    • 한국언어정보학회:학술대회논문집
    • /
    • 한국언어정보학회 2002년도 학술대회 발표논문집
    • /
    • pp.48-67
    • /
    • 2002
  • This paper is a report on part of the results of a research project entitled "Research and Model Development for a Multi-Lingual Lexical Database". It Is a six-year project in which we aim to construct a model of a multilingual lexical database of Korean, Chinese, Japanese, and English. Now we have finished the first two-year stage of the project In this paper, we present the goal of the project, the construction model of items in the lexical database, and the possible (semi-)automatic methods of acquisition of lexical information. As an appendix, we present some sample items of the database as an i1lustration.

  • PDF

한국어 어휘습득의 계산주의적 모델 (A Computational Model for Lexical Acquisition in Korean)

  • 유원희;박기남;류기곤;임희석;남기춘
    • 대한음성학회:학술대회논문집
    • /
    • 대한음성학회 2007년도 한국음성과학회 공동학술대회 발표논문집
    • /
    • pp.135-137
    • /
    • 2007
  • This study has experimented and materialized a computational lexical processing model which hybridizes full model and decomposition model as applying lexical acquisition, one of early stages of human lexical processes, to Korean. As the result of the study, we could simulate the lexical acquisition process of linguistic input through experiments and studying, and suggest a theoretical foundation for the order of acquitting certain grammatical categories. Also, the model of this study has shown proofs with which we can infer the type of the mental lexicon of the human cerebrum through fu1l-list dictionary and decomposition dictionary which were automatically produced in the study.

  • PDF

트리 구조 어휘 사전을 이용한 연결 숫자음 인식 시스템의 구현 (Implementation of Connected-Digit Recognition System Using Tree Structured Lexicon Model)

  • 윤영선;채의근
    • 대한음성학회지:말소리
    • /
    • 제50호
    • /
    • pp.123-137
    • /
    • 2004
  • In this paper, we consider the implementation of connected digit recognition system using tree structured lexicon model. To implement efficiently the fixed or variable length digit recognition system, finite state network (FSN) is required. We merge the word network algorithm that implements the FSN with lexical tree search algorithm that is used for general speech recognition system for fast search and large vocabulary systems. To find the efficient modeling of digit recognition system, we investigate some performance changes when the lexical tree search is applied.

  • PDF

어휘판단 과제 시 보이는 언어현상의 계산주의적 모델 설계 및 구현 (Design and Implementation of Computational Model Simulating Language Phenomena in Lexical Decision Task)

  • 박기남;임희석;남기춘
    • 컴퓨터교육학회논문지
    • /
    • 제9권2호
    • /
    • pp.89-99
    • /
    • 2006
  • 본 논문은 인지신경과학의 연구 방법으로 주로 사용되는 어휘판단과제LDT:ILexical decision task) 시 보이는 언어현상을 모사할 수 있는 계산주의 모델(computational model)을 제안한다. 제안하는 모델은 LDT 시 언어와 독립적으로 관찰되는 언어현상인 빈도효과, 어휘성효과, 단어유사성효과, 시각적쇠퇴효과, 의미점화효과, 그리고 반복점화효과 등을 모사할 수 있도록 설계되었다. 실험결과, 제안한 모델은 통계적으로 유의미하게 빈도효과, 어휘성 효과, 단어유사성 효과, 시각적 쇠퇴효과 그리고 의미점화 효과를 모사할 수 있었으며, LDT 시 인간 피험자와 유사한 양상의 수행 양식을 보였다.

  • PDF

악성 URL 탐지를 위한 URL Lexical Feature 기반의 DL-ML Fusion Hybrid 모델 (DL-ML Fusion Hybrid Model for Malicious Web Site URL Detection Based on URL Lexical Features)

  • 김대엽
    • 정보보호학회논문지
    • /
    • 제33권6호
    • /
    • pp.881-891
    • /
    • 2023
  • 최근에는 인공지능을 활용하여 악성 URL을 탐지하는 다양한 연구가 진행되고 있으며, 대부분의 연구 결과에서 높은 탐지 성능을 보였다. 그러나 고전 머신러닝을 활용하는 경우 feature를 분석하고 선별해야 하는 추가 비용이 발생하며, 데이터 분석가의 역량에 따라 탐지 성능이 결정되는 이슈가 있다. 본 논문에서는 이러한 이슈를 해결하기 위해 URL lexical feature를 자동으로 추출하는 딥러닝 모델의 일부가 고전 머신러닝 모델에 결합된 형태인 DL-ML Fusion Hybrid 모델을 제안한다. 제안한 모델로 직접 수집한 총 6만 개의 악성과 정상 URL을 학습한 결과 탐지 성능이 최대 23.98%p 향상되었을 뿐만 아니라, 자동화된 feature engineering을 통해 효율적인 기계학습이 가능하였다.

Lexical and Semantic Incongruities between the Lexicons of English and Korean

  • Lee, Yae-Sheik
    • 한국언어정보학회지:언어와정보
    • /
    • 제5권2호
    • /
    • pp.21-37
    • /
    • 2001
  • Pustejovsky (1995) rekindled debate on the dual problems of how to represent lexical meaning and on the information that is to be encoded in a lexicon. For natural language processing such as machine translation, these are important issues. When a lexical-conceptual mismatch occurs in translation of corresponding words from two different languages, the appropriate representation of their meanings is very important. This paper proposes a new formalism for representing lexical entries by first analysing observable mismatches in comparable pairs of nouns, verbs, and adjectives in English and Korean. Inherent mis-interpretations and mis-readings in each pair are identified. Then, concept theories such as those presented by Ganter and Wille (1996) and Priss (1998) are extended in order to reflect the cognitivist view that meaning resides in concept, and also to incorporate the propositions of the so-called ‘multiple inheritance’system. An alternative to the formalism of Pustejovsky (1995) and Pollard & Sag (1994) is then proposed. Finally, representative examples of lexical mismatches are analysed using the new model.

  • PDF

한국어 용언 어절 재인에 미치는 어휘 변인의 영향 -모어 화자와 고급 학습자의 예- (The Influence of Lexical Factors on Verbal Eojeol Recognition: Evidence from L1 Korean Speakers and L2 Korean Learners)

  • 김영주;이선진;이은하;남기춘;전현애;이선영
    • 한국어교육
    • /
    • 제29권3호
    • /
    • pp.25-53
    • /
    • 2018
  • This study examined the influence of lexical factors on verbal Eojeol recognition. To meet the goal, forty-five L2 Korean learners and twenty-two Korean native speakers took Eojeol decision tasks measured with the lexical factors such as 'number of strokes', 'number of consonants and vowels', 'number of syllables', 'number of morphemes', 'whole Eojeol frequency', 'root frequency', 'first-syllable-sharing frequency', and 'number of dictionary meanings.' As a result, 'whole Eojeol frequency' was the most effective factor to predict Eojeol recognition reaction time for native speakers and L2 learners, which supports the full-list model. Other lexical factors influencing Eojeol recognition reaction time in L2 learners were different following their proficiency level.

부모-유아 어휘 상호작용 척도의 개발 및 타당화 (Development and Validation of Parent-child Lexical Interaction Scale for Preschoolers (PLIS-P))

  • 정수지;최나야
    • Human Ecology Research
    • /
    • 제58권3호
    • /
    • pp.429-445
    • /
    • 2020
  • This study developed and validated a 'Parent-child Lexical Interaction Scale for Preschoolers (PLIS-P)'. First, we developed the preliminary scale with 7 factors after reviewing previous literature related to vocabulary and literacy instruction for young children and reflected on feedback from child studies experts and mothers with young children. Subsequently, to validate the scale, the online survey was conducted on mothers with 5-to 6-year-old children who live in Seoul, Gyeonggi, Incheon, Gyeongsang, Chungcheong, Jeolla, Gangwon, and Jeju. Responses from 309 mothers were used to conduct exploratory and confirmatory factor analysis and correlation analysis. The results were as follows. First, the result of exploratory analysis showed that the model with 7 factors was satisfactory: (1) vocabulary exposure, (2) word elaboration, (3) scaffolding, (4) play activity, (5) conventional instruction, (6) word type awareness instruction, (7) word morphology instruction. Second, confirmatory factor analysis confirmed the good fit of the model. Third, the concurrent validity was confirmed by correlation analysis using EC-HOME. Last, the internal consistency reliability of each factor of PLIS-P was also confirmed. This study developed both a theoretical framework of parent-child lexical interaction and a Parent-child Lexical Interaction Scale for Preschoolers. This scale can be used by parents, practitioners, and researchers to acquire knowledge about interaction related to words between Korean parents and young children.

DHMM과 어휘해석을 이용한 Voice dialing 시스템 (The Voice Dialing System Using Dynamic Hidden Markov Models and Lexical Analysis)

  • 최성호;이강성;김순협
    • 전자공학회논문지B
    • /
    • 제28B권7호
    • /
    • pp.548-556
    • /
    • 1991
  • In this paper, Korean spoken continuous digits are ercognized using DHMM(Dynamic Hidden Markov Model) and lexical analysis to provide the base of developing voice dialing system. After segmentation by phoneme unit, it is recognized. This system can be divided into the segmentation section, the design of standard speech section, the recognition section, and the lexical analysis section. In the segmentation section, it is segmented using the ZCR, O order LPC cepstrum, and Ai, parameter of voice speech dectaction, which is changed according to time. In the standard speech design section, 19 phonemes or syllables are trained by DHMM and designed as a standard speech. In the recognition section, phomeme stream are recognized by the Viterbi algorithm.In the lexical decoder section, finally recognized continuous digits are outputed. This experiment shiwed the recognition rate of 85.1% using data spoken 7 times of 21 classes of 7 continuous digits which are combinated all of the occurence, spoken by 10 man.

  • PDF