• Title/Summary/Keyword: Word order

Search Result 1,011, Processing Time 0.023 seconds

The Method of the Evaluation of Verbal Lexical-Semantic Network Using the Automatic Word Clustering System (단어클러스터링 시스템을 이용한 어휘의미망의 활용평가 방안)

  • Kim, Hae-Gyung;Song, Mi-Young
    • Korean Journal of Oriental Medicine
    • /
    • v.12 no.3 s.18
    • /
    • pp.1-15
    • /
    • 2006
  • For the recent several years, there has been much interest in lexical semantic network. However, it seems to be very difficult to evaluate the effectiveness and correctness of it and invent the methods for applying it into various problem domains. In order to offer the fundamental ideas about how to evaluate and utilize lexical semantic networks, we developed two automatic word clustering systems, which are called system A and system B respectively. 68,455,856 words were used to learn both systems. We compared the clustering results of system A to those of system B which is extended by the lexical-semantic network. The system B is extended by reconstructing the feature vectors which are used the elements of the lexical-semantic network of 3,656 '-ha' verbs. The target data is the 'multilingual Word Net-CoreNet'.When we compared the accuracy of the system A and system B, we found that system B showed the accuracy of 46.6% which is better than that of system A, 45.3%.

  • PDF

A Method for Automatic Detection of Character Encoding of Multi Language Document File (다중 언어로 작성된 문서 파일에 적용된 문자 인코딩 자동 인식 기법)

  • Seo, Min Ji;Kim, Myung Ho
    • KIISE Transactions on Computing Practices
    • /
    • v.22 no.4
    • /
    • pp.170-177
    • /
    • 2016
  • Character encoding is a method for changing a document to a binary document file using the code table for storage in a computer. When people decode a binary document file in a computer to be read, they must know the code table applied to the file at the encoding stage in order to get the original document. Identifying the code table used for encoding the file is thus an essential part of decoding. In this paper, we propose a method for detecting the character code of the given binary document file automatically. The method uses many techniques to increase the detection rate, such as a character code range detection, escape character detection, character code characteristic detection, and commonly used word detection. The commonly used word detection method uses multiple word database, which means this method can achieve a much higher detection rate for multi-language files as compared with other methods. If the proportion of language is 20% less than in the document, the conventional method has about 50% encoding recognition. In the case of the proposed method, regardless of the proportion of language, there is up to 96% encoding recognition.

The Processing System of English for Korean: Focused on the Interaction with Native Language Processing (한국인의 영어처리의 기제: 모국어처리와의 상호작용을 중심으로)

  • 이창환;강봉경
    • Korean Journal of Cognitive Science
    • /
    • v.15 no.2
    • /
    • pp.43-53
    • /
    • 2004
  • The purpose of this study was to investigate the role of phonology in lexical access of bilingual processing for Korean-English bilinguals. Four experiments have been conducted in order to adjudicate the nonselective lexical access hypothesis, which argues simultaneous phonological activation of two bilingual languages, and the selective lexical access hypothesis. which argues phonological activation of only one bilingual language. The results showed that the Korean target word processing was significantly affected by the phonological manipulation of the English target word(Exp. 2). Similarly, the English target word processing showed the tendencies that it is affected by the phonological manipulation of the Korean prime word(Exp. 2). This results indicates that the phonological information of another bilingual language is automatically activated when we process one of bilingual languages, and the process of English which is the second language for most Korean, is phonologically activated.

  • PDF

A Study on the Deduction of Social Issues Applying Word Embedding: With an Empasis on News Articles related to the Disables (단어 임베딩(Word Embedding) 기법을 적용한 키워드 중심의 사회적 이슈 도출 연구: 장애인 관련 뉴스 기사를 중심으로)

  • Choi, Garam;Choi, Sung-Pil
    • Journal of the Korean Society for information Management
    • /
    • v.35 no.1
    • /
    • pp.231-250
    • /
    • 2018
  • In this paper, we propose a new methodology for extracting and formalizing subjective topics at a specific time using a set of keywords extracted automatically from online news articles. To do this, we first extracted a set of keywords by applying TF-IDF methods selected by a series of comparative experiments on various statistical weighting schemes that can measure the importance of individual words in a large set of texts. In order to effectively calculate the semantic relation between extracted keywords, a set of word embedding vectors was constructed by using about 1,000,000 news articles collected separately. Individual keywords extracted were quantified in the form of numerical vectors and clustered by K-means algorithm. As a result of qualitative in-depth analysis of each keyword cluster finally obtained, we witnessed that most of the clusters were evaluated as appropriate topics with sufficient semantic concentration for us to easily assign labels to them.

Voice Command Web Browser Using Variable Vocabulary Word Recognizer (가변어휘 단어 인식기를 사용한 음성 명령 웹 브라우저)

  • 이항섭
    • The Journal of the Acoustical Society of Korea
    • /
    • v.18 no.2
    • /
    • pp.48-52
    • /
    • 1999
  • In this paper, we describe a Voice Command Web Browser using a variable vocabulary word recognizer that can do Internet surfing with Korean speech recognition on the Web. The feature of this browser is that it can handle the links and menus of the web browser by speech. Therefore, we can use speech interface together with mouse for web browsing. To recognize the recognition candidates dynamically changing according to Web pages, we use the variable vocabulary word recognizer. The recognizer was trained using POW (Phonetically Optimized Words) 3,848 words. So that it can recognize new words which did not exist in training data. The preliminary test results showed that the performance of speaker-independent and vocabulary-independent recognition is 93.8% for 32 Korean words. The Voice Command Web Browser was developed on windows 95/NT using Netscape Navigator and reflected usability test results in order to offer easy interface to users unfamiliar with speech interface. In on-line experiment of speaker-independent and environment-independent situation, Voice Command Web Browser showed recognition accuracy of 90%.

  • PDF

A Study on the Law2Vec Model for Searching Related Law (연관법령 검색을 위한 워드 임베딩 기반 Law2Vec 모형 연구)

  • Kim, Nari;Kim, Hyoung Joong
    • Journal of Digital Contents Society
    • /
    • v.18 no.7
    • /
    • pp.1419-1425
    • /
    • 2017
  • The ultimate goal of legal knowledge search is to obtain optimal legal information based on laws and precedent. Text mining research is actively being undertaken to meet the needs of efficient retrieval from large scale data. A typical method is to use a word embedding algorithm based on Neural Net. This paper demonstrates how to search relevant information, applying Korean law information to word embedding. First, we extracts reference laws from precedents in order and takes reference laws as input of Law2Vec. The model learns a law by predicting its surrounding context law. The algorithm then moves over each law in the corpus and repeats the training step. After the training finished, we could infer the relationship between the laws via the embedding method. The search performance was evaluated based on precision and the recall rate which are computed from how closely the results are associated to the search terms. The test result proved that what this paper proposes is much more useful compared to existing systems utilizing only keyword search when it comes to extracting related laws.

Spatial Gap Estimation for Word Separation in Handwritten Legal Amounts on BAnk Check (필기체 수표 금액 문장에서의 단어 분리를 위한 공간적 간격 추정)

  • Kim In-cheol;Kim Kyoung-min
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.9 no.5
    • /
    • pp.1096-1101
    • /
    • 2005
  • An efficient method of estimating the spatial gaps between the connected components has been prposed to separatethe individual words from a handwritten legal amount on bank check. Owing to the inherent problem of underestimation or overestimation, the previous gap measures have much difficulty in being applied to the legal amounts that usually include the great shape variability by writer's unconstrained writing style and touching or irregular gaps between words by space limitation. In order to alleviate such burden and improve word separation performance, we have developed a modified version of each distance measure. Through a series of word separation experiments, we found that the modified distance measures show a better performance with over $2-3\%$ of the word separation rate than their corresponding original distance measures.

Research of the Relationship between the Hotel Wedding Service Qualities and Customer Satisfaction, and the Word-of-Mouth Intention as a Moderating Variable (호텔예식 서비스품질과 만족간의 관계 및 구전의도의 조절효과 연구)

  • Song, Young-Seok;Kim, Yeon-Sun
    • The Journal of the Korea Contents Association
    • /
    • v.12 no.7
    • /
    • pp.406-414
    • /
    • 2012
  • The purpose of this study tries to find some competitive marketing strategies of the hotel wedding service. In order to do that, this study positively analyzes the followings: First, the causality between the qualities of hotel wedding service and customer satisfaction. Second, the word-of-mouth intention on the causality. The study bases on the survey that has been conducted for five-star hotel customers in Seoul. 20 questionnaires were circulated in each for 12 five-star hotels in Seoul. The number of sample used in this study is 315. This study draws two conclusions from the above survey analysis: First, the hotel wedding service influences customer satisfaction. Second, customers' word-of-mouth plays a meaningful controlling role between the qualities of hotel wedding service and customer satisfaction. Regarding the results and limits of this study, more researches are needed into the products of hotel wedding service and the sample segmentation.

The Processing System of English for Korean : Focused on the Interaction with Native Language Processing (한국인의 영어처리의 기제 : 모국어처리와의 상호작용을 중심으로)

  • Lee, Chang-H.;Kang, Bong-Kyeng
    • Annual Conference on Human and Language Technology
    • /
    • 2004.10d
    • /
    • pp.240-247
    • /
    • 2004
  • The purpose of this study was to investigate the role of phonology in lexical access of bilingual processing for Korean-English bilinguals. Four experiments have been conducted in order to adjudicate the nonselective lexical access hypothesis, which argues simultaneous phonological activation of two bilingual languages, and the selective lexical access hypothesis, which argues phonological activation of only one bilingual language. The results showed that the Korean target word processing was significantly affected by the phonological manipulation of the English prime word(Exp. 2). Similarly, the English target word processing showed the tendencies that it is affected by the phonological manipulation of the Korean prime word(Exp. 2). This results indicates that the phonological information of another bilingual language is automatically activated when we process one of bilingual languages, and the process of English, which is the second language for most Korean, is phonologically activated.

  • PDF

The influence of Instagram's posts information attributes on acceptable intentions and word of mouth effect: focusing on college student in South Korea and the United states (인스타그램의 게시글 정보특성과 수용의도 및 구전효과의 영향관계 연구: 한국, 미국 대학생을 중심으로)

  • Park, Se-June;Cho, Seung-Ho
    • Journal of Digital Convergence
    • /
    • v.13 no.9
    • /
    • pp.115-128
    • /
    • 2015
  • As generation of Web 2.0 comes in, enormous information of corporation from various platform are being produced. However, corporations should understand features of each platform and appropriate strategies in order to attract the public in the midst of such flood of information. Numerous studies have been conducted regarding SNS which has grown rapidly in recent but a study relating a specific medium is relatively in short. So this study analyzed how information of Instagram bulletin board is accepted in perspective of consumer in Korean and America, We examined the relationship between intention of acceptance and Word Of Mouth effect through meditating effect of information usefulness. To answer the research question, we conducted online survey with Korean and USA college students. The result showed that usefulness of the information was shown to the major intermediary variable between the information characteristics of bulletin board and the intention of acceptance intention and Word Of Mouth(WOM).