• Title/Summary/Keyword: Korean nouns

Search Result 232, Processing Time 0.026 seconds

Efficient Keyword Extraction from Social Big Data Based on Cohesion Scoring

  • Kim, Hyeon Gyu
    • Journal of the Korea Society of Computer and Information
    • /
    • v.25 no.10
    • /
    • pp.87-94
    • /
    • 2020
  • Social reviews such as SNS feeds and blog articles have been widely used to extract keywords reflecting opinions and complaints from users' perspective, and often include proper nouns or new words reflecting recent trends. In general, these words are not included in a dictionary, so conventional morphological analyzers may not detect and extract those words from the reviews properly. In addition, due to their high processing time, it is inadequate to provide analysis results in a timely manner. This paper presents a method for efficient keyword extraction from social reviews based on the notion of cohesion scoring. Cohesion scores can be calculated based on word frequencies, so keyword extraction can be performed without a dictionary when using it. On the other hand, their accuracy can be degraded when input data with poor spacing is given. Regarding this, an algorithm is presented which improves the existing cohesion scoring mechanism using the structure of a word tree. Our experiment results show that it took only 0.008 seconds to extract keywords from 1,000 reviews in the proposed method while resulting in 15.5% error ratio which is better than the existing morphological analyzers.

An investigation of Function Analysis patterns for the Effective VE at the Design Phase (효과적인 설계VE 활동을 위한 기능분석 유형조사)

  • Min Kyung-Seok
    • Korean Journal of Construction Engineering and Management
    • /
    • v.5 no.6 s.22
    • /
    • pp.63-71
    • /
    • 2004
  • This study is an analysis the function Analysis Patterns for the effective application of VE(Value Engineering) and a presentation of the function analysis methods. 1. The function analysis of VE activities can be summarized 6 patterns as follows ; a generated ideas without function analysis process, a function analysis make use of inspection tools for generated idea, an illogical jump as a using of subjective terms, a using of duplication terms for same functions, a subjective ranking as function definitions for function analysis, and a large classification for a for approaching function definition. 2. In order for the effective function analysis, the process requires as follows; a check level by project advanced, a reclassified main nouns in the order of frequency of use, a main check objects by concepts of importance and satisfaction degree. It is not only good for the effective function analysis, it also induces effective turn out of the FAST Diagram for function arrangement.

Foreign Language Education of Korean Peninsula: Insights from Nogeldae (『노걸대』 분석을 통해서 바라본 우리 반도의 외국어 교육)

  • Kim, Jeong-ryeol
    • The Journal of the Korea Contents Association
    • /
    • v.17 no.6
    • /
    • pp.408-414
    • /
    • 2017
  • This paper aims to investigate the value and resilience of Nogeoldae which was written at the end of Koryo dynasty and has been used as the most important foreign language education materials throughout the 500 years of Chosun dynasty. To this end, 106 volumes of dialogues, 12 of meeting, 17 of lodging, 21 of Daedo bound, 34 of Daedo lives and 11 of return in Nogeoldae are analyzed by an average length of the sentences, an average length of words, type-token ratio, number of words before main verbs and number of words before nouns to identify the progressive degree of the complexity. The result of the analysis shows that Nogeoldae presents a desired progressive complexity found in modern foreign language textbooks.

Mention Detection with Pointer Networks (포인터 네트워크를 이용한 멘션탐지)

  • Park, Cheoneum;Lee, Changki
    • Journal of KIISE
    • /
    • v.44 no.8
    • /
    • pp.774-781
    • /
    • 2017
  • Mention detection systems use nouns or noun phrases as a head and construct a chunk of text that defines any meaning, including a modifier. The term "mention detection" relates to the extraction of mentions in a document. In the mentions, a coreference resolution pertains to finding out if various mentions have the same meaning to each other. A pointer network is a model based on a recurrent neural network (RNN) encoder-decoder, and outputs a list of elements that correspond to input sequence. In this paper, we propose the use of mention detection using pointer networks. Our proposed model can solve the problem of overlapped mention detection, an issue that could not be solved by sequence labeling when applying the pointer network to the mention detection. As a result of this experiment, performance of the proposed mention detection model showed an F1 of 80.07%, a 7.65%p higher than rule-based mention detection; a co-reference resolution performance using this mention detection model showed a CoNLL F1 of 52.67% (mention boundary), and a CoNLL F1 of 60.11% (head boundary) that is high, 7.68%p, or 1.5%p more than coreference resolution using rule-based mention detection.

Korean Compound Nouns Decomposition Suitable for Embedded Systems (임베디드 시스템에 적합한 한국어 복합명사 분해)

  • Choi, Min-Seok;Kim, Chang-Hyun;Cheon, Min-Ah;Park, Ho-Min;Namgoong, Young;Yoon, Ho;Kim, Jae-Hoon
    • Annual Conference on Human and Language Technology
    • /
    • 2018.10a
    • /
    • pp.316-320
    • /
    • 2018
  • 복합명사는 둘 이상의 말이 결합된 명사를 말하며 문장에서 하나의 단어로 간주된다, 그러나 맞춤법 및 띄어쓰기 검사나 정보검색의 색인어 추출, 기계번역의 미등록어 추정 등의 분야에서는 복합명사를 구성하는 개별 단어를 확인할 필요가 있다. 이 과정을 복합명사 분해라고 한다. 복합명사를 분해하는 방법으로 크게 규칙 기반 방법, 통계 기반 방법 등이 있으며 본 논문에서는 규칙을 기반으로 최소한의 통계 정보를 이용하는 방법을 제안한다. 본 논문은 4개의 분해 규칙을 적용하여 분해 후보를 생성하고 분해 후보들 중에 우선순위를 정하여 최적 후보를 선택하는 방법을 제안한다. 기본 단어(명사)로 트라이(trie)를 구축하고 구축된 트라이를 이용하여 양방향 최장일치를 적용하고 음절 쌍의 통계정보를 이용해서 모호성을 제거한다. 성능을 평가하기 위해 70,000여 개의 명사 사전과 음절 쌍 통계정보를 구축하였고, 이를 바탕으로 복합명사를 분해하였으며, 분해 정확도는 단어 구성비를 반영하면 96.63%이다. 제안된 복합명사 분해 방법은 최소한의 데이터를 이용하여 복합명사 분해를 수행하였으며 트라이 자료구조를 사용해서 사전의 크기를 줄이고 사전의 검색 속도를 개선하였다. 그 결과로 임베디드 시스템과 같은 소형 기기의 환경에 적합한 복합명사 분해 시스템을 구현할 수 있었다.

  • PDF

Named Entity Recognition and Dictionary Construction for Korean Title: Books, Movies, Music and TV Programs (한국어 제목 개체명 인식 및 사전 구축: 도서, 영화, 음악, TV프로그램)

  • Park, Yongmin;Lee, Jae Sung
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.3 no.7
    • /
    • pp.285-292
    • /
    • 2014
  • A named entity recognition method is used to improve the performance of information retrieval systems, question answering systems, machine translation systems and so on. The targets of the named entity recognition are usually PLOs (persons, locations and organizations). They are usually proper nouns or unregistered words, and traditional named entity recognizers use these characteristics to find out named entity candidates. The titles of books, movies and TV programs have different characteristics than PLO entities. They are sometimes multiple phrases, one sentence, or special characters. This makes it difficult to find the named entity candidates. In this paper we propose a method to quickly extract title named entities from news articles and automatically build a named entity dictionary for the titles. For the candidates identification, the word phrases enclosed with special symbols in a sentence are firstly extracted, and then verified by the SVM with using feature words and their distances. For the classification of the extracted title candidates, SVM is used with the mutual information of word contexts.

A Process Model for Effective Idea Creation and Administration of Value Engineering at Design Phase Activity (설계VE활동의 효과적인 아이디어 창출 및 관리를 위한 프로세스 모델)

  • Kim, Hong-Hyun;Min, Kyung-Seok
    • Korean Journal of Construction Engineering and Management
    • /
    • v.10 no.3
    • /
    • pp.13-21
    • /
    • 2009
  • The ideas created by the execution of Value Engineering(VE) during design phase are usually proposed in a scattered way due to the characteristic of brainstorming. Because of this process, it is likely that similar ideas are proposed duplicatedly, and the created similar overlapping ideas may lead to different analyses and evaluations respectively. The efficient analysis/evaluation of the such similarly created ideas are not easy and the verification of the adequacy and objectivity of the ideas becomes harder and the objective evaluation gets more difficult as well as consumes much time. In spite of the perfect executions of preparatory step for idea creation, the difficulties in managing the similarly created ideas in quantity systematically result in the reduced reliability of the VE activities during design phase. Thus, this study proposes a process model for the efficient idea creation and management of the VE activities during design phase.

An Implementation of Search System based on Natural Language Index Incorporating considering Image Characteristics (이미지 특성을 고려한 자연어 색인 기반의 검색시스템 구현)

  • Kim, Jung-Yee;Lee, Ki-Wook;Lee, Kang-Ho
    • Journal of the Korea Society of Computer and Information
    • /
    • v.11 no.5 s.43
    • /
    • pp.337-343
    • /
    • 2006
  • The number of digital camera users is increasing rapidly and countless number of photos floats about on the internet especially through wide-spreading Cyworld and blogs. Though portraits cover a large percentage of those photos, because of the property rights, near entirety or such photos are unavailable for use by web-page producers, advertising companies, web-designers, and so on, who need a variety of portraits with differing expressions and characteristics. This study offers a search engine that incorporates image characteristics based on natural language index, which can provide a fast and reliable search result. It will create an opportunity for the digital photographers to mure easily sell their pictures and simultaneously provide the would-be users of the photos a better and easier way to find the pictures they are looking for. Once the search engine is realized, it will become possible to use not only the nouns as keywords and categories but also verbs in search of portraits revealing feelings, expressions, dressings, and other characteristics.

  • PDF

Language Lateralization Using Magnetoencephalography (MEG): A Preliminary Study (뇌자도를 이용한 언어 편재화: 예비 연구)

  • Lee, Seo-Young;Kang, Eunjoo;Kim, June Sic;Lee, Sang-Kun;Kang, Hyejin;Park, Hyojin;Kim, Sung Hun;Lee, Seung Hwan;Chung, Chun Kee
    • Annals of Clinical Neurophysiology
    • /
    • v.8 no.2
    • /
    • pp.163-170
    • /
    • 2006
  • Backgrounds: MEG can measure the task-specific neurophysiologic activity with good spatial and time resolution. Language lateralization using noninvasive method has been a subject of interest in resective brain surgery. We purposed to develop a paradigm for language lateralization using MEG and validate its feasibility. Methods: Magnetic fields were obtained in 12 neurosurgical candidates and one volunteer for language tasks, with a 306 channel whole head MEG. Language tasks were word listening, reading and picture naming. We tested two word listening paradigms: semantic decision of meaning of abstract nouns, and recognition of repeated words. The subjects were instructed to silently name or read, and respond with pushing button or not. We decided language dominance according to the number of acceptable equivalent current dipoles (ECD) modeled by sequential single dipole, and the mean magnetic field strength by root mean square value, in each hemisphere. We collected clinical data including Wada test. Results: Magnetic fields evoked by word listening were generally distributed in bilateral temporoparietal areas with variable hemispheric dominance. Language tasks using visual stimuli frequently evoked magnetic field in posterior midline area, which made laterality decision difficult. Response during task resulted in more artifacts and different results depending on responding hand. Laterality decision with mean magnetic field strength was more concordant with Wada than the method with ECD number of each hemisphere. Conclusions: Word listening task without hand response is the most feasible paradigm for language lateralization using MEG. Mean magnetic field strength in each hemisphere is a proper index for hemispheric dominance.

  • PDF

An Error Analysis on Business E-mails in English : A Case-Study (비지니스 이메일 영작문에 나타난 오류분석: 사례연구)

  • Hwang, Seon-Yoo
    • Journal of Convergence for Information Technology
    • /
    • v.8 no.6
    • /
    • pp.273-279
    • /
    • 2018
  • This study aimed at providing a comprehensive account of the sources and causes of errors in business emails that Korean college students wrote using a translation machine. Data were collected from 21 emails written by the students who took a business English course. Findings indicated that the students tended to make frequent errors in verb use and verb tense as well as a definite article, countable/noncountable nouns, time adverbs and prepositions. Therefore, the study suggested that the students' common errors imply that they experience some difficulties learning these linguistic features. Given that learners' errors can give us valuable insights into teaching and learning how to write in English, pedagogical suggestions are put forward based on the study results.