• Title/Summary/Keyword: Sentence Analysis

Search Result 497, Processing Time 0.026 seconds

E-commerce data based Sentiment Analysis Model Implementation using Natural Language Processing Model (자연어처리 모델을 이용한 이커머스 데이터 기반 감성 분석 모델 구축)

  • Choi, Jun-Young;Lim, Heui-Seok
    • Journal of the Korea Convergence Society
    • /
    • v.11 no.11
    • /
    • pp.33-39
    • /
    • 2020
  • In the field of Natural Language Processing, Various research such as Translation, POS Tagging, Q&A, and Sentiment Analysis are globally being carried out. Sentiment Analysis shows high classification performance for English single-domain datasets by pretrained sentence embedding models. In this thesis, the classification performance is compared by Korean E-commerce online dataset with various domain attributes and 6 Neural-Net models are built as BOW (Bag Of Word), LSTM[1], Attention, CNN[2], ELMo[3], and BERT(KoBERT)[4]. It has been confirmed that the performance of pretrained sentence embedding models are higher than word embedding models. In addition, practical Neural-Net model composition is proposed after comparing classification performance on dataset with 17 categories. Furthermore, the way of compressing sentence embedding model is mentioned as future work, considering inference time against model capacity on real-time service.

A comparative study of Entity-Grid and LSA models on Korean sentence ordering (한국어 텍스트 문장정렬을 위한 개체격자 접근법과 LSA 기반 접근법의 활용연구)

  • Kim, Youngsam;Kim, Hong-Gee;Shin, Hyopil
    • Korean Journal of Cognitive Science
    • /
    • v.24 no.4
    • /
    • pp.301-321
    • /
    • 2013
  • For the task of sentence ordering, this paper attempts to utilize the Entity-Grid model, a type of entity-based modeling approach, as well as Latent Semantic analysis, which is based on vector space modeling, The task is well known as one of the fundamental tools used to measure text coherence and to enhance text generation processes. For the implementation of the Entity-Grid model, we attempt to use the syntactic roles of the nouns in the Korean text for the ordering task, and measure its impact on the result, since its contribution has been discussed in previous research. Contrary to the case of German, it shows a positive result. In order to obtain the information on the syntactic roles, we use a strategy of using Korean case-markers for the nouns. As a result, it is revealed that the cues can be helpful to measure text coherence. In addition, we compare the results with the ones of the LSA-based model, discussing the advantages and disadvantages of the models, and options for future studies.

  • PDF

Syntactic Category Prediction for Improving Parsing Accuracy in English-Korean Machine Translation (영한 기계번역에서 구문 분석 정확성 향상을 위한 구문 범주 예측)

  • Kim Sung-Dong
    • The KIPS Transactions:PartB
    • /
    • v.13B no.3 s.106
    • /
    • pp.345-352
    • /
    • 2006
  • The practical English-Korean machine translation system should be able to translate long sentences quickly and accurately. The intra-sentence segmentation method has been proposed and contributed to speeding up the syntactic analysis. This paper proposes the syntactic category prediction method using decision trees for getting accurate parsing results. In parsing with segmentation, the segment is separately parsed and combined to generate the sentence structure. The syntactic category prediction would facilitate to select more accurate analysis structures after the partial parsing. Thus, we could improve the parsing accuracy by the prediction. We construct features for predicting syntactic categories from the parsed corpus of Wall Street Journal and generate decision trees. In the experiments, we show the performance comparisons with the predictions by human-built rules, trigram probability and neural networks. Also, we present how much the category prediction would contribute to improving the translation quality.

The analysis of mathematics error type that appears from the process of solving problem related to real life (실생활 문장제의 해결과정에 나타나는 오류유형 분석)

  • Park, Jang Hee;Ryu, Shi Kyu;Lee, Joong Kwoen
    • Journal of the Korean School Mathematics Society
    • /
    • v.15 no.4
    • /
    • pp.699-718
    • /
    • 2012
  • The purpose of mathematics eduction is to develop the ability of thinking mathematically. It informs method to solve problem through mathematical thinking that teach mathematical ability. Errors in the problem solving can be thought as those in the mathematical thinking. Therefore analysis and classification of mathematics errors is important to teach mathematics. This study researches the preceding studies on mathematics errors and presents the characteristic of them with analyzed models. The results achieved by analysis of the process of problem solving are as follows : ▸ Students feel much harder to solve words problems rather than multiple-choice problems. ▸ The length of sentence make some differences of understanding of the words problems. Students easy to understand short sentence problems than long sentence problems. ▸ If students feel difficulties on the pre-learned mathematical content, they feel the same difficulties on the words problems based on the pre-learned mathematics content.

  • PDF

A Documentational Study of Doinqigong in The Oriental Medicine Classics (고전의서(古典醫書) 중 도인기공법(導引氣功法)에 관한 문헌(文獻) 연구(硏究))

  • Kim, Hyun-Tai;Han, Chang-Hyun;Lee, Sang-Nam;Kwon, Young-Kyu;Ahn, Sang-Woo;Park, Ji-Ha
    • Journal of Korean Medical classics
    • /
    • v.22 no.3
    • /
    • pp.7-29
    • /
    • 2009
  • Objectives : Because of emphasizing a side of preventive medicine in the oriental medicine, an interest in Doinqigong(導引氣功: Physical and breathing exercise) has been elated recently. But, it has a limited sphere of application in the present south korea. Therefore we would like to bring out its sphere of application and detailed method in the oriental medicine classics. Method : We have researched theory and method of Doinqigong in the Junghwauijeon(中華醫典: Oriental medicine classic collections) DB according to below the procedure. (1) Making a related words list: We have used existing study of Doinqigong to make a list. It has been connected with Doinqigong. It includes not only technical terms, but also general terms. (2) Searching sentence: We have searched sentence that contain terms related with Doinqigong in the Junghwaeujeon DB. (3) Analysis of related sentence: We have searched and classified sentence by theory and method. Conclusions : (1) The total number of oriental medicine classics connected with Doinqigong is twelve. (2) The number of oriental medicine classics connected with Doinqigong's theory is four. and the contents are the working principle of Doinqigong, the Doinqigong following to time, the control of life's cultivation, the importance of consciousness, the consciousness of the running qigong and so on.

  • PDF

A Study of bathing therapy on the ${\ulcorner}$Wai-Tai-Mi-Yao(外臺秘要)${\lrcorner}$ ("외대비요(外臺秘要)"의 약욕요법(藥浴療法) 활용에 관한 연구)

  • Heo, Kyung-Ja;Lee, Byung-Wook;Kim, Eun-Ha
    • Korean Journal of Oriental Medicine
    • /
    • v.11 no.1
    • /
    • pp.43-60
    • /
    • 2005
  • Objective : ${\ulcorner}$Wai-Tai-Mi-Yao${\lrcorner}$ had been made by Wang-Dao(王燾) in Tang Dynasty(唐朝). It included not only in those days medical knowledge, but also before medical knowledge. So it is regarded as important classic in the oriental medicine. And there are various bathing therapy methods. Therefore we would like to bring out use sphere and detailed method of bathing therapy in Tang Dynasty and before period. Methodologies :We have researched bathing therapy of ${\ulcorner}$Wai-Tai-Mi-Yao${\lrcorner}$ according to below the procedure. (1) Making a related words list: We have used existing external treatments technical books to make a list. The list is consist of 23 words and includes not only technical terms, but also general terms. (2) Searching sentence: We have searched sentence that contain terms that related with bathing therapies. (3) Analysis of related sentence: We have searched and classified sentence by disease. Conclusions :(1) ${\ulcorner}$Wai-Tai-Mi-Yao${\lrcorner}$ has described 15,180 records. Bathing therapies of ${\ulcorner}$Wai-Tai-Mi-Yao${\lrcorner}$ had been used to cure 726 records from the whole volume. The contents account for 4.8% of the whole volume. (2) ${\ulcorner}$Wai-Tai-Mi-Yao${\lrcorner}$ has described 1,104 diseases. Bathing therapies of ${\ulcorner}$Wai-Tai-Mi-Yao${\lrcorner}$ had been used to cure 293 diseases from the whole diseases. The contents account for 26.5% of the whole volume. (3) These diseases belong to dermatologic, internal, ophthalmic, otolaryngologic, obstetrics, gynecologic, pediatric, surgical and veterinary diseases.

  • PDF

Three-Phase English Syntactic Analysis for Improving the Parsing Efficiency (영어 구문 분석의 효율 개선을 위한 3단계 구문 분석)

  • Kim, Sung-Dong
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.5 no.1
    • /
    • pp.21-28
    • /
    • 2016
  • The performance of an English-Korean machine translation system depends heavily on its English parser. The parser in this paper is a part of the rule-based English-Korean MT system, which includes many syntactic rules and performs the chart-based parsing. The parser generates too many structures due to many syntactic rules, so much time and memory are required. The rule-based parser has difficulty in analyzing and translating the long sentences including the commas because they cause high parsing complexity. In this paper, we propose the 3-phase parsing method with sentence segmentation to efficiently translate the long sentences appearing in usual. Each phase of the syntactic analysis applies its own independent syntactic rules in order to reduce parsing complexity. For the purpose, we classify the syntactic rules into 3 classes and design the 3-phase parsing algorithm. Especially, the syntactic rules in the 3rd class are for the sentence structures composed with commas. We present the automatic rule acquisition method for 3rd class rules from the syntactic analysis of the corpus, with which we aim to continuously improve the coverage of the parsing. The experimental results shows that the proposed 3-phase parsing method is superior to the prior parsing method using only intra-sentence segmentation in terms of the parsing speed/memory efficiency with keeping the translation quality.

Comparison of Significant Term Extraction Based on the Number of Selected Principal Components (주성분 보유수에 따른 중요 용어 추출의 비교)

  • Lee Chang-Beom;Ock Cheol-Young;Park Hyuk-Ro
    • The KIPS Transactions:PartB
    • /
    • v.13B no.3 s.106
    • /
    • pp.329-336
    • /
    • 2006
  • In this paper, we propose a method of significant term extraction within a document. The technique used is Principal Component Analysis(PCA) which is one of the multivariate analysis methods. PCA can sufficiently use term-term relationships within a document by term-term correlations. We use a correlation matrix instead of a covariance matrix between terms for performing PCA. We also try to find out thresholds of both the number of components to be selected and correlation coefficients between selected components and terms. The experimental results on 283 Korean newspaper articles show that the condition of the first six components with correlation coefficients of |0.4| is the best for extracting sentence based on the significant selected terms.

Korean Morphological Analysis Method Based on BERT-Fused Transformer Model (BERT-Fused Transformer 모델에 기반한 한국어 형태소 분석 기법)

  • Lee, Changjae;Ra, Dongyul
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.11 no.4
    • /
    • pp.169-178
    • /
    • 2022
  • Morphemes are most primitive units in a language that lose their original meaning when segmented into smaller parts. In Korean, a sentence is a sequence of eojeols (words) separated by spaces. Each eojeol comprises one or more morphemes. Korean morphological analysis (KMA) is to divide eojeols in a given Korean sentence into morpheme units. It also includes assigning appropriate part-of-speech(POS) tags to the resulting morphemes. KMA is one of the most important tasks in Korean natural language processing (NLP). Improving the performance of KMA is closely related to increasing performance of Korean NLP tasks. Recent research on KMA has begun to adopt the approach of machine translation (MT) models. MT is to convert a sequence (sentence) of units of one domain into a sequence (sentence) of units of another domain. Neural machine translation (NMT) stands for the approaches of MT that exploit neural network models. From a perspective of MT, KMA is to transform an input sequence of units belonging to the eojeol domain into a sequence of units in the morpheme domain. In this paper, we propose a deep learning model for KMA. The backbone of our model is based on the BERT-fused model which was shown to achieve high performance on NMT. The BERT-fused model utilizes Transformer, a representative model employed by NMT, and BERT which is a language representation model that has enabled a significant advance in NLP. The experimental results show that our model achieves 98.24 F1-Score.

Meta Information Retrieval using Sentence Analysis of Korean Dialogue Style (한국어 대화체 문장 분석을 이용한 메타 정보검색)

  • 박인철
    • Journal of the Korea Computer Industry Society
    • /
    • v.4 no.10
    • /
    • pp.703-712
    • /
    • 2003
  • Today, documents existing on internet by the development of communication network increase in number. And it is required the information retrieval system that can efficiently acquire the necessary information. Most information retrieval systems retrieve documents using a simple keyword or a boolean query of keywords. But, the method is not fit for novice users to use and has many difficulties than user's dialogue query from the viewpoint of convenience and precise understanding for query. So, this paper has an aim to suggest the method that will cope with above problems and to design and implement a meta query processing system for information retrieval using Korean dialogue sentences. The system implemented in this paper can generates a new boolean query for a given Korean dialogue sentence and resolve lexical ambiguities through morphological analysis, syntactic analysis and extension of query using thesaurus.

  • PDF