• 제목/요약/키워드: Linguistic Information

검색결과 533건 처리시간 0.028초

An Induced Hesitant Linguistic Aggregation Operator and Its Application for Creating Fuzzy Ontology

  • Kong, Mingming;Ren, Fangling;Park, Doo-Soon;Hao, Fei;Pei, Zheng
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • 제12권10호
    • /
    • pp.4952-4975
    • /
    • 2018
  • An induced hesitant linguistic aggregation operator is investigated in the paper, in which, hesitant fuzzy linguistic evaluation values are associated with probabilistic information. To deal with these hesitant fuzzy linguistic information, an induced hesitant fuzzy linguistic probabilistic ordered weighted averaging (IHFLPOWA) operator is proposed, monotonicity, boundary and idempotency of IHFLPOWA are proved. Then andness, orness and the entropy of dispersion of IHFLPOWA are analyzed, which are used to characterize the weighting vector of the operator, these properties show that IHFLPOWA is extensions of the induced linguistic ordered weighted averaging operator and linguistic probabilistic aggregation operator. In this paper, IHFLPOWA is utilized to gather linguistic information and create fuzzy ontologies, and a movie fuzzy ontology as an illustrative case study is used to show the elaboration of the proposed method and comparison with the existing linguistic aggregation operators, it seems that the IHFLPOWA operator is an useful and alternative operator for dealing with hesitant fuzzy linguistic information with probabilistic information.

작성자 언어적 특성 기반 가짜 리뷰 탐지 딥러닝 모델 개발 (Development of a Deep Learning Model for Detecting Fake Reviews Using Author Linguistic Features)

  • 신동훈;신우식;김희웅
    • 한국정보시스템학회지:정보시스템연구
    • /
    • 제31권4호
    • /
    • pp.01-23
    • /
    • 2022
  • Purpose This study aims to propose a deep learning-based fake review detection model by combining authors' linguistic features and semantic information of reviews. Design/methodology/approach This study used 358,071 review data of Yelp to develop fake review detection model. We employed linguistic inquiry and word count (LIWC) to extract 24 linguistic features of authors. Then we used deep learning architectures such as multilayer perceptron(MLP), long short-term memory(LSTM) and transformer to learn linguistic features and semantic features for fake review detection. Findings The results of our study show that detection models using both linguistic and semantic features outperformed other models using single type of features. In addition, this study confirmed that differences in linguistic features between fake reviewer and authentic reviewer are significant. That is, we found that linguistic features complement semantic information of reviews and further enhance predictive power of fake detection model.

Quantitative Linguistic Analysis on Literary Works

  • Choi, Kyung-Ho
    • Journal of the Korean Data and Information Science Society
    • /
    • 제18권4호
    • /
    • pp.1057-1064
    • /
    • 2007
  • From the view of natural language process, quantitative linguistic analysis is a linguistic study relying on statistical methods, and is a mathematical linguistics in an attempt to discover various linguistic characters by interpreting linguistic facts quantitatively through statistical methods. In this study, I would like to introduce a quantitative linguistic analysis method utilizing a computer and statistical methods on literary works. I also try to introduce a use of SynKDP, a synthesized Korean data process, and show the relations between distribution of linguistic unit elements which are used by the hero in a novel #Sassinamjunggi# and theme analysis on literary works.

  • PDF

공통변환 기반 다국어 자동번역을 위한 언어학적 모델링 (Linguistic Modeling for Multilingual Machine Translation based on Common Transfer)

  • 최승권;김영길
    • 한국언어정보학회지:언어와정보
    • /
    • 제18권1호
    • /
    • pp.77-97
    • /
    • 2014
  • Multilingual machine translation means the machine translation that is for more than two languages. Common transfer means the transfer in which we can reuse the transfer rules among similar languages according to linguistic typology. Therefore, the multilingual machine translation based on common transfer is the multilingual machine translation that can share the transfer rules among languages with similar linguistic typology. This paper describes the linguistic modeling for multilingual machine translation based on common transfer under development. This linguistic modeling consists of the linguistic devices such as 1) multilingual common Part-of-Speech set, 2) multilingual common transfer format, 3) multilingual common transfer chunking, and 4) multilingual common transfer rules based on linguistic typology. Validity of this linguistic modeling for multilingual machine translation is shown in the simulation. The multilingual machine translation system based on common transfer including Korean, English, Chinese, Spanish, and French will be developed till 2018.

  • PDF

Sentiment Analysis of Korean Using Effective Linguistic Features and Adjustment of Word Senses

  • Jang, Ha-Yeon;Shin, Hyo-Pil
    • 한국언어정보학회지:언어와정보
    • /
    • 제14권2호
    • /
    • pp.33-46
    • /
    • 2010
  • This paper introduces a new linguistic-focused approach for sentiment analysis (SA) of Korean. In order to overcome shortcomings of previous works that focused mainly on statistical methods, we made effective use of various linguistic features reflecting the nature of Korean. These features include contextual shifters, modal affixes, and the morphological dependency of chunk structures. Moreover, in order to eschew possible confusion caused by ambiguous words and to improve the results of SA, we also proposed simple adjustment methods of word senses using KOLON ontology mapping information. Through experiments we contend that effective use of linguistic features and ontological information can improve the results of sentiment analysis of Korean.

  • PDF

Automatic Mapping Between Large-Scale Heterogeneous Language Resources for NLP Applications: A Case of Sejong Semantic Classes and KorLexNoun for Korean

  • Park, Heum;Yoon, Ae-Sun
    • 한국언어정보학회지:언어와정보
    • /
    • 제15권2호
    • /
    • pp.23-45
    • /
    • 2011
  • This paper proposes a statistical-based linguistic methodology for automatic mapping between large-scale heterogeneous languages resources for NLP applications in general. As a particular case, it treats automatic mapping between two large-scale heterogeneous Korean language resources: Sejong Semantic Classes (SJSC) in the Sejong Electronic Dictionary (SJD) and nouns in KorLex. KorLex is a large-scale Korean WordNet, but it lacks syntactic information. SJD contains refined semantic-syntactic information, with semantic labels depending on SJSC, but the list of its entry words is much smaller than that of KorLex. The goal of our study is to build a rich language resource by integrating useful information within SJD into KorLex. In this paper, we use both linguistic and statistical methods for constructing an automatic mapping methodology. The linguistic aspect of the methodology focuses on the following three linguistic clues: monosemy/polysemy of word forms, instances (example words), and semantically related words. The statistical aspect of the methodology uses the three statistical formulae ${\chi}^2$, Mutual Information and Information Gain to obtain candidate synsets. Compared with the performance of manual mapping, the automatic mapping based on our proposed statistical linguistic methods shows good performance rates in terms of correctness, specifically giving recall 0.838, precision 0.718, and F1 0.774.

  • PDF

Crowdfunding Scams: The Profiles and Language of Deceivers

  • Lee, Seung-hun;Kim, Hyun-chul
    • 한국컴퓨터정보학회논문지
    • /
    • 제23권3호
    • /
    • pp.55-62
    • /
    • 2018
  • In this paper, we propose a model to detect crowdfunding scams, which have been reportedly occurring over the last several years, based on their project information and linguistic features. To this end, we first collect and analyze crowdfunding scam projects, and then reveal which specific project-related information and linguistic features are particularly useful in distinguishing scam projects from non-scams. Our proposed model built with the selected features and Random Forest machine learning algorithm can successfully detect scam campaigns with 84.46% accuracy.

통계적 방법을 활용한 객관적 언어정보 도출 - 학제적 연구의 가능성 모색 (The deduction of objective linguistic information using statistical methods - The grouping of the possibility of interdisciplinary research)

  • 최경호;이용욱
    • Journal of the Korean Data and Information Science Society
    • /
    • 제22권1호
    • /
    • pp.49-55
    • /
    • 2011
  • 최근 들어 여러 영역에서 통섭을 통한 융합을 이루려는 시도가 자주 나타난다. 학문에 있어서도 예외는 아닌바, 학제적 연구가 그 예라 하겠다. 통계학과 관련된 학제적 연구의 한 분야로 언어정보학 또는 계량언어학이라 일컬어지는 언어학 연구가 있다. 그런데 통계학과 언어학의 학제적 연구는 주로 언어학자들을 중심으로 이루어져 오고 있다. 따라서 통계학적인 측면에서 보면 언어학자들의 연구결과에 일부 부족한 부분이 분명 존재한다. 이에 본 연구에서는 일부 언어학 연구에서 나타나는 객관성확보의 부족한 면에 대한 보완을 통계적인 방법을 이용하여 수행함으로써, 통계학과 언어학의 '학제적연구'의 완성도를 높일 수 있는 방안에 대해 고찰해 보았다. 부언하면 본 연구에서는 언어학 연구에서 보다 객관적인 언어정보를 도출하는데 도움이 될 수 있는 여러 통계적인 방법들을 소개하고 응용 예를 보였다.

Development of Linguistic Contents for Contextual Dialogue

  • Moon, Sang-Ho
    • Journal of information and communication convergence engineering
    • /
    • 제8권1호
    • /
    • pp.116-121
    • /
    • 2010
  • New teaching and studying methods using educational contents are gradually widespread with the advancement of information and communication technology. As educational contents, in this paper, we design and implement linguistic contents for studying essential expressions applied to various situations of real life. In detail, the linguistic contents are run on web environments, and have suitable animations for learning essential expressions based on several foreign languages in contextual dialogues. Also, useful functions are included in contents to reinforce what users have learned.

정보검색에서 퍼지 언어 매트릭스에 근거한 효율적인 퍼지 질의 평가 방법 (Effectual Fuzzy Query Evaluation Method based on Fuzzy Linguistic Matrix in Information Retrieval)

  • 최명복;김민구
    • 한국지능시스템학회논문지
    • /
    • 제10권3호
    • /
    • pp.218-227
    • /
    • 2000
  • 본 논문에서는 시소러스에 근거한 새로운 퍼지 정보검색 기법을 제안한다. 제안된 방법에서 시소러스는 내부 용어들 간의 관련도를 정성적인 언어 갑으로 갖는 퍼지 언어 매트릭스로 표현되며 용어들간의 관계는 동의, 계층, 그릭 연관이 세 가지 관계가 제공된다 싯러스 내부 용어들 간이 무시된 관련도가 퍼기 이론에 근거한 퍼지이론에 근거한 퍼지 언어 매트릭스의 전이 폐쇄 알고리즘에 의해 추론된다 또한 제안돈 방법은 사용자의 질의, 그리고 문서와 같은 정보 항목의 표현에도 인간이 주관적이고 부정확한 측도를 그대로 반영하는 정성적인 언어 값을 허용한다. 따라서 논문 [1-3]에서 제안된 방법보다 좀 더 유용하다. 또한 질의 평가시 퍼지 언어 매트릭스와 AON(Associated Ordinary Number)값을 이용하기 때문에 논문 [1-3]에서 사용되는 방법보다 시간적으로 효츌적이다. 결과적으로 사용자가 좀 더 유용하고 지능적인 방법으로 질의를 처리할 수있도록 한다

  • PDF