• Title/Summary/Keyword: Linguistic Information

Search Result 540, Processing Time 0.019 seconds

An Induced Hesitant Linguistic Aggregation Operator and Its Application for Creating Fuzzy Ontology

  • Kong, Mingming;Ren, Fangling;Park, Doo-Soon;Hao, Fei;Pei, Zheng
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.12 no.10
    • /
    • pp.4952-4975
    • /
    • 2018
  • An induced hesitant linguistic aggregation operator is investigated in the paper, in which, hesitant fuzzy linguistic evaluation values are associated with probabilistic information. To deal with these hesitant fuzzy linguistic information, an induced hesitant fuzzy linguistic probabilistic ordered weighted averaging (IHFLPOWA) operator is proposed, monotonicity, boundary and idempotency of IHFLPOWA are proved. Then andness, orness and the entropy of dispersion of IHFLPOWA are analyzed, which are used to characterize the weighting vector of the operator, these properties show that IHFLPOWA is extensions of the induced linguistic ordered weighted averaging operator and linguistic probabilistic aggregation operator. In this paper, IHFLPOWA is utilized to gather linguistic information and create fuzzy ontologies, and a movie fuzzy ontology as an illustrative case study is used to show the elaboration of the proposed method and comparison with the existing linguistic aggregation operators, it seems that the IHFLPOWA operator is an useful and alternative operator for dealing with hesitant fuzzy linguistic information with probabilistic information.

Development of a Deep Learning Model for Detecting Fake Reviews Using Author Linguistic Features (작성자 언어적 특성 기반 가짜 리뷰 탐지 딥러닝 모델 개발)

  • Shin, Dong Hoon;Shin, Woo Sik;Kim, Hee Woong
    • The Journal of Information Systems
    • /
    • v.31 no.4
    • /
    • pp.01-23
    • /
    • 2022
  • Purpose This study aims to propose a deep learning-based fake review detection model by combining authors' linguistic features and semantic information of reviews. Design/methodology/approach This study used 358,071 review data of Yelp to develop fake review detection model. We employed linguistic inquiry and word count (LIWC) to extract 24 linguistic features of authors. Then we used deep learning architectures such as multilayer perceptron(MLP), long short-term memory(LSTM) and transformer to learn linguistic features and semantic features for fake review detection. Findings The results of our study show that detection models using both linguistic and semantic features outperformed other models using single type of features. In addition, this study confirmed that differences in linguistic features between fake reviewer and authentic reviewer are significant. That is, we found that linguistic features complement semantic information of reviews and further enhance predictive power of fake detection model.

Quantitative Linguistic Analysis on Literary Works

  • Choi, Kyung-Ho
    • Journal of the Korean Data and Information Science Society
    • /
    • v.18 no.4
    • /
    • pp.1057-1064
    • /
    • 2007
  • From the view of natural language process, quantitative linguistic analysis is a linguistic study relying on statistical methods, and is a mathematical linguistics in an attempt to discover various linguistic characters by interpreting linguistic facts quantitatively through statistical methods. In this study, I would like to introduce a quantitative linguistic analysis method utilizing a computer and statistical methods on literary works. I also try to introduce a use of SynKDP, a synthesized Korean data process, and show the relations between distribution of linguistic unit elements which are used by the hero in a novel #Sassinamjunggi# and theme analysis on literary works.

  • PDF

Linguistic Modeling for Multilingual Machine Translation based on Common Transfer (공통변환 기반 다국어 자동번역을 위한 언어학적 모델링)

  • Choi, Sungkwon;Kim, Younggil
    • Language and Information
    • /
    • v.18 no.1
    • /
    • pp.77-97
    • /
    • 2014
  • Multilingual machine translation means the machine translation that is for more than two languages. Common transfer means the transfer in which we can reuse the transfer rules among similar languages according to linguistic typology. Therefore, the multilingual machine translation based on common transfer is the multilingual machine translation that can share the transfer rules among languages with similar linguistic typology. This paper describes the linguistic modeling for multilingual machine translation based on common transfer under development. This linguistic modeling consists of the linguistic devices such as 1) multilingual common Part-of-Speech set, 2) multilingual common transfer format, 3) multilingual common transfer chunking, and 4) multilingual common transfer rules based on linguistic typology. Validity of this linguistic modeling for multilingual machine translation is shown in the simulation. The multilingual machine translation system based on common transfer including Korean, English, Chinese, Spanish, and French will be developed till 2018.

  • PDF

Sentiment Analysis of Korean Using Effective Linguistic Features and Adjustment of Word Senses

  • Jang, Ha-Yeon;Shin, Hyo-Pil
    • Language and Information
    • /
    • v.14 no.2
    • /
    • pp.33-46
    • /
    • 2010
  • This paper introduces a new linguistic-focused approach for sentiment analysis (SA) of Korean. In order to overcome shortcomings of previous works that focused mainly on statistical methods, we made effective use of various linguistic features reflecting the nature of Korean. These features include contextual shifters, modal affixes, and the morphological dependency of chunk structures. Moreover, in order to eschew possible confusion caused by ambiguous words and to improve the results of SA, we also proposed simple adjustment methods of word senses using KOLON ontology mapping information. Through experiments we contend that effective use of linguistic features and ontological information can improve the results of sentiment analysis of Korean.

  • PDF

Automatic Mapping Between Large-Scale Heterogeneous Language Resources for NLP Applications: A Case of Sejong Semantic Classes and KorLexNoun for Korean

  • Park, Heum;Yoon, Ae-Sun
    • Language and Information
    • /
    • v.15 no.2
    • /
    • pp.23-45
    • /
    • 2011
  • This paper proposes a statistical-based linguistic methodology for automatic mapping between large-scale heterogeneous languages resources for NLP applications in general. As a particular case, it treats automatic mapping between two large-scale heterogeneous Korean language resources: Sejong Semantic Classes (SJSC) in the Sejong Electronic Dictionary (SJD) and nouns in KorLex. KorLex is a large-scale Korean WordNet, but it lacks syntactic information. SJD contains refined semantic-syntactic information, with semantic labels depending on SJSC, but the list of its entry words is much smaller than that of KorLex. The goal of our study is to build a rich language resource by integrating useful information within SJD into KorLex. In this paper, we use both linguistic and statistical methods for constructing an automatic mapping methodology. The linguistic aspect of the methodology focuses on the following three linguistic clues: monosemy/polysemy of word forms, instances (example words), and semantically related words. The statistical aspect of the methodology uses the three statistical formulae ${\chi}^2$, Mutual Information and Information Gain to obtain candidate synsets. Compared with the performance of manual mapping, the automatic mapping based on our proposed statistical linguistic methods shows good performance rates in terms of correctness, specifically giving recall 0.838, precision 0.718, and F1 0.774.

  • PDF

Crowdfunding Scams: The Profiles and Language of Deceivers

  • Lee, Seung-hun;Kim, Hyun-chul
    • Journal of the Korea Society of Computer and Information
    • /
    • v.23 no.3
    • /
    • pp.55-62
    • /
    • 2018
  • In this paper, we propose a model to detect crowdfunding scams, which have been reportedly occurring over the last several years, based on their project information and linguistic features. To this end, we first collect and analyze crowdfunding scam projects, and then reveal which specific project-related information and linguistic features are particularly useful in distinguishing scam projects from non-scams. Our proposed model built with the selected features and Random Forest machine learning algorithm can successfully detect scam campaigns with 84.46% accuracy.

The deduction of objective linguistic information using statistical methods - The grouping of the possibility of interdisciplinary research (통계적 방법을 활용한 객관적 언어정보 도출 - 학제적 연구의 가능성 모색)

  • Choi, Kyoung-Ho;Lee, Yong-Wook
    • Journal of the Korean Data and Information Science Society
    • /
    • v.22 no.1
    • /
    • pp.49-55
    • /
    • 2011
  • There are tries to unite through consilience in many fields. Interdisciplinary research is an instance of those. Linguistic studies called linguistic informatics or quantitative linguistics is a field of interdisciplinary research related with statistics linguists have studied chiefly statistics and linguistics. In the statistical aspect, there is need to supplement somewhat of the result of researches by linguists. This study shows statistical method can supplement insufficient objectivity in linguistic studies, and examines the way to raise a degree of completion of interdisciplinary research on statistics and linguistics. This study also shows an introduction and application of the statistical method can be useful for the deduction of objective linguistic information in linguistic studies.

Development of Linguistic Contents for Contextual Dialogue

  • Moon, Sang-Ho
    • Journal of information and communication convergence engineering
    • /
    • v.8 no.1
    • /
    • pp.116-121
    • /
    • 2010
  • New teaching and studying methods using educational contents are gradually widespread with the advancement of information and communication technology. As educational contents, in this paper, we design and implement linguistic contents for studying essential expressions applied to various situations of real life. In detail, the linguistic contents are run on web environments, and have suitable animations for learning essential expressions based on several foreign languages in contextual dialogues. Also, useful functions are included in contents to reinforce what users have learned.

Effectual Fuzzy Query Evaluation Method based on Fuzzy Linguistic Matrix in Information Retrieval (정보검색에서 퍼지 언어 매트릭스에 근거한 효율적인 퍼지 질의 평가 방법)

  • 최명복;김민구
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.10 no.3
    • /
    • pp.218-227
    • /
    • 2000
  • In this paper, we present a new fuzzy information retrieval method based on thesaurus. In the proposed method th thesaurus is represented by a fuzzy linguistic matrix, where the elements in fuzzy linguistic matrix represent a qualitative linguistic values between terms. In the fuzzy linguistic matrix, there are three kinds of fuzzy relationships between terms, i.e., similar relation, hierarchical relation, and associative relation. The implicit fuzzy relationships between terms are inferred by the transitive closure of the fuzzy linguistic matrix based on fuzzy theory. And the proposed method has the capability to deal with a qualitative linguistic weights in a query and in indexing of information items to reflect qualitative measure of human based on vague and uncertain decisions rather than a quantitiative measure. Therefore the proposed method is more flexible than the ones presented in papers[1-3]. Moreover our method is more effectual of time than the ones presented in papers[1-3] because we use a fuzzy linguistic matrix and AON (Associate Ordinary Number) values in query evaluation process. As a result, the proposed method allows the users to perform fuzzy queries in a more flexible and more intelligent manner.

  • PDF