• Title/Summary/Keyword: word selection

Search Result 172, Processing Time 0.027 seconds

Ranking Translation Word Selection Using a Bilingual Dictionary and WordNet

  • Kim, Kweon-Yang;Park, Se-Young
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.16 no.1
    • /
    • pp.124-129
    • /
    • 2006
  • This parer presents a method of ranking translation word selection for Korean verbs based on lexical knowledge contained in a bilingual Korean-English dictionary and WordNet that are easily obtainable knowledge resources. We focus on deciding which translation of the target word is the most appropriate using the measure of semantic relatedness through the 45 extended relations between possible translations of target word and some indicative clue words that play a role of predicate-arguments in source language text. In order to reduce the weight of application of possibly unwanted senses, we rank the possible word senses for each translation word by measuring semantic similarity between the translation word and its near synonyms. We report an average accuracy of $51\%$ with ten Korean ambiguous verbs. The evaluation suggests that our approach outperforms the default baseline performance and previous works.

The Analysis of a Causal Relationship of Traditional Korean Restaurant's Well-Bing Attribute Selection on Customers' Re-Visitation and Word-of-Mouth

  • Baek, Hang-Sun;Shin, Chung-Sub;Lee, Sang-Youn
    • East Asian Journal of Business Economics (EAJBE)
    • /
    • v.4 no.2
    • /
    • pp.48-60
    • /
    • 2016
  • This study analyzes what effects does restaurant's well-being attribute selection have on word-of-mouth intention. Based on the result, this study aims to provide basic data for establishing Korean restaurant's service strategy and marketing strategy. The researchers surveyed 350 customers who visited a Korean restaurant located in Kangbook, Seoul. We encoded gathered data and analyzed them using SPSS 17.0 statistics package program. Following are the analyzed results. First, under hypothesis 1 - Korean restaurant's well-being attribute selection will have a positive influence on re-visitation intention - it is shown that sufficiency, healthiness, and steadiness have similar influence on re-visitation intention. Second, under hypothesis 2 - Korean restaurant's well-being attribute selection will have a positive influence on word-of-mouth intention - it is shown that sufficiency, healthiness, environment, and steadiness have similar influence on word -of-mouth intention. Third, under hypothesis 3 - Korean restaurant's re-visitation intention will have a positive influence on word -of-mouth intention - it is considered that eliciting customer's re-visitation intention also has influence on word-of-mouth intention. We will be necessary to consult how to derive customer's re-visitation intention or word-of-mouth intention by considering factors which customers of traditional Korean restaurant value.

Microblog User Geolocation by Extracting Local Words Based on Word Clustering and Wrapper Feature Selection

  • Tian, Hechan;Liu, Fenlin;Luo, Xiangyang;Zhang, Fan;Qiao, Yaqiong
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.14 no.10
    • /
    • pp.3972-3988
    • /
    • 2020
  • Existing methods always rely on statistical features to extract local words for microblog user geolocation. There are many non-local words in extracted words, which makes geolocation accuracy lower. Considering the statistical and semantic features of local words, this paper proposes a microblog user geolocation method by extracting local words based on word clustering and wrapper feature selection. First, ordinary words without positional indications are initially filtered based on statistical features. Second, a word clustering algorithm based on word vectors is proposed. The remaining semantically similar words are clustered together based on the distance of word vectors with semantic meanings. Next, a wrapper feature selection algorithm based on sequential backward subset search is proposed. The cluster subset with the best geolocation effect is selected. Words in selected cluster subset are extracted as local words. Finally, the Naive Bayes classifier is trained based on local words to geolocate the microblog user. The proposed method is validated based on two different types of microblog data - Twitter and Weibo. The results show that the proposed method outperforms existing two typical methods based on statistical features in terms of accuracy, precision, recall, and F1-score.

Target Word Selection for English-Korean Machine Translation System using Multiple Knowledge (다양한 지식을 사용한 영한 기계번역에서의 대역어 선택)

  • Lee, Ki-Young;Kim, Han-Woo
    • Journal of the Korea Society of Computer and Information
    • /
    • v.11 no.5 s.43
    • /
    • pp.75-86
    • /
    • 2006
  • Target word selection is one of the most important and difficult tasks in English-Korean Machine Translation. It effects on the translation accuracy of machine translation systems. In this paper, we present a new approach to select Korean target word for an English noun with translation ambiguities using multiple knowledge such as verb frame patterns, sense vectors based on collocations, statistical Korean local context information and co-occurring POS information. Verb frame patterns constructed with dictionary and corpus play an important role in resolving the sparseness problem of collocation data. Sense vectors are a set of collocation data when an English word having target selection ambiguities is to be translated to specific Korean target word. Statistical Korean local context Information is an N-gram information generated using Korean corpus. The co-occurring POS information is a statistically significant POS clue which appears with ambiguous word. The experiment showed promising results for diverse sentences from web documents.

  • PDF

Investigating Opinion Mining Performance by Combining Feature Selection Methods with Word Embedding and BOW (Bag-of-Words) (속성선택방법과 워드임베딩 및 BOW (Bag-of-Words)를 결합한 오피니언 마이닝 성과에 관한 연구)

  • Eo, Kyun Sun;Lee, Kun Chang
    • Journal of Digital Convergence
    • /
    • v.17 no.2
    • /
    • pp.163-170
    • /
    • 2019
  • Over the past decade, the development of the Web explosively increased the data. Feature selection step is an important step in extracting valuable data from a large amount of data. This study proposes a novel opinion mining model based on combining feature selection (FS) methods with Word embedding to vector (Word2vec) and BOW (Bag-of-words). FS methods adopted for this study are CFS (Correlation based FS) and IG (Information Gain). To select an optimal FS method, a number of classifiers ranging from LR (logistic regression), NN (neural network), NBN (naive Bayesian network) to RF (random forest), RS (random subspace), ST (stacking). Empirical results with electronics and kitchen datasets showed that LR and ST classifiers combined with IG applied to BOW features yield best performance in opinion mining. Results with laptop and restaurant datasets revealed that the RF classifier using IG applied to Word2vec features represents best performance in opinion mining.

Differential effects of online word-of-mouth about attractive and one-dimensional Kano attributes on hospital selection (온라인 입소문이 병원선택에 미치는 영향의 카노속성에 따른 차이)

  • Kim, Sujung;Kim, Junyong
    • Korea Journal of Hospital Management
    • /
    • v.27 no.3
    • /
    • pp.1-14
    • /
    • 2022
  • Purposes: This purpose of this study was to check how much the online word of mouth influences on customer's hospital selection according to Kano's model. Methodology: Kano classified the attributes that affect customer's satisfaction into attractive, one-dimensional, indifferent, must-be, and reverse attributes. Among them, attractive and one-dimensional attributes make up the largest portion in hospital selection. Based on this, the influence of positive or negative online reviews on the selection of hospitals was investigated. Differentiated service was selected as the attractive attributes, and a kind, sufficient explanation was selected as the one-dimensional attributes. Then a questionnaire was conducted how much the positive or negative online reviews influence on hospital selection, respectively. It was conducted from August 7 to September 7, 2021 for medical consumers in their 20s and older who have used medical services for the past 3 years, and the final 142 questionnaires were analyzed. All data was analyzed by chi-square and two-way ANOVA using SPSS ver 25.0. Findings: The results showed that, in one-dimensional attributes, the difference between positive and negative reviews was not statistically significant, but in attractive attributes, positive and negative reviews showed a statistically significant difference. It suggests that positive reviews on attractive attributes had a greater influence on hospital selection. In terms of hospital selection, when the experimental participants were exposed to the positive reviews, the hospital selection ratio did not differ by Kano's attributes, but to the negative reviews it differed. The hospital selection ratio, even after they were exposed to negative reviews, was higher in the attractive attributes than in the one-dimensional attributes. Practical Implication: This study confirmed that hospital selection is influenced differently depending on the Kano's attributes and the direction of the reviews, and suggests that marketers should respond differently to each Kano's attributes when they deal with online reviews of hospitals.

Practical Target Word Selection Using Collocation in English to Korean Machine Translation (영한번역 시스템에서 연어 사용에 의한 실용적인 대역어 선택)

  • 김성묵
    • Journal of Korea Society of Industrial Information Systems
    • /
    • v.5 no.2
    • /
    • pp.56-61
    • /
    • 2000
  • The quality of English to Korean Machine Translation depends on how well it deals with target word selection of verbs containing enormous ambiguity. Verb sense disambiguation can be done by using collocation, but the construction of verb collocations costs a lot of efforts and expenses. So, existing methods should be examined in the practical view points. This paper describes the practical method of target word selection using existing collocation and semantic distance computed from minimum semantic features of nouns.

  • PDF

English Bible Text Visualization Using Word Clouds and Dynamic Graphics Technology (단어 구름과 동적 그래픽스 기법을 이용한 영어성경 텍스트 시각화)

  • Jang, Dae-Heung
    • The Korean Journal of Applied Statistics
    • /
    • v.27 no.3
    • /
    • pp.373-386
    • /
    • 2014
  • A word cloud is a visualization of word frequency in a given text. The importance of each word is shown in font size or color. This plot is useful for quickly perceiving the most prominent words and for locating a word alphabetically to determine its relative prominence. With dynamic graphics, we can find the changing pattern of prominent words and their frequencies according to the changing selection of chapters in a given text. We can define the word frequency matrix. In this matrix, rows are chapters in text and columns are ranks corresponding to word frequency about the words in the text. We can draw the word frequency matrix plot with this matrix. Dynamic graphic can indicate the changing pattern of the word frequency matrix according to the changing selection of the range of ranks of words. We execute an English Bible text visualization using word clouds and dynamic graphics technology.

A Study on Statistical Feature Selection with Supervised Learning for Word Sense Disambiguation (단어 중의성 해소를 위한 지도학습 방법의 통계적 자질선정에 관한 연구)

  • Lee, Yong-Gu
    • Journal of the Korean BIBLIA Society for library and Information Science
    • /
    • v.22 no.2
    • /
    • pp.5-25
    • /
    • 2011
  • This study aims to identify the most effective statistical feature selecting method and context window size for word sense disambiguation using supervised methods. In this study, features were selected by four different methods: information gain, document frequency, chi-square, and relevancy. The result of weight comparison showed that identifying the most appropriate features could improve word sense disambiguation performance. Information gain was the highest. SVM classifier was not affected by feature selection and showed better performance in a larger feature set and context size. Naive Bayes classifier was the best performance on 10 percent of feature set size. kNN classifier on under 10 percent of feature set size. When feature selection methods are applied to word sense disambiguation, combinations of a small set of features and larger context window size, or a large set of features and small context windows size can make best performance improvements.

A Study on Word Selection Method and Device Improvement for Improving Speech Recognition Rate of Speech-Language-impaired in Severe Noise Environment (심한 소음환경에서 언어장애인 음성 인식률 향상을 위한 단어선정 방법 및 장치 개선에 관한 연구)

  • Yang, Ki-Woong;Lee, Hyung-keun
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.23 no.5
    • /
    • pp.555-567
    • /
    • 2019
  • Speech recognition rate is lowered even in a noisy environment, and it is difficult for a person with a speech disability or an inconvenient language to use it in a social life. In addition to improving the inconvenience of using the language, 280 words were selected using the word selection method which was improved when the word was selected considering the pronunciation characteristics of the language impaired. The MEMS development device used in the experiment was made considering material, lead wire type, length and direction. We improved the speech recognition rate by using the developed word selection method and the MEMS device developed to improve the speech recognition rate due to incorrect pronunciation and severe noise. The new method of selecting words and the mems device were improved and the results were included.