• Title/Summary/Keyword: Word Frequency

Search Result 752, Processing Time 0.028 seconds

A Study on Optimization of Support Vector Machine Classifier for Word Sense Disambiguation (단어 중의성 해소를 위한 SVM 분류기 최적화에 관한 연구)

  • Lee, Yong-Gu
    • Journal of Information Management
    • /
    • v.42 no.2
    • /
    • pp.193-210
    • /
    • 2011
  • The study was applied to context window sizes and weighting method to obtain the best performance of word sense disambiguation using support vector machine. The context window sizes were used to a 3-word, sentence, 50-bytes, and document window around the targeted word. The weighting methods were used to Binary, Term Frequency(TF), TF ${\times}$ Inverse Document Frequency(IDF), and Log TF ${\times}$ IDF. As a result, the performance of 50-bytes in the context window size was best. The Binary weighting method showed the best performance.

The Impact of Word of Mouth on Customer Perceived Value for the Malaysian Restaurant Industry

  • Oluwafemi, Adebusoye Shedrack;Dastane, Omkar
    • Asian Journal of Business Environment
    • /
    • v.6 no.3
    • /
    • pp.21-31
    • /
    • 2016
  • Purpose - The purpose of this research is to determine the impact of word of mouth on customer perceived value for restaurants in Malaysia. The objectives of this research include determining how word of mouth (WoM) factors - frequency of word of mouth messages, reputation of word of mouth messenger, richness of word of mouth message, dispersion of word of mouth conversations and manner of word of mouth delivery impact customer perceived value in Malaysian restaurant industry. Research Design, Data, and Methodology - The research follows causal / explanatory research method based on quantitative data. A sample of 150 restaurant customers in Kuala Lumpur, Malaysia was selected using convenience sampling technique. Likert scale questionnaire is used to collect data and data is analysed using regression analysis through SPSS 22. Results - The statistical analysis revealed that independent variable 'manner of delivery' significantly and positively impacts customer perceived value for restaurants in Malaysia. Conclusions - To build strong positive customer perception, Malaysian restaurants can enhance word of mouth campaigns' 'manner of delivery' by making them passionate, exciting and with high emotional appeal.

A study on Korean language processing using TF-IDF (TF-IDF를 활용한 한글 자연어 처리 연구)

  • Lee, Jong-Hwa;Lee, MoonBong;Kim, Jong-Weon
    • The Journal of Information Systems
    • /
    • v.28 no.3
    • /
    • pp.105-121
    • /
    • 2019
  • Purpose One of the reasons for the expansion of information systems in the enterprise is the increased efficiency of data analysis. In particular, the rapidly increasing data types which are complex and unstructured such as video, voice, images, and conversations in and out of social networks. The purpose of this study is the customer needs analysis from customer voices, ie, text data, in the web environment.. Design/methodology/approach As previous study results, the word frequency of the sentence is extracted as a word that interprets the sentence has better affects than frequency analysis. In this study, we applied the TF-IDF method, which extracts important keywords in real sentences, not the TF method, which is a word extraction technique that expresses sentences with simple frequency only, in Korean language research. We visualized the two techniques by cluster analysis and describe the difference. Findings TF technique and TF-IDF technique are applied for Korean natural language processing, the research showed the value from frequency analysis technique to semantic analysis and it is expected to change the technique by Korean language processing researcher.

Research Trend Analysis of the Retail Industry: Focusing on the Department Store (유통업태 연구동향 분석: 백화점을 중심으로)

  • Hoe-Chang YANG
    • The Journal of Economics, Marketing and Management
    • /
    • v.11 no.5
    • /
    • pp.45-55
    • /
    • 2023
  • Purpose: As one of the continuous studies on the offline distribution industry, the purpose of this study is to find ways for offline stores to respond to the growth of online shopping by identifying research trends on department stores. Research design, data and methodology: To this end, this study conducted word frequency analysis, word co-occurrence frequency analysis, BERTopic, LDA, and dynamic topic modeling using Python 3.7 on a total of 551 English abstracts searched with the keyword 'department store' in scienceON as of October 10, 2022. Results: The results of word frequency analysis and co-occurrence frequency analysis revealed that research related to department stores frequently focuses on factors such as customers, consumers, products, satisfaction, services, and quality. BERTopic and LDA analyses identified five topics, including 'store image,' with 'shopping information' showing relatively high interest, while 'sales systems' were observed to have relatively lower interest. Conclusions: Based on the results of this study, it was concluded that research related to department stores has so far been conducted in a limited scope, and it is insufficient to provide clues for department stores to secure competitiveness against online platforms. Therefore, it is suggested that additional research be conducted on topics such as the true role of department stores in the retail industry, consumer reinterpretation, customer value and lifetime value, department stores as future retail spaces, ethical management, and transparent ESG management.

Extracting High-Frequency Optimal Korean Word Set by Word Frequency Statistics (어절 빈도 조사에 의한 최적의 고빈도 어절 집합 추출)

  • Kang, Seung-Shik
    • Annual Conference on Human and Language Technology
    • /
    • 2001.10d
    • /
    • pp.85-88
    • /
    • 2001
  • 1500만, 700만, 10만 어절 크기의 세 가지 원시 말뭉치로부터 한국어 어절 빈도를 조사하였다. 각 말뭉치에 대한 어절 빈도 결과를 비교-분석하여 활용가치가 높은 고빈도 어절 집합을 구하였다. 고빈도 어절 집합의 효용성을 검증하기 위해 일반문서에 대한 어절 적중률을 실험하였다. 그 결과로 고빈도 563 어절이 24.5%, 9484 어절이 51.5%, 184246 어절이 81.6%의 어절 적중률을 보였다.

  • PDF

Design and Implementation of Minutes Summary System Based on Word Frequency and Similarity Analysis (단어 빈도와 유사도 분석 기반의 회의록 요약 시스템 설계 및 구현)

  • Heo, Kanhgo;Yang, Jinwoo;Kim, Donghyun;Bok, Kyoungsoo;Yoo, Jaesoo
    • The Journal of the Korea Contents Association
    • /
    • v.19 no.10
    • /
    • pp.620-629
    • /
    • 2019
  • An automated minutes summary system is required to objectively summarize and classify the contents of discussions or discussions for decision making. This paper designs and implements a minutes summary system using word2vec model to complement the existing minutes summary system. The proposed system is further implemented with word2vec model to remove index words during morpheme analysis and to extract representative sentences with common opinions from documents. The proposed system automatically classifies documents collected during the meeting process and extracts representative sentences representing the agenda among various opinions. The conference host can quickly identify and manage all the agendas discussed at the meeting through the proposal system. The proposed system analyzes various agendas of large-scale debates or discussions and summarizes sentences that can be representative opinions to support fast and accurate decision making.

Digital Isolated Word Recognition System based on MFCC and DTW Algorithm (MFCC와 DTW에 알고리즘을 기반으로 한 디지털 고립단어 인식 시스템)

  • Zang, Xian;Chong, Kil-To
    • Proceedings of the KIEE Conference
    • /
    • 2008.10b
    • /
    • pp.290-291
    • /
    • 2008
  • The most popular speech feature used in speech recognition today is the Mel-Frequency Cepstral Coefficients (MFCC) algorithm, which could reflect the perception characteristics of the human ear more accurately than other parameters. This paper adopts MFCC and its first order difference, which could reflect the dynamic character of speech signal, as synthetical parametric representation. Furthermore, we quote Dynamic Time Warping (DTW) algorithm to search match paths in the pattern recognition process. We use the software "GoldWave" to record English digitals in the lab environments and the simulation results indicate the algorithm has higher recognition accuracy than others using LPCC, etc. as character parameters in the experiment for Digital Isolated Word Recognition (DIWR) system.

  • PDF

A Comparative Study on the Occurrence Loci of Disfluency between Neurogenic and Developmental Stuttering (신경인성과 발달성 말더듬의 비유창성 발생 자리에 대한 연구)

  • Shin, Myung-Sun;Kwon, Do-Ha;Yoon, Chi-Yeon
    • Speech Sciences
    • /
    • v.13 no.3
    • /
    • pp.185-195
    • /
    • 2006
  • This study aims to clarify disfluency loci in a neurogenic stuttering group and to examine how the characteristics are different from a developmental stuttering group. For the study, spoken language samples were collected from 11 adults with developmental stuttering and 11 adults with neurogenic stuttering in the course of speaking tasks including reading, monologue and conversation. Using the collected samples, disfluency characteristics of the two groups were to be investigated by analyzing adaptation effect, consistency effect and frequency of disfluency occurrence according to word position, which are related to the occurrence loci of disfluency. Results of this study were as follows: First, while the neurogenic stuttering group did not show any adaptation effect, the developmental stuttering group showed the adaptation effect that the percent of disfluency word reducing as they read the same materials repeatedly. Second, there was no meaningful difference of consistency effect between the two stuttering groups. Third, the neurogenic stuttering group showed more disfluency frequency in final sounds among the word position compared to the developmental stuttering group.

  • PDF

A Study on Sound Changes affecting Noun-final Consonant (체언말 자음의 음성적 교체 현상에 대한 연구)

  • Oh, Jea-hyuk;Shin, Ji-young
    • Proceedings of the KSPS conference
    • /
    • 2005.11a
    • /
    • pp.193-198
    • /
    • 2005
  • The aim of this paper is to exam why the nouns that used /kh, ph, ts, th/, as the final phoneme changed. Assuming that these change related to the aspects of the word usage, we collected the word frequency and the phonetic form of words. The results are as follows : ① The realization of standard phonetic form is related to the frequency of case marker that could not be omitted, combined with the word. ② The changing into /s/ in a coronal consonant is related to the case marker [i].

  • PDF

Research Paper Classification Scheme based on Word Embedding (워드 임베딩 기반 연구 논문 분류 기법)

  • Dipto, Biswas;Gil, Joon-Min
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2021.11a
    • /
    • pp.494-497
    • /
    • 2021
  • 텍스트 분류(text classification)는 원시 텍스트 데이터로부터 정보를 추출할 수 있는 기술에 기반하여 많은 양의 텍스트 데이터를 관심 영역으로 분류하는 것으로 최근에 각광을 받고 있다. 본 논문에서는 워드 임베딩(word embedding) 기법을 이용하여 특정 분야의 연구 논문을 분류하고 추천하는 기법을 제안한다. 워드 임베딩으로 CBOW(Continuous Bag-of-Word)와 Sg(Skip-gram)를 연구 논문의 분류에 적용하고 기존 방식인 TF-IDF(Term Frequency-Inverse Document Frequency)와 성능을 비교 분석한다. 성능 평가 결과는 워드 임베딩에 기반한 연구 논문 분류 기법이 TF-IDF에 기반한 연구 논문 분류 기법보다 좋은 성능을 가진다는 것을 나타낸다.