• Title/Summary/Keyword: high-frequency vocabulary

Search Result 23, Processing Time 0.018 seconds

A study on Korean tourism trends using social big data -Focusing on sentiment analysis- (소셜 빅데이터를 활용한 한국관광 트렌드에 관한연구 -감성분석을 중심으로-)

  • Youn-hee Choi;Kyoung-mi Yoo
    • The Journal of the Convergence on Culture Technology
    • /
    • v.10 no.3
    • /
    • pp.97-109
    • /
    • 2024
  • In the field of domestic tourism, tourism trend analysis of tourism consumers, both international tourists and domestic tourists, is essential not only for the Korean tourism market but also for local and governmental tourism policy makers. e will explore the keywords and sentiment analysis on social media to establish a marketing strategy plan and revitalize the domestic tourism industry through communication and information from tourism consumers. This study utilized TEXTOM 6.0 to analyze recent trends in Korean tourism. Data was collected from September 31, 2022, to August 31, 2023, using 'Korean tourism' and 'domestic tourism' as keywords, targeting blogs, cafes, and news provided by Naver, Daum, and Google. Through text mining, 100 key words and TF-IDF were extracted in order of frequency, and then CONCOR analysis and sentiment analysis were conducted. For Korean tourism keywords, words related to tourist destinations, travel companions and behaviors, tourism motivations and experiences, accommodation types, tourist information, and emotional connections ranked high. The results of the CONCOR analysis were categorized into five clusters related to tourist destinations, tourist information, tourist activities/experiences, tourism motivation/content, and inbound related. Finally, the sentiment analysis showed a high level of positive documents and vocabulary. This study analyzes the rapidly changing trends of Korean tourism through text mining on Korean tourism and is expected to provide meaningful data to promote domestic tourism not only for Koreans but also for foreigners visiting Korea.

A New Approach to Automatic Keyword Generation Using Inverse Vector Space Model (키워드 자동 생성에 대한 새로운 접근법: 역 벡터공간모델을 이용한 키워드 할당 방법)

  • Cho, Won-Chin;Rho, Sang-Kyu;Yun, Ji-Young Agnes;Park, Jin-Soo
    • Asia pacific journal of information systems
    • /
    • v.21 no.1
    • /
    • pp.103-122
    • /
    • 2011
  • Recently, numerous documents have been made available electronically. Internet search engines and digital libraries commonly return query results containing hundreds or even thousands of documents. In this situation, it is virtually impossible for users to examine complete documents to determine whether they might be useful for them. For this reason, some on-line documents are accompanied by a list of keywords specified by the authors in an effort to guide the users by facilitating the filtering process. In this way, a set of keywords is often considered a condensed version of the whole document and therefore plays an important role for document retrieval, Web page retrieval, document clustering, summarization, text mining, and so on. Since many academic journals ask the authors to provide a list of five or six keywords on the first page of an article, keywords are most familiar in the context of journal articles. However, many other types of documents could not benefit from the use of keywords, including Web pages, email messages, news reports, magazine articles, and business papers. Although the potential benefit is large, the implementation itself is the obstacle; manually assigning keywords to all documents is a daunting task, or even impractical in that it is extremely tedious and time-consuming requiring a certain level of domain knowledge. Therefore, it is highly desirable to automate the keyword generation process. There are mainly two approaches to achieving this aim: keyword assignment approach and keyword extraction approach. Both approaches use machine learning methods and require, for training purposes, a set of documents with keywords already attached. In the former approach, there is a given set of vocabulary, and the aim is to match them to the texts. In other words, the keywords assignment approach seeks to select the words from a controlled vocabulary that best describes a document. Although this approach is domain dependent and is not easy to transfer and expand, it can generate implicit keywords that do not appear in a document. On the other hand, in the latter approach, the aim is to extract keywords with respect to their relevance in the text without prior vocabulary. In this approach, automatic keyword generation is treated as a classification task, and keywords are commonly extracted based on supervised learning techniques. Thus, keyword extraction algorithms classify candidate keywords in a document into positive or negative examples. Several systems such as Extractor and Kea were developed using keyword extraction approach. Most indicative words in a document are selected as keywords for that document and as a result, keywords extraction is limited to terms that appear in the document. Therefore, keywords extraction cannot generate implicit keywords that are not included in a document. According to the experiment results of Turney, about 64% to 90% of keywords assigned by the authors can be found in the full text of an article. Inversely, it also means that 10% to 36% of the keywords assigned by the authors do not appear in the article, which cannot be generated through keyword extraction algorithms. Our preliminary experiment result also shows that 37% of keywords assigned by the authors are not included in the full text. This is the reason why we have decided to adopt the keyword assignment approach. In this paper, we propose a new approach for automatic keyword assignment namely IVSM(Inverse Vector Space Model). The model is based on a vector space model. which is a conventional information retrieval model that represents documents and queries by vectors in a multidimensional space. IVSM generates an appropriate keyword set for a specific document by measuring the distance between the document and the keyword sets. The keyword assignment process of IVSM is as follows: (1) calculating the vector length of each keyword set based on each keyword weight; (2) preprocessing and parsing a target document that does not have keywords; (3) calculating the vector length of the target document based on the term frequency; (4) measuring the cosine similarity between each keyword set and the target document; and (5) generating keywords that have high similarity scores. Two keyword generation systems were implemented applying IVSM: IVSM system for Web-based community service and stand-alone IVSM system. Firstly, the IVSM system is implemented in a community service for sharing knowledge and opinions on current trends such as fashion, movies, social problems, and health information. The stand-alone IVSM system is dedicated to generating keywords for academic papers, and, indeed, it has been tested through a number of academic papers including those published by the Korean Association of Shipping and Logistics, the Korea Research Academy of Distribution Information, the Korea Logistics Society, the Korea Logistics Research Association, and the Korea Port Economic Association. We measured the performance of IVSM by the number of matches between the IVSM-generated keywords and the author-assigned keywords. According to our experiment, the precisions of IVSM applied to Web-based community service and academic journals were 0.75 and 0.71, respectively. The performance of both systems is much better than that of baseline systems that generate keywords based on simple probability. Also, IVSM shows comparable performance to Extractor that is a representative system of keyword extraction approach developed by Turney. As electronic documents increase, we expect that IVSM proposed in this paper can be applied to many electronic documents in Web-based community and digital library.

A Study on Classic Fashion Image and Sensible Vocabularies - Focusing on Women of Baby Boom and Y Generations - (클래식 패션 이미지와 감성 어휘 연구 - 베이비붐, Y세대 여성을 중심으로 -)

  • Sang, Yoon-Jin;Yoo, Jung-Min;Park, Minjung;Lee, Inseong
    • Journal of the Korea Fashion and Costume Design Association
    • /
    • v.17 no.3
    • /
    • pp.85-98
    • /
    • 2015
  • Modern fashion shows the trend of various styles and the period focusing on only product functions is changed to the period focusing on consumer's sensibility. Consumers show different sensitivities and preference by individual at the stage cognizing and recognizing the stimulation of given image and the method of objective measurement based on the fashion sensible vocabularies is necessary to measure fashion sensibility. Therefore, this research is significant to examine differences of preference to classic fashion by generation and awareness for sensible vocabularies and suggest methodology of design sensible evaluation research through the quantitative evaluation objectifying subjective sensibility. For the method of research, precedent theses related to classic, concept and characteristics of classic in books and definition and characteristics by generation were examined, the best 3 domestic portal sites were selected and adjective vocabularies and images related to classic were collected from 2010 to 2014. Among the 206 adjectives collected, vocabularies whose average is more than 3.5 were drawn by 5-point Likert scale for fashion expert group. And, among the total 306 images collected, 21 representative images were selected by preliminary investigation of fashion expert group. For the classic images and vocabularies selected, frequency analysis, factor analysis and variance analysis were conducted by SPSS 19.0. The results of analysis are as follows. Preference to classic fashion image by generation was analyzed. As a result, both of two generations selected classic fashion as the most classic one. The images of the next orders were analyzed. As a result, Y generation selected basic classic fashion image which is casual with high activity as a classic one. Baby boom generation selected ancient classic fashion image, so there were differences in preference for classic by generation. As a factor analysis on classic adjective vocabularies, they could be divided into 5 factors such as basic form, attractive form, traditional form, vintage form and active form and they verified that credibility of all measuring variables for classic sensible vocabularies was achieved. Differences of classic sensible vocabularies by classic fashion image and generation were examined. As a result, generation and classic fashion image made a significant effect on five factors. Therefore, there were differences of the awareness on classic fashion images and sensible vocabularies among the generations and this thesis can be a fundamental material which objectifies subjective sensibility and suggests the methodology of new research.

  • PDF