• Title/Summary/Keyword: Vocabulary Analysis

Search Result 306, Processing Time 0.025 seconds

A Morpheme Analyzer based on Transformer using Morpheme Tokens and User Dictionary (사용자 사전과 형태소 토큰을 사용한 트랜스포머 기반 형태소 분석기)

  • DongHyun Kim;Do-Guk Kim;ChulHui Kim;MyungSun Shin;Young-Duk Seo
    • Smart Media Journal
    • /
    • v.12 no.9
    • /
    • pp.19-27
    • /
    • 2023
  • Since morphemes are the smallest unit of meaning in Korean, it is necessary to develop an accurate morphemes analyzer to improve the performance of the Korean language model. However, most existing analyzers present morpheme analysis results by learning word unit tokens as input values. However, since Korean words are consist of postpositions and affixes that are attached to the root, even if they have the same root, the meaning tends to change due to the postpositions or affixes. Therefore, learning morphemes using word unit tokens can lead to misclassification of postposition or affixes. In this paper, we use morpheme-level tokens to grasp the inherent meaning in Korean sentences and propose a morpheme analyzer based on a sequence generation method using Transformer. In addition, a user dictionary is constructed based on corpus data to solve the out - of-vocabulary problem. During the experiment, the morpheme and morpheme tags printed by each morpheme analyzer were compared with the correct answer data, and the experiment proved that the morpheme analyzer presented in this paper performed better than the existing morpheme analyzer.

A study on Korean tourism trends using social big data -Focusing on sentiment analysis- (소셜 빅데이터를 활용한 한국관광 트렌드에 관한연구 -감성분석을 중심으로-)

  • Youn-hee Choi;Kyoung-mi Yoo
    • The Journal of the Convergence on Culture Technology
    • /
    • v.10 no.3
    • /
    • pp.97-109
    • /
    • 2024
  • In the field of domestic tourism, tourism trend analysis of tourism consumers, both international tourists and domestic tourists, is essential not only for the Korean tourism market but also for local and governmental tourism policy makers. e will explore the keywords and sentiment analysis on social media to establish a marketing strategy plan and revitalize the domestic tourism industry through communication and information from tourism consumers. This study utilized TEXTOM 6.0 to analyze recent trends in Korean tourism. Data was collected from September 31, 2022, to August 31, 2023, using 'Korean tourism' and 'domestic tourism' as keywords, targeting blogs, cafes, and news provided by Naver, Daum, and Google. Through text mining, 100 key words and TF-IDF were extracted in order of frequency, and then CONCOR analysis and sentiment analysis were conducted. For Korean tourism keywords, words related to tourist destinations, travel companions and behaviors, tourism motivations and experiences, accommodation types, tourist information, and emotional connections ranked high. The results of the CONCOR analysis were categorized into five clusters related to tourist destinations, tourist information, tourist activities/experiences, tourism motivation/content, and inbound related. Finally, the sentiment analysis showed a high level of positive documents and vocabulary. This study analyzes the rapidly changing trends of Korean tourism through text mining on Korean tourism and is expected to provide meaningful data to promote domestic tourism not only for Koreans but also for foreigners visiting Korea.

A Study on the Landscape Characteristics and Implications of the Royal Garden through 「The 36 Scenery of Seongdeok Summer Mountain Resort」 by Kangxi Emperor (강희제(康熙帝)의 「승덕 피서산장(避暑山莊) 36경」에 담긴 황가원림의 경관 특성과 함의)

  • RHO Jaehyun;MENG Zijun
    • Korean Journal of Heritage: History & Science
    • /
    • v.55 no.4
    • /
    • pp.212-240
    • /
    • 2022
  • This study is a multi-layered exploration of 「The Thirty-Six Scenery of Seongdeok Summer Mountain Resort(承德避暑山莊三十六景)」 (The 36th view of Kangxi) recited by Emperor Kangxi of China through literature study, ancient calligraphy diagrams, and field studies. The conclusion of tracing the landscape characteristics and implications contained in 「The 36th view of Kangxi」 through the analysis of the headword(標題語) and the interpretation of the Jeyeong poem(題詠詩) is as follows. 「The 36th view of Kangxi」 is an extension of the outer edge of the Eight Sceneries, and when compared to the existing Eight Sceneries peom and Eight Sceneries painting, it is found that the landscape is centered on the 'viewpoint' rather than the landscape object. In particular, it aimed to create a structured landscape centered on nine types of buildings represented by 'Jeon(殿)' and 'Jeong(亭)' was given. In particular, Yeouiju, located in Lake district, is a scenic country endowed with the character of a gardens in Garden, which is composed by collecting famous representative Chinese landscapes and landscapes of Sansu-si and Sanshu Painting. As a result of headword analysis to understand the characteristics of landscape components, 14 landscapes (38.9%) related to water elements and 13 landscapes(36.1%) related to mountain elements, the elements related to architecture and civil engineering were classified in the order of 3 cases(8.3%), and the elements related to the skylight were classified in the order of 2 cases(5.6%). However, in Jeyeong-si, the mention of landscape vocabulary for climate elements was overwhelming. In other words, in the poems of 「The 36th Scenery of Kangxi」, scenery vocabulary symbolizing 'coolness' such as 雲(cloud), 水(water), 泉(spring), 清(clear), 波(wave), 流(wave), 風(wind) and 無暑(without heat), etc. It is not a coincidence that it appears, and it is strongly attached to the sense of place of Summer Mountain Resort in Rehe(熱河). Among the 23 landscapes whose seasonal background was confirmed, the fact that the lower landscape is portrayed as the majority and the climate elements of the resort area are portrayed in three-dimensional and multi-dimensional ways are closely related to the period of enjoying the gardens of Kangxi, the main subject of the landscape. In addition, many animal and plant landscapes appearing in Jeyeong-si appear to be in the same context as the spatial attributes of not only recreation, but also contemplation and hunting. On the other hand, in Jeyeongsi, there are 33 wonders(91.7%) citing famous people and famous books through ancient poems, old stories, and ancient stories tends to be prominent. It is inferred that this was based on Kangxi's understanding and pride in traditional Chinese culture. In 「The 36th view of Kangxi」, not only a book-writing description of the feelings of being entrusted to the family sutras, but also the spirit of patriotism, love, self-discipline and respect for mother and filial piety are strongly implied. Ultimately, 「The 36th view of Kangxi」 shows the real scene of the resort, as well as the spiritual dimension, in a multi-faceted and three-dimensional way, and the spirit of an emperor based on the dignity of the royal family and the sentiments of a writer it deserves to be called a collection of imperial records that were intended to reveal.

Relationship between Result of Sentiment Analysis and User Satisfaction -The case of Korean Meteorological Administration- (감성분석 결과와 사용자 만족도와의 관계 -기상청 사례를 중심으로-)

  • Kim, In-Gyum;Kim, Hye-Min;Lim, Byunghwan;Lee, Ki-Kwang
    • The Journal of the Korea Contents Association
    • /
    • v.16 no.10
    • /
    • pp.393-402
    • /
    • 2016
  • To compensate for limited the satisfaction survey currently conducted by Korea Metrological Administration (KMA), a sentiment analysis via a social networking service (SNS) can be utilized. From 2011 to 2014, with the sentiment analysis, Twitter who had commented 'KMA' had collected, then, using $Na{\ddot{i}}ve$ Bayes classification, we were classified into three sentiments: positive, negative, and neutral sentiments. An additional dictionary was made with morphemes appeared only in the positive, negative, and neutral sentiments of basic $Na{\ddot{i}}ve$ Bayes classification, thus the accuracy of sentiment analysis was improved. As a result, when sentiments were classified with a basic $Na{\ddot{i}}ve$ Bayes classification, the training data were reproduced about 75% accuracy rate. Whereas, when classifying with the additional dictionary, it showed 97% accuracy rate. When using the additional dictionary, sentiments of verification data was classified with about 75% accuracy rate. Lower classification accuracy rate would be improved by not only a qualified dictionary that has increased amount of training data, including diverse keywords related to weather, but continuous update of the dictionary. Meanwhile, contrary to the sentiment analysis based on dictionary definition of individual vocabulary, if sentiments are classified into meaning of sentence, increased rate of negative sentiment and change in satisfaction could be explained. Therefore, the sentiment analysis via SNS would be considered as useful tool for complementing surveys in the future.

A Study on Classic Fashion Image and Sensible Vocabularies - Focusing on Women of Baby Boom and Y Generations - (클래식 패션 이미지와 감성 어휘 연구 - 베이비붐, Y세대 여성을 중심으로 -)

  • Sang, Yoon-Jin;Yoo, Jung-Min;Park, Minjung;Lee, Inseong
    • Journal of the Korea Fashion and Costume Design Association
    • /
    • v.17 no.3
    • /
    • pp.85-98
    • /
    • 2015
  • Modern fashion shows the trend of various styles and the period focusing on only product functions is changed to the period focusing on consumer's sensibility. Consumers show different sensitivities and preference by individual at the stage cognizing and recognizing the stimulation of given image and the method of objective measurement based on the fashion sensible vocabularies is necessary to measure fashion sensibility. Therefore, this research is significant to examine differences of preference to classic fashion by generation and awareness for sensible vocabularies and suggest methodology of design sensible evaluation research through the quantitative evaluation objectifying subjective sensibility. For the method of research, precedent theses related to classic, concept and characteristics of classic in books and definition and characteristics by generation were examined, the best 3 domestic portal sites were selected and adjective vocabularies and images related to classic were collected from 2010 to 2014. Among the 206 adjectives collected, vocabularies whose average is more than 3.5 were drawn by 5-point Likert scale for fashion expert group. And, among the total 306 images collected, 21 representative images were selected by preliminary investigation of fashion expert group. For the classic images and vocabularies selected, frequency analysis, factor analysis and variance analysis were conducted by SPSS 19.0. The results of analysis are as follows. Preference to classic fashion image by generation was analyzed. As a result, both of two generations selected classic fashion as the most classic one. The images of the next orders were analyzed. As a result, Y generation selected basic classic fashion image which is casual with high activity as a classic one. Baby boom generation selected ancient classic fashion image, so there were differences in preference for classic by generation. As a factor analysis on classic adjective vocabularies, they could be divided into 5 factors such as basic form, attractive form, traditional form, vintage form and active form and they verified that credibility of all measuring variables for classic sensible vocabularies was achieved. Differences of classic sensible vocabularies by classic fashion image and generation were examined. As a result, generation and classic fashion image made a significant effect on five factors. Therefore, there were differences of the awareness on classic fashion images and sensible vocabularies among the generations and this thesis can be a fundamental material which objectifies subjective sensibility and suggests the methodology of new research.

  • PDF

Exploring the Research Trends of Learning Strategies in Korean Language Education Using Co-word Analysis (동시출현단어 분석을 활용한 한국어교육에서의 학습전략 연구 동향 탐색)

  • Heo, Youngsoo;Park, Ji-Hong
    • Journal of the Korean Society for information Management
    • /
    • v.38 no.2
    • /
    • pp.65-86
    • /
    • 2021
  • In the foreign language education, learners are an important part of education, however in the Korean language education, the study of learners was insufficient compared to the contents of education, teaching methods and textbooks. Therefore, it is meaningful to analyze how learner research, especially learning strategy research, has been conducted and derive areas that need research for better education. In this study, co-word analysis was conducted on the titles of academic journals and dissertations in order to analyze the learning strategy research in Korean language education. I found it is about "reading" that the most studies related to Korean language learners' learning strategies were conducted and those studies' subjects mostly were 'Chinese international students' and 'marriage-immigrants'. In addition, the results of the subgroup analysis on the research topic show four major subgroups: a group related to 'reading for academic purposes', a group related to 'request, rejection, conversation, etc.', a group related to 'writing', and a group related to 'vocabulary, listening'. This shows that the researchers' major interests in studying Korean learner's strategies are "reading" and "speaking" and their studies have been concentrated in the specific areas. Therefore, it is necessary for researchers to study various functions and subjects in Korean language learner's learning strategies.

Automatic Error Correction System for Erroneous SMS Strings (SMS 변형된 문자열의 자동 오류 교정 시스템)

  • Kang, Seung-Shik;Chang, Du-Seong
    • Journal of KIISE:Software and Applications
    • /
    • v.35 no.6
    • /
    • pp.386-391
    • /
    • 2008
  • Some spoken word errors that violate grammatical or writing rules occurs frequently in communication environments like mobile phone and messenger. These unexpected errors cause a problem in a language processing system for many applications like speech recognition, text-to-speech translation, and so on. In this paper, we proposed and implemented an automatic correction system of ill-formed words and word spacing errors in SMS sentences that has been the major errors of poor accuracy. We experimented three methods of constructing the word correction dictionary and evaluated the results of those methods. They are (1) manual construction of error words from the vocabulary list of ill-formed communication languages, (2) automatic construction of error dictionary from the manually constructed corpus, and (3) context-dependent method of automatic construction of error dictionary.

The Design of Component Repository Management System for Semantic Web (시멘틱 웹 기반 컴포넌트 저장소 관리 시스템 설계)

  • Kim, Yang-Hoon;Jang, Joon-Sik;Kim, Guk-Boh
    • Journal of Korea Multimedia Society
    • /
    • v.11 no.2
    • /
    • pp.237-246
    • /
    • 2008
  • According to the development of information & web technology and the amount of information increases, there have been numerous problems exposed. Although, software engineers try to overcome the limitations by using software agent and web service, there hasn't been satisfactory result in the paradigm of current software agent being a provider to the user's demand. Moreover, the latest configuration of software development is based on CBD (Component Based Development). However, to construct new component using CBD costs great deal of expenses and therefore, a new model which can acquirecomponent information on the web promptly and accurately with low expenses is required. In this paper, the repository management system which acquires and manages on the Semantic web is designed and compare them to the existing component repository management and present its analysis result. In addition, to overcome the limitations of existing component repository system; low accuracy of search result, restrictive search vocabulary and faulty information, the specific plan is presented to perform a knowledge search for the component.

  • PDF

A Study on the Deconstruction Characteristics of Traditional Space Analyzed by Aesthetic Idea of Lao-tzu (노자의 미학적 관점으로 본 전통공간의 해체적 특성 연구)

  • Lee, Jong-Hee;Kim, Ji-Eun
    • Korean Institute of Interior Design Journal
    • /
    • v.21 no.2
    • /
    • pp.56-64
    • /
    • 2012
  • This paper has tried to analyze the characteristics of space in Korean traditional architecture by deconstructive concept through connecting Lao-tzu's theory, the main discourse of East and West, with Derrida's deconstructionism theory. Derrida's philosophical term of differance(diff$\acute{e}$rance) is similar to Tao of Lao-tzu. It is because Derrida emphasized the relationships with others by trying the strategy of overcoming dichotomous thinking by this term. Tao of Lao-tzu also has the relative characteristics that cannot be concluded by one sole meaning. Like this, Derrida and Lao-tzu are against traditional and dichotomous way of thinking. In this point of view, this study has set Derrida's deconstruction theory and Lao-tzu's thinking as the common viewpoint of this world. And through the phrase of Tao Te Ching which means deconstructive Tao, deconstructive space design vocabulary was derived as mixed no-boundary, shape of no-shape, transcendence of time and space. The deconstructive characteristics of traditional space by case study analysis of Lao-tzu's deconstructive space design are as follows: First, it is not a specific or detailed shape but an unlimited possibility that can be transformed into something else, moving and changing endlessly and has a borderless beauty. Second, it is nothing itself but creates various shapes, as if it exists without shape. Third, it is a relative and unlimited space and pursuits a free form as a non-conceptional shape without any system or value.

  • PDF

The Equality of Key Words of the Journal of Korean Dental Society of Anesthesiology with Medical Subject Headings (MeSH) (2001-2014) (대한치과마취과학회지 게재 논문들의 핵심용어와 MeSH 용어의 일치도)

  • Shim, Youn-Soo;Kim, Ah-Hyeon;You, Yong-Ouk;Kim, Il-Ho;Yu, Song-Yi;Lee, Kwang-Seok;Jeong, Chae-Yul;Kim, Eun-Hee;Maeng, Sun-Woo;An, So-Youn
    • Journal of The Korean Dental Society of Anesthesiology
    • /
    • v.14 no.3
    • /
    • pp.143-149
    • /
    • 2014
  • Background: The purpose of this study was to analyze the equality between key words used in the Journal of Korean Dental Society of Anesthesiology and Medical Subject Headings (MeSH). Methods: A total of 666 English key words in 187 papers (average 3.5 words in a paper) from 2001 to 2014 were eligible for this study. We classified them according to matched, and non-matched terms. After descriptive analysis, we assayed patterns of errors in using MeSH, and reviewed frequently used non-MeSH terms. Results: Fifty nine point six percent (59.6%) of total key words were completely coincident with MeSH terms, 40.39% were not MeSH terms. Conclusions: The results show that the coincidence rate of key words with MeSH terms was at a moderate level. However, there is a need for us to understand MeSH more specifically and accurately. Use of proper key words aligned with the international standards such as MeSH is important to be properly cited. The authors should pay attention and be educated on correct use of MeSH as key words.