• Title/Summary/Keyword: 단어빈도

Search Result 541, Processing Time 0.031 seconds

A Semi-Automatic Semantic Mark Tagging System for Building Dialogue Corpus (대화 말뭉치 구축을 위한 반자동 의미표지 태깅 시스템)

  • Park, Junhyeok;Lee, Songwook;Lim, Yoonseob;Choi, Jongsuk
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.8 no.5
    • /
    • pp.213-222
    • /
    • 2019
  • Determining the meaning of a keyword in a speech dialogue system is an important technology for the future implementation of an intelligent speech dialogue interface. After extracting keywords to grasp intention from user's utterance, the intention of utterance is determined by using the semantic mark of keyword. One keyword can have several semantic marks, and we regard the task of attaching the correct semantic mark to the user's intentions on these keyword as a problem of word sense disambiguation. In this study, about 23% of all keywords in the corpus is manually tagged to build a semantic mark dictionary, a synonym dictionary, and a context vector dictionary, and then the remaining 77% of all keywords is automatically tagged. The semantic mark of a keyword is determined by calculating the context vector similarity from the context vector dictionary. For an unregistered keyword, the semantic mark of the most similar keyword is attached using a synonym dictionary. We compare the performance of the system with manually constructed training set and semi-automatically expanded training set by selecting 3 high-frequency keywords and 3 low-frequency keywords in the corpus. In experiments, we obtained accuracy of 54.4% with manually constructed training set and 50.0% with semi-automatically expanded training set.

Interpretation of the Forest Therapy Process and Effect Verification through KeyWord Analysis of Literature on Forest Therapy (산림치유 효과 검증 연구의 주요어 분석을 통한 치유 발현과정 해석)

  • Park, Kyeong-Ja;Shin, Chang-Seob;Kim, Dongsoo
    • Journal of Korean Society of Forest Science
    • /
    • v.110 no.1
    • /
    • pp.82-90
    • /
    • 2021
  • In this study, the validity of the forest therapy process, in which forest activities using forest therapy factors lead to immunity promotion and health promotion, was analyzed theoretically and qualitatively to refine and systemize the forest therapy concept. Research and analysis data were collected from the websites of institutions related to forest therapy; 33 theses and 33 original research articles from 2000 to March 2020 were searched for forest therapy key words, as well as the prize winning work of the 2016 forest therapy experience essay. A word cloud was generated by frequency of nouns and adjectives and from the key words in the web pages, theses, articles, and the forest therapy experience essay. Through interpretation of word frequency, the systemic flow of forest therapy was defined. The results suggest that the source of forest therapy's power was a positive experience of the forest and an improved attitude toward nature as well as forest therapeutic factors. The therapeutic effect is maximized through the forest healing program, leading to physical and mental resilience and resistance; consequently, health and immunity are promoted. From this study, forest therapy is proposed as "a health promotion activity for the psychological, physical, and spiritual resilience of the subjects through various environmental factors of the forest, positive experiences, and attitudes toward the forest."

Analysis of Descriptive Course Evaluation of University Chemistry Laboratory Class using Text Mining (텍스트 마이닝을 활용한 대학 화학 실험 수업의 서술형 강의 평가 내용 분석)

  • Yun, Jeonghyun;Park, Geumju
    • Journal of the Korean Chemical Society
    • /
    • v.66 no.3
    • /
    • pp.218-227
    • /
    • 2022
  • The purpose of this study is to analyze the opinions of students by using the text mining to the good points and improvements among the descriptive course evaluation written by the students who participated in the university chemistry laboratory class, and to derive the improvement for the class. We analyzed the frequency of occurrence, co-occurrence and network of key words. As a result of the study, in the network of good points in the class, the most frequent mentions were made between class and professor, along with explanation, understanding, student, passion, fun, TA, experiment, help, etc. In the network of improvements in the class, the most frequent mentions were made between class and student, along with professor, content, explanation, exam, wish, experiment, understanding, difficult, thought, problem, etc. In other words, the students suggested the opinion that the contents of the class were well understood and that they felt fun and satisfied with the experimental process due to 'easy and detailed explanation' and 'TA's assistance' as good points of the class. On the other hand, the students suggested the negative opinions that the understanding and concentration in the class was decreased due to 'difficulty of content and exam', 'excessive assignments', and 'class environment' as improvements of the class.

Multidimensional Analysis of Unstructured Data and Trends in Architectural Review Opinions of Small and Medium-Sized Apartment Projects (다차원 분석방법을 활용한 중소규모 공동주택 건축심의 의견의 경향과 비정형 데이터로서의 특성분석)

  • Kim, Jinhee;Hwang, Taeeon;Kim, Jae-Sik;Huh, Youngki
    • Korean Journal of Construction Engineering and Management
    • /
    • v.24 no.6
    • /
    • pp.74-80
    • /
    • 2023
  • This study examines the characteristics of architectural review opinions as unstructured data, focusing on the most challenging risk for developers of small and medium-sized apartment projects in response to the increasing number of single-person households in Korea. Using multidimensional analysis methods, the study analyzes the review opinions of 25 projects in B City. Correspondence analysis and MDS (Multidimensional Scale) analysis show that, consistent with prior research, the keywords related to 'structure' and 'planning' dominate architectural review opinions in B City. While the MDS model's stress is very poor at 34.4%, correspondence analysis reveals that this is due to the characteristics of unstructured data in architectural reviews. In addition, the non-structured data analyzed in this study, such as architectural review opinions, exhibited a probability distribution with low kurtosis and high skewness, as they involved various combinations and occurrences of data depending on the discretion of the review committee members and the specific formats of different local governments. This often led to the emergence of keywords that differed significantly from commonly mentioned terms. Although the study has some limitations, it provides a foundation for future detailed analysis by identifying the characteristics of architectural review opinions as unstructured data.

Literary Research Using Digital Analysis Tools: A Case Study of 『Dangerous Liaisons』 ('디지털 분석 도구를 활용한 문학 연구 : 라클로의 『위험한 관계Les liaisons dangereuses』를 중심으로)

  • RYU Sun-Jung;YOU Eun-Soon
    • The Journal of the Convergence on Culture Technology
    • /
    • v.10 no.3
    • /
    • pp.173-180
    • /
    • 2024
  • We This study aimed to quantitatively analyze the theme of 'libertinage' and the associated issues of reason and emotion in 『Dangerous Liaisons』, a novel considered a masterpiece of libertine literature and an epistolary novel of the 18th century, using digital analysis tools. First, based on the frequency analysis of word usage using Voyant and LIWC 22, we confirmed that libertinage is manifested with keywords such as 'love' and 'time'. With Voyant's 'Contexts' feature, it was found that the letters sent by Valmont to Madame de Tourvel and those sent by Madame de Merteuil both have 'love' as the central theme. However, emotional vocabulary was higher in the former, whereas strategic vocabulary was more prevalent in the latter. Additionally, it was observed that the most frequently used word in the letters sent by Madame de Merteuil is 'time', with a higher frequency than 'love'. Thirdly, using LIWC 22, we measured the analytical thinking and emotional tone of the letters exchanged by the main characters, and analyzed how these values changed according to the chapters. Through these analyses, we confirmed that this novel, alongside Rousseau's "New Eloise," anticipates romanticism by embracing the theme of 'emotion,' which was rejected by 18th-century Enlightenment ideals.

Mass Media and Social Media Agenda Analysis Using Text Mining : focused on '5-day Rotation Mask Distribution System' (텍스트 마이닝을 활용한 매스 미디어와 소셜 미디어 의제 분석 : '마스크 5부제'를 중심으로)

  • Lee, Sae-Mi;Ryu, Seung-Eui;Ahn, Soonjae
    • The Journal of the Korea Contents Association
    • /
    • v.20 no.6
    • /
    • pp.460-469
    • /
    • 2020
  • This study analyzes online news articles and cafe articles on the '5-day Rotation Mask Distribution System', which is emerging as a recent issue due to the COVID-19 incident, to identify the mass media and social media agendas containing media and public reactions. This study figured out the difference between mass media and social media. For analysis, we collected 2,096 full text articles from Naver and 1,840 posts from Naver Cafe, and conducted word frequency analysis, word cloud, and LDA topic modeling analysis through data preprocessing and refinement. As a result of analysis, social media showed real-life topics such as 'family members' purchase', 'the postponement of school opening', ' mask usage', and 'mask purchase', reflecting the characteristics of personal media. Social media was found to play a role of exchanging personal opinions, emotions, and information rather than delivering information. With the application of the research method applied to this study, social issues can be publicized through various media analysis and used as a reference in the process of establishing a policy agenda that evolves into a government agenda.

Addressing Low-Resource Problems in Statistical Machine Translation of Manual Signals in Sign Language (말뭉치 자원 희소성에 따른 통계적 수지 신호 번역 문제의 해결)

  • Park, Hancheol;Kim, Jung-Ho;Park, Jong C.
    • Journal of KIISE
    • /
    • v.44 no.2
    • /
    • pp.163-170
    • /
    • 2017
  • Despite the rise of studies in spoken to sign language translation, low-resource problems of sign language corpus have been rarely addressed. As a first step towards translating from spoken to sign language, we addressed the problems arising from resource scarcity when translating spoken language to manual signals translation using statistical machine translation techniques. More specifically, we proposed three preprocessing methods: 1) paraphrase generation, which increases the size of the corpora, 2) lemmatization, which increases the frequency of each word in the corpora and the translatability of new input words in spoken language, and 3) elimination of function words that are not glossed into manual signals, which match the corresponding constituents of the bilingual sentence pairs. In our experiments, we used different types of English-American sign language parallel corpora. The experimental results showed that the system with each method and the combination of the methods improved the quality of manual signals translation, regardless of the type of the corpora.

A Study on the Semantic Network Structure of the Regime in the Image Contents (영상콘텐츠분야의 정권별 의미연결망 연구)

  • Hwang, Go-Eun;Moon, Shin-Jung
    • Journal of the Korean BIBLIA Society for library and Information Science
    • /
    • v.28 no.3
    • /
    • pp.217-240
    • /
    • 2017
  • The purpose of this study was to investigate the semantic network analysis to understand image contents and to examine the degree to which words, word clusters contributed to the formation of semantic map within image contents. For this research, from 1993 until 2016 the field of the image contents were collected for a total of 2,624 cases papers. The word appeared in Title analyzed the social network by using the R program of Big Data. The results were as follows: First, The field of image contents is based on researches related to 'image', 'media' and 'contents'. Second, there is a three-step flow ('education' -> 'media' -> 'contents') of research in the field of image contents. Third, researches related to 'broadcasting', 'digital', 'technology', and 'production' were continuously carried out. Finally, There were new research subjects for each regime.

Experience of Nursing University Students Participating in Low-Salt Diet Campaign (간호대학생의 교내 저염식이 캠페인 참여 경험)

  • Kim, Su-I;Woo, Sang-Jun;Kim, Eun-A
    • Journal of the Korea Convergence Society
    • /
    • v.11 no.8
    • /
    • pp.361-368
    • /
    • 2020
  • The purpose of this study is to investigate the educational effect of participation experience in nursing university students and to analyze the content. The subjects of this study were 32 of the 36 students in the 2nd to 3rd grade nursing students who participated in the D university in N city. Data analysis was classified into primary, secondary, and tertiary domains by assigning unique numbers to meaningful words in the original data. Participants were classified into three types of experiences after participating in the campus low salt diet campaign. Based on the results of this study, It was found that the experience of participating in and experiencing the education program of the nursing university students is educational effect.

Exploratory Research for Happiness-related Curriculum Introduction in Medical Education (의학교육에서의 행복 관련 교육과정 도입을 위한 탐색적 연구)

  • Yoo, Hyo Hyun
    • The Journal of the Korea Contents Association
    • /
    • v.17 no.3
    • /
    • pp.400-407
    • /
    • 2017
  • The purpose of this research was analyzing pre-post change of concept recognition structure about happiness and activity in which medical school students feel happiness, in order to develop such curriculum accordingly. Research subjects included a total of 36 sophomores attending medical school, and a day reconstruction method and network analysis were applied. According to the research finding, while medical school students were experiencing a lot of happiness while eating/talking and doing leisure activities, the frequency of feeling happiness through learning activities was shown to be low. Words that expressed happiness before and after were similar in many parts and 'economy' showed the highest degree centrality before and 'work' showed the highest degree centrality after. Because the structure of the concept about happiness was divided from 1 group into 4 groups including one's work, positive self, health of family, value about life etc., perception of the concept about happiness was changed from the concept about superficial happiness to the concept about actual happiness. Therefore, in order for prospective doctors who will treat human health to establish the concept and value about happiness properly, education about happiness is necessary and for this, curriculum related to happiness must be developed systematically.