• Title/Summary/Keyword: Frequency based Text Analysis

Search Result 239, Processing Time 0.027 seconds

An Analysis of Keywords on 'School Space Innovation' Policies using Text Mining - Focused on News Articles - (텍스트 마이닝을 활용한 '학교 공간 혁신' 정책 키워드 분석 - 뉴스 기사를 중심으로 -)

  • Lee, Dongkuk
    • The Journal of Sustainable Design and Educational Environment Research
    • /
    • v.19 no.2
    • /
    • pp.11-20
    • /
    • 2020
  • The goal of this study was to investigate the implementation and related issues of the school space innovation issued by key Korean mass media using text mining. To accomplish this goal, this study collected 519 news articles associated with the school space innovation issued by 54 Korean mass media companies. Based on this data, this study performed the frequency analysis and network analysis regarding the keywords. Based on the findings, the characteristics of school space innovation are summarized as follows: First, school space innovation has progressed in response to future education. Second, users are actively participating in school space innovation. Third, experts are supporting the innovation of school space by establishing a cooperative system. Fourth, the community is actively considering the innovation of school space. Fifth, the main projects of the Ministry of Education and the Provincial Offices of Education are actively conducted in a mix of top-down and bottom-up approaches. The findings of this study will contribute to providing a clear direction for contemporary school space innovation and implications for future research agenda and implementation.

The Research Trends and Keywords Modeling of Shoulder Rehabilitation using the Text-mining Technique (텍스트 마이닝 기법을 활용한 어깨 재활 연구분야 동향과 키워드 모델링)

  • Kim, Jun-hee;Jung, Sung-hoon;Hwang, Ui-jae
    • Journal of the Korean Society of Physical Medicine
    • /
    • v.16 no.2
    • /
    • pp.91-100
    • /
    • 2021
  • PURPOSE: This study analyzed the trends and characteristics of shoulder rehabilitation research through keyword analysis, and their relationships were modeled using text mining techniques. METHODS: Abstract data of 10,121 articles in which abstracts were registered on the MEDLINE of PubMed with 'shoulder' and 'rehabilitation' as keywords were collected using python. By analyzing the frequency of words, 10 keywords were selected in the order of the highest frequency. Word-embedding was performed using the word2vec technique to analyze the similarity of words. In addition, the groups were classified and analyzed based on the distance (cosine similarity) through the t-SNE technique. RESULTS: The number of studies related to shoulder rehabilitation is increasing year after year, keywords most frequently used in relation to shoulder rehabilitation studies are 'patient', 'pain', and 'treatment'. The word2vec results showed that the words were highly correlated with 12 keywords from studies related to shoulder rehabilitation. Furthermore, through t-SNE, the keywords of the studies were divided into 5 groups. CONCLUSION: This study was the first study to model the keywords and their relationships that make up the abstracts of research in the MEDLINE of Pub Med related to 'shoulder' and 'rehabilitation' using text-mining techniques. The results of this study will help increase the diversifying research topics of shoulder rehabilitation studies to be conducted in the future.

Trend Analysis of Fraudulent Claims by Long Term Care Institutions for the Elderly using Text Mining and BIGKinds (텍스트 마이닝과 빅카인즈를 활용한 노인장기요양기관 부당청구 동향 분석)

  • Youn, Ki-Hyok
    • Journal of Internet of Things and Convergence
    • /
    • v.8 no.2
    • /
    • pp.13-24
    • /
    • 2022
  • In order to explore the context of fraudulent claims and the measures for preventing them targeting the long-term care institutions for the elderly, which is increasing every year in Korea, this study conducted the text mining analysis using the media report articles. The media report articles were collected from the news big data analysis system called 'BIG KINDS' for about 15 years from July 2008 when the Long-Term Care Insurance for the Elderly took effect, to February 28th 2022. During this period of time, total 2,627 articles were collected under keywords like 'elderly care+fraudulent claims' and 'long-term care+fraudulent claims', and among them, total 946 articles were selected after excluding overlapped articles. In the results of the text mining analysis in this study, first, the top 10 keywords mentioned in the highest frequency in every section(July 1st 2008-February 28th 2022) were shown in the order of long-term care institution for the elderly, fraudulent claims, National Health Insurance Service, Long-Term Care Insurance for the Elderly, long-term care benefits(expenses), elderly care facilities, The Ministry of Health & Welfare, the elderly, report, and reward(payment). Second, in the results of the N-gram analysis, they were shown in the order of long-term care benefits(expenses) and fraudulent claims, fraudulent claims and long-care institution for the elderly, falsehood and fraudulent claims, report and reward(payment), and long-term care institution for the elderly and report. Third, the analysis of TF-IDF was similar to the results of the frequency analysis while the rankings of report, reward(payment), and increase moved up. Based on such results of the analysis above, this study presented the future direction for the prevention of fraudulent claims of long-term care institutions for the elderly.

Convergence Study of Relation between Job Stress and Self-efficacy of Nurses (간호사의 직무 스트레스와 자기효능감 관련 연구에 대한 융합적 고찰)

  • Moon, Heakyung;Jung, Miran;Noh, Wonjung
    • Journal of Convergence for Information Technology
    • /
    • v.9 no.3
    • /
    • pp.146-151
    • /
    • 2019
  • This study performed to identify the relationship between job stress and self-efficacy based on the related research review and text network analysis. For the literature review, we performed the search process at three domestic and one foreign database using key words, 'nurse', 'stress', 'self-efficacy'. A total of 18 papers were selected as the target literature. Nine of these studies reported a statistically significant negative correlation between nurses' job stress and self-efficacy. It was difficult to compare between studies' results because of the optional usage of the questionnaires. In addition, a text network analysis was conducted by extracting keywords from the 18 papers. The keyword with the highest frequency of appearance was job stress, and the main words with high frequency of emergence were self-efficacy, hospital, and correlation. To clarify the relationship between the keywords, it is proposed to perform a survey on the influence factors through the development of Korean version measurement.

A Case Study on Characteristics of Gender and Major in Career Preparation of University Students from Low-income Families: Application of Text Frequency Analysis and Association Rules (저소득층 대학생들의 진로준비과정에서의 성별·전공별 특성에 대한 사례연구: 텍스트 빈도분석과 연관분석의 적용)

  • Lee, Jihye;Lee, Shinhye
    • Journal of Digital Convergence
    • /
    • v.16 no.12
    • /
    • pp.61-69
    • /
    • 2018
  • This study aims to understand and to infer the implications from the career preparation experiences of low-income university students in the context of high youth unemployment rate and the polarization of the social classes. For this purpose, we selected 13 university students who received scholarship from the S scholarship foundation and conducted analysis using text mining techniques based on the six-time interviews. According to the results, university students seem to be influenced by home environment and income level when recalling previous academic experience or designing career during the interview process. Also, these differences were found to have different characteristics according to gender and major. This study is meaningful in that the qualitative research data is analyzed by applying the text mining technique in a convergent way. As a result, the college life and career preparation of low-income university students were explored through the frequency and relation of words.

A Study on FIFA Partner Adidas of 2022 Qatar World Cup Using Big Data Analysis

  • Kyung-Won, Byun
    • International Journal of Internet, Broadcasting and Communication
    • /
    • v.15 no.1
    • /
    • pp.164-170
    • /
    • 2023
  • The purpose of this study is to analyze the big data of Adidas brand participating in the Qatar World Cup in 2022 as a FIFA partner to understand useful information, semantic connection and context from unstructured data. Therefore, this study collected big data generated during the World Cup from Adidas participating in sponsorship as a FIFA partner for the 2022 Qatar World Cup and collected data from major portal sites to understand its meaning. According to text mining analysis, 'Adidas' was used the most 3,340 times based on the frequency of keyword appearance, followed by 'World Cup', 'Qatar World Cup', 'Soccer', 'Lionel Messi', 'Qatar', 'FIFA', 'Korea', and 'Uniform'. In addition, the TF-IDF rankings were 'Qatar World Cup', 'Soccer', 'Lionel Messi', 'World Cup', 'Uniform', 'Qatar', 'FIFA', 'Ronaldo', 'Korea', and 'Nike'. As a result of semantic network analysis and CONCOR analysis, four groups were formed. First, Cluster A named it 'Qatar World Cup Sponsor' as words such as 'Adidas', 'Nike', 'Qatar World Cup', 'Sponsor', 'Sponsor Company', 'Marketing', 'Nation', 'Launch', 'Official', 'Commemoration' and 'National Team' were formed into groups. Second, B Cluster named it 'Group stage' as words such as 'Qatar', 'Uruguay', 'FIFA' and 'group stage' were formed into groups. Third, C Cluster named it 'Winning' as words such as 'World Cup Winning', 'Champion', 'France', 'Argentina', 'Lionel Messi', 'Advertising' and 'Photograph' formed a group. Fourth, D Cluster named it 'Official Ball' as words such as 'Official Ball', 'World Cup Official Ball', 'Soccer Ball', 'All Times', 'Al Rihla', 'Public', 'Technology' was formed into groups.

A Study on the Research Trends in Domestic/International Information Science Articles by Co-word Analysis (동시출현단어 분석을 통한 국내외 정보학 학회지 연구동향 파악)

  • Kim, Ha Jin;Song, Min
    • Journal of the Korean Society for information Management
    • /
    • v.31 no.1
    • /
    • pp.99-118
    • /
    • 2014
  • This paper carried out co-word analysis of noun and noun phrase using text-mining technique in order to grasp the research trends on domestic and international information science articles. It was conducted based on collected titles and articles of the papers published in the Journal of the Korean Society for Information Management (KOSIM) and Journal of American Society for Information Science and Technology (JASIST) from 1990 to 2013. By dividing whole period into five publication window, this paper was organized into the following processes: 1) analysis of high frequency co-word pair to examine the overall trends of both information science articles 2) analysis of each word appearing with high frequency keyword to grasp the detailed subject 3) focused network analysis of trend after 2010 when distinctively new keyword appeared. The result of the analysis shows that KOSIM has considerable portion of studies conducted regarding topics such as library, information service, information user and information organization. Whereas, JASIST has focused on studies regarding information retrieval, information user, web information, and bibliometrics.

A Study on the Smart Tourism Awareness through Bigdata Analysis

  • LEE, Song-Yi;LEE, Hwan-Soo
    • The Journal of Industrial Distribution & Business
    • /
    • v.11 no.5
    • /
    • pp.45-52
    • /
    • 2020
  • Purpose: In the 4th industrial revolution, services that incorporate various smart technologies in the tourism sector have begun to gain popularity. Accordingly, academic discussions on smart tourism have also started to become active in various fields. Despite recent research, the definition of smart tourism is still ambiguous, and it is not easy to differentiate its scope or characteristics from traditional tourism concepts. Thus, this study aims to analyze the perception of smart tourism exposed online to identify the current point of smart tourism in Korea and present the research direction for conceptualizing smart tourism suitable for the domestic situation. Research design, data, and methodology: This study analyzes the perception of smart tourism exposed online based on 20,198 news data from portal sites over the past six years. Data on words used with smart tourism were collected from the leading portal sites Naver, Daum, and Google. Text mining techniques were applied to identify the social awareness status of smart tourism. Network analysis was used to visualize the results between words related to smart tourism, and CONCOR analysis was conducted to derive clusters formed by words having similarity. Results: As a result of keyword analysis, the frequency of words related to the development and construction of smart tourism areas was high. The analysis of the centrality of the connection between words showed that the frequency of keywords was similar, and that the words "smartphones" and "China" had relatively high connection centrality. The results of network analysis and CONCOR indicated that words were formed into eight groups including related technologies, promotion, globalization, service introduction, innovation, regional society, activation, and utilization guide. The overall results of data analysis showed that the development of smart tourism cities was a noticeable issue. Conclusions: This study is meaningful in that it clearly reflects the differences in the perception of smart tourism between online and research trends despite various efforts to develop smart tourism in Korea. In addition, this study highlights the need to understand smart tourism concepts and enhance academic discussions. It is expected that such academic discussions will contribute to improving the competitiveness of smart tourism research in Korea.

Study on Chinese Consumers' Perceptions of Samsung Smartphones through Social Media Data Analysis (소셜 미디어 데이터 분석을 통한 중국 소비자의 삼성 스마트폰에 대한 인식 연구)

  • Cui Ran;Inyong Nam
    • The Journal of the Convergence on Culture Technology
    • /
    • v.10 no.4
    • /
    • pp.311-321
    • /
    • 2024
  • This study comprehensively analyzed the perceptions of Chinese consumers who have and have not purchased Samsung smartphones, based on data from the social media platform Weibo. Various big data analysis techniques were used, including text mining, frequency analysis, centrality analysis, semantic network analysis, and CONCOR analysis. The results indicate that positive perceptions of Samsung smartphones include aspects such as design aesthetics, camera functionality, AI features, screen quality, specifications, and performance, and their status as a premium brand. On the other hand, negative perceptions include issues with pricing, a yellow tint in photos, slow charging speeds, and safety concerns. These findings will provide a crucial basis for making significant improvements in Samsung's market strategy in China.

An Attempt to Measure the Familiarity of Specialized Japanese in the Nursing Care Field

  • Haihong Huang;Hiroyuki Muto;Toshiyuki Kanamaru
    • Asia Pacific Journal of Corpus Research
    • /
    • v.4 no.2
    • /
    • pp.57-74
    • /
    • 2023
  • Having a firm grasp of technical terms is essential for learners of Japanese for Specific Purposes (JSP). This research aims to analyze Japanese nursing care vocabulary based on objective corpus-based frequency and subjectively rated word familiarity. For this purpose, we constructed a text corpus centered on the National Examination for Certified Care Workers to extract nursing care keywords. The Log-Likelihood Ratio (LLR) was used as the statistical criterion for keyword identification, giving a list of 300 keywords as target words for a further word recognition survey. The survey involved 115 participants of whom 51 were certified care workers (CW group) and 64 were individuals from the general public (GP group). These participants rated the familiarity of the target keywords through crowdsourcing. Given the limited sample size, Bayesian linear mixed models were utilized to determine word familiarity rates. Our study conducted a comparative analysis of word familiarity between the CW group and the GP group, revealing key terms that are crucial for professionals but potentially unfamiliar to the general public. By focusing on these terms, instructors can bridge the knowledge gap more efficiently.