• 제목/요약/키워드: Scattertext

검색결과 4건 처리시간 0.016초

토픽모델링을 활용한 해운물류 뉴스 분석 (Analysis of Shipping and Logistics News Articles using Topic Modeling)

  • 윤희영;곽일엽
    • 무역학회지
    • /
    • 제46권4호
    • /
    • pp.61-76
    • /
    • 2021
  • This study focuses on three logistics-related news (Logistics Newspaper, Korea Shipping Gadget, and Korea Shipping Newspaper) in order to present changes in logistics issues, centering on Corona 19, which has recently had the greatest impact in the world. For data collection, two-year news articles in 2019 and 2020 (title, article, content, date, article classification, article URL) were collected through web crawling (using Python's BeautifulSoup, requests module) on the homepages of three representative logistics-related media companies. As for the data analysis methods, fundamental statistical analysis, Latent Dirichlet Allocation (LDA) for topic modeling, and Scattertext were performed. The analysis results were as follows. First, among the three news media related to logistics, the Korea Shipping Newspaper was carrying out the most active media activities. Second, through topic modeling with LDA, eight logistics-related topics were identified, and keywords and significant issues of each topic were presented. Third, the keywords were visually expressed through Scattertext. This is the first study to present changes in the logistics field, focusing on articles from representative logistics-related media in 2019 and 2020. In particular, 2019 and 2020 can be divided into before and after the outbreak of Corona 19, which has had a great impact not only on the logistics field but also on our lives as a whole. For future work, a multi-faceted approach is required, such as comparative studies of logistics issues between countries or presenting implications based on long-term time-series articles.

비정형 텍스트 데이터 분석을 활용한 기록관리 분야 연구동향 (Research Trends in Record Management Using Unstructured Text Data Analysis)

  • 홍덕용;허준석
    • 한국기록관리학회지
    • /
    • 제23권4호
    • /
    • pp.73-89
    • /
    • 2023
  • 본 연구에서는 텍스트 마이닝 기법을 활용하여 국내 기록관리 연구 분야의 비정형 텍스트 데이터인 국문 초록에서 사용된 키워드 빈도를 분석하여 키워드 간 거리 분석을 통해 국내기록관리 연구 동향을 파악하는 것이 목적이다. 이를 위해 한국학술지인용색인(Korea Citation Index, KCI)의 학술지 기관통계(등재지, 등재후보지)에서 대분류(복합학), 중분류 (문헌정보학)으로 검색된 학술지(28종) 중 등재지 7종 1,157편을 추출하여 77,578개의 키워드를 시각화하였다. Word2vec를 활용한 t-SNE, Scattertext 등의 분석을 수행하였다. 분석 결과, 첫째로 1,157편의 논문에서 얻은 77,578개의 키워드를 빈도 분석한 결과, "기록관리" (889회), "분석"(888회), "아카이브"(742회), "기록물"(562회), "활용"(449회) 등의 키워드가 연구자들에 의해 주요 주제로 다뤄지고 있음을 확인하였다. 둘째로, Word2vec 분석을 통해 키워드 간의 벡터 표현을 생성하고 유사도 거리를 조사한 뒤, t-SNE와 Scattertext를 활용하여 시각화하였다. 시각화 결과에서 기록관리 연구 분야는 두 그룹으로 나누어졌는데 첫 번째 그룹(과거)에는 "아카이빙", "국가기록관리", "표준화", "공문서", "기록관리제도" 등의 키워드가 빈도가 높게 나타났으며, 두 번째 그룹(현재)에는 "공동체", "데이터", "기록정보서비스", "온라인", "디지털 아카이브" 등의 키워드가 주요한 관심을 받고 있는 것으로 나타났다.

초록데이터를 활용한 국내외 FTA 연구동향: 2000-2020 (Trends in FTA Research of Domestic and International Journal using Paper Abstract Data)

  • 윤희영;곽일엽
    • 무역학회지
    • /
    • 제45권5호
    • /
    • pp.37-53
    • /
    • 2020
  • This study aims to provide the implications of research development by comparing domestic and international studies conducted on the subject of FTA. To this end, among the papers written during the period from 2000 to July 23, 2020, papers whose title is searched by FTA (Free Trade Agreement) were selected as research data. In the case of domestic research, 1,944 searches from the Korean Citation Index (KCI) and 970 from the Web of Science and SCOPUS were selected for international research, and the research trend was analyzed through keywords and abstracts. Frequency analysis and word embedding (Word2vec) were used to analyze the data and visualized using t-SNE and Scattertext. The results of the analysis are as follows. First, in the top 30 keywords of domestic and international research, 16 out of 30 were found to be the same. In domestic research, many studies have been conducted to analyze the outcomes or expected effects of countries that have concluded or discussed FTAs with Korea, on the other hand there are diverse range of study subjects in international research. Second, in the word embedding analysis, t-SNE was used to visually represent the research connection of the top 60 keywords. Finally, Scattertext was used to visually indicate which keywords were frequently used in studies from 2000 to 2010, and from 2011 to 2020. This study is the first to draw implications for academic development through abstract and keyword analysis by applying various text mining approaches to the FTA related research papers. Further in-depth research is needed, including collecting a variety of FTA related text data, comparing and analyzing FTA studies in different countries.

Association Modeling on Keyword and Abstract Data in Korean Port Research

  • Yoon, Hee-Young;Kwak, Il-Youp
    • Journal of Korea Trade
    • /
    • 제24권5호
    • /
    • pp.71-86
    • /
    • 2020
  • Purpose - This study investigates research trends by searching for English keywords and abstracts in 1,511 Korean journal articles in the Korea Citation Index from the 2002-2019 period using the term "Port." The study aims to lay the foundation for a more balanced development of port research. Design/methodology - Using abstract and keyword data, we perform frequency analysis and word embedding (Word2vec). A t-SNE plot shows the main keywords extracted using the TextRank algorithm. To analyze which words were used in what context in our two nine-year subperiods (2002-2010 and 2010-2019), we use Scattertext and scaled F-scores. Findings - First, during the 18-year study period, port research has developed through the convergence of diverse academic fields, covering 102 subject areas and 219 journals. Second, our frequency analysis of 4,431 keywords in 1,511 papers shows that the words "Port" (60 times), "Port Competitiveness" (33 times), and "Port Authority" (29 times), among others, are attractive to most researchers. Third, a word embedding analysis identifies the words highly correlated with the top eight keywords and visually shows four different subject clusters in a t-SNE plot. Fourth, we use Scattertext to compare words used in the two research sub-periods. Originality/value - This study is the first to apply abstract and keyword analysis and various text mining techniques to Korean journal articles in port research and thus has important implications. Further in-depth studies should collect a greater variety of textual data and analyze and compare port studies from different countries.