• Title/Summary/Keyword: document topic

Search Result 190, Processing Time 0.021 seconds

Development of Scaffolding Strategies Model by Information Search Process (ISP) (정보탐색과정(ISP)에 의한 스캐폴딩 전략 모형 개발)

  • Jeong-Hoon Lim
    • Journal of Korean Library and Information Science Society
    • /
    • v.54 no.1
    • /
    • pp.143-165
    • /
    • 2023
  • This study aims to propose a scaffolding strategy that can be applied to the information search process by using Kuhlthau's ISP model, which presented a design and implementation strategy for the mediation role in the learning process. To this end, the relevant literature was reviewed to categorize scaffolding strategies, and impressions were collected from the students surveys after providing 150 middle school students in the Daejeon area with the project class to which the scaffolding strategy based on the ISP model was applied. The collected data were processed into a form suitable for analysis through data preprocessing for word frequencies to be extracted, and topic analysis was performed using STM (Structural Topic Modeling). First, after determining the optimal number of topics and extracting topics for each stage of the ISP model, the extracted topics were classified into three types: cognitive domain-macro perspective, cognitive domain-micro perspective, and emotional domain perspective. In this process, we focused on cognitive verbs and emotional verbs among words extracted through text mining, and presented a scaffolding strategy model related to each topic by reviewing representative document cases. Based on the results of this study, if an appropriate scaffolding strategy is provided at the ISP model stage, a positive effect on learners' self-directed task solving can be expected.

The Research Features Analysis of Leisure and Recreation based on Co-authors Network and Topic Model (공저자 네트워크 및 토픽 모델링 기반 여가레크리에이션 학술 연구 특징 분석)

  • Park, SungGeon;Park, Kwang-Won;Kang, Hyun-Wook
    • 한국체육학회지인문사회과학편
    • /
    • v.57 no.2
    • /
    • pp.279-289
    • /
    • 2018
  • The purpose of this study is to investigate features of leisure and recreation scholarship study in The Korean Journal of physical education based on co-authors network and topic modeling through using Word Cloud and LDA Topic Modeling(Latent Dirichlet Allocation). The data collected for this study are 2,697 papers published online from January 2008 to March 2017 on the Korean journal of physical education. Respectively ordered analysis targets are the major author, author of correspondence, co-author 1, co-author 2, co-author n in related document to explore studies' trends using the 369 documents. As a result, the co-author network analysis result found that 451 were linked to the research network, on average researchers had 1.52 relationships and the average distance between researchers was 2.33. The Representative author's concentration of connection was ranked high in the order of the following, Lee. K. M., Hwang. S. H., H., Lee. C. S., and proximity centers were shown in Seo K. B., Han. J. H., Kim. K. J. Finally, parameter-centric features appeared in order of Lee. C. W. and Seo. K. B. was most actively connected between the researchers of the leisure-related academic papers. Future research needs discussions among scholars regarding the trend and direction of future leisure research.

Analysis of major issues in the field of Maritime Autonomous Surface Ships using text mining: focusing on S.Korea news data (텍스트 마이닝을 활용한 자율운항선박 분야 주요 이슈 분석 : 국내 뉴스 데이터를 중심으로)

  • Hyeyeong Lee;Jin Sick Kim;Byung Soo Gu;Moon Ju Nam;Kook Jin Jang;Sung Won Han;Joo Yeoun Lee;Myoung Sug Chung
    • Journal of the Korean Society of Systems Engineering
    • /
    • v.20 no.spc1
    • /
    • pp.12-29
    • /
    • 2024
  • The purpose of this study is to identify the social issues discussed in Korea regarding Maritime Autonomous Surface Ships (MASS), the most advanced ICT field in the shipbuilding industry, and to suggest policy implications. In recent years, it has become important to reflect social issues of public interest in the policymaking process. For this reason, an increasing number of studies use media data and social media to identify public opinion. In this study, we collected 2,843 domestic media articles related to MASS from 2017 to 2022, when MASS was officially discussed at the International Maritime Organization, and analyzed them using text mining techniques. Through term frequency-inverse document frequency (TF-IDF) analysis, major keywords such as 'shipbuilding,' 'shipping,' 'US,' and 'HD Hyundai' were derived. For LDA topic modeling, we selected eight topics with the highest coherence score (-2.2) and analyzed the main news for each topic. According to the combined analysis of five years, the topics '1. Technology integration of the shipbuilding industry' and '3. Shipping industry in the post-COVID-19 era' received the most media attention, each accounting for 16%. Conversely, the topic '5. MASS pilotage areas' received the least media attention, accounting for 8 percent. Based on the results of the study, the implications for policy, society, and international security are as follows. First, from a policy perspective, the government should consider the current situation of each industry sector and introduce MASS in stages and carefully, as they will affect the shipbuilding, port, and shipping industries, and a radical introduction may cause various adverse effects. Second, from a social perspective, while the positive aspects of MASS are often reported, there are also negative issues such as cybersecurity issues and the loss of seafarer jobs, which require institutional development and strategic commercialization timing. Third, from a security perspective, MASS are expected to change the paradigm of future maritime warfare, and South Korea is promoting the construction of a maritime unmanned system-based power, but it emphasizes the need for a clear plan and military leadership to secure and develop the technology. This study has academic and policy implications by shedding light on the multidimensional political and social issues of MASS through news data analysis, and suggesting implications from national, regional, strategic, and security perspectives beyond legal and institutional discussions.

Multilingual Story Link Detection based on Properties of Event Terms (사건 어휘의 특성을 반영한 다국어 사건 연결 탐색)

  • Lee Kyung-Soon
    • The KIPS Transactions:PartB
    • /
    • v.12B no.1 s.97
    • /
    • pp.81-90
    • /
    • 2005
  • In this paper, we propose a novel approach which models multilingual story link detection by adapting the features such as timelines and multilingual spaces as weighting components to give distinctive weights to terms related to events. On timelines term significance is calculated by comparing term distribution of the documents on that day with that on the total document collection reported, and used to represent the document vectors on that day. Since two languages can provide more information than one language, term significance is measured on each language space and used to refer the other language space as a bridge on multilingual spaces. Evaluating the method on Korean and Japanese news articles, our method achieved $14.3{\%}\;and\;16.7{\%}$ improvement for mono- and multi-lingual story pairs, and for multilingual story pairs, respectively. By measuring the space density, the proposed weighting components are verified with a high density of the intra-event stories and a low density of the inter-events stories. This result indicates that the proposed method is helpful for multilingual story link detection.

Investigating the Combination of Bag of Words and Named Entities Approach in Tracking and Detection Tasks among Journalists

  • Mohd, Masnizah;Bashaddadh, Omar Mabrook A.
    • Journal of Information Science Theory and Practice
    • /
    • v.2 no.4
    • /
    • pp.31-48
    • /
    • 2014
  • The proliferation of many interactive Topic Detection and Tracking (iTDT) systems has motivated researchers to design systems that can track and detect news better. iTDT focuses on user interaction, user evaluation, and user interfaces. Recently, increasing effort has been devoted to user interfaces to improve TDT systems by investigating not just the user interaction aspect but also user and task oriented evaluation. This study investigates the combination of the bag of words and named entities approaches implemented in the iTDT interface, called Interactive Event Tracking (iEvent), including what TDT tasks these approaches facilitate. iEvent is composed of three components, which are Cluster View (CV), Document View (DV), and Term View (TV). User experiments have been carried out amongst journalists to compare three settings of iEvent: Setup 1 and Setup 2 (baseline setups), and Setup 3 (experimental setup). Setup 1 used bag of words and Setup 2 used named entities, while Setup 3 used a combination of bag of words and named entities. Journalists were asked to perform TDT tasks: Tracking and Detection. Findings revealed that the combination of bag of words and named entities approaches generally facilitated the journalists to perform well in the TDT tasks. This study has confirmed that the combination approach in iTDT is useful and enhanced the effectiveness of users' performance in performing the TDT tasks. It gives suggestions on the features with their approaches which facilitated the journalists in performing the TDT tasks.

Text Data Analysis Model Based on Web Application (웹 애플리케이션 기반의 텍스트 데이터 분석 모델)

  • Jin, Go-Whan
    • The Journal of the Korea Contents Association
    • /
    • v.21 no.11
    • /
    • pp.785-792
    • /
    • 2021
  • Since the Fourth Industrial Revolution, various changes have occurred in society as a whole due to advance in technologies such as artificial intelligence and big data. The amount of data that can be collect in the process of applying important technologies tends to increase rapidly. Especially in academia, existing generated literature data is analyzed in order to grasp research trends, and analysis of these literature organizes the research flow and organizes some research methodologies and themes, or by grasping the subjects that are currently being talked about in academia, we are making a lot of contributions to setting the direction of future research. However, it is difficult to access whether data collection is necessary for the analysis of document data without the expertise of ordinary programs. In this paper, propose a text mining-based topic modeling Web application model. Even if you lack specialized knowledge about data analysis methods through the proposed model, you can perform various tasks such as collecting, storing, and text-analyzing research papers, and researchers can analyze previous research and research trends. It is expect that the time and effort required for data analysis can be reduce order to understand.

What has Korea told in the WTO? : An analysis on the Ministerial Conference Statements (WTO에서 한국은 무슨 말을 해왔나?: 각료회의 대표발언문 분석을 중심으로)

  • Jeong-meen Suh
    • Korea Trade Review
    • /
    • v.48 no.1
    • /
    • pp.29-53
    • /
    • 2023
  • This study analyzes the statements made by representatives of member countries at the WTO Ministerial Conference (MC), the highest decision-making body of the WTO, to examine the position and attitude that Korea has shown at the WTO during the last 27 years. After constructing text dataset by extracting about 1,800 statement documents made by member countries from the WTO document database, the text mining technique is applied to figure out the characteristics of Korea's statements compared to other member countries. Through formal characteristics such as the number of remarks and length of speech, basic attitudes such as continuity of Korea's interest in the WTO and the level of interest in the WTO are measured. In terms of substantive characteristics, the topics in the statements of Korea are categorized through the LDA topic model, and the keywords of Korea for each session are analyzed through comparative analysis with statements by other member countries.

An Improved Combined Content-similarity Approach for Optimizing Web Query Disambiguation

  • Kamal, Shahid;Ibrahim, Roliana;Ghani, Imran
    • Journal of Internet Computing and Services
    • /
    • v.16 no.6
    • /
    • pp.79-88
    • /
    • 2015
  • The web search engines are exposed to the issue of uncertainty because of ambiguous queries, being input for retrieving the accurate results. Ambiguous queries constitute a significant fraction of such instances and pose real challenges to web search engines. Moreover, web search has created an interest for the researchers to deal with search by considering context in terms of location perspective. Our proposed disambiguation approach is designed to improve user experience by using context in terms of location relevance with the document relevance. The aim is that providing the user a comprehensive location perspective of a topic is informative than retrieving a result that only contains temporal or context information. The capacity to use this information in a location manner can be, from a user perspective, potentially useful for several tasks, including user query understanding or clustering based on location. In order to carry out the approach, we developed a Java based prototype to derive the contextual information from the web results based on the queries from the well-known datasets. Among those results, queries are further classified in order to perform search in a broad way. After the result provision to users and the selection made by them, feedback is recorded implicitly to improve the web search based on contextual information. The experiment results demonstrate the outstanding performance of our approach in terms of precision 75%, accuracy 73%; recall 81% and f-measure 78% when compared with generic temporal evaluation approach and furthermore achieved precision 86%, accuracy 71%; recall 67% and f-measure 75% when compared with web document clustering approach.

Wrapper-based Economy Data Collection System Design And Implementation (래퍼 기반 경제 데이터 수집 시스템 설계 및 구현)

  • Piao, Zhegao;Gu, Yeong Hyeon;Yoo, Seong Joon
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2015.05a
    • /
    • pp.227-230
    • /
    • 2015
  • For analyzing and prediction of economic trends, it is necessary to collect particular economic news and stock data. Typical Web crawler to analyze the page content, collects document and extracts URL automatically. On the other hand there are forms of crawler that can collect only document of a particular topic. In order to collect economic news on a particular Web site, we need to design a crawler which could directly analyze its structure and gather data from it. The wrapper-based web crawler design is required. In this paper, we design a crawler wrapper for Economic news analysis system based on big data and implemented to collect data. we collect the data which stock data, sales data from USA auto market since 2000 with wrapper-based crawler. USA and South Korea's economic news data are also collected by wrapper-based crawler. To determining the data update frequency on the site. And periodically updated. We remove duplicate data and build a structured data set for next analysis. Primary to remove the noise data, such as advertising and public relations, etc.

  • PDF

A Circle Labeling Scheme without Re-labeling for Dynamically Updatable XML Data (동적으로 갱신가능한 XML 데이터에서 레이블 재작성하지 않는 원형 레이블링 방법)

  • Kim, Jin-Young;Park, Seog
    • Journal of KIISE:Databases
    • /
    • v.36 no.2
    • /
    • pp.150-167
    • /
    • 2009
  • XML has become the new standard for storing, exchanging, and publishing of data over both the internet and the ubiquitous data stream environment. As demand for efficiency in handling XML document grows, labeling scheme has become an important topic in data storage. Recently proposed labeling schemes reflect the dynamic XML environment, which itself provides motivation for the discovery of an efficient labeling scheme. However, previous proposed labeling schemes have several problems: 1) An insertion of a new node into the XML document triggers re-labeling of pre-existing nodes. 2) They need larger memory space to store total label. etc. In this paper, we introduce a new labeling scheme called a Circle Labeling Scheme. In CLS, XML documents are represented in a circular form, and efficient storage of labels is supported by the use of concepts Rotation Number and Parent Circle/Child Circle. The concept of Radius is applied to support inclusion of new nodes at arbitrary positions in the tree. This eliminates the need for re-labeling existing nodes and the need to increase label length, and mitigates conflict with existing labels. A detailed experimental study demonstrates efficiency of CLS.