• Title/Summary/Keyword: Topic-Relevance

Search Result 60, Processing Time 0.026 seconds

The Influence of Topic Exploration and Topic Relevance On Amplitudes of Endogenous ERP Components in Real-Time Video Watching (실시간 동영상 시청시 주제탐색조건과 주제관련성이 내재적 유발전위 활성에 미치는 영향)

  • Kim, Yong Ho;Kim, Hyun Hee
    • Journal of Korea Multimedia Society
    • /
    • v.22 no.8
    • /
    • pp.874-886
    • /
    • 2019
  • To delve into the semantic gap problem of the automatic video summarization, we focused on an endogenous ERP responses at around 400ms and 600ms after the on-set of audio-visual stimulus. Our experiment included two factors: the topic exploration of experimental conditions (Topic Given vs. Topic Exploring) as a between-subject factor and the topic relevance of the shots (Topic-Relevant vs. Topic-Irrelevant) as a within-subject factor. For the Topic Given condition of 22 subjects, 6 short historical documentaries were shown with their video titles and written summaries, while in the Topic Exploring condition of 25 subjects, they were asked instead to explore topics of the same videos with no given information. EEG data were gathered while they were watching videos in real time. It was hypothesized that the cognitive activities to explore topics of videos while watching individual shots increase the amplitude of endogenous ERP at around 600 ms after the onset of topic relevant shots. The amplitude of endogenous ERP at around 400ms after the onset of topic-irrelevant shots was hypothesized to be lower in the Topic Given condition than that in the Topic Exploring condition. The repeated measure MANOVA test revealed that two hypotheses were acceptable.

A Method of Calculating Topic Keywords for Topic Labeling (토픽 레이블링을 위한 토픽 키워드 산출 방법)

  • Kim, Eunhoe;Suh, Yuhwa
    • Journal of Korea Society of Digital Industry and Information Management
    • /
    • v.16 no.3
    • /
    • pp.25-36
    • /
    • 2020
  • Topics calculated using LDA topic modeling have to be labeled separately. When labeling a topic, we look at the words that represent the topic, and label the topic. Therefore, it is important to first make a good set of words that represent the topic. This paper proposes a method of calculating a set of words representing a topic using TextRank, which extracts the keywords of a document. The proposed method uses Relevance to select words related to the topic with discrimination. It extracts topic keywords using the TextRank algorithm and connects keywords with a high frequency of simultaneous occurrence to express the topic with a higher coverage.

An Automated Topic Specific Web Crawler Calculating Degree of Relevance (연관도를 계산하는 자동화된 주제 기반 웹 수집기)

  • Seo Hae-Sung;Choi Young-Soo;Choi Kyung-Hee;Jung Gi-Hyun;Noh Sang-Uk
    • Journal of Internet Computing and Services
    • /
    • v.7 no.3
    • /
    • pp.155-167
    • /
    • 2006
  • It is desirable if users surfing on the Internet could find Web pages related to their interests as closely as possible. Toward this ends, this paper presents a topic specific Web crawler computing the degree of relevance. collecting a cluster of pages given a specific topic, and refining the preliminary set of related web pages using term frequency/document frequency, entropy, and compiled rules. In the experiments, we tested our topic specific crawler in terms of the accuracy of its classification, crawling efficiency, and crawling consistency. First, the classification accuracy using the set of rules compiled by CN2 was the best, among those of C4.5 and back propagation learning algorithms. Second, we measured the classification efficiency to determine the best threshold value affecting the degree of relevance. In the third experiment, the consistency of our topic specific crawler was measured in terms of the number of the resulting URLs overlapped with different starting URLs. The experimental results imply that our topic specific crawler was fairly consistent, regardless of the starting URLs randomly chosen.

  • PDF

A Video Summarization Study On Selecting-Out Topic-Irrelevant Shots Using N400 ERP Components in the Real-Time Video Watching (동영상 실시간 시청시 유발전위(ERP) N400 속성을 이용한 주제무관 쇼트 선별 자동영상요약 연구)

  • Kim, Yong Ho;Kim, Hyun Hee
    • Journal of Korea Multimedia Society
    • /
    • v.20 no.8
    • /
    • pp.1258-1270
    • /
    • 2017
  • 'Semantic gap' has been a year-old problem in automatic video summarization, which refers to the gap between semantics implied in video summarization algorithms and what people actually infer from watching videos. Using the external EEG bio-feedback obtained from video watchers as a solution of this semantic gap problem has several another issues: First, how to define and measure noises against ERP waveforms as signals. Second, whether individual differences among subjects in terms of noise and SNR for conventional ERP studies using still images captured from videos are the same with those differently conceptualized and measured from videos. Third, whether individual differences of subjects by noise and SNR levels help to detect topic-irrelevant shots as signals which are not matched with subject's own semantic topical expectations (mis-match negativity at around 400m after stimulus on-sets). The result of repeated measures ANOVA test clearly shows a 2-way interaction effect between topic-relevance and noise level, implying that subjects of low noise level for video watching session are sensitive to topic-irrelevant visual shots, while showing another 3-way interaction among topic-relevance, noise and SNR levels, implying that subjects of high noise level are sensitive to topic-irrelevant visual shots only if they are of low SNR level.

Curriculum Relevance Analysis of Physics Book Report Text Using Topic Modeling (토픽모델링을 활용한 물리학 독서감상문 텍스트의 교육과정 연계성 분석)

  • Lim, Jeong-Hoon
    • Journal of Korean Library and Information Science Society
    • /
    • v.53 no.2
    • /
    • pp.333-353
    • /
    • 2022
  • This study analyzed the relevance of the curriculum by applying topic modeling to book reports written as content area reading activities in the 'physics' class. In order to carry out the research, 332 physics book reports were collected to analyze the relevance among keywords and topics were extracted using STM. The result of the analysis showed that the main keywords of the physics book reports were 'thought', 'content', 'explain', 'theory', 'person', 'understanding'. To examine the influence and connection relationship of the derived keywords, the study presented degree centrality, between centrality, and eigenvetor centrality. As a result of the topic modeling analysis, eleven topics related to the physics curriculum were extracted, and the curriculum linkage could be drawn in three subjects (Physics I, Physics II, Science History), and six areas (force and motion, modern physics, wave, heat and energy, Western science history, and What is science). The analyzed results can be used as evidence for a more systematic implementation of content area reading activities which reflect the subject characteristics in the future.

Variations in relevance assessments and evaluation of the performance of full-text retrieval system (상이한 적합성 판정과 전문검색시스템의 평가에 관한 연구)

  • 문성빈
    • Journal of the Korean Society for information Management
    • /
    • v.14 no.2
    • /
    • pp.123-141
    • /
    • 1997
  • This study examined the extent to which variations in relevance assessments affect the evaluation of the performance of full-text retrieval system. Four sets of relevance judgments obtained by examining the full-text of documents were used to test the retrieval effectiveness. There was no noticeable difference in retrieval performance among the four relevance judgment sets. It implies that a variety of definitions of relevance has no effect on the evaluation of the performance of the full-text retrieval system. Furth r retrieval experiments on this topic incorporating relevance feedback, which is one of the sophisticated retrieval techniques using relevance information, are suggested.

  • PDF

Automatic Extraction Techniques of Topic-relevant Visual Shots Using Realtime Brainwave Responses (실시간 뇌파반응을 이용한 주제관련 영상물 쇼트 자동추출기법 개발연구)

  • Kim, Yong Ho;Kim, Hyun Hee
    • Journal of Korea Multimedia Society
    • /
    • v.19 no.8
    • /
    • pp.1260-1274
    • /
    • 2016
  • To obtain good summarization algorithms, we need first understand how people summarize videos. 'Semantic gap' refers to the gap between semantics implied in video summarization algorithms and what people actually infer from watching videos. We hypothesized that ERP responses to real time videos will show either N400 effects to topic-irrelevant shots in the 300∼500ms time-range after stimulus on-set or P600 effects to topic-relevant shots in the 500∼700ms time range. We recruited 32 participants in the EEG experiment, asking them to focus on the topic of short videos and to memorize relevant shots to the topic of the video. After analysing real time videos based on the participants' rating information, we obtained the following t-test result, showing N400 effects on PF1, F7, F3, C3, Cz, T7, and FT7 positions on the left and central hemisphere, and P600 effects on PF1, C3, Cz, and FCz on the left and central hemisphere and C4, FC4, P8, and TP8 on the right. A further 3-way MANOVA test with repeated measures of topic-relevance, hemisphere, and electrode positions showed significant interaction effects, implying that the left hemisphere at central, frontal, and pre-frontal positions were sensitive in detecting topic-relevant shots while watching real time videos.

Users' Relevance Criteria in Universal Search in Korea : An Exploratory Study (통합 검색 환경에서 이용자 적합성 판단 기준에 관한 탐색적 연구)

  • Park, Jung-Ah
    • Journal of the Korean Society for information Management
    • /
    • v.29 no.2
    • /
    • pp.113-133
    • /
    • 2012
  • This study is an exploratory research on the user relevance criteria in Korean search service environments that provide integrated search results. Data were collected from 10 participants using a semi-structured interview technique. The participants conducted a web search using integrated search services, such as Naver or Daum on a self-selected topic. They were asked to judge the relevance of retrieved documents and to report their relevance criteria. As a result, the research indicated 8 user-defined relevance and non-relevance criteria. The research shows that specificity and richness are the two most important criteria yet, the user's relevance criteria have not changed much despite the change in search environment.

Construction of Event Networks from Large News Data Using Text Mining Techniques (텍스트 마이닝 기법을 적용한 뉴스 데이터에서의 사건 네트워크 구축)

  • Lee, Minchul;Kim, Hea-Jin
    • Journal of Intelligence and Information Systems
    • /
    • v.24 no.1
    • /
    • pp.183-203
    • /
    • 2018
  • News articles are the most suitable medium for examining the events occurring at home and abroad. Especially, as the development of information and communication technology has brought various kinds of online news media, the news about the events occurring in society has increased greatly. So automatically summarizing key events from massive amounts of news data will help users to look at many of the events at a glance. In addition, if we build and provide an event network based on the relevance of events, it will be able to greatly help the reader in understanding the current events. In this study, we propose a method for extracting event networks from large news text data. To this end, we first collected Korean political and social articles from March 2016 to March 2017, and integrated the synonyms by leaving only meaningful words through preprocessing using NPMI and Word2Vec. Latent Dirichlet allocation (LDA) topic modeling was used to calculate the subject distribution by date and to find the peak of the subject distribution and to detect the event. A total of 32 topics were extracted from the topic modeling, and the point of occurrence of the event was deduced by looking at the point at which each subject distribution surged. As a result, a total of 85 events were detected, but the final 16 events were filtered and presented using the Gaussian smoothing technique. We also calculated the relevance score between events detected to construct the event network. Using the cosine coefficient between the co-occurred events, we calculated the relevance between the events and connected the events to construct the event network. Finally, we set up the event network by setting each event to each vertex and the relevance score between events to the vertices connecting the vertices. The event network constructed in our methods helped us to sort out major events in the political and social fields in Korea that occurred in the last one year in chronological order and at the same time identify which events are related to certain events. Our approach differs from existing event detection methods in that LDA topic modeling makes it possible to easily analyze large amounts of data and to identify the relevance of events that were difficult to detect in existing event detection. We applied various text mining techniques and Word2vec technique in the text preprocessing to improve the accuracy of the extraction of proper nouns and synthetic nouns, which have been difficult in analyzing existing Korean texts, can be found. In this study, the detection and network configuration techniques of the event have the following advantages in practical application. First, LDA topic modeling, which is unsupervised learning, can easily analyze subject and topic words and distribution from huge amount of data. Also, by using the date information of the collected news articles, it is possible to express the distribution by topic in a time series. Second, we can find out the connection of events in the form of present and summarized form by calculating relevance score and constructing event network by using simultaneous occurrence of topics that are difficult to grasp in existing event detection. It can be seen from the fact that the inter-event relevance-based event network proposed in this study was actually constructed in order of occurrence time. It is also possible to identify what happened as a starting point for a series of events through the event network. The limitation of this study is that the characteristics of LDA topic modeling have different results according to the initial parameters and the number of subjects, and the subject and event name of the analysis result should be given by the subjective judgment of the researcher. Also, since each topic is assumed to be exclusive and independent, it does not take into account the relevance between themes. Subsequent studies need to calculate the relevance between events that are not covered in this study or those that belong to the same subject.

Analysis of Issues Related to Artificial Intelligence Based on Topic Modeling (토픽모델링을 활용한 인공지능 관련 이슈 분석)

  • Noh, Seol-Hyun
    • Journal of Digital Convergence
    • /
    • v.18 no.5
    • /
    • pp.75-87
    • /
    • 2020
  • The present study determined new value that can be created through the convergence between artificial intelligence technology (AIT) and all industries by deriving and thoroughly analyzing major issues related to artificial intelligence (AI). This study analyzes domestic articles related to AI using topic modeling method based on LDA algorithm. Keywords were extracted from 3,889 articles of eleven metropolitan newspapers, eight business newspapers and major broadcasting companies; articles were selected by searching for the keyword "artificial intelligence". Keywords were extracted by optimizing the relevance parameter λ to improve the measure of pointwise mutual information (PMI), which shows the association among the keywords of each topic, and topic names were inferred from keywords based on valid evidence. The extracted topics widely showed changes occurring throughout society, economy, industries, culture, and the support policy and vision of the government.