Trend Properties and a Ranking Method for Automatic Trend Analysis

Oh, Heung-Seon;Choi, Yoon-Jung;Shin, Wook-Hyun;Jeong, Yoon-Jae;Myaeng, Sung-Hyon;

한국정보과학회논문지:소프트웨어및응용 (Journal of KIISE:Software and Applications)

제36권3호
/
Pages.236-243
/
2009
/
1229-6848(pISSN)

한국정보과학회 (Korean Institute of Information Scientists and Engineers)

자동 트렌드 탐지를 위한 속성의 정의 및 트렌드 순위 결정 방법

Trend Properties and a Ranking Method for Automatic Trend Analysis

오흥선 (한국과학기술원 정보통신공학과) ;
최윤정 (한국과학기술원 정보통신공학과) ;
신욱현 (한국과학기술원 정보통신공학과) ;
정윤재 (한국과학기술원 정보통신공학과) ;
맹성현 (한국과학기술원 정보통신공학과)

발행 : 2009.03.15

PDF KSCI

PDF 다운로드

⟨ 이전 논문 다음 논문 ⟩

초록

특허, 뉴스, 블로그와 같이 시간 정보가 있는 문서들로부터의 자동적인 트렌드 분석(trend analysis)은 토픽탐지 및 추적 기술(TDT: Topic Detection and Tracking)과 더불어 중요한 연구 분야로 대두되고 있다. 과거 연구들은 대부분 트렌드과 관련된 단어의 출현 빈도 정보를 이용하여 주어진 개념의 중요도를 측정하고 이 개념의 시간에 따른 트렌드 라인을 보여주는 것에 초점을 맞췄다. 신출 트렌드 (emerging trend)를 탐지하기 위해서는 주어진 개념의 출현 빈도수 변화와 같은 간단한 방법이나 학습 데이타와 비교하여 차이를 탐지하여 제시하는 방법이 사용되었다. 그러나 여러 트렌드 중에서 특징적인 트렌드를 찾아서 사용자에게 제공하기 위해서는 트렌드 순위 결정 함수가 필요하다. 본 논문은 트렌드의 다양한 측면을 정량화하기 위하여 출현 빈도로 구성된 트렌드 곡선으로부터 네 가지 속성 (변동성, 지속성, 안정성, 누적량) 을 정의하고 이를 활용한 트렌드 순위 결정 방법을 제안한다. 일련의 실험을 통하여 각 속성의 유용성을 검증하고 속성들의 조합이 순위 결정에 어떤 영향을 미치는지 분석하였다. 실험결과로부터 네 가지 속성을 모두 조합할 경우 특징적인 트렌드 탐지에 더욱 기여하는 것을 알 수 있다.

With advances in topic detection and tracking(TDT), automatic trend analysis from a collection of time-stamped documents, like patents, news papers, and blog pages, is a challenging research problem. Past research in this area has mainly focused on showing a trend line over time of a given concept by measuring the strength of trend-associated term frequency information. for detection of emerging trends, either a simple criterion such as frequency change was used, or an overall comparison was made against a training data. We note that in order to show most salient trends detected among many possibilities, it is critical to devise a ranking function. To this end, we define four properties(change, persistency, stability and volume) of trend lines drawn from frequency information, to quantify various aspects of trends, and propose a method by which trend lines can be ranked. The properties are examined individually and in combination in a series of experiments for their validity using the ranking algorithm. The results show that a judicious combination of the four properties is a better indicator for salient trends than any single criterion used in the past for ranking or detecting emerging trends.

키워드

참고문헌

Firminger, L., Trend Analysis: a collection of sub methodologies, Swinburne University of Technology, 2003
Glance, N., M. Hurst, and T. Tomokiyo, Blog-Pulse: Automated Trend Discovery for Weblogs, In WWW 2004 Workshop on the Weblogging Ecosystem: Aggregation, Analysis and Dynamics, 2004
Mei, Q. and C.X. Zhai., Discovering evolutionary theme patterns from text: an exploration of temporal text mining, In Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining, 2005 https://doi.org/10.1145/1081870.1081895
Google Trends: http://www.google.com/trends
Lavrenko, V., et al., Language models for financial news recommendation, In Proceedings of the ninth international conference on Information and knowledge management, 2000 https://doi.org/10.1145/354756.354845
Kontostathis, A., et al., A Survey of Emerging Trend Detection in Textual Data Mining, In Survey of Text Mining: Clustering, Classification, and Retrieval, 2003
Rajaraman, K. and A.H. Tan, Topic Detection, Tracking, and Trend Analysis Using Self-Organizing Neural Networks, In Proceedings of the 5th Pacific-Asia Conference on Knowledge Discovery and Data Mining, 2001
Morinaga, S. and K. Yamanishi, Tracking dynamics of topic trends using a finite mixture model. In Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining, 2004
Fung, G.P.C., J.X. Yu, and W. Lam, News Sensitive Stock Trend Prediction. In Proceedings of the 6th Pacific-Asia Conference on Advances in Knowledge Discovery and Data Mining, 2002
Lee, J., S. Cho, and J. Baek, Trend detection using auto-associative neural networks: Intraday KOSPI 200 futures, In Computational Intelligence for Financial Engineering, 2003
Lent, B., R. Agrawal, and R. Srikant, Discovering trends in text databases, In Proceedings of the 3rd International Conference on Knowledge Discovery and Data Mining (KDD), 1997
Run-Length Encoding: http://en.wikipedia.org/wiki/Run-length_encoding
Wright, A.H., Genetic algorithms for real parameter optimization. Foundations of Genetic Algorithms, 1991
Budanitsky, A. and G. Hirst, Evaluating WordNetbased Measures of Lexical Semantic Relatedness. Computational Linguistics, 2006
Yih, W., J. Goodman, and V.R. Carvalho, Finding advertising keywords on web pages. In Proceedings of the 15th international conference on World Wide Web, 2006
Verity, http://www.verity.comges
Holzman, L.E., Fisher, Fisher, T.A., Galisky, L. M., Kontostathis, A., and Pottenger, W. M., A Software Infrastructure for Research in Textual Data Mining. The International Journal of Artificial Intelligence Tools, volume 14, 2004

한국정보과학회논문지:소프트웨어및응용 (Journal of KIISE:Software and Applications)

자동 트렌드 탐지를 위한 속성의 정의 및 트렌드 순위 결정 방법

Trend Properties and a Ranking Method for Automatic Trend Analysis

초록

키워드

참고문헌

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

자세히 찾기

이미지 검색 (β)