Browse > Article
http://dx.doi.org/10.1633/JISTaP.2014.2.4.3

Investigating the Combination of Bag of Words and Named Entities Approach in Tracking and Detection Tasks among Journalists  

Mohd, Masnizah (Japan Advanced Institute of Science and Technology (JAIST))
Bashaddadh, Omar Mabrook A. (Universiti Kebangsaan Malaysia)
Publication Information
Journal of Information Science Theory and Practice / v.2, no.4, 2014 , pp. 31-48 More about this Journal
Abstract
The proliferation of many interactive Topic Detection and Tracking (iTDT) systems has motivated researchers to design systems that can track and detect news better. iTDT focuses on user interaction, user evaluation, and user interfaces. Recently, increasing effort has been devoted to user interfaces to improve TDT systems by investigating not just the user interaction aspect but also user and task oriented evaluation. This study investigates the combination of the bag of words and named entities approaches implemented in the iTDT interface, called Interactive Event Tracking (iEvent), including what TDT tasks these approaches facilitate. iEvent is composed of three components, which are Cluster View (CV), Document View (DV), and Term View (TV). User experiments have been carried out amongst journalists to compare three settings of iEvent: Setup 1 and Setup 2 (baseline setups), and Setup 3 (experimental setup). Setup 1 used bag of words and Setup 2 used named entities, while Setup 3 used a combination of bag of words and named entities. Journalists were asked to perform TDT tasks: Tracking and Detection. Findings revealed that the combination of bag of words and named entities approaches generally facilitated the journalists to perform well in the TDT tasks. This study has confirmed that the combination approach in iTDT is useful and enhanced the effectiveness of users' performance in performing the TDT tasks. It gives suggestions on the features with their approaches which facilitated the journalists in performing the TDT tasks.
Keywords
Bag of Words; Named Entity Recognition; Interactive Topic Detection and Tracking; User; Journalists;
Citations & Related Records
연도 인용수 순위
  • Reference
1 Allan, J., Harding, S., Fisher, D., Bolivar, A., Guzman-Lara, S., & Amstutz, P. (2005). Taking topic detection from evaluation to practice. Proceedings of the 38th Hawaii International Conference on System Sciences (HICSS'05) (pp. 101.1). Washington: IEEE Computer Society.
2 Berendt, B., & Subasic, I. (2009). STORIES in time: A graph-based interface for news tracking and discovery. WI-IAT '09 Proceedings of the 2009 IEEE/WIC/ACM International Joint Conference on Web Intelligence and Intelligent Agent Technology, (pp. 531- 534). Washington: IEEE Computer Society.
3 Borlund, P. (2002). Experimental components for the evaluation of interactive information retrieval systems. Journal of Documentation, 56(1), 71-90.
4 Borlund, P., & Ingwersen, P. (1997). The development of a method for the evaluation of interactive information retrieval systems. Journal of Documentation, 53(3), 225-250.   DOI   ScienceOn
5 Kowalski, G. (1997). Information retrieval systems - Theory and implementation. London: Kluwer Academic.
6 Cunningham, H., Maynard, D., Bontcheva, K., & Tablan, V. (2002). GATE: A framework and graphical development environment for robust NLP tools and applications. In Proceedings of the 40th Anniversary Meeting of the Association for Computational Linguistics (pp. 168-175). Philadelphia: PA.
7 Doyle, L. B. (1975). Information retrieval and processing. California: Melville.
8 Jones, G. J. F., & Gabb, S. M. (2002). A visualisation tool for topic tracking analysis and development. Proceedings of the 25th annual international ACM SIGIR conference on Research and development in information retrieval (pp. 389-390). New York: ACM Press.
9 Kumaran, G., & Allan, J. (2004). Text classification and named entities for new event detection. Proceedings of the 27th annual international ACM SIGIR conference on research and development in information retrieval (pp. 297-304). New York: ACM Press.
10 Leuski, A., & Allan, J. (2000). Lighthouse: Showing the way to relevant information. In Proceedings of the IEEE Symposium on information Visualization (pp. 125-129). Washington, IEEE Computer Society.
11 Lewis, D. D., & Gale, W. A. (1994). A sequential algorithm for training text classifiers. Proceedings of the 17th annual international ACM SIGIR conference on Research and development in information retrieval (pp. 3-12). New York: Springer-Verlag.
12 Linacre, J. M. (2006). WINSTEPS Rasch measurement computer program. Chicago: Winsteps.
13 Makkonen, J., Ahonen-Myka, H., & Salmenkivi, M. (2004). Simple semantics in topic detection and tracking. Information Retrieval, 7(3-4), 347-368.   DOI   ScienceOn
14 Mohd, M., Crestani, F., & Ruthven, I. (2012). Evaluation of an interactive topic detection and tracking interface. Journal of Information Science, 38(4), 383-398.   DOI
15 Pons-Porrata, A., Berlanga-Llavori, R., Ruiz-Shulcloper, J., & Perez-Martinez, J. M. (2004). JERARTOP: A new topic detection system. In Proceeding of Progress in Pattern Recognition, Image Analysis and Applications (pp. 71-90). New York: Springer-Verlag.
16 Rijsbergen, C. J. (1979). Information retrieval, 2nd ed. London: Butterworths.
17 Sparck-Jones, K. S., & Willett, P. (1997). Readings in information retrieval. San-Francisco: Morgan Kaufmann.
18 Sparck-Jones, K. S. (1981). Information retrieval experiment. London: Butterworth-Heinemann Newton.
19 Swan, R., & Allan, J. (2000). Automatic generation of overview timelines. Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval (pp. 49-56). New York: ACM Press.
20 Yang, Y., Carbonell, J. Q., Brown R. D., Pierce, T., Archibald, B. T., & Lin, X. (1999). Learning approaches for detecting and tracking news events. IEEE Intelligent Systems Special Issue on Applications of Intelligent Information Retrieval, IEEE Educational Activities Department, 14(4), 32 -43.
21 Zhang, K., Zi, J., & Wu, L. G. (2007). New event detection based on indexing-tree and named entities. SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval (pp. 215- 222). Netherlands: ACM Press.