Browse > Article
http://dx.doi.org/10.3743/KOSIM.2019.36.4.083

Enhancing Classification Performance of Temporal Keyword Data by Using Moving Average-based Dynamic Time Warping Method  

Jeong, Do-Heon (덕성여자대학교 문헌정보학과)
Publication Information
Journal of the Korean Society for information Management / v.36, no.4, 2019 , pp. 83-105 More about this Journal
Abstract
This study aims to suggest an effective method for the automatic classification of keywords with similar patterns by calculating pattern similarity of temporal data. For this, large scale news on the Web were collected and time series data composed of 120 time segments were built. To make training data set for the performance test of the proposed model, 440 representative keywords were manually classified according to 8 types of trend. This study introduces a Dynamic Time Warping(DTW) method which have been commonly used in the field of time series analytics, and proposes an application model, MA-DTW based on a Moving Average(MA) method which gives a good explanation on a tendency of trend curve. As a result of the automatic classification by a k-Nearest Neighbor(kNN) algorithm, Euclidean Distance(ED) and DTW showed 48.2% and 66.6% of maximum micro-averaged F1 score respectively, whereas the proposed model represented 74.3% of the best micro-averaged F1 score. In all respect of the comprehensive experiments, the suggested model outperformed the methods of ED and DTW.
Keywords
dynamic time warping; moving average; k-nearest neighbor; temporal analysis; pattern mining;
Citations & Related Records
Times Cited By KSCI : 6  (Citation Analysis)
연도 인용수 순위
1 Salvador, S., & Chan, P. (2007). Toward accurate dynamic time warping in linear time and space. Intelligent Data Analysis, 11(5), 561-580. http://dx.doi.org/10.3233/IDA-2007-11508   DOI
2 ten Holt, G. A., Reinders, M. J. T., & Hendriks, E. A. (2007). Multi-dimensional dynamic time warping for gesture recognition. Thirteenth annual conference of the Advanced School for Computing and Imaging
3 Tsokos, C. P. (2010). K-th Moving, Weighted and exponential moving average for time series forecasting models. European Journal of Pure and Applied Mathematics, 3(3), 406-416.
4 Yang, K., & Shahabi, C. (2007). An efficient k nearest neighbor search for multivariate time series. Information and Computation, 205(1), 65-98. http://dx.doi.org/10.1016/j.ic.2006.08.004   DOI
5 Zhuang, Y., Chen, L., Wang, X.S., & Lian, J. (2007). A weighted moving average-based approach for cleaning sensor data. 27th International Conference on Distributed Computing Systems (ICDCS '07). http://dx.doi.org/10.1109/ICDCS.2007.83
6 Asch, V. V. (2013). Macro- and micro-averaged evaluation measures [BASIC DRAFT].
7 Aach, J., & Church, G. M. (2001). Aligning gene expression time series with time warping algorithms. Bioinformatics, 17(6), 495-508. http://dx.doi.org/10.1093/bioinformatics/17.6.495   DOI
8 Abe, H., & Tsumoto, S. (2010). Trend detection from large text data. 2010 IEEE International Conference on Systems Man and Cybernetics (SMC), 310-315. http://dx.doi.org/10.1109/ICSMC.2010.5641682
9 Al-Naymat, G., Chawla, S., & Taheri, J. (2009). SparseDTW: a novel approach to speed up dynamic time warping. Proceeding of the Eighth Australasian Data Mining Conference, 101, 117-127.
10 Astrom, F. (2007). Changes in the LIS research front: time-sliced cocitation analyses of LIS journal articles, 1990-2004. Journal of the American Society for Information Science and Technology, 58(7), 947-957. http://dx.doi.org/10.1002/asi.20567   DOI
11 Bagnall, A., Lines, J., Bostrom, A., Large, J., & Keogh, E. (2017). The great time series classification bake off: A review and experimental evaluation of recent algorithmic advances, 31(3), 606-660.   DOI
12 Daim, T. U., Rueda, G., Martin, H., & Gerdsri, P. (2006). Forecasting emerging technologies: Use of bibliometrics and patent analysis. Technological Forecasting and Social Change, 73(8), 981-1012. http://dx.doi.org/10.1016/j.techfore.2006.04.004   DOI
13 Jeffery, S. R., Alonso, G., Franklin, M. J., Hong, W., & Widom, J. (2006). Declarative support for sensor data cleaning. International Conference on Pervasive Computing (LNCS 3968), 88-100.
14 Dore, J. C., & Ojasoo, T. (2001). How to analyze publication time trends by correspondence factor analysis: Analysis of publications by 48 countries in 19 disciplines over 12 years. Journal of the American Society for Information Science and Technology, 52(9), 763-769. http://dx.doi.org/10.1002/asi.1130   DOI
15 Geler, Z., Kurbalija, V., Radovanovic, M., & Ivanovic, M. (2014). Impact of the sakoe-chiba band on the DTW time-series distance measure for kNN classification. International Conference on Knowledge Science, Engineering and Management (KSEM 2014): Knowledge Science, Engineering and Management, 105-114.
16 Glanzel, W., & Schlemmer, B. (2007). National research profiles in a changing europe (1983-2003): An exploratory study of sectoral characteristics in the Triple Helix. Scientometrics, 70(2), 267-275. http://dx.doi.org/10.1007/s11192-007-0203-8   DOI
17 Hsu, H. H., Yang, A. C., & Lu, M. D. (2011). KNN-DTW based missing value imputation for microarray time series data. Journal of Computers, 6(3), 418-425. http://dx.doi.org/10.4304/jcp.6.3.418-425
18 Hwang, M. N., Cho, M. H., Hwang, M., Lee, M., & Jeong, D. H. (2011). Application of trend detection of technical terms to technology opportunity discovery. Communications in Computer and Information Science (CCIS), 264, 258-262. http://dx.doi.org/10.1007/978-3-642-27210-3_33   DOI
19 Juang, B.-H. (1984). On the hidden markov model and dynamic time warping for speech recognition - A unified view. AT&T DELL LAB Technical Journal, 63(7), 1213-1243. http://dx.doi.org/10.1002/j.1538-7305.1984.tb00034.x   DOI
20 Keogh, E. (2005). Exact indexing of dynamic time warping. Knowledge and Information Systems, 7(3), 358-386. http://dx.doi.org/10.1007/s10115-004-0154-9   DOI
21 Keogh, E. J., & Pazzani, M. J. (2001). Derivative dynamic time warping. Proceedings of the 2001 SIAM International Conference on Data Mining, 1-11. http://dx.doi.org/10.1137/1.9781611972719.1
22 Mei, Q., & Zhai, C. X. (2005). Discovering evolutionary theme patterns from text: An exploration of temporal text mining. The 11th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 198-207. http://dx.doi.org/10.1145/1081870.1081895
23 Kim, Yunji, & Park, Cheong Hee (2014). An improved dynamic time warping method for query by humming. Journal of Korean Institute of Information Scientists and Engineers(KIISE): Software and Applications, 41(4), 318-326.
24 Park, KeeHyun, & Yoo, Sangjin (2003). A prediction system on user interest degree to web sites using the concept of the moving averages. Korean management science review, 20(1), 25-36.
25 Seo, Janghyuk, Jung, Woohwan, & Shim, Kyuseok (2019). Improving the upper bound of the dynamic time warping for sparse and long time sequences. Journal of Korean Institute of Information Scientists and Engineers(KIISE), 46(6), 570-576. http://dx.doi.org/10.5626/JOK.2019.46.6.570
26 Kim, J., Hwang, M., Jeong, D.H., & Jung, H. (2012). Technology trends analysis and forecasting application based on decision tree and statistical feature analysis. Expert Systems with Applications, 39(2012), 12618-12625. http://dx.doi.org/10.1016/j.eswa.2012.05.021   DOI
27 Ko, M. H., West, G., Venkatesh, S., & Kumar, M. (2005). Online context recognition in multisensor systems using dynamic time warping. 2005 International Conference on Intelligent Sensors, Sensor Networks and Information Processing. http://dx.doi.org/10.1109/ISSNIP.2005.1595593
28 Niennattrakul, V., & Ratanamahatana, C. A. (2007). On clustering multimedia time series data using K-Means and dynamic time warping. 2007 International Conference on Multimedia and Ubiquitous Engineering (MUE'07). http://dx.doi.org/10.1109/MUE.2007.165
29 Rajagopalan, S., & Santoso, S. (2009). Wind power forecasting and error analysis using the autoregressive moving average modeling. 2009 IEEE Power & Energy Society General Meeting. http://dx.doi.org/10.1109/PES.2009.5276019
30 An, Juyoung, Ahn, Kyubin, & Song, Min (2016). Text mining driven content analysis of ebola on news media and scientific publications. Journal of the Korean Society for Library and Information Science, 50(2), 289-307. https://doi.org/10.4275/KSLIS.2016.50.2.289   DOI
31 Lee, Jae Won (2012). A stock trading system based on moving average patterns and turning point matrix. Journal of KIISE: Computing Practices and Letters, 18(7), 528-532.
32 Lee, Chunju, Ahn, Wonbin, & Oh, KyongJoo (2017). Analysis of intraday price momentum effect based on patterns using dynamic time warping. Journal of the Korean Data & Information Science Society, 28(4), 819-829. http://dx.doi.org/10.7465/jkdi.2017.28.4.819   DOI
33 Jeong, Do-Heon (2017). Prescriptive analytics system design fusing automatic classification method and intellectual structure analysis method. Journal of the Korean Society for information Management, 34(4), 33-57. https://dx.doi.org/10.3743/KOSIM.2017.34.4.033   DOI
34 Jeong, Do-Heon (2018). Generating and controlling an interlinking network of technical terms to enhance data utilization. Journal of the Korean Society for information Management, 35(1), 157-182. https://dx.doi.org/10.3743/KOSIM.2018.35.1.157   DOI
35 Jeong, Do-Heon, & Joo, Hwang-Soo (2018). Discovering interdisciplinary convergence technologies using content analysis technique based on topic modeling. Journal of the Korean Society for information Management, 35(3), 77-100. http://doi.org/10.3743/KOSIM.2018.35.3.077   DOI
36 Choi, Sanghee (2017). Analysis of author image based on book recommendation from readers. Journal of the Korean Society for information Management, 34(4), 153-171. https://doi.org/10.3743/KOSIM.2017.34.4.153   DOI
37 Pyo, Soon Hee, Kim, Yun Hyung, Kim, Hye Sun, & Kim, Wan Jong (2015). A study on the developing of big data services in public library. Journal of the Korean Society for information Management, 32(2), 63-86. https://doi.org/10.3743/KOSIM.2015.32.2.063   DOI