Browse > Article
http://dx.doi.org/10.12815/kits.2020.19.4.81

A Study on the Compression and Major Pattern Extraction Method of Origin-Destination Data with Principal Component Analysis  

Kim, Jeongyun (Dept. of Civil and Environmental Engineering, KAIST)
Tak, Sehyun (Center for Connected and Automated Driving Research, The Korea Transport Institute)
Yoon, Jinwon (Dept. of Civil and Environmental Engineering, KAIST)
Yeo, Hwasoo (Dept. of Civil and Environmental Engineering, KAIST)
Publication Information
The Journal of The Korea Institute of Intelligent Transport Systems / v.19, no.4, 2020 , pp. 81-99 More about this Journal
Abstract
Origin-destination data have been collected and utilized for demand analysis and service design in various fields such as public transportation and traffic operation. As the utilization of big data becomes important, there are increasing needs to store raw origin-destination data for big data analysis. However, it is not practical to store and analyze the raw data for a long period of time since the size of the data increases by the power of the number of the collection points. To overcome this storage limitation and long-period pattern analysis, this study proposes a methodology for compression and origin-destination data analysis with the compressed data. The proposed methodology is applied to public transit data of Sejong and Seoul. We first measure the reconstruction error and the data size for each truncated matrix. Then, to determine a range of principal components for removing random data, we measure the level of the regularity based on covariance coefficients of the demand data reconstructed with each range of principal components. Based on the distribution of the covariance coefficients, we found the range of principal components that covers the regular demand. The ranges are determined as 1~60 and 1~80 for Sejong and Seoul respectively.
Keywords
Big data; Public transit data; Origin-destination data; Principal component analysis; Pattern analysis;
Citations & Related Records
Times Cited By KSCI : 4  (Citation Analysis)
연도 인용수 순위
1 Asif M. T., Kannan S., Dauwels J. and Jaillet P.(2013), "Data compression techniques for urban traffic data," 2013 IEEE Symposium on Computational Intelligence in Vehicles and Transportation Systems (CIVTS), IEEE, 2013.
2 Asif M. T., Srinivasan K., Mitrovic N., Dauwels J. and Jaillet P.(2014), "Near-lossless compression for large traffic networks," IEEE Transactions on Intelligent Transportation Systems, vol. 16, no. 4, pp.1817-1826.   DOI
3 Barcelo J., Montero L., Marques L. and Carmona C.(2010), "Travel time forecasting and dynamic origin-destination estimation for freeways based on bluetooth traffic monitoring," Transportation Research Record, vol. 2175, no. 1, pp.19-27.   DOI
4 Calabrese F., Diao M., Di Lorenzo G., Ferreira Jr. J. and Ratti C.(2013), "Understanding individual mobility patterns from urban sensing data: A mobile phone trace example," Transportation Research Part C: Emerging Technologies, vol. 26, pp.301-313.   DOI
5 Damaiyanti T. I., Imawan A. and Kwon J.(2014), "Extracting trends of traffic congestion using a nosql database," 2014 IEEE Fourth International Conference on Big Data and Cloud Computing, IEEE, pp.209-213.
6 Diao M., Zhu Y., Ferreira Jr. J. and Ratti C.(2016), "Inferring individual daily activities from mobile phone traces: A Boston example," Environment and Planning B: Planning and Design, vol. 43, no. 5, pp.920-940.   DOI
7 Djukic T., Van Lint J. W. C. and Hoogendoorn S. P.(2012), "Application of principal component analysis to predict dynamic origin-destination matrices," Transportation Research Record, vol. 2283, no. 1, pp.81-89.   DOI
8 Feng S., Ke R., Wang X., Zhang Y. and Li L.(2017), "Traffic flow data compression considering burst components," IET Intelligent Transport Systems, vol. 11, no. 9, pp.572-580.   DOI
9 Feng S., Zhang Y. and Li L. (2016), "A comparison study for traffic flow data compression," 2016 12th World Congress on Intelligent Control and Automation (WCICA), IEEE, pp.977-982.
10 Ha J. and Lee S.(2016), "The Estimation of Commuting Pattern and the Analysis of the Commuting Network Structure using Smart Card Data: Focused on the Possibility of APplication Through the Validation Process with Household Travel Survey Data," Journal of Korea Planning Association, vol. 51, no. 4, p.123.   DOI
11 Kim J., Kim D., Seoung H. and Song T.(2019), A study on the Reliability of Traffic Demand Prediction Based on Big Data, The Korea Transport Institute, pp.1-777.
12 Kim S. K.(2007), The estimation and Application of Origin-Destination Tables by Using Smart Card Data, Seoul, Seoul Development Institute, 2007-R-11.
13 Lee M., Han J. and Lee H.(2018), "Analysis of the Transit Ridership Pattern using Transportation Card Data: focusing on Ganghwa," The Journal of The Korea Institute of Intelligent Transportation Systems, vol. 17, no. 2, pp.58-72.   DOI
14 Kim S. K.(2015), "Plans for Raising the Utilization of Smart Card Data," KRIHS Monthly Magazine, vol. 405, pp.18-24.
15 Kim W., Kim Y. H., Park H. S. and Park J. K.(2017), "Analysis of Traffic Card Big Data by Hadoop and Sequential Mining Technique," Journal of Information Technology Applications & Management, vol. 24, no. 4, pp.187-196.   DOI
16 Kumar P., Khani A. and Davis G. A.(2019), "Transit Route Origin-Destination Matrix Estimation using Compressed Sensing," Transportation Research Record, vol. 2673, no. 10, pp.164-174.   DOI
17 Li L., Su X., Zhang Y., Hu J. and Li Z.(2014), "Traffic prediction, data compression, abnormal data detection and missing data imputation: An integrated study based on the decomposition of traffic time series," 17th International IEEE Conference on Intelligent Transportation Systems (ITSC), IEEE, pp.282-289.
18 Li Q., Jianming H. and Yi Z.(2007), "A flow volumes data compression approach for traffic network based on principal component analysis," 2007 IEEE Intelligent Transportation Systems Conference, IEEE, pp.125-130.
19 Luo D., Cats O. and van Lint H.(2017), "Constructing transit origin-destination matrices with spatial clustering," Transportation Research Record, vol. 2652, no. 1, pp.39-49.   DOI
20 Maktoubian J., Noori M., Mouziraji M. G. and Amini M.(2017), "Analyzing Large-Scale Smart Card Data to Investigate Public Transport Travel Behaviour Using Big Data Analytics," Journal of Information Technology and Software Engineering, vol. 7, no. 4, p.211.
21 Xu D. W., Wang Y. D., Jia L. M., Zhang G. J. and Guo H. F.(2017), "Compression Algorithm of Road Traffic Spatial Data Based on LZW Encoding," Journal of Advanced Transportation, 2017.
22 Mitrovic N., Asif M. T., Dauwels J. and Jaillet P.(2015), "Low-dimensional models for compressed sensing and prediction of large-scale traffic data," IEEE Transactions on Intelligent Transportation Systems, vol. 16, no. 5, pp.2949-2954.   DOI
23 Munizaga M., Palma C. and Fischer D.(2011), "Estimation of a Disaggregate Public Transport OD Matrix from Passive SmartCard Data from Santiago, Chile," Transportation Research Board, 11-0430.
24 Ryu Y. and Chung U.(2013), "A study on Combined Model of Gravity Model and Growth Factor Model for Trip Distribution Estimation," Journal of Daegu Gyeongbuk Development Institute, vol. 12, no. 1, pp.63-73.
25 Yang H., Kim G., Nam H. and Jun C.(2018), "An Individual Trip Dynamic Visualization method using Smartcard Data," Journal of Korean Society for Geospatial Information Science, vol. 26, no. 2, pp.3-10.   DOI