• Title/Summary/Keyword: Taxi Trajectory Data

Search Result 7, Processing Time 0.025 seconds

Labeling Big Spatial Data: A Case Study of New York Taxi Limousine Dataset

  • AlBatati, Fawaz;Alarabi, Louai
    • International Journal of Computer Science & Network Security
    • /
    • v.21 no.6
    • /
    • pp.207-212
    • /
    • 2021
  • Clustering Unlabeled Spatial-datasets to convert them to Labeled Spatial-datasets is a challenging task specially for geographical information systems. In this research study we investigated the NYC Taxi Limousine Commission dataset and discover that all of the spatial-temporal trajectory are unlabeled Spatial-datasets, which is in this case it is not suitable for any data mining tasks, such as classification and regression. Therefore, it is necessary to convert unlabeled Spatial-datasets into labeled Spatial-datasets. In this research study we are going to use the Clustering Technique to do this task for all the Trajectory datasets. A key difficulty for applying machine learning classification algorithms for many applications is that they require a lot of labeled datasets. Labeling a Big-data in many cases is a costly process. In this paper, we show the effectiveness of utilizing a Clustering Technique for labeling spatial data that leads to a high-accuracy classifier.

T-START: Time, Status and Region Aware Taxi Mobility Model for Metropolis

  • Wang, Haiquan;Lei, Shuo;Wu, Binglin;Li, Yilin;Du, Bowen
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.12 no.7
    • /
    • pp.3018-3040
    • /
    • 2018
  • The mobility model is one of the most important factors that impacts the evaluation of any transportation vehicular networking protocols via simulations. However, to obtain a realistic mobility model in the dynamic urban environment is a very challenging task. Several studies extract mobility models from large-scale real data sets (mostly taxi GPS data) in recent years, but they do not consider the statuses of taxi, which is an important factor affected taxi's mobility. In this paper, we discover three simple observations related to the taxi statuses via mining of real taxi trajectories: (1) the behavior of taxi will be influenced by the statuses, (2) the macroscopic movement is related with different geographic features in corresponding status, and (3) the taxi load/drop events are varied with time period. Based on these three observations, a novel taxi mobility model (T-START) is proposed with respect to taxi statuses, geographic region and time period. The simulation results illustrate that proposed mobility model has a good approximation with reality in trajectory samples and distribution of nodes in four typical time periods.

Analysis of the taxi telematics history data based on a state diagram (상태도에 기반한 택시 텔레매틱스 히스토리 데이터 분석)

  • Lee, Jung-Hoon;Kwon, Sang-Cheol
    • Journal of Korea Spatial Information System Society
    • /
    • v.10 no.1
    • /
    • pp.41-49
    • /
    • 2008
  • This paper presents a data analysis method for the taxi telematics system which generates a greate deal of location history data. By the record consist of the basic GPS receiver-generated fields, device-added fields such as taxi operation status, and framework-attached fields such as matched link Identifier and position ratio in a link, each taxi can be represented by a state diagram. The transition and the state definition enable us to efficiently extract such information as pick-up time, pick-up distance, dispatch time, and dispatch distance. The analysis result can help to verify the efficiency of a specific taxi dispatch algorithm, while the analysis framework can invite a new challenging service including future traffic estimation, trajectory clustering, and so on.

  • PDF

A Benchmark Test of Spatial Big Data Processing Tools and a MapReduce Application

  • Nguyen, Minh Hieu;Ju, Sungha;Ma, Jong Won;Heo, Joon
    • Journal of the Korean Society of Surveying, Geodesy, Photogrammetry and Cartography
    • /
    • v.35 no.5
    • /
    • pp.405-414
    • /
    • 2017
  • Spatial data processing often poses challenges due to the unique characteristics of spatial data and this becomes more complex in spatial big data processing. Some tools have been developed and provided to users; however, they are not common for a regular user. This paper presents a benchmark test between two notable tools of spatial big data processing: GIS Tools for Hadoop and SpatialHadoop. At the same time, a MapReduce application is introduced to be used as a baseline to evaluate the effectiveness of two tools and to derive the impact of number of maps/reduces on the performance. By using these tools and New York taxi trajectory data, we perform a spatial data processing related to filtering the drop-off locations within Manhattan area. Thereby, the performance of these tools is observed with respect to increasing of data size and changing number of worker nodes. The results of this study are as follows 1) GIS Tools for Hadoop automatically creates a Quadtree index in each spatial processing. Therefore, the performance is improved significantly. However, users should be familiar with Java to handle this tool conveniently. 2) SpatialHadoop does not automatically create a spatial index for the data. As a result, its performance is much lower than GIS Tool for Hadoop on a same spatial processing. However, SpatialHadoop achieved the best result in terms of performing a range query. 3) The performance of our MapReduce application has increased four times after changing the number of reduces from 1 to 12.

An Unified Spatial Index and Visualization Method for the Trajectory and Grid Queries in Internet of Things

  • Han, Jinju;Na, Chul-Won;Lee, Dahee;Lee, Do-Hoon;On, Byung-Won;Lee, Ryong;Park, Min-Woo;Lee, Sang-Hwan
    • Journal of the Korea Society of Computer and Information
    • /
    • v.24 no.9
    • /
    • pp.83-95
    • /
    • 2019
  • Recently, a variety of IoT data is collected by attaching geosensors to many vehicles that are on the road. IoT data basically has time and space information and is composed of various data such as temperature, humidity, fine dust, Co2, etc. Although a certain sensor data can be retrieved using time, latitude and longitude, which are keys to the IoT data, advanced search engines for IoT data to handle high-level user queries are still limited. There is also a problem with searching large amounts of IoT data without generating indexes, which wastes a great deal of time through sequential scans. In this paper, we propose a unified spatial index model that handles both grid and trajectory queries using a cell-based space-filling curve method. also it presents a visualization method that helps user grasp intuitively. The Trajectory query is to aggregate the traffic of the trajectory cells passed by taxi on the road searched by the user. The grid query is to find the cells on the road searched by the user and to aggregate the fine dust. Based on the generated spatial index, the user interface quickly summarizes the trajectory and grid queries for specific road and all roads, and proposes a Web-based prototype system that can be analyzed intuitively through road and heat map visualization.

A new Clustering Algorithm for GPS Trajectories with Maximum Overlap Interval (최대 중첩구간을 이용한 새로운 GPS 궤적 클러스터링)

  • Kim, Taeyong;Park, Bokuk;Park, Jinkwan;Cho, Hwan-Gue
    • KIISE Transactions on Computing Practices
    • /
    • v.22 no.9
    • /
    • pp.419-425
    • /
    • 2016
  • In navigator systems, keeping map data up-to-date is an important task. Manual update involves a substantial cost and it is difficult to achieve immediate reflection of changes with manual updates. In this paper, we present a method for trajectory-center extraction, which is essential for automatic road map generation with GPS data. Though clustered trajectories are necessary to extract the center road, real trajectories are not clustered. To address this problem, this paper proposes a new method using the maximum overlapping interval and trajectory clustering. Finally, we apply the Virtual Running method to extract the center road from the clustered trajectories. We conducted experiments on real massive taxi GPS data sets collected throughout Gang-Nam-Gu, Sung-Nam city and all parts of Seoul city. Experimental results showed that our method is stable and efficient for extracting the center trajectory of real roads.

Frequent Origin-Destination Sequence Pattern Analysis from Taxi Trajectories (택시 기종점 빈번 순차 패턴 분석)

  • Lee, Tae Young;Jeon, Seung Bae;Jeong, Myeong Hun;Choi, Yun Woong
    • KSCE Journal of Civil and Environmental Engineering Research
    • /
    • v.39 no.3
    • /
    • pp.461-467
    • /
    • 2019
  • Advances in location-aware and IoT (Internet of Things) technology increase the rapid generation of massive movement data. Knowledge discovery from massive movement data helps us to understand the urban flow and traffic management. This paper proposes a method to analyze frequent origin-destination sequence patterns from irregular spatiotemporal taxi pick-up locations. The proposed method starts by conducting cluster analysis and then run a frequent sequence pattern analysis based on identified clusters as a base unit. The experimental data is Seoul taxi trajectory data between 7 a.m. and 9 a.m. during one week. The experimental results present that significant frequent sequence patterns occur within Gangnam. The significant frequent sequence patterns of different regions are identified between Gangnam and Seoul City Hall area. Further, this study uses administrative boundaries as a base unit. The results based on administrative boundaries fails to detect the frequent sequence patterns between different regions. The proposed method can be applied to decrease not only taxis' empty-loaded rate, but also improve urban flow management.