Spatial Big Data Management and Analytics - Taxi Trajectory Analysis for Finding Pick-up Hotspots

The sixth module is entitled to "Practical Applications of Spatial Data Science", in which five real-world problems are introduced and corresponding solutions are presented with step-by-step procedures in the solution structures and related open source software's, discussed in Module 2. The first lecture presents an example of Desktop GIS, in which only QGIS is used, to find the top 5 counties for timberland investment in the southeastern states of the U.S, in which simple differencing of demand and supply is applied to figure out counties of large deficit of timber supply in comparison with timber demand. In the second lecture, an example of sever GIS, in which QGIS and PostgreSQL/PostGIS are used, will be presented as a solution for a given problem of NYC spatial data center, which required multiple user access and different levels of privileges. The third lecture presents an example of spatial data analytics, in which QGIS and R are used, to find out any regional factors which contribute to higher or lower disease prevalence in administrative districts, for which spatial autocorrelation analysis is conducted and decision tree analysis is applied. The fourth lecture is another example of spatial data analytics, to find optimal infiltration routing with network analysis, in which cost surface is produced and Dijkstra's algorithm is used. The fifth lecture is an example of spatial big data management and analytics, in which QGIS, PostGIS, R, and Hadoop MapReduce are all used, to provide a solution of "Passenger Finder", which can guide to the places where more passengers are waiting for taxi cabs. For the solution, spatial big data, taxi trajectory, are collected, and noise removal and map matching are conducted in Hadoop environment. Then, a series of spatial data processing and analysis such as spatial join in PostGIS, hotspot analysis in R are conducted in order to provide the solution. All in all, learners will realize the value of spatial big data and power of the solution structure with combination of four disciplines.

