Mon, Jul 12, 2021 @ 04:00 PM - 06:00 PM
PhD Candidate: Mingxuan Yue
Title: Inferring Mobility Behaviors from Trajectory Datasets
time: July 12 2021 4pm -6pm
Committee: Haipeng Luo, Craig Knoblock, Mahdi Soltanolkotabi, Tianshu Sun (external) and Cyrus Shahabi (advisor)
zoom link: https://usc.zoom.us/j/96196937726?pwd=Z2JvNkhueHZHUzF5dk8ySGp2elpaZz09
Identifying people's mobility behaviors (e.g., work commute, shopping) in rich trajectory data is of great economic and social interest to various applications, including location/trip recommendations, geo-targeting/advertisements, urban planning, anomaly detection, epidemiology.Inferring the mobility behaviors is challenging as it requires a robust unsupervised clustering technique and effective mobility-related features to cluster trajectories of various spatial and temporal scales into groups, each of which follows the same mobility behavior. Specifically, my thesis tackles the following three challenges.
First, it is difficult to infer the mobility behavior directly from a trajectory since the raw coordinates do not provide useful information about the surrounding environment of the visited locations. Existing trajectory clustering approaches usually rely on pre-defined distance measurement and usually group trajectories with similar shapes and spatial (and temporal) scales together rather than group the trajectories with the same mobility behavior. In this case, trajectories in different groups may still belong to the same mobility behavior, e.g., the school commutes may occur at different locations by different transportation modes and are assigned to different groups by these approaches. To overcome this challenge, we propose DETECT, which extracts salient points in the trajectories and augments them with auxiliary geographical features retrieved from the Point of Interest data. In this way, each trajectory is transformed into a context sequence, i.e., an ordered list of real-value feature vectors, each describing the ``context'' of a visited location (e.g., sports, shopping, or dining venues) in the trajectory. Rather than using pre-defined distance measurements, DETECT is data-driven by employing a two-phase deep learning procedure that first learns fixed-size embeddings of variant-length trajectories and then optimizes a clustering objective for a better separation of clusters.
Second, the robustness of the clustering approaches on the context sequences could be further improved to have a more accurate and stable inference of mobility behaviors. Existing deep-learning-based clustering approaches (including DETECT) usually employ a two-phase procedure and are sensitive to a lossy initialization. Therefore, we propose a variational clustering method called VAMBC which can simultaneously learn the fixed-size embeddings and the cluster assignments in a single phase and produce robust clustering results. In addition, unlike other variation approaches that could collapse to trivial solutions, VAMBC separates the information of individual trajectories and common patterns of clusters in the embedding space and encourages sufficient involvement of the cluster membership in creating the embeddings to avoid producing poor clustering results.
Third, effective mobility-related features are of great importance in this problem, but one can not assume the auxiliary geographical data is always available for generating these features. Hence, we also investigate approaches to learn region representations from trajectories without using any auxiliary data like the Point of Interests. We first propose DeepMARK for learning a representation of regions to support delivery time estimation. Then, we study to learn a representation for broader applications, including the mobility behavior inference and next location prediction. In this study, we propose a framework based on a heterogeneous graph neural network, in which we exploit rich mobility-related attributes and relationships of regions, users, and temporal periods involved in trajectories. Within this framework, we design a mobility-related objective via customized random walks and learn effective region embeddings by encoding information from neighboring nodes in the graph. We demonstrate the advance of the learned region representation over various baseline approaches in three downstream tasks using a real-world dataset.
Audiences: Everyone Is Invited
Contact: Lizsl De Leon