Predicting Rigid Body Dynamics with Graph Neural Networks

This project employs Graph Neural Networks (GNNs) to simulate and predict the behavior of rigid bodies. By discretising objects into graph structures, the model captures detailed physical interactions. The performance is evaluated by comparing the predicted trajectory of the object with the ground truth generated by PyBullet.

Figure by Stanford CS224W (lecture 6.1)

Graph Neural Networks

A Graph Neural Network (GNN) processes data structured as graphs, capturing relationships and interactions between nodes and edges. This unique capability allows GNNs to excel in tasks where the connections and patterns among data points are crucial for analysis. Unlike traditional neural networks that assume independent and identically distributed data, GNNs embrace the interconnected nature of real-world data, making them highly effective in domains like social networks or biology.

The focus is not just on processing individual data points, but on understanding how these points are interlinked, uncovering deeper insights that are often missed by other forms of analysis.

In this context, the GNN is employed to learn the physical properties and behaviors of rigid objects during descend and collisions with the ground. The aim is to enable the model to accurately predict an object's form and path as it falls onto a surface.

Graph discretisation

By converting the object into a graph of nodes, edges, and faces, its physical structure is translated into a format that a Graph Neural Network (GNN) can process. This discretisation allows precise tracking of node positions and edge displacements, capturing the object's motion and structure. Such detailed information helps the GNN in learning the object's behavior. Typically, a denser mesh yields more accurate predictions due to better representation of the object's physical structure.

Dataset

The dataset encompasses a variety of shapes, including cubes, cylinders, and spheres, each with randomised starting conditions when dropped towards the ground. It records graph-related data at each time step, capturing details like the position of nodes, edge displacements, and the distance to the nearest object.

The dataset is designed to capture collision dynamics, detailing the impact forces and subsequent changes in motion when these shapes interact with the ground or with each other. This aspect is crucial for applications in virtual reality environments, where realistic object interactions are essential.

Figure by DeepMind

Rollout

Upon training completion, the performance of the model is evaluated through rollout. Here, the model predicts the trajectory and behaviour of the object during the fall and interaction with the ground. The GIFs below showcase the model’s performance (red) compared to the ground truth (green). It displays the real-time speed followed by a slow-motion segment to more clearly observe the trajectory of each object.

Although the model predicts the dynamics of each shape, denser graphs could improve the learning process and hence the performance. The next step is to investigate non-rigid, deformable objects and develop a heterogeneous GNN model.

Rollout performance