Modeling Expectation Violation in Intuitive Physics with Coarse Probabilistic Object Representations


From infancy, humans have expectations about how objects will move and interact. Even young children expect objects not to move through one another, teleport, or disappear. They are surprised by mismatches between physical expectations and perceptual observations, even in unfamiliar scenes with completely novel objects. A model that exhibits human-like understanding of physics should be similarly surprised, and adjust its beliefs accordingly. We propose ADEPT, a model that uses a coarse (approximate geometry) object-centric representation for dynamic 3D scene understanding. Inference integrates deep recognition networks, extended probabilistic physical simulation, and particle filtering for forming predictions and expectations across occlusion. We also present a new test set for measuring violations of physical expectations, using a range of scenarios derived from developmental psychology. We systematically compare ADEPT, baseline models, and human expectations on this test set. ADEPT outperforms standard network architectures in discriminating physically implausible scenes, and often performs this discrimination at the same level as people. We will release all code and data.

Advances in Neural Information Processing Systems 32
Tomer Ullman
Primary Investigator

My research focuses on the structure and origin of knowledge, guided by perspectives and methods from cognitive science, cognitive development, and computational modeling. By combining these, I hope to better understand the form and development of the basic commonsense reasoning that guides our interaction with the world and the people in it.