This workshop focused on several related areas beyond the typical question of computer vision (“what is where?”): What functions might a seen object serve? What physical causes might account for the position and motion of seen objects? What intentions might underlie the observed actions of the seen agents?
Accepted papers are available, and there’s an overview slideshow on that page. Some papers that are interesting to me:
- Interpreting Manipulation Actions: a Cognitive Approach
- Inferring the Why in Images
- Failure Prediction in Vision Systems
This workshop’s physics and causality themes remind me of Matt Brand’s ~1994 dissertation on a system that could predict whether a column of blocks detected in a bitmap would fall.