Abstract: this talk will be about scene understanding with neural networks. Precisely, it starts from a brief introduction about classification of aerial and satellite images and the advent of deep learning for solving this task, then discusses various kinds of deep networks for dense semantic classification, fusion of heterogeneous data (especially with residual correction), and joint-learning with additional cartography. In a second part, it moves to 3D with semantic labeling of point clouds and presents SnapNet a multi-view convnet which can classify 3D from LiDAR or photogrammetry. It discusses various strategies for urban modeling or robotic exploration. Building on latest developments of the last years, we will see how it is now possible to semantize the world that surrounds us.