Spatial Forcing: Implicit Spatial Representation Alignment for Vision-language-action Model - Explained Simply | ArXiv Explained