FastVLM: Efficient Vision Encoding for Vision Language Models - Explained Simply | ArXiv Explained