DINOv2: Learning Robust Visual Features without Supervision - Explained Simply | ArXiv Explained