Next-Embedding Prediction Makes Strong Vision Learners - Explained Simply | ArXiv Explained