Emerging Properties in Unified Multimodal Pretraining - Explained Simply | ArXiv Explained