Interpreting CLIP's Image Representation via Text-Based Decomposition - Explained Simply | ArXiv Explained