Optimizing Large Language Model Training Using FP4 Quantization - Explained Simply | ArXiv Explained