Scaling Laws for Neural Language Models - Explained Simply | ArXiv Explained