Towards Understanding Bugs in Distributed Training and Inference Frameworks for Large Language Models - Explained Simply | ArXiv Explained