Spark: Strategic Policy-Aware Exploration via Dynamic Branching for Long-Horizon Agentic Learning - Explained Simply | ArXiv Explained