SVTRv2: CTC Beats Encoder-Decoder Models in Scene Text Recognition - Explained Simply | ArXiv Explained