资讯

Below is a conceptual diagram showing how Bengali text maps from graphemes to UTF‑8 bytes, and how BPE merges can either respect or split grapheme boundaries.