UTF-8 encodes some characters with multiple bytes. Read about how it works. You should probably read 1..4 bytes (one character) and decode it into a 32-bit integer (
char32_t
or std::uint32_t
), then repeat.