UTF-8 encoding system

| Number of bytes | First code point | Last code point | Byte 1 | Byte 2 | Byte 3 | Byte 4 | | 1 | U+0000 | U+007F | 0xxx xxxx | | | | | 2 | U+0080 | U+07FF | 110x xxxx | 10xx xxxx | | | | 3 | U+0800 | U+FFFF | 1110 xxxx | 10xx xxxx | 10xx xxxx | | | 4 | U+10000 | [18]U+10FFFF | 1111 0xxx | 10xx xxxx | 10xx xxxx | 10xx xxxx |

References:

wikipedia - UTF-8