Here's the file "Hello" in various encodings:
48 65 6C 6C 6F
-
This is the traditional ANSI encoding.
48 00 65 00 6C 00 6C 00 6F 00
-
This is the Unicode (little-endian) encoding with no BOM.
FF FE 48 00 65 00 6C 00 6C 00 6F 00
-
This is the Unicode (little-endian) encoding with BOM. The BOM (FF FE) serves two purposes: First, it tags the file as a Unicode document, and second, the order in which the two bytes appear indicate that the file is little-endian.
00 48 00 65 00 6C 00 6C 00 6F
-
This is the Unicode (big-endian) encoding with no BOM. Notepad does not support this encoding.
FE FF 00 48 00 65 00 6C 00 6C 00 6F
-
This is the Unicode (big-endian) encoding with BOM. Notice that this BOM is in the opposite order from the little-endian BOM.
EF BB BF 48 65 6C 6C 6F
-
This is UTF-8 encoding. The first three bytes are the UTF-8 encoding of the BOM.
2B 2F 76 38 2D 48 65 6C 6C 6F
-
This is UTF-7 encoding. The first five bytes are the UTF-7 encoding of the BOM. Notepad doesn't support this encoding.
No comments:
Post a Comment