shinichi♥stranger
carol♥z-motion
aquila♥(inky);paper
map♥swordmanspirit

HTML tutorials: Encode UTF-8 without BOM

Level:
Some people don't know that not every editor is able to display non-latin characters such as Japanese and will show up at your html-file wrong as well.
If you want to make use of those characters you need to be able to encode your files in UTF-8 without BOM instead of ANSI.

If you already tried to put some Japanese words or sentences on your website you may know the problem of getting questionmarks instead of Japanese characters. If your website is coded in PHP this could also cause trouble and your included files will not be indicated.

The major problem

If you make use of a HTML editor such as (early) Adobe Dreamweaver or Microsoft Script Editor your files will be saved as ANSI character encoding by default but ANSI is just able to display latin latters. That's why you have to change your preferences/ settings to make sure your files are saved in Unicode, "[...] an industry standard allowing computers to consistently represent and manipulate text expressed in most of the world's writing systems" (1). That way your documents will be able to display characters such as Arabic or Hebrew as well as Japanese and a lot of other signs on your website.
The most common Unicode is UTF-8 (8-bit UCS/Unicode Transformation Format). If your HTML editor is able to save files in UTF-8 you should set this as your standarts.

The BOM-Case

The second problem is that many editors (i.e. Adobe Dreamweaver) add an invisible attachment to PHP-files saved in UTF-8 — this is called BOM (byte-order mark).
The result is similar to what I told you before: The include-function does not work but unlike fevore this time non-latin characters are displayed.

How to solve this problem?

  • At first you have to include a meta-tag to the head of your web-document to indicate you want to display non-latin characters. Just add this code somewhere within the head-tag. (Don't forget to delete the space before)
    < meta http-equiv='Content-Type' content='text/html; charset=utf-8' />
  • Afterward check out if your editor is able to encode UTF-8 without BOM. If you don't have one you may try out Notepad++. It's free ware, easy to use and available in various languages. In Notepad++ you're able change the encoding for each file (Encoding/ Encode UTF-8 without BOM).
  • In Adobe Dreamwaver CS4 better check out the preferences and enable UTF-8:
    Preferences/ New Document/ Standart Encoding - Unicode (UTF-8). Don't forget to remove the checkmark at Enclose Unicode-Signatur (BOM).
    Finally check out what's displayed on your operator interface. Now it should look like this:

If you can see it is not that difficult to solve this problem if you use the right software. Before you go ahead check out if your HTML editor is able to encode UTF-8 without BOM or not. Sometimes you just need to change your preferences a little. Try to find out by reading the intern help or FAQ.
(1) Definition taken from wikipedia.org
last updated on January 10, 2010
⬅ GO BACK

general

style selection

broad your mind

creative aids