Accessible Web Design Tip: get your 'character encoding' sorted
Added on Friday 25 Jul 2003
If you have ever used the World Wide Web Consortiums (W3C) HTML validator (http://validator.w3.org/) to check the validity of your web pages, you will have noticed that it requires information about character encoding before it will carry out the check. But what does 'character encoding' mean, and what encoding should you use?
I will not try to explain it in any detail, as it is a complicated subject - however the following simplification should give you the basic idea.
A character encoding is a particular match up between a set of numbers and a set of character, e.g. the character encoding called US ASCII (international standard ISO 646) consists of 256 numbers, and each of these numbers represents a character, (letters from the alphabet, numbers, or some other character), e.g. 65 represents A, 66, is B and so on.
Not all languages use the Western Alphabet, so there are encodings that will match the numbers up with different characters (e.g. Japanese characters). The browser on your local computer checks to see what character encoding you are using - so that it can display the page appropriately. For example if you have indicated that you are using 'ISO 646', the browser knows that this is US ASCII - so it will display the page using English characters.
So how do you add character encoding information?
For HTML 4.01, you can indicate the character encoding in the head of your page using the following meta tag:
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">
ISO-8859-1 (commonly called Latin 1) is the default characters set for HTTP .1.1, and indicates a set of common English characters. Generally this is the most common character set indicated in web pages, and is likely to be the one you will use.
Creating valid HTML is one of the most important steps you can take when designing accessible websites. If you do not provide the appropriate character encoding this could lead to characters in your page not displaying correctly, which of course will have an impact on the accessibility of your content.
Links
- W3C: HTML Document Representation: http://www.w3.org/TR/html401/charset.html
- Character Encoding Standards: http://www.webreference.com/dlab/books/html/39-1.html
- List of common characters sets: http://www.w3.org/International/O-charset-lang.html
- W3C: Internationalization: http://www.w3.org/International/O-HTTP-charset