Character Encoding
I wanted to post about character encoding since it is an important part of developing a website. It seems like a small thing, and something that has little or no impact on anything, which is not true. Character encoding refers to the way that characters are encoded in an HTML document, and is defined by a meta tag such as:
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1" />
This tag says that the content of the HTML document is presented in the ISO-8859-1 character set, which is the common english character encoding format. Under such a scheme, characters need to be encoded in such a way (such as &, %, *, etc.) that when the browser tries to decode them for presentation, there are no problems. In this instance, the characters need to be encoded according to the ISO-8859-1 character set otherwise the browser will not decode them properly, since that is the type of encoding that it is expecting.
Using a character code that is not understood by the currently defined character set (such as using a non-ISO-8859-1 encoded character on a page that is expecting ISO-8859-1) will result in the browser being unable to display the character, and most will display a small diamond with a question mark in it instead. Obviously this is not an ideal situation, and it looks really unprofessional. Your user's will notice.
Since one of the crowning attractions of the Internet is the worldwide accessibility, I personally believe that all webpages should be encoded using what is called Unicode, or the UTF-8 character set. This character is simply another way of encoding characters, and in my opinion is a more universal character set. I won't really go into why that is the case, but I wanted to post about it. Most blog platforms publish in UTF-8, and as we at the magazine have been seeing more and more international traffic, I will probably be looking at switching the whole website over to UTF-8. That is going to be a long, and tough one, and might need to wait until we do an overhaul, but I think that it will be a small (but valuable) optimization for the site.
This post is filed under Developers' Corner and has the following keyword tags: html, accessibility, utf8.