PDA

View Full Version : Utf8 in Mysql, should i?



micdhack
March 2nd, 2010, 06:41 PM
Hello,
i am creating a website which is going to be multicultural. That means multiple language text to print and text being posted by the users.

In the html <head> i have <meta http-equiv="Content-Type" content="text/html; charset=utf-8"> which works like a charm.

Also anything being posted and stored in the db is being stored correctly cause when i read it back through the mysqli in php is being printed as originally posted.

My mysql text and varchar tables have a latin1_swedish_ci charset.

I am not sure what kind of an effect does the mysql utf8 fields have for the text in the db but so far with the latin1_swedish_ci fields i was able to store Greek, Czech and Japanese letters without problem.

So my question would be, if anything is working correctly, should i change to utf8 or it doesnt really matter?

kahumba
March 2nd, 2010, 07:34 PM
In Ubuntu mysql doesn't default to utf8 which creates troubles when reading Russian strings (unless I encode/decode the strings by hand, which I shouldn't have to).
I've been surprised how the typical Linux distro is almost completely utf8 by default but mysql is not. Somebody is trying to save bytes in exchange for confusion and troubles. Way to go.

The Cog
March 2nd, 2010, 07:51 PM
Its my belief that that the mysql tables should be set for utf8 if that's what you're putting in them. Although it might work for now, putting utf8 byte sequences into a latin-1 database table, you may well find later that you want to use tools that are aware of the database encoding and try (correctly) to treat the database contents as latin1 and end up displaying stuff different to what you expect.

Hellkeepa
March 3rd, 2010, 02:21 AM
HELLo!

Yes, you should always stick to one charset when manipulating strings in a system. Otherwise you're begging for problems, problems which can be quite annoying in rectifying at a later point in time.

To make sure that your system is UTF-8 all the way through, you need to do the following:
Send the following header with PHP, or via the web server, on all of the pages displayed:

content-type: text/html; charset=utf-8
Using the HTML meta tags alone is not enough, as the server overrides these.
Send "USE NAMES 'utf8'" to the MySQL server, right after selecting the database you're using.
Make sure all tables are created with ") DEFAULT CHARSET=utf8;" at the end of the table definition.
Add 'accept-charset="utf-8"' to all form elements in the HTML code.
Make sure you use UTF-8 enabled functions, and actually tell them to use UTF-8, in the PHP source code.

Happy codin'!

micdhack
March 3rd, 2010, 08:54 AM
Wow man thanks! Your answer is perfect and super thorough.
I ll start making the appropriate changes right away.

Hellkeepa
March 3rd, 2010, 04:48 PM
HELLo!

You're welcome, glad you found it useful. :-)

Happy codin'!