Supporting UTF-8 in PHP
Posted by Jonathan Ng | Filed under Technical
Up till now, I didn’t mention much about the little project I’m working on now: a RSS feed reader. It’s called SpoonFeed, as I want to literally spoon feed users with news, LoL.. One of the reasons for the discretion is that hardly any work has been done on it. I’m still working on the layouts and stuff like that. A basic model that allows adding and viewing feeds might take another week or two, depending on how tough advanced diploma is.
So I was working on scripts that’ll sort the feed items alphabetically, and I thought about Unicode characters. Throughout the project, I thought that enabling UTF-8 support only required minor tweakings like changing the charset in MySQL and the HTML output. But boy was I wrong. I did some testing and I was sure that it was a PHP code related problem since phpMyAdmin could display UTF-8 characters in my sample database without a problem.
Simply put, the solution is to execute a “SET NAMES 'UTF-8';” sql command to tell MySQL that it should output data in that charset. Check here for more information. However, that wasn’t all. I still had to use the following function to encode query results into UTF-8 character set.
function html_encode_utf8($s)
{
$len = strlen($s);
$x = 0;
while ($x< $len) {
if (ord($s[$x])<128) {
$result .= $s[$x];
} else if (ord($s[$x]) < 224) {
$result .= "" . bindec(substr(decbin(ord($s[$x])),3) . substr(decbin(ord($s[$x+1])),2)) . ";";
$x = $x + 1;
} else if (ord($s[$x] < 239)) {
$result .= "" . bindec(substr(decbin(ord($s[$x])),4) . substr(decbin(ord($s[$x+1])),2) . substr(decbin(ord($s[$x+2])), 2)) . ";";
$x = $x + 2;
}
$x++;
}
return $result;
}
Ah, finally things work now. I’m still trying to figure out the best way to integrate that function into my database abstraction layer. But since I didn’t get much sleep last night, I’m pretty sleepy now. I’ll be going back Ipoh tommorrow together with some friends, so that’s how the last few days of my holidays are used. Sigh, and I thought I could do tons of stuff in 4 weeks…
Tags: php