Chinese, Korean Japanese - It's All the Same
by in CodeSOD on 2014-03-25Matt H. and company recently added support for Cyrillic script to their PDF invoice generator when they discovered that none of the characters would print. The script used DOMPDF to convert the HTML invoices to PDF, and font handling across scripts can get a bit hairy, so it was not really a surprise. However, as he was digging through the code that generated the invoices, he found this little gem:
<?php ... $body = mb_convert_encoding($body, 'HTML-ENTITIES', 'UTF-8'); $body = preg_replace_callback('/(&#)([0-9]{4,})(;)/', function($matches) { $code = $matches[2]; if ($code<=19968 && $code>=40895) return $matches[0]; // not CJK return '<span style="font-family:kochi-gothic">'.$matches[0].'</spank>'; },$body); $body = mb_convert_encoding($body, 'UTF-8', 'HTML-ENTITIES'); ... ?>