{"id":13,"date":"2011-02-06T09:21:47","date_gmt":"2011-02-06T09:21:47","guid":{"rendered":"http:\/\/www.extradrm.com\/?p=13"},"modified":"2013-05-26T21:37:48","modified_gmt":"2013-05-26T19:37:48","slug":"developpements-solr-pour-koha","status":"publish","type":"post","link":"https:\/\/www.extradrm.com\/?p=13","title":{"rendered":"Fix UTF8 issues in Oscommerce 2.3"},"content":{"rendered":"<p>osCommerce 2.3.1 is a bit messy when it comes to the new introduction of UTF-8 character set. In older versions, everything was ISO-8859-1 <\/p>\n<p>* Core source code (ISO-8859-1)<br \/>\n* Language pack source code (ISO-8859-1)<br \/>\n* Mysql tables (ISO-8859-1)<br \/>\n* Mysql I\/O (ISO-8859-1)<br \/>\n* The html output (ISO-8859-1)<br \/>\n* HTML charset for browser decoding (ISO-8859-1)<br \/>\n* Incoming HTML form data (ISO-8859-1)<\/p>\n<p>In the new version it&#8217;s different:<\/p>\n<p>* Core source code (ISO-8859-1)<br \/>\n* Language pack source code (ISO-8859-1)<br \/>\n* Mysql tables (UTF-8)<br \/>\n* Mysql I\/O (UTF-8)<br \/>\n* The html output (ISO-8859-1)<br \/>\n* HTML charset for browser decoding (UTF-8)<br \/>\n* Incoming HTML form data (UTF-8)<\/p>\n<p><strong><span style=\"text-decoration: underline;\">WHAT&#8217;S THE PROBLEM?<\/span><\/strong><br \/>\nThis of course causes terrible problems if you upload a language pack<br \/>\nthat is ISO-8859-1 source code and contains foregin characters like<br \/>\n\u00e5\u00e4\u00f6.<\/p>\n<p>Then you say, why not convert the language file contents to UTF-8<br \/>\nthen? Well PHP 5.3 does not fully support UTF-8 encoded files so<br \/>\nthe source code should more effectively remain in ISO-8859-1 format.<\/p>\n<p><span style=\"text-decoration: underline;\"><strong>WHAT&#8217;S THE GUIDELINE THEN?<\/strong><\/span><br \/>\nWell, I am told to use entities for special characters such as<br \/>\n\u00e4 =&gt; &amp;auml; and so on&#8230; Becasue all non-foreign ASCII characters are<br \/>\nsupported in both UTF-8 and ISO-8859-1. That way your language pack is<br \/>\nuniversal for both charsets. But this messes up any non-HTML output.<br \/>\nSo you&#8217;re still going to have some unpleasent results.<\/p>\n<p><span style=\"text-decoration: underline;\"><strong>BUT CAN&#8217;T I JUST DEFINE OUTPUT AS ISO-8859-1 LIKE BEFORE?<\/strong><\/span><br \/>\nNo! Because when the customer is passing HTML form data that has<br \/>\nforegin characters, these will not be displayed correctly when<br \/>\nswapping to a UTF-8 language pack for example English. Neither<br \/>\nwill the data be saved correctly in the database.<\/p>\n<p><span style=\"text-decoration: underline;\"><strong>SO WHAT DO I DO?<\/strong><\/span><br \/>\nWe implement dual support for both ISO-8859-1 and UTF-8 language packs.<\/p>\n<p># 1.1 In {catalog}\/{admin}\/includes\/application_top.php<\/p>\n<p>*** On line ~166, find ***<\/p>\n<pre>\/\/ include the language translations\r\nrequire(DIR_WS_LANGUAGES . $language . '.php');\r\n$current_page = basename($PHP_SELF);\r\nif (file_exists(DIR_WS_LANGUAGES . $language . '\/' . $current_page)) {\r\ninclude(DIR_WS_LANGUAGES . $language . '\/' . $current_page);\r\n}<\/pre>\n<p>*** After that, add ***<\/p>\n<pre>\/\/ BOF: [osC Solutions] ISO-8859-1\/UTF-8 dual support\r\nswitch (strtolower(CHARSET)) {\r\ncase 'utf-8':\r\ntep_db_query(\"SET character set utf8\");\r\nbreak;\r\ncase 'iso-8859-1':\r\ntep_db_query(\"SET character set latin1\");\r\nbreak;\r\n}\r\n\/\/ EOF: [osC Solutions] ISO-8859-1\/UTF-8 dual support<\/pre>\n<p># 1.2 In {catalog}\/{admin}\/includes\/classes\/email.php<\/p>\n<p>*** On line ~148, find ***<\/p>\n<pre> function add_text($text = '') {<\/pre>\n<p>*** After that, add ***<\/p>\n<pre>\/\/ BOF: [osC Solutions] ISO-8859-1\/UTF-8 dual support\r\nif (strtolower(CHARSET) != 'utf-8') $text = utf8_encode($text);\r\n\/\/ EOF: [osC Solutions] ISO-8859-1\/UTF-8 dual support\r\n<\/pre>\n<p>______________________________________________________________________<\/p>\n<p>*** On line ~161, find ***<\/p>\n<pre>function add_html($html, $text = NULL, $images_dir = NULL) {<\/pre>\n<p>*** After that, add ***<\/p>\n<pre>\/\/ BOF: [osC Solutions] ISO-8859-1\/UTF-8 dual support\r\nif (strtolower(CHARSET) != 'utf-8') $html = utf8_encode($html);\r\nif (strtolower(CHARSET) != 'utf-8') $text = utf8_encode($text);\r\n\/\/ EOF: [osC Solutions] ISO-8859-1\/UTF-8 dual support<\/pre>\n<p># 2.1 In {catalog}\/includes\/application_top.php<\/p>\n<p>*** On line ~258, find ***<\/p>\n<pre>\/\/ include the language translations\r\nrequire(DIR_WS_LANGUAGES . $language . '.php');<\/pre>\n<p>*** After that, add ***<\/p>\n<pre>\r\n\/\/ BOF: [osC Solutions] ISO-8859-1\/UTF-8 dual support\r\nswitch (strtolower(CHARSET)) {\r\ncase 'utf-8':\r\ntep_db_query(\"SET character set utf8\");\r\nbreak;\r\ncase 'iso-8859-1':\r\ntep_db_query(\"SET character set latin1\");\r\nbreak;\r\n}\r\n\/\/ EOF: [osC Solutions] ISO-8859-1\/UTF-8 dual support\r\n<\/pre>\n<p># 2.2 In {catalog}\/includes\/classes\/email.php<\/p>\n<p>*** On line ~148, find ***<\/p>\n<pre>function add_text($text = '') {<\/pre>\n<p>*** After that, add ***<\/p>\n<pre>\/\/ BOF: [osC Solutions] ISO-8859-1\/UTF-8 dual support\r\nif (strtolower(CHARSET) != 'utf-8') $text = utf8_encode($text);\r\n\/\/ EOF: [osC Solutions] ISO-8859-1\/UTF-8 dual support<\/pre>\n<p>______________________________________________________________________<\/p>\n<p>*** On line ~161, find ***<\/p>\n<pre>function add_html($html, $text = NULL, $images_dir = NULL) {<\/pre>\n<p>*** After that, add ***<\/p>\n<pre>\/\/ BOF: [osC Solutions] ISO-8859-1\/UTF-8 dual support\r\nif (strtolower(CHARSET) != 'utf-8') $html = utf8_encode($html);\r\nif (strtolower(CHARSET) != 'utf-8') $text = utf8_encode($text);\r\n\/\/ EOF: [osC Solutions] ISO-8859-1\/UTF-8 dual support<\/pre>\n<p><strong>CHECKLISTS<\/strong><\/p>\n<p>Full ISO-8859-1 Language Pack<br \/>\n[ ] Source code saved in ANSII\/ISO-8859-1<br \/>\n[ ] {languagename}.php states define(&#8216;CHARSET&#8217;, &#8216;iso-8859-1&#8217;);<\/p>\n<p>Full UTF-8 Language Pack<br \/>\n[ ] Source code saved in UTF-8 without BOM<br \/>\n[ ] {languagename}.php states define(&#8216;CHARSET&#8217;, &#8216;utf-8&#8217;);<br \/>\n[ ] {languagename}.php includes mb_internal_encoding(&#8216;utf-8&#8217;);<\/p>\n","protected":false},"excerpt":{"rendered":"<p>osCommerce 2.3.1 is a bit messy when it comes to the new introduction of UTF-8 character set. In older versions, everything was ISO-8859-1 * Core source code (ISO-8859-1) * Language pack source code (ISO-8859-1)&#46;&#46;&#46;<\/p>\n","protected":false},"author":1,"featured_media":2841,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":[],"categories":[18,182,5],"tags":[],"youtube_video":null,"_links":{"self":[{"href":"https:\/\/www.extradrm.com\/index.php?rest_route=\/wp\/v2\/posts\/13"}],"collection":[{"href":"https:\/\/www.extradrm.com\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.extradrm.com\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.extradrm.com\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.extradrm.com\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=13"}],"version-history":[{"count":0,"href":"https:\/\/www.extradrm.com\/index.php?rest_route=\/wp\/v2\/posts\/13\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.extradrm.com\/index.php?rest_route=\/wp\/v2\/media\/2841"}],"wp:attachment":[{"href":"https:\/\/www.extradrm.com\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=13"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.extradrm.com\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=13"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.extradrm.com\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=13"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}