Fix UTF8 issues in Oscommerce 2.3

osCommerce 2.3.1 is a bit messy when it comes to the new introduction of UTF-8 character set. In older versions, everything was ISO-8859-1

* Core source code (ISO-8859-1)
* Language pack source code (ISO-8859-1)
* Mysql tables (ISO-8859-1)
* Mysql I/O (ISO-8859-1)
* The html output (ISO-8859-1)
* HTML charset for browser decoding (ISO-8859-1)
* Incoming HTML form data (ISO-8859-1)

In the new version it’s different:

* Core source code (ISO-8859-1)
* Language pack source code (ISO-8859-1)
* Mysql tables (UTF-8)
* Mysql I/O (UTF-8)
* The html output (ISO-8859-1)
* HTML charset for browser decoding (UTF-8)
* Incoming HTML form data (UTF-8)

WHAT’S THE PROBLEM?
This of course causes terrible problems if you upload a language pack
that is ISO-8859-1 source code and contains foregin characters like
åäö.

Then you say, why not convert the language file contents to UTF-8
then? Well PHP 5.3 does not fully support UTF-8 encoded files so
the source code should more effectively remain in ISO-8859-1 format.

WHAT’S THE GUIDELINE THEN?
Well, I am told to use entities for special characters such as
ä => ä and so on… Becasue all non-foreign ASCII characters are
supported in both UTF-8 and ISO-8859-1. That way your language pack is
universal for both charsets. But this messes up any non-HTML output.
So you’re still going to have some unpleasent results.

BUT CAN’T I JUST DEFINE OUTPUT AS ISO-8859-1 LIKE BEFORE?
No! Because when the customer is passing HTML form data that has
foregin characters, these will not be displayed correctly when
swapping to a UTF-8 language pack for example English. Neither
will the data be saved correctly in the database.

SO WHAT DO I DO?
We implement dual support for both ISO-8859-1 and UTF-8 language packs.

# 1.1 In {catalog}/{admin}/includes/application_top.php

*** On line ~166, find ***

// include the language translations
require(DIR_WS_LANGUAGES . $language . '.php');
$current_page = basename($PHP_SELF);
if (file_exists(DIR_WS_LANGUAGES . $language . '/' . $current_page)) {
include(DIR_WS_LANGUAGES . $language . '/' . $current_page);
}

*** After that, add ***

// BOF: [osC Solutions] ISO-8859-1/UTF-8 dual support
switch (strtolower(CHARSET)) {
case 'utf-8':
tep_db_query("SET character set utf8");
break;
case 'iso-8859-1':
tep_db_query("SET character set latin1");
break;
}
// EOF: [osC Solutions] ISO-8859-1/UTF-8 dual support

# 1.2 In {catalog}/{admin}/includes/classes/email.php

*** On line ~148, find ***

 function add_text($text = '') {

*** After that, add ***

// BOF: [osC Solutions] ISO-8859-1/UTF-8 dual support
if (strtolower(CHARSET) != 'utf-8') $text = utf8_encode($text);
// EOF: [osC Solutions] ISO-8859-1/UTF-8 dual support

______________________________________________________________________

*** On line ~161, find ***

function add_html($html, $text = NULL, $images_dir = NULL) {

*** After that, add ***

// BOF: [osC Solutions] ISO-8859-1/UTF-8 dual support
if (strtolower(CHARSET) != 'utf-8') $html = utf8_encode($html);
if (strtolower(CHARSET) != 'utf-8') $text = utf8_encode($text);
// EOF: [osC Solutions] ISO-8859-1/UTF-8 dual support

# 2.1 In {catalog}/includes/application_top.php

*** On line ~258, find ***

// include the language translations
require(DIR_WS_LANGUAGES . $language . '.php');

*** After that, add ***

// BOF: [osC Solutions] ISO-8859-1/UTF-8 dual support
switch (strtolower(CHARSET)) {
case 'utf-8':
tep_db_query("SET character set utf8");
break;
case 'iso-8859-1':
tep_db_query("SET character set latin1");
break;
}
// EOF: [osC Solutions] ISO-8859-1/UTF-8 dual support

# 2.2 In {catalog}/includes/classes/email.php

*** On line ~148, find ***

function add_text($text = '') {

*** After that, add ***

// BOF: [osC Solutions] ISO-8859-1/UTF-8 dual support
if (strtolower(CHARSET) != 'utf-8') $text = utf8_encode($text);
// EOF: [osC Solutions] ISO-8859-1/UTF-8 dual support

______________________________________________________________________

*** On line ~161, find ***

function add_html($html, $text = NULL, $images_dir = NULL) {

*** After that, add ***

// BOF: [osC Solutions] ISO-8859-1/UTF-8 dual support
if (strtolower(CHARSET) != 'utf-8') $html = utf8_encode($html);
if (strtolower(CHARSET) != 'utf-8') $text = utf8_encode($text);
// EOF: [osC Solutions] ISO-8859-1/UTF-8 dual support

CHECKLISTS

Full ISO-8859-1 Language Pack
[ ] Source code saved in ANSII/ISO-8859-1
[ ] {languagename}.php states define(‘CHARSET’, ‘iso-8859-1’);

Full UTF-8 Language Pack
[ ] Source code saved in UTF-8 without BOM
[ ] {languagename}.php states define(‘CHARSET’, ‘utf-8’);
[ ] {languagename}.php includes mb_internal_encoding(‘utf-8’);

extradrmtech

Since 30 years I work on Database Architecture and data migration protocols. I am also a consultant in Web content management solutions and medias protecting solutions. I am experienced web-developer with over 10 years developing PHP/MySQL, C#, VB.Net applications ranging from simple web sites to extensive web-based business applications. Besides my work, I like to work freelance only on some wordpress projects because it is relaxing and delightful CMS for me. When not working, I like to dance salsa and swing and to have fun with my little family.

You may also like...

Leave a Reply