PHP messing with HTML Charset Encoding

You have probably come to mix encoding types.
For example. A page that is sent as iso-8859-1, but get UTF-8 text encoding from MySQL or XML would typically fail.

To solve this problem you must keep control on input ecodings type in relation to the type of encoding you have chosen to use internal.

If you send it as an iso-8859-1, your input from the user is also iso-8859-1.

header("Content-type:text/html; charset: iso-8859-1");

And if mysql sends latin1 you do not have to do anything.

But if your input is not iso-8859-1 you must converted it, before it’s sending to the user or to adapt it to Mysql before it’s store.

mb_convert_encoding($text, mb_internal_encoding(), 'UTF-8'); // If it's UTF-8 to internal encoding

Short it means that you must always have input converted to fit internal encoding and convereter output to match the external encoding.


This is the internal encoding I have chosen to use.

mb_internal_encoding('iso-8859-1'); // Internal encoding

This is a code i use.

mb_language('uni'); // Mail encoding
mb_internal_encoding('iso-8859-1'); // Internal encoding
mb_http_output('pass'); // Skip

function convert_encoding($text, $from_code="", $to_code="")
{
    if (empty($from_code))
    {
        $from_code = mb_detect_encoding($text, 'auto');
        if ($from_code == 'ASCII')
        {
            $from_code="iso-8859-1";
        }
    }

    if (empty($to_code))
    {
        return mb_convert_encoding($text, mb_internal_encoding(), $from_code);
    }
    return mb_convert_encoding($text, $to_code, $from_code);
}

function encoding_html($text, $code="")
{
    if (empty($code))
    {
        return htmlentities($text, ENT_NOQUOTES, mb_internal_encoding());
    }

    return mb_convert_encoding(htmlentities($text, ENT_NOQUOTES, $code), mb_internal_encoding(), $code);
}
function decoding_html($text, $code="")
{
    if (empty($code))
    {
        return html_entity_decode($text, ENT_NOQUOTES, mb_internal_encoding());
    }

    return mb_convert_encoding(html_entity_decode($text, ENT_NOQUOTES, $code), mb_internal_encoding(), $code);
}

Leave a Comment