Jump to content
surknight

Weird problem with language pack

Recommended Posts

I'm having a problem with the manufacturer's box sometimes showing the wrong characters in the title and sometimes the wrong wording in dropdown list. Here's what it looks like on the store's index page:

 

post-323813-0-10482100-1395664211_thumb.jpg

 

The language is Croatian and the word that should be showing is "Proizvođači" (Manufacturers). The word "Odaberite" under it means 'select'. The problem is that it doesn't show up like that all the time. If I select one of the manufacturer's names from the list I get this:

 

post-323813-0-77179900-1395664633_thumb.jpg

 

The box title has obviously switched to another codepage, but it's still not right. Going further down the list of manufactures, I get this:

 

post-323813-0-08433900-1395664825_thumb.jpg

 

All of a sudden the manufacturer's box is showing the correct characters and the top selection in the dropdown has changed to 'Molima odaberite' (Please select) which is the default dropdown defined in hrvatski.php.

 

Everything is in UTF-8. All the language files are in UTF-8 without BOM (verified in Notepad ++). The database is in UTF-8. The setlocale for the language file is hr_HR.UTF-8 and the defined character set is UTF-8. Everywhere else on the page the characters show correctly (see the box under the one in question - it never changes). I searched every language file, and nowhere does either 'Odaberite' by itself as a dropdown selection or 'Proizvođači' with incorrect characters appear. Where else would the cart be pulling this from for the manufacturers box?

 

Cart verion is 2.3.3, localhost installation. PHP version is 5.4.7

 

Thanks in advance.

Share this post


Link to post
Share on other sites

First, confirm that the page is actually displaying in UTF-8. In the browser, View > Character Encoding (or similar) should show 'UTF-8'. If it doesn't, some bit of code is changing the encoding. Is this all standard osC, or is a template or add-on involved where you're having a problem? Under View > Character Encoding, try some other encodings to see if one makes the right characters appear. That might give a clue that it's an encoding issue (possibly the encoding changing from page to page). Some servers are misconfigured to force Latin-1, but usually you see that consistently, unless there's something else in the HTTP headers that's triggering the Latin-1 force.

 

In the first picture, characters showing as empty boxes usually means that a character glyph is not defined for that code point (but it is a legal character, otherwise). Are you using a browser and PC with a font set that doesn't cover these characters? That it sometimes shows up is odd, but could be explained by its being another font. UTF-8 itself covers every know character, but if somehow you're getting into one of the Latin-x/ISO-8859-x character sets, not all the characters needed for Croatian may be in that set. However, something should show up for display. Can you use a debugger such as Firebug (for Firefox) to determine if the same font is in use each time? It should tell you the list of fonts being used.

 

Note that the normal behavior of UTF-8 fonts is to show a box with a 4 digit hex code inside it for a character that is legal, but it doesn't have a glyph defined in the font for it. An empty box like in your first example is usually a sign of no glyph defined for a single byte font. A displayed page must be all one encoding, so it's puzzling what's going on here. Can you report back with confirmation of the character set that the page claims to be using, and the font(s) in use contain the necessary Croat characters? Hopefully that will give you a lead on what went wrong.

Share this post


Link to post
Share on other sites

Yes, the page encoding does show as UTF-8 at all times. Changing it to either Central European ISO or Central European Windows corrects the title bar for the manufacturer's box on the index page, but the rest of the page is in the wrong characters then. The second instance with the 'Canon' manufacturer selected doesn't show correctly no matter what the page encoding is set to. As far as the fonts are concerned it's Lucida sans-serif every time.

 

It seems to be behaving almost as if there were a separate manufacturer's box for each manufacturer_id, but there isn't. :mellow:

Share this post


Link to post
Share on other sites

OK, the page is UTF-8, except for this title bar entry, where it is Latin-x (Central European). I can't look at the code until tonight, but this information I believe would be coming from a language file, and so should be in UTF-8. It sounds like the file data (text) was entered in the wrong encoding when editing the file. Find the Croatian language .php file with the MANUFACTURER_* defined entries, and confirm that everything is UTF-8 encoded and the right text.

 

As to why it keeps changing, that's a puzzler. The master language file (croatian.php, I think) should be the only place that defines the page encoding and locale, and it sounds like it's always staying UTF-8. Is there any chance that you're pulling in a wrong language file at times (a different language, or an old version of the correct one)? If that was happening, I would expect problems all over the page. Maybe if your application_top.php is not consistently defining the right $language setting, that might cause problems, but again, I think the whole page would be affected.

Share this post


Link to post
Share on other sites

Found the problem - it's the cache. Hitting reset on the three entries (Categories box, Manufacturers box, and Also Purchased module) in Tools -->Cache Control got rid of the encoding problem on the manufacturer's box.

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

×