Author Topic: Charset trouble on a drop-down menu  (Read 4474 times)

Mateusz Viste

  • Full Member
  • ***
  • Posts: 64
  • Karma: 7
    • View Profile
    • My homepage
Charset trouble on a drop-down menu
« on: April 27, 2010, 06:11:07 PM »
Hi Everyone!

I think that I found a bug related to translations.

When I am editing a SIP provider, to configure incoming calls routing, I can select the internal destination of calls.
There is a position named "Outgoing only". In polish that would be "tylko połączenia wychodzące". However, the drop down box doesn't display this string right. It seems that the charset used to display this specific drop down box is different from the charset used to display the whole rest of the webGUI...

Here I posted a screenshot of the problem:
http://www.viste-family.net/mateusz/Temp/Askozia-menu-charset-bug/

Is there any chance to have that fixed for final 2.0? :)

chocho

  • Guest
Re: Charset trouble on a drop-down menu
« Reply #1 on: May 01, 2010, 01:43:50 PM »
Also happened with cyrillic symbols too :-\

Michael

  • Askozia Staff
  • Hero Member
  • *
  • Posts: 1020
  • Karma: 49
    • View Profile
Re: Charset trouble on a drop-down menu
« Reply #2 on: May 01, 2010, 03:24:42 PM »
I'd really like to get this fixed for 2.0.0 but haven't had a chance to look into it. It will be out Monday or Tuesday depending on some other factors but I have no dev time free before then.

chocho

  • Guest
Re: Charset trouble on a drop-down menu
« Reply #3 on: May 01, 2010, 04:12:22 PM »
OK,
Back to English for now.

Mateusz Viste

  • Full Member
  • ***
  • Posts: 64
  • Karma: 7
    • View Profile
    • My homepage
Re: Charset trouble on a drop-down menu
« Reply #4 on: May 01, 2010, 05:11:18 PM »
The drop-down menu is in fact a Javascript function called "add_incoming_extension_selector()". The function has a properly declared charset (<script type="text/javascript" charset="utf-8">).

However, all positions are generated as raw bytes, so I guess that's the reason why the browser is not applying any charset to them... For example, here is the position which should be displayed as "Tylko połączenia wychodzące":

Code: [Select]
<option value="">\x54\x79\x6c\x6b\x6f\x20\x70\x6f\xc5\x82\xc4\x85\x63\x7a\x65\x6e\x69\x61\x20\x77\x79\x63\x68\x6f\x64\x7a\xc4\x85\x63\x65</option>' + '<option value=""></option>
The byte code itself is okay (C5 82 is "ł" in UTF-8, and "C4 85" is "ą", so everything is right), but I don't see any reason to put all the mess as raw bytes, instead of simply generating proper UTF-8 text...

I looked into the www/javascript.inc file, and commented out the following line:
Code: [Select]
$options[$i][1] = escape_jstring($options[$i][1]);
This fixed the issue, and I got all positions of the menu displayed nicely, without the binary crap I had before.

Of course, I believe that the escape_jstring() function is there for a reason. I don't know what it is supposed to do, but in this situation, it does it wrong...

@Michael - do you think that the lack of escape_jstring() processing on the $option variable might generate any border-effects? If not, then the fix is ready ;-)

giovanni.v

  • Hero Member
  • *****
  • Posts: 694
  • Karma: 53
    • View Profile
    • BoneOS SDK &  TeeBX VoIP communication platform
Re: Charset trouble on a drop-down menu
« Reply #5 on: May 01, 2010, 07:55:36 PM »
Of course, I believe that the escape_jstring() function is there for a reason. I don't know what it is supposed to do, but in this situation, it does it wrong...

https://wush.net/trac/askozia/ticket/39

Glad to see something is broken fixing something else  ;)
Escaping strings before passing to js code is often necessary but i'm not a js guru (nor a fan!).

Mateusz Viste

  • Full Member
  • ***
  • Posts: 64
  • Karma: 7
    • View Profile
    • My homepage
Re: Charset trouble on a drop-down menu
« Reply #6 on: May 01, 2010, 08:15:39 PM »

https://wush.net/trac/askozia/ticket/39

Glad to see something is broken fixing something else  ;)
Escaping strings before passing to js code is often necessary but i'm not a js guru (nor a fan!).

That's funny :)
I don't knew that this escaping thing was a "fix" for some previous bugs...
I  think that the problem here is not the idea about escaping things before passing them to JS, but rather *how* this escaping is done. I mean, normal ASCII or UTF-8 strings shouldn't be escaped at all - only characters which are likely to make troubles (, ; ' [ ] { } : " < > etc...).

Anyway - it sounds then that the fix is more complicated than just commenting out the line I found guilty... :P

giovanni.v

  • Hero Member
  • *****
  • Posts: 694
  • Karma: 53
    • View Profile
    • BoneOS SDK &  TeeBX VoIP communication platform
Re: Charset trouble on a drop-down menu
« Reply #7 on: May 01, 2010, 08:54:08 PM »
normal ASCII or UTF-8 strings shouldn't be escaped at all - only characters which are likely to make troubles (, ; ' [ ] { } : " < > etc...).
Almost true but was something like "i want nohing I forgot can't break...".

Quote
Anyway - it sounds then that the fix is more complicated than just commenting out the line I found guilty...

Probably not... at this time the charset in the gui is declared as iso-8859-1 using a meta in the html header (fbegin.inc at line 232) while some translations require utf-8.
Changing the meta might solve the problem. I'm unable to test at this moment.

Mateusz Viste

  • Full Member
  • ***
  • Posts: 64
  • Karma: 7
    • View Profile
    • My homepage
Re: Charset trouble on a drop-down menu
« Reply #8 on: May 02, 2010, 08:42:52 AM »
Hi all,

Here is my fix for the escape_jstring() function of the javascript.inc file:

Code: [Select]
function escape_jstring($in_str) {
        $out_str = '';
        $len_in_str = strlen($in_str);
        for($i = 0; $i < $len_in_str; $i++) {
                $dec = ord(substr($in_str, $i, 1));
                if (ord(substr($in_str, $i, 1)) < 127) {
                        $out_str .= '\\x' . dechex($dec);
                } else {
                         $out_str .= substr($in_str, $i, 1);
                }
        }
        return $out_str;
}

Basically, the modification I done is that now only low bytes are escaped (0...126), not high bytes (127...255). Unicode surrogates are written as pairs of high bytes, therefore that way I avoid braking UTF-8 strings by escaping them.
I tested this fix using every weird string I could think about (with apostrophes, commas, cyrillic characters, dots, quotes, etc...), and I didn't noticed any regression.