Freelancers Network
 
skill list top cap
Homepage
Join the Freelancer's Network
Update your details
Find a freelancer
Post a project
Find a project
Projects Archive
Post a job
Find a job
Jobs Archive
See Dan's Pages
See Andy's Pages
Link to this site
Resources
Join/Leave Forum
Forum Messages
+Additions+ Adverts
Advertising
Contact Us
Subscribe to our newsletter - enter your email address and hit return
Freelancers.net is owned and operated by Andy Stowell and Dan Winchester
skill list end cap
guru web hostcom

Find me again on Freelancers.net

FN-FORUM: Japanese encoding/charsets in RTF files

date posted 20th March 2006 13:50

Hi

I have some code that tracks hits to websites, picks up search terms =
from search engines etc, one of our clients has a lot of hits from =
japan, this is not a problem for displaying in the browser as I set the =
correct content-type and urldecode the search term in PHP and all is =
well. However, one of the features that we offer that our client likes =
is the ability to receive reports on a monthly basis via email in RTF =
format.=20

Problem is that the rtf encoding doesn=E2=80=99t come out correctly

The search terms come in from the search engine looking like this:

%E3%83%AA%E3%83%90%E3%83%97%E3%83%BC%E3%83%AB%E5%A4%A7%E5%AD%A6

As the Japanese chars have been encoded

So we unencode the string and get something like : =
=C3=A3=C6=92=C2=AA=C3=A3=C6=92=C2=90=C3=A3=C6=92=E2=80=94=C3=A3=C6=92=C2=BC=
=C3=A3=C6=92=C2=AB=C3=A5=C2=A4=C2=A7=C3=A5=C2=AD=C2=A6

To get around this in the browser we set the charset to UTF-8 and all is =
well, but in RTF documents you are limited to the following encoding =
formats:

I have tried using a number of the formats below but nothing translates =
the chars correctly, instead it displays Korean, which is wrong!

Ive tried

ANSI
Default=20
And the one that should work : Shift-Jis, just converts the text to =
Korean.

/*
ANSI =3D 0
Default =3D 1
Symbol =3D 2
Invalid =3D 3
Mac =3D 77
Shift Jis =3D 128
Hangul =3D 129
Johab =3D 130
GB2312 =3D 134
Big5 =3D 136
Greek =3D 161
Turkish =3D 162
Vietnamese =3D 163
Hebrew =3D 177
Arabic =3D 178
Arabic Traditional =3D 179
Arabic user =3D 180
Hebrew user =3D 181
Baltic =3D 186
Russian =3D 204
Thai =3D 222
Eastern European =3D 238
PC 437 =3D 254
OEM =3D 255


So does anyone know of an effecting way to translate foreign chars =
correctly in RTF documents or can anyone point me in the correct =
direction.

Many thanks and kind regards
Ash



Messages by Day
March 31st 2006
March 30th 2006
March 29th 2006
March 28th 2006
March 27th 2006
March 26th 2006
March 25th 2006
March 24th 2006
March 23rd 2006
March 22nd 2006
March 21st 2006
March 20th 2006
March 19th 2006
March 18th 2006
March 17th 2006
March 16th 2006
March 15th 2006
March 14th 2006
March 13th 2006
March 12th 2006
March 11th 2006
March 10th 2006
March 9th 2006
March 8th 2006
March 7th 2006
March 6th 2006
March 5th 2006
March 4th 2006
March 3rd 2006
March 2nd 2006
March 1st 2006


Messages by Month
December 2006
November 2006
October 2006
September 2006
August 2006
July 2006
June 2006
May 2006
April 2006
March 2006
February 2006
January 2006


Messages by Year
2008
2007
2006
2005
2004
2003
2002
2001
2000