Welcome Guest Search | Active Topics | Members | Log In | Register

making sure the right text encoding is applied Options · View
Jack Campin - bogus address
Posted: Wednesday, April 16, 2008 6:21:35 PM


Rank: Guest
Groups: Guest

Joined: 9/17/2007
Posts: 11,670
Points: -1,200
Date parsed: 16/04/2008 18:21:35
Date: Wed, 16 Apr 2008 17:21:35 +0100

http://www.campin.me.uk/Cerdanos/ is a mirror of a Geocities site of
Turkish art music scores, heavily edited to remove absolute URLs,
Javascript, CSS and Yahooisms. I want it to work offline and on
the most basic browsers there are. I've made it HTML 4.01 strict.

It's all in Turkish. I've set up the META stuff the W3C suggests to
get the right encoding. On my usual browser at home (iCab/MacOS 9.1)
it Just Works. Looks exactly the same on the latest Firefox under
MacOS 10.3.9.

When I try it on the machine at work (Windows 2000 Professional with
some version of Internet Explorer) it comes up as nonsense since it
chooses US English text encoding. If I change the default to "auto-
select" the text encoding, it picks a Central European one which makes
the page a different kind of nonsense. It looks okay if I manually
select Turkish text encoding, but the point of the META tag was to
obviate that.

Is there some juju which will make IE/W2000 (and other Windows browsers)
do the right thing without also screwing up W3C compliance?

There are lots of missing files, some of the scores are mis-classified,
and the scans are mostly of poor quality. I may be able to do something
about that, but meanwhile there's nothing else on this scale out there.

==== j a c k at c a m p i n . m e . u k === <http://www.campin.me.uk> ====
Jack Campin, 11 Third St, Newtongrange EH22 4PU, Scotland == mob 07800 739 557
CD-ROMs and free stuff: Scottish music, food intolerance, and Mac logic fonts
Andreas Prilop
Posted: Wednesday, April 16, 2008 8:41:21 PM


Rank: Guest
Groups: Guest

Joined: 9/17/2007
Posts: 11,670
Points: -1,200
Date parsed: 16/04/2008 20:41:21
Date: Wed, 16 Apr 2008 18:41:21 +0200

On Wed, 16 Apr 2008, Jack Campin - bogus address wrote:

> It's all in Turkish.

You might use ISO-8859-9 or Windows-1254. But you should really
be using Unicode UTF-8 in AD 2008.

> I've set up the META stuff the W3C suggests to get the right encoding.

No, no! The correct and recommended way is to specify the encoding
(charset) in the HTTP header:
http://www.w3.org/International/O-HTTP-charset

The fake META is only an illusion:
http://www.unics.uni-hannover.de/nhtcapri/meta-http-equiv.1
http://www.unics.uni-hannover.de/nhtcapri/meta-http-equiv.2

Test your browser(s) for Turkish characters here:
http://www.unics.uni-hannover.de/nhtcapri/multilingual1.html

--
In memoriam Alan J. Flavell
http://groups.google.com/groups/search?q=author:Alan.J.Flavell
Jack Campin - bogus address
Posted: Thursday, April 17, 2008 8:29:00 AM


Rank: Guest
Groups: Guest

Joined: 9/17/2007
Posts: 11,670
Points: -1,200
Date parsed: 17/04/2008 08:29:00
Date: Thu, 17 Apr 2008 07:29:00 +0100

>> It's all in Turkish.
> You might use ISO-8859-9 or Windows-1254. But you should really
> be using Unicode UTF-8 in AD 2008.

I used cp1254, which I assumed was more widely supported (at least
for old cheap machines, which was what I was after).

>> I've set up the META stuff the W3C suggests to get the right encoding.
> No, no! The correct and recommended way is to specify the encoding
> (charset) in the HTTP header:
> http://www.w3.org/International/O-HTTP-charset

As I said in the original post, my primary motivation was to make
it work offline - it's only on my site for testing purposes in that
form, I intend to ZIP the whole lot into a downloadable archive.
So no HTTP would be involved in the end product. And I don't have
the privs to hack Apache configs at the site where I'm hosting that
right now anyway.

==== j a c k at c a m p i n . m e . u k === <http://www.campin.me.uk> ====
Jack Campin, 11 Third St, Newtongrange EH22 4PU, Scotland == mob 07800 739 557
CD-ROMs and free stuff: Scottish music, food intolerance, and Mac logic fonts
Anahata
Posted: Thursday, April 17, 2008 10:11:25 AM


Rank: Guest
Groups: Guest

Joined: 9/17/2007
Posts: 11,670
Points: -1,200
Date parsed: 17/04/2008 10:11:25
Date: Thu, 17 Apr 2008 09:11:25 +0100

>>you should really
>>be using Unicode UTF-8 in AD 2008.

I certainly agree that UTF8 is the way forward and you can use it on
everything, instead of having to choose according to the language of the
page.

> As I said in the original post, my primary motivation was to make
> it work offline

I should think your best bet is to
(a) encode in UTF-8
(b) add META charset directives you can to help browsers
(c) As a last resort in case some browser still doesn't autodetect it,
include instructions on the page to set it manually.

Anahata
Andreas Prilop
Posted: Thursday, April 17, 2008 6:14:22 PM


Rank: Guest
Groups: Guest

Joined: 9/17/2007
Posts: 11,670
Points: -1,200
Date parsed: 17/04/2008 18:14:22
Date: Thu, 17 Apr 2008 16:14:22 +0200

On Thu, 17 Apr 2008, Jack Campin - bogus address wrote:

> I used cp1254, which I assumed was more widely supported (at least
> for old cheap machines, which was what I was after).

I don't know what you mean by "old cheap machines" - but you
can use "charset=utf-8" with Netscape 4 and Internet Explorer 4.
Just test your "old cheap machines" on
http://www.unics.uni-hannover.de/nhtcapri/multilingual1.html

--
http://niwo.mnsys.org/saved/~flavell/charset/
Owen Rees
Posted: Thursday, April 17, 2008 11:28:33 PM


Rank: Guest
Groups: Guest

Joined: 9/17/2007
Posts: 11,670
Points: -1,200
Date parsed: 17/04/2008 23:28:33
Date: Thu, 17 Apr 2008 22:28:33 +0100

On Thu, 17 Apr 2008 07:29:00 +0100, Jack Campin - bogus address
<bogus@purr.demon.co.uk> wrote in
<bogus-FBA4F4.07290017042008@news.news.demon.net>:

>I used cp1254, which I assumed was more widely supported (at least
>for old cheap machines, which was what I was after).

A quick test of an offline copy of the page with IE6 on Windows XP Home
suggests that UTF-8 declared in the meta will do the right thing with
UTF-8 encoded characters in the body of the page but with the original
cp1254 declaration and encoding it gets it wrong.

If you replace the 'cp1254' with 'windows-1254' in the meta then IE
seems to find the right characters. ('windows-1254' is the official name
for the charset - see <http://www.iana.org/assignments/character-sets>.)

--
Owen Rees
[one of] my preferred email address[es] and more stuff can be
found at <http://www.users.waitrose.com/~owenrees/index.html>
Users browsing this topic
Guest


Forum Jump
You cannot post new topics in this forum.
You cannot reply to topics in this forum.
You cannot delete your posts in this forum.
You cannot edit your posts in this forum.
You cannot create polls in this forum.
You cannot vote in polls in this forum.

Main Forum RSS : RSS

YAFPro Theme Created by Jaben Cargman (Tiny Gecko)
Powered by Yet Another Forum.net version 1.9.1.1 (NET v2.0) - 9/10/2007
Copyright © 2003-2006 Yet Another Forum.net. All rights reserved.
This page was generated in 0.127 seconds.