rm ([personal profile] rm) wrote2009-03-31 12:21 pm

HTML question

- Freaky control characters showing up in Firefox on Macs.
- They show up on nothing else -- not in Safari, not on PCs.

What the hell should I be looking for that will make this stop?

[identity profile] filkerdave.livejournal.com 2009-03-31 04:23 pm (UTC)(link)
Are they in the page source?

[identity profile] rm.livejournal.com 2009-03-31 04:26 pm (UTC)(link)
No, but looking at the page source I had a thought.

ampersands. ampersands as regular text. could they be triggering it?
ext_3685: Stylized electric-blue teapot, with blue text caption "Brewster North" (Default)

[identity profile] brewsternorth.livejournal.com 2009-03-31 04:27 pm (UTC)(link)
As opposed to ampersand-amp-semicolon? I don't know, maybe?

[identity profile] rm.livejournal.com 2009-03-31 04:30 pm (UTC)(link)
Yeah.

I've never had to deal with that before. How do I do code for just plain old ampersand?

I bet that will fix a chunk of 'em.
ext_3685: Stylized electric-blue teapot, with blue text caption "Brewster North" (Default)

[identity profile] brewsternorth.livejournal.com 2009-03-31 04:33 pm (UTC)(link)
The XML code is certainly & (& amp ;).

[identity profile] rm.livejournal.com 2009-03-31 04:34 pm (UTC)(link)
Okay. This is me again, super not getting this.

I know you just made a clever joke.

But I do not understand it.

If I want the screen to display: &

Do I just type a: &

Or do I type something else?
dipping_sauce: (Default)

[personal profile] dipping_sauce 2009-03-31 06:00 pm (UTC)(link)
It's preferable to type & amp ; (without the spaces).

[identity profile] dsmoen.livejournal.com 2009-04-01 06:53 am (UTC)(link)
Ampersands need to be encoded. & amp ; (without the spaces) is syntactically correct in ANY form of HTML. Ampersand alone never has been.

[identity profile] filkerdave.livejournal.com 2009-03-31 04:34 pm (UTC)(link)
& should give you the & safely.
ext_3685: Stylized electric-blue teapot, with blue text caption "Brewster North" (Default)

[identity profile] brewsternorth.livejournal.com 2009-03-31 04:37 pm (UTC)(link)
...yeah, I don't know enough HTML to show up the XML code without LJ automatically parsing it into the entity it should be. D'oh!

[identity profile] misch.livejournal.com 2009-03-31 04:49 pm (UTC)(link)
There isn't a good way to do that.

&lt; for <
&gt; for >

[identity profile] filkerdave.livejournal.com 2009-03-31 04:38 pm (UTC)(link)
De rien :)
marcmagus: Me playing cribbage in regency attire (Default)

[personal profile] marcmagus 2009-03-31 04:41 pm (UTC)(link)
Bare ampersands could *easily* be causing problems with some browsers. They're not allowed by the specification, but many browsers will make a guess and render them anyway. (I think often weird HTML problems that only occur on some browsers are because there's something wrong with your source and the popular browsers are made to make guesses in those situations so the page will render, even though it's wrong.)

If you ever want me to test things in FF, links, and/or lynx on Linux, I'd be happy to, btw.
ext_3685: Stylized electric-blue teapot, with blue text caption "Brewster North" (Default)

[identity profile] brewsternorth.livejournal.com 2009-03-31 04:26 pm (UTC)(link)
Apparently FF and Safari use two different character sets as their defaults: Firefox uses Western (ISO-8859-1), and Safari (and probably others) use Western (ISO Latin 1). I don't know if ISO-8859-1 is more restrictive than Latin 1, but it might explain the discrepancy in how accents and other characters are rendered.

[identity profile] rm.livejournal.com 2009-03-31 04:27 pm (UTC)(link)
Wow. Learn something new you never wanted to know every day.

Oi.

[identity profile] filkerdave.livejournal.com 2009-03-31 04:39 pm (UTC)(link)
Interesting. Surely that would take it from the system default character set?
sethg: picture of me with a fedora and a "PRESS: Daily Planet" card in the hat band (Default)

[personal profile] sethg 2009-03-31 04:51 pm (UTC)(link)
Latin-1 and ISO-8859-1 are the same thing.

However, the default character set on English-language Windows systems is Windows-1252, which is almost like Latin-1 except that in some places where Latin-1 has control characters, Windows-1252 has curly quotes (“”), long dashes, and other visible characters. The Web is so full of pages that represent themselves as Latin-1 (or even ASCII) and are actually coded in Windows-1252 that some browsers have just thrown in the towel and treat all these pages as Windows-1252. grumble grumble grumble Microsoft hegemony grumble grumble.

(which of course may have nothing to do with the OP's problem.)

[identity profile] browneyedgirl65.livejournal.com 2009-03-31 04:54 pm (UTC)(link)
In particular, mac made some different decisions with respect to the general issue of rendering non-standard characters. These decisions are not outside the scope of the relevant standards because there's some room for interpretation. Anyway, browsers on macintosh will simply handle some of these characters differently. It's easy enough to put out html and encodings that will work properly on both once you understand this issue. html entities are the safest; unicode encodings are also good to go. For your purposes the entities should be just fine.

[identity profile] browneyedgirl65.livejournal.com 2009-03-31 04:27 pm (UTC)(link)
Mac has all kinds of issues with different font sets. Without knowing more, it's hard to say, but is this on a non-English page? With more than standard ASCII?
ext_3685: Stylized electric-blue teapot, with blue text caption "Brewster North" (Default)

[identity profile] brewsternorth.livejournal.com 2009-03-31 04:28 pm (UTC)(link)
It has a small quantity of non-English accents, but it's also failing to recognise whichever Latin-1 character represents opening and closing quotation marks.

[identity profile] browneyedgirl65.livejournal.com 2009-03-31 04:29 pm (UTC)(link)
Oi. Replace all those quotes with ampersand-quot-semicolon instead, and see how that works.

[identity profile] rm.livejournal.com 2009-03-31 04:31 pm (UTC)(link)
Dumb, dumb HTML question:

do I write out "ampersand-quot-semicolon" or do I write out &";

am unclear.
Edited 2009-03-31 16:33 (UTC)

[identity profile] browneyedgirl65.livejournal.com 2009-03-31 04:34 pm (UTC)(link)
Sorry, you would use the single character ampersand where I say ampersand. I can't tell if LJ is going to render it or not, sometimes.

Use this:

&amp; to print out an ampersand character

&quot; to print out a properly escaped quotation character

These are called html entities, and there's a list of them here: http://htmlhelp.com/reference/html40/entities/ plus other places if you google

[identity profile] rm.livejournal.com 2009-03-31 04:35 pm (UTC)(link)
Thank you thank you thank you!

I feel embarrassed that I've never needed to know this before.

[identity profile] filkerdave.livejournal.com 2009-03-31 04:37 pm (UTC)(link)
Yes, and they're also the safest way to do accented character if you do the occasional entries (as I do) in German or French.

[identity profile] stardragonca.livejournal.com 2009-03-31 08:01 pm (UTC)(link)
Hey! Something I can understand and use.
(Normal procedure:search for non-English word phrase or character online, then cut and paste. Clumsy, but me.)
Techno-peasantly yours.

[identity profile] filkerdave.livejournal.com 2009-03-31 08:24 pm (UTC)(link)
It works, certainly.

For email, I'm familiar enough with the alt-keypad method and I know most of the characters I need by heart :)

further confusion

[identity profile] jinian.livejournal.com 2009-03-31 04:36 pm (UTC)(link)
Neither: it should look like &quot; (once my HTML is processed).

" (if you are seeing this in plain text) is what to type.

[identity profile] misch.livejournal.com 2009-03-31 04:54 pm (UTC)(link)
There's always problems doing copy-pasta from word processors. Most notably, when Word or Open Office do double quote substitution.

Word:
Tools | Autocorrect Options | Auto-format as you type (tab)
Uncheck: Replace as you type "Straight quotes" with "smart quotes"

[identity profile] browneyedgirl65.livejournal.com 2009-03-31 05:13 pm (UTC)(link)
And that is the reason Word is the bane of my existence :-P. Thanks for that tip, I'll have to remember it...

[identity profile] browneyedgirl65.livejournal.com 2009-03-31 08:25 pm (UTC)(link)
Yup, found my way there shortly thereafter. Cute :-)

[identity profile] rm.livejournal.com 2009-03-31 04:30 pm (UTC)(link)
Mostly English. A foreign language words. Also, there are ampersands, which I'm starting to suspect are the major issue.

[identity profile] browneyedgirl65.livejournal.com 2009-03-31 04:30 pm (UTC)(link)
Ampersands, as in ampersands that should show as ampersands? You can show the latter safely with ampersand-amp-semicolon

[identity profile] mecurtin.livejournal.com 2009-03-31 04:40 pm (UTC)(link)
In case it's not clear from what other people have said:

Use this table or something like it. Notorious offenders: ampersand, smart quotes (right and left "angled" quote marks), accented letters, everything after ASCII 122.
sethg: picture of me with a fedora and a "PRESS: Daily Planet" card in the hat band (Default)

[personal profile] sethg 2009-03-31 05:02 pm (UTC)(link)
The "code charts" on the Unicode Web site provide the ultimate reference for this kind of thing. For example, this table shows (most of) the characters from the Zapf Dingbats font that have been incorporated into the Unicode standard; if you want to include a little scissors on your page then you can represent it in HTML with &#x2702; to get ✂.

(Of course if the person reading your page doesn't have a font with that character on his or her computer then it will show up as a funny block instead of a scissors--but memory and disk space are so cheap these days that most operating systems come with fonts that cover the majority of the Unicode space.)