Re: [xep-support] Question about rendering character references

From: Eliot Kimber <ekimber@innodata-isogen.com>
Date: Fri Jul 22 2005 - 11:34:08 PDT

Powell, Todd wrote:
> However, in all these cases, when the FO is pushed through XEP, the
> character doesn't show in the resulting PDF.
>
> Any suggestions on what I'm doing wrong.

You need to use a font that has a glyph for that character. On Windows
2K and XP you can use the character map to see if the font you're using
has a glyph for that character. You may also need to change the font
selection strategy to "character-by-character" if you are specifying a
font list that includes a font with a glyph for the character. See a
recent response to this list in regards to another font issue for
details on font selection strategy.

The encoding you use doesn't matter--it's the same abstract character in
the parsed XML.

The reason your UTF-8 representation looked wrong but the UTF-16 version
looked correct is because whatever you used to look at the UTF-8 version
interpreted it as ASCII, not UTF-8. This doesn't happen with UTF-16
encodings because they must start with the magic "byte order mark" that
unambiguously marks them as being UTF-16. UTF-8 doesn't require a byte
order mark (but one can be used).

UTF-8 is designed to be compatible with ASCII, such that the first 255
characters of UTF-8 are the same as in ASCII. Thus, if the tool opening
file doesn't know it's UTF-8 and tries to guess, it might guess wrong if
all it sees are ASCII characters. Most encoding guessers look at the
first 1000 or so characters and if they find an UTF-8 non-ASCII
character it must be UTF-8 and if they don't, they assume ASCII. But,
like in your case, they can assume wrong, and then you see what you
saw--apparent garbage characters.

Using a tool like SC Unipad or Textpad, you can force the encoding on
open and then get an accurate picture of your file's contents.

Cheers,

Eliot

-- 
W. Eliot Kimber
Professional Services
Innodata Isogen
9390 Research Blvd, #410
Austin, TX 78759
(512) 372-8155
ekimber@innodata-isogen.com
www.innodata-isogen.com
-------------------
(*) To unsubscribe, send a message with words 'unsubscribe xep-support'
in the body of the message to majordomo@renderx.com from the address
you are subscribed from.
(*) By using the Service, you expressly agree to these Terms of Service http://www.renderx.com/terms-of-service.html
Received on Fri Jul 22 11:58:46 2005

This archive was generated by hypermail 2.1.8 : Fri Jul 22 2005 - 11:58:48 PDT