Re: [xep-support] Rendering Unicode combined diacritic characters

From: Jim Skelton (jimskelton@yahoo.com)
Date: Thu May 20 2004 - 11:32:59 PDT

  • Next message: James Drayton: "[xep-support] Adobe SVGs"

    Hi Nikolai,

    Thanks for letting me know about this limitation in
    XEP.

    I'm not an expert on Unicode character rendering, but
    I understand that the Unicode specification allows for
    combining diacritics and leaves it up to the
    application to render it acceptably. Microsoft's
    proprietary Uniscribe has gone a long way in
    accomplishing this (I'm not sure how they render the
    acute barred i, but it looks OK). I'm sure it wasn't
    easy.

    You may want to take a look at SIL's Graphite at
    http://sourceforge.net/projects/silgraphite which is a
    system that enables rendering of non roman and other
    complex scripts. I'm not sure how portable it is, or
    whether they have any java libraries available. It is
    also able to render the below characters fine.

    I would think that it may be in RenderX's interests to
    develop Unicode rendering capability, though this may
    not be part of RenderX's market. Adding a Unicode
    rendering component could potentially expand RenderX's
    market...

    Thank for your consideration into this.

    --Jim

    --- Nikolai Grigoriev <grig@renderx.com> wrote:
    > Jim,
    >
    > > I'm finding that when displaying a PDF file
    > generated
    > > from XEP, that Unicode combining diacritics don't
    > > render correctly.
    >
    > You are right. XEP does not do any special
    > processing
    > for combining diacritics: it just displays diacritic
    > signs
    > as a standalone characters. (They show above their
    > preceding characters only by virtue of their glyph
    > metrics).
    >
    > I admit this is a drawback in XEP; but it is not an
    > easy
    > one to overcome. Proper diacritic placement requires
    > parsing of complex font structures; we don't have
    > immediate
    > plans in that direction.
    >
    > > You'll notice that the barred i does have
    > > the acute accent over it (though a little too far
    > > left), but the dot on the i is retained (it
    > shouldn't
    > > be retained when combined with an acute accent).
    > > This character is represented by x0268 x0301.
    >
    > I don't see how we could achieve this, unless there
    > is a special glyph for barred dotless i. XEP
    > certainly
    > cannot decompose glyph descriptions from fonts.
    >
    > > Microsoft's Unicode rendering engine (Uniscribe),
    > > only recently was able to render these character
    > > combinations correctly. There are other unicode
    > > rendering schemes, such as SIL's graphite...
    >
    > We don't use any third-party Unicode processors:
    > our past experience made us very suspicious to
    > external components, as it seriously undermines
    > portability.
    >
    > Best regards,
    > Nikolai Grigoriev
    > RenderX
    >
    >
    > ----- Original Message -----
    > From: "Jim Skelton" <jimskelton@yahoo.com>
    > To: <xep-support@renderx.com>
    > Sent: Tuesday, May 18, 2004 3:36 AM
    > Subject: [xep-support] Rendering Unicode combined
    > diacritic characters
    >
    >
    > I'm finding that when displaying a PDF file
    > generated
    > from XEP, that Unicode combining diacritics don't
    > render correctly. An example of this is at
    > http://www3.telus.net/osis/CNT.2john.pdf -- in the
    > title, 3rd word from the left, you'll notice that
    > the
    > acute accent over the e dieresis is too low. This
    > character is represented by the two glyphs x00EB
    > x0301. The latter glyph is a combining diacritic,
    > meaning that it overstrikes the preceding character.
    > In theory, the rendering engine should place the
    > combining diacritic at the correct height, depending
    > on the metrics of the base character.
    >
    > Another example of this is in the acute accent over
    > barred i, found in the first line of regular text,
    > 2nd
    > word over. You'll notice that the barred i does have
    > the acute accent over it (though a little too far
    > left), but the dot on the i is retained (it
    > shouldn't
    > be retained when combined with an acute accent).
    > This
    > character is represented by x0268 x0301.
    >
    > The embedded font used in this PDF file is a Unicode
    > font which contains a lot of the latin character
    > subsets.
    >
    > Is there something that can be done to render
    > Unicode
    > combining diacritics correctly? I noticed that
    > Microsoft's Unicode rendering engine (Uniscribe),
    > only
    > recently was able to render these character
    > combinations correctly. There are other unicode
    > rendering schemes, such as SIL's graphite...
    >
    > --Jim
    >
    >
    >
    >
    >
    >
    >
    > __________________________________
    > Do you Yahoo!?
    > SBC Yahoo! - Internet access at a great low price.
    > http://promo.yahoo.com/sbc/
    > -------------------
    > (*) To unsubscribe, send a message with words
    > 'unsubscribe xep-support'
    > in the body of the message to majordomo@renderx.com
    > from the address
    > you are subscribed from.
    > (*) By using the Service, you expressly agree to
    > these Terms of Service
    > http://www.renderx.com/tos.html
    >
    > -------------------
    > (*) To unsubscribe, send a message with words
    > 'unsubscribe xep-support'
    > in the body of the message to majordomo@renderx.com
    > from the address
    > you are subscribed from.
    > (*) By using the Service, you expressly agree to
    > these Terms of Service
    http://www.renderx.com/tos.html

            
                    
    __________________________________
    Do you Yahoo!?
    Yahoo! Domains Claim yours for only $14.70/year
    http://smallbusiness.promotions.yahoo.com/offer
    -------------------
    (*) To unsubscribe, send a message with words 'unsubscribe xep-support'
    in the body of the message to majordomo@renderx.com from the address
    you are subscribed from.
    (*) By using the Service, you expressly agree to these Terms of Service http://www.renderx.com/tos.html



    This archive was generated by hypermail 2.1.5 : Thu May 20 2004 - 11:46:28 PDT