Re: [xep-support] linefeed normalization

From: G. Ken Holman (
Date: Sun Apr 04 2004 - 06:36:52 PDT

    At 2004-04-04 00:53 -0700, Victor Mote wrote:
    >Section 7.1 of the document "XSL Formatting Objects in XEP 3.7" says:
    >"Line break is forced by explicit linefeed characters: U+000A, U+000D,
    >U+2028, U+2029, unless they are suppressed by linefeed normalization;"
    >By "linefeed normalization", I assume you mean the linefeed-treatment
    >attribute. However, linefeed-treatment seems to only apply to U+000A
    >(although I suppose that they mean U+000D and combinations of the two also).

    Linefeed normalization is not linefeed treatment. The XML processor inside
    of the XSL-FO processor normalizes "natural" end-of-line sequences (naked
    LF, CR, CR+LF) into linefeed characters.

    An XML instance can bypass XML processor normalization by using numeric
    character entities for these characters, at which point they are not naked
    and are not normalized.



    is different than:


    to an application reading these with an XML processor.

    >Section 13.2 of the Unicode 3.0 standard describes U+2028 as an "unambiguous
    >character ... line separator...". Is there a mechanism in XEP that will
    >allow one to do *both* of the following: 1) treat U+000A (and U+000D) as
    >spaces (the default behavior), and 2) force a line break at U+2028?

    Do you have the opportunity to do some translation in your stylesheet? An
    example is below of playing with linefeed preservation. I acknowledge that
    doing the transformation may mess up the "spaces adjacent to linefeed"
    processing in XSL-FO.

    You could go to the extent of a recursive call processing the text nodes
    doing a replacement of U+2028 characters with empty blocks. I've included
    that in the example as well.

    Both techniques give the desired result.

    I hope this helps.

    .......................... Ken

    T:\ftemp>type mote.xml

    T:\ftemp>type mote.xsl
    <?xml version="1.0" encoding="utf-8"?><!--mote.xsl-->
    <xsl:stylesheet xmlns:xsl=""

    <xsl:template match="/names">
           <simple-page-master master-name="frame"
                               page-height="297mm" page-width="210mm"
                               margin-top="15mm" margin-bottom="15mm"
                               margin-left="15mm" margin-right="15mm">
             <region-body region-name="frame-body"/>
           <page-sequence-master master-name="frame-pages">
             <single-page-master-reference master-reference="frame"/>

         <page-sequence master-reference="frame-pages">
           <flow flow-name="frame-body">
             <xsl:apply-templates select="name" mode="translate"/>
             <xsl:apply-templates select="name" mode="recurse"/>

    <!--preservation through translation and setting of properties-->
    <xsl:template match="name" mode="translate">
       <block linefeed-treatment="preserve">
         <xsl:apply-templates mode="translate"/>

    <xsl:template match="text()" mode="translate">
       <xsl:value-of select="translate(.,'&#xa;&#x2028;&#xd;',' &#xa; ')"/>

    <!--preservation solely through recognition of special character-->
    <xsl:template match="name" mode="recurse">
         <xsl:apply-templates mode="recurse"/>

    <xsl:template match="text()" mode="recurse" name="recurse">
       <xsl:param name="text" select="string(.)"/>
         <xsl:when test="contains($text,'&#x2028;')">
           <xsl:value-of select="substring-before($text,'&#x2028;')"/>
           <xsl:call-template name="recurse">
             <xsl:with-param name="text"
           <xsl:value-of select="$text"/>



