Re: [xep-support] Does XEP have an extension to conditional renderingofpagenumber in xrefs?

From: Eliot Kimber <ekimber@innodata-isogen.com>
Date: Thu Sep 28 2006 - 07:48:51 PDT

Hood, Earl wrote:
>> You could do this at the XSLT level by using a two-pass process that
>> uses the XEP (or equivalent other proprietary intermediate form) area
>> tree to determine the relative page relationships of the different
>> elements and suppress them or not on the second pass.
>
> Have you implemented something that does this, and are willing to
> share?

We have done some limited things using Antenna House's Java API where we
pre-render specific constructs, such as tables or page sequences to see
how many pages they take up so that the second pass can then reflect
that (for example, when generating a list of effective pages that has to
reflect the length of the list of effective pages itself).

I don't think we've done anything more general but the basic technique
would be the same regardless of the details of the intermediate format:

1. Generate the initial FO instance, adding appropriate "marker"
elements or content (i.e., the rx:pinpoin Jirka mentioned or wrappers
with specific ID values or whatever will work) so that you can correlate
original input XML elements to their rendered location in the first pass.

2. Using the intermediate area tree, which reflects the paginated result
of processing the first FO instance, process the input XML again,
examining the pass 1 area tree as needed in order to make decisions
based on the pass-1 layout result to generate the pass 2 FO instance.

3. As Jirka points out, if necessary, run a third pass to resolve any
page break changes from pass 1 to pass 2. Ideally detect any occilations
that result (i.e., a target on page X in pass 1, page Y in pass 2, back
on page X in pass 3).

There are several practical issues with this approach as a general
technique:

1. The different FO implementations have different non-standard ways of
representing their area trees. So this approach will by necessity by
engine specific.

2. The area tree can be many times larger than the original input XML or
the resulting FO document, leading to potential performance issues with
either memory usage or I/O time needed to write and read the
intermediate file.

I have proposed a "standard" extension that would allow the creator of
the FO instance to write page-aware data to a "side file" that would
then only have the information you need to support a second pass.
However, none of the vendors have yet shown any interest in this
suggestion (see
http://exslfo.sourceforge.net/requirements.html#side-files for my
original statement of requirements). My initial idea was to have
something like this:

<fo:root xmlns:exslfo="http://exlso.org"
xmlns:msf="http://www.example.com/mysidefilemarkup"
>
<exslfo:side-file href="./sidefiles/mysidefile.xml"
   root-tag="msf:mysidefiledata"
>
  ...
<fo:block id="block-01"
><exslfo:side-file-data xmlns="http://www.example.com/mysidefilemarkup">
<element-to-page-mapping-item>
<orig-element-id>para-01</orig-element-id>
<rendered-page><fo:page-number/></rendered-page>
</element-to-page-mapping-item>
</exslfo:side-file-data>
...{rest of content of the fo:block} ...
</fo:block>
...
</fo:root>

This would result in the following side file:

<?xml version="1.0"?>
<mysidefiledata xmlns:msf="http://www.example.com/mysidefilemarkup">
<element-to-page-mapping-item>
<orig-element-id>para-01</orig-element-id>
<rendered-page>10</rendered-page>
</element-to-page-mapping-item>
</mysidefiledata>

This would let you create the smallest possible intermediate data set
needed to support your second pass processing.

The reason this approach would not be appropriate for the FO standard
itself is that it presumes a particular implementation processing
approach, namely a linear pass over the document, which the FO
specification does not require. However, for any tool that does in fact
do linear processing (which includes XEP, XSL Formatter, and FOP) it
seems like a perfectly reasonable approach.

Note also that you can get the same effect, although with a little less
convience, by generalizing Ken Holman's technique for putting this type
of data in your generated PDF and then extracting it from there on the
second pass (see
http://www.cranesoftwrights.com/resources/bbi/index.htm). This approach
is FO implementation independent but does require the ability to extract
data from PDF (which isn't that hard but not part of the usual standard
tool set).

Cheers,

Eliot

-- 
W. Eliot Kimber
Professional Services
Innodata Isogen
9390 Research Blvd, #410
Austin, TX 78759
(214) 954-5198
ekimber@innodata-isogen.com
www.innodata-isogen.com
-------------------
(*) To unsubscribe, send a message with words 'unsubscribe xep-support'
in the body of the message to majordomo@renderx.com from the address
you are subscribed from.
(*) By using the Service, you expressly agree to these Terms of Service http://www.renderx.com/terms-of-service.html
Received on Thu Sep 28 07:54:52 2006

This archive was generated by hypermail 2.1.8 : Thu Sep 28 2006 - 07:54:53 PDT