RE: [xep-support] ePUB

From: Kevin Brown <kevin@renderx.com>
Date: Wed Feb 17 2010 - 13:34:02 PST

Greg/Dave/Werner:

Excellent discussions, thanks.

I am not going to rehash all that was said but so as to say I agree with the
general positions. Dave is exactly right -- I believe you don't need a
vendor beyond an XSLT vendor (or your own scripting language) and some other
tools to do things like zip up containers. Going from XML to XSL FO to ePUB
is the wrong path.

Although keep in mind that the XSLT you use to go from XML to XSL FO likely
has some of the same structural analysis (for TOC, Bookmarks, formatting
differences for fo:blocks for headers, ...) to leverage for conversion to
the desired ePUB. In other words, you are probably half way there with your
current XSLT. You have the analysis side, you just need to change what is
output when recognizing that structure.

One other note on RenderX's HTML format (and SVG format). I want to be clear
all understand exactly what it is and why it came about. We had several of
our larger customers that do not want to force their customers on their
self-service portals to have Adobe software installed at all. They were
producing monthly investment statements and bills dynamically online. The
customer had the option of getting a PDF -- but it was not the main option.
BUT, they also had a requirement that the HTML view looked as close as
possible to the PDF view. So, we decided to create a conversion from XEP
Intermediate Format to HTML and SVG. It is post composition. The
Intermediate Format would have <page> elements and <text> fragments and
<image> and <line> and <rect> ... all with exact X,Y coordinates within our
page coordinate space. We convert this to HTML with extensive CSS and
absolute positioned containers. It doesn't flow .. at all. It looks like the
page no matter what.

Kevin Brown
RenderX

-----Original Message-----
From: owner-xep-support@renderx.com [mailto:owner-xep-support@renderx.com]
On Behalf Of Greg Baryza
Sent: Wednesday, February 17, 2010 7:36 AM
To: xep-support@renderx.com
Subject: Re: [xep-support] ePUB

At 03:42 PM 2/17/2010, Werner Donné wrote:
>Hi Greg,
>
>Generating ePub from either PDF or XSL-FO will never produce optimal
>results. It is true you could flow the text in them on an eReader. You can
>even create a layout that comes close using CSS. The problem, however, is
>that for a good eReader-experience you need structure and navigation
>information. The chapter headings, for example, can be used to generate
>the NCX navigation table in ePub, but how can you tell what is a chapter
>heading in PDF or XSL-FO?
>
>You also need semantic information to cut up a large document into parts,
>because otherwise the eReader will choke on it.
>
>If I understand it correctly you would like to recover the effort you put
>in producing the formatting in XSL-FO. You could transform XSL-FO to
>XHTML, ignoring unsupported constructs. Formatting properties can be
>translated to CSS-properties in a "style" attribute. Properties for which
>there is no equivalent in CSS are simply dropped. The result will be a
>document containing a lot of "div", "span", "table", "img" and "a"
>elements and not much more. In order to obtain a navigation structure you
>can place ID-attributes on the fo:block elements that correspond to the
>chapters or sections in the original document. Your chapter structure
>would be encoded in those IDs.
>
>Best regards,
>
>Werner.

Werner:

Thank you for that. It gives me further understanding and something to
work on.

Our documentation is presently in XML. It is loaded into a database (one
of my company's products) and stored as objects that know how to render
themselves in HTML. We use this capability to serve up documentation on
demand to browsers.

Each object also knows how to re-render itself in XML. To produce PDFs we
first get XML files of the top-level documents from the database (which
recursively renders the children). Then we pre-process them to add info
useful for printed material, but not so much for online access -- list of
tables, list of figures, legal boilerplate and so on. This we turn into
XSL-FO and give to XEP.

Sounds like a route to investigatewould be to do another pre-processor that
would add the stuff needed by the e-reader and them transform that.

Much obliged.

>On 17 Feb 2010, at 14:10, Greg Baryza wrote:
>
> > At 01:25 PM 2/16/2010, Kevin Brown wrote:
> >> It has been a recent topic of discussion without any specific
conclusions
> >> yet. From internal discussions, I can say ...
> >>
> >> Point (1) -- RenderX is a vendor that specializes in the "art" of
> >> composition -- meaning, layout of pages and composition of text like
> >> kerning, word and character spacing, calculating line ending and
> hyphenation
> >> ... the core engine of RenderX is used to do this. Consider this our
> special
> >> "ability".
> >>
> >> Point (2) -- ePUB is really not much at all of this. There are some
> elements
> >> that require analysis like embedding fonts, even restricted, subsetting
> >> fonts into the ePUB zip architecture. But it is designed to allow the
> device
> >> to flow the document, not create a pre-conceived flow of the document.
It
> >> needs to do this to allow it to be viewed on any reader of any size,
> layout.
> >> In other words, XSL FO already has things that are not appropriate for
> ePUB
> >> or constructs that (are not or) should not be mapped.
> >>
> >> So ...
> >>
> >> Some feel that one should go from raw XML to ePUB with alternate
mappings
> >> for appearance. There is a DITA module in development for this. There
are
> >> google tools for DocBook and other formats ...
> >>
> >> http://code.google.com/p/epub-tools/
> >>
> >> We usually stay within the world of composition -- we produce PDF
> (including
> >> PDF/A, PDF/X, PDF Acroforms), PostScript, AFP, XPS and soon PPML. Our
> other
> >> formats include XHTML and SVG but these are special cases where you
> wish the
> >> XHTML and the SVG to look exactly like (or as close as possible to) the
> >> printed page.
> >
> > I'm still very much a newbie when it comes to ePUB, so if these
> comments are off the mark, I welcome the educational opportunity...
> >
> > Drawing a line between XHTML and ePUB seems like making a distinction
> without a difference. Isn't the browser "flowing" content into the page
> in a way similar to that of an eReader? It appears that many of the
> current conversion-to-ePUB products take PDFs as input so why not cut out
> the middle step when you can work with the original markup?
> >
> > We have used XEP quite satisfactorily for half a decade now. We are
> starting to get probes from our current customers as to whether we will
> produce the content for their eReaders, or whether they will have to do
> it themselves. Naturally, we would rather produce the content. We also
> do not necessarily want to add another vendor unless it becomes necessary.
> >
> > That's a short explanation of what motivated the query.
> >
> >
> >> Of course, RenderX has always been a customer-focused vendor so your
> >> feedback is welcome. How important is this to our customers? Feedback
on
> >> here is what we like to see.
> >>
> >> Kevin Brown
> >> RenderX
> >>
> >>
> >> -----Original Message-----
> >> From: owner-xep-support@renderx.com
[mailto:owner-xep-support@renderx.com]
> >> On Behalf Of Greg Baryza
> >> Sent: Tuesday, February 16, 2010 12:41 PM
> >> To: xep-support@renderx.com
> >> Subject: [xep-support] ePUB
> >>
> >> Does RenderX have any near-term plans to produce ePUB documents from
> XSL-FO
> >> input?
> >>
> >> Thanks.
> >>
> >> <G>
> >>
> >> -------------------
> >> (*) To unsubscribe, send a message with words 'unsubscribe xep-support'
> >> in the body of the message to majordomo@renderx.com from the address
> >> you are subscribed from.
> >> (*) By using the Service, you expressly agree to these Terms of Service
> >> http://www.renderx.com/terms-of-service.html
> >>
> >> -------------------
> >> (*) To unsubscribe, send a message with words 'unsubscribe xep-support'
> >> in the body of the message to majordomo@renderx.com from the address
> >> you are subscribed from.
> >> (*) By using the Service, you expressly agree to these Terms of
> Service http://www.renderx.com/terms-of-service.html
> >
> > -------------------
> > (*) To unsubscribe, send a message with words 'unsubscribe xep-support'
> > in the body of the message to majordomo@renderx.com from the address
> > you are subscribed from.
> > (*) By using the Service, you expressly agree to these Terms of Service
> http://www.renderx.com/terms-of-service.html
>
>--
>http://www.pincette.biz/
>Handling your documents with care, wherever you are.
>
>
>-------------------
>(*) To unsubscribe, send a message with words 'unsubscribe xep-support'
>in the body of the message to majordomo@renderx.com from the address
>you are subscribed from.
>(*) By using the Service, you expressly agree to these Terms of Service
>http://www.renderx.com/terms-of-service.html

-------------------
(*) To unsubscribe, send a message with words 'unsubscribe xep-support'
in the body of the message to majordomo@renderx.com from the address
you are subscribed from.
(*) By using the Service, you expressly agree to these Terms of Service
http://www.renderx.com/terms-of-service.html

-------------------
(*) To unsubscribe, send a message with words 'unsubscribe xep-support'
in the body of the message to majordomo@renderx.com from the address
you are subscribed from.
(*) By using the Service, you expressly agree to these Terms of Service http://www.renderx.com/terms-of-service.html
Received on Wed Feb 17 14:02:15 2010

This archive was generated by hypermail 2.1.8 : Wed Feb 17 2010 - 14:02:22 PST