Re: [xep-support] PDF File size

From: Michael Sulyaev <>
Date: Mon Oct 20 2008 - 23:56:18 PDT

Justin Lipton wrote:
> Is there a way to make the rendered PDF ignore these redundant inlines
> such that the PDF output would be identical in both cases?
> We're trying to get PDF unit tests working and this seems to be tripping
> us up. Stripping them out as part of the FO generation would really
> complicate our existing XSLT transforms.

Hello Justin,

Generated PDF files may not be binary identical even for one and the
same input, at least because they contain datetime fields which
obviously vary. File size may also differ, at least because compression
methods are not expected to produce outputs of equal size even if inputs
are of equal size.

I'd suggest not to waste time doing unit tests on PDF, but rather
consider using XEP Intermediate Format output for this purpose. These
are text files (XML actually), so diffing is fast, and if you need to
see the difference some way, xmldiff may help.

Another approach is to use a rasterizer (e.g. GhostScript) to produce
raster images from PDF files and compare them pixel-wise page-by-page.
Sounds a bit complicated, but it works, and may be extended to produce
visible 'symmetric' difference images for pages that differ, if you need it.

Michael Sulyaev

(*) To unsubscribe, send a message with words 'unsubscribe xep-support'
in the body of the message to from the address
you are subscribed from.
(*) By using the Service, you expressly agree to these Terms of Service
Received on Tue Oct 21 00:30:09 2008

This archive was generated by hypermail 2.1.8 : Tue Oct 21 2008 - 00:30:11 PDT