Re: [xep-support] PDF File size

From: Justin Lipton <justin.lipton@exari.com>
Date: Tue Nov 04 2008 - 16:37:10 PST
Hi Michael,

Thanks for your suggestion - I'm still struggling with this.
Here's the only difference between my input FO files:

$ diff a1.fo b1.fo
5c5
< <fo:list-item-body start-indent="body-start()"><fo:block id="Paragraph_4" font-size="12pt" font-family="Times New Roman, Times, Symbol, serif" font-weight="bold"><fo:inline>Project Name: </fo:inline><fo:inline>Amex</fo:inline></fo:block></fo:list-item-body>  </fo:list-item>
---
> <fo:list-item-body start-indent="body-start()"><fo:block id="Paragraph_4" font-size="12pt" font-family="Times New Roman, Times, Symbol, serif" font-weight="bold"><fo:inline>Project Name: Amex</fo:inline></fo:block></fo:list-item-body>  </fo:list-item>


Here's the difference in the XEP files (intermediate format):
$ diff a1.xep b1.xep
18,19c18,19
< <xep:text value="oject Name: " x="137894" y="737704" width="64632"/>
< <xep:text value="Amex" x="202526" y="737704" width="29988"/>
---
> <xep:text value="oject Name: " x="137894" y="737704" width="63972"/>
> <xep:text value="Amex" x="201866" y="737704" width="29988"/>

The PS:
173c173
< 202.526 737.704 moveto
---
> 201.866 737.704 moveto

and finally when I convert to an image format (PNG) I'm finding that the word "Amex" is a pixel or two displaced. The PNGs also have different file sizes.
I've attached all of the input and output files.

Any idea what could be causing this difference?

Regards,
Justin.


Michael Sulyaev wrote:
Justin Lipton wrote:
Is there a way to make the rendered PDF ignore these redundant inlines such that the PDF output would be identical in both cases?
We're trying to get PDF unit tests working and this seems to be tripping us up. Stripping them out as part of the FO generation would really complicate our existing XSLT transforms.

Hello Justin,

Generated PDF files may not be binary identical even for one and the same input, at least because they contain datetime fields which obviously vary. File size may also differ, at least because compression methods are not expected to produce outputs of equal size even if inputs are of equal size.

I'd suggest not to waste time doing unit tests on PDF, but rather consider using XEP Intermediate Format output for this purpose. These are text files (XML actually), so diffing is fast, and if you need to see the difference some way, xmldiff may help.

Another approach is to use a rasterizer (e.g. GhostScript) to produce raster images from PDF files and compare them pixel-wise page-by-page. Sounds a bit complicated, but it works, and may be extended to produce visible 'symmetric' difference images for pages that differ, if you need it.

Regards,
Michael Sulyaev
RenderX

-------------------
(*) To unsubscribe, send a message with words 'unsubscribe xep-support'
in the body of the message to majordomo@renderx.com from the address
you are subscribed from.
(*) By using the Service, you expressly agree to these Terms of Service http://www.renderx.com/terms-of-service.html


--
Exari Signature

Justin Lipton
justin.lipton@exari.com
Level 7, 10-16 Queen Street, Melbourne 3000 Australia
Tel +61 3 9621 2775 | Fax +61 3 9621 2776

Exari Systems
Boston | London | Melbourne | Munich
www.exari.com

Really Important Notice
Actually, this is only important if this email was sent to you by mistake or if someone sent it to you when they shouldn't have. If you got it by mistake, please reply and tell us so we can re-send it to the right person. If you got it from someone who should know better, please tell them they're in big trouble. In either case, please don't forward it on to anyone else. Or print it. Or talk about it. Otherwise you'll be in big trouble too.

exari logo
For Contract Relief

-------------------
(*) To unsubscribe, send a message with words 'unsubscribe xep-support'
in the body of the message to majordomo@renderx.com from the address
you are subscribed from.
(*) By using the Service, you expressly agree to these Terms of Service http://www.renderx.com/terms-of-service.html

exari-clr-24.gif
Received on Tue Nov 4 17:13:16 2008

This archive was generated by hypermail 2.1.8 : Tue Nov 04 2008 - 17:13:21 PST