[xep-support] Re: Searchable image PDFs

From: Kevin Brown <kevin@renderx.com>
Date: Wed May 25 2011 - 14:01:36 PDT

Brian:

1) Drop down the text in a block on the page. I would put it in an absolute
container in a small font to ensure it all fits and will not cause a flow
problem.

2) Stamp the image over the top using an absolute positioned container whose
z-order is higher than the block, usually placed at the end of the flow for
that page.

3) You can control paging by inserting a simple structure like:
        <fo:block break-after="page"><fo:leader/></fo:block>

So you would have something like this .... have not tested but should work
...

<fo:block-container> <!-- make absolute and place on the page -->
        <fo:block>Searchable text here</fo:block>
</fo:block-container>
<fo:block-container> <!-- make absolute and place on the page over the text
block above -->
        <fo:block>
                <fo:external-image src="url('mytif.tif')"/>
        </fo:block>
<fo:block-container>
<fo:block break-after="page"><fo:leader/></fo:block> <!-- go to next page
-->

If you wish to do it in a flow ... this is also possible. Just design the
block containers to overlap each other. Just do the opposite of this
example:

http://www.dpawson.co.uk/xsl/sect3/fofixedposn.html#d12230e136

sample attached.

Kevin Brown
RenderX

-----Original Message-----
From: xep-support-bounces@renderx.com
[mailto:xep-support-bounces@renderx.com] On Behalf Of Brian Sheppard
Sent: Wednesday, May 25, 2011 1:34 PM
To: xep-support@renderx.com
Subject: [xep-support] Searchable image PDFs

I'm giving this a bump, since I haven't found a solution. I'm attempting to
build a PDF where the TIFF page-images are visible and the embedded OCR is
not. (Though the OCR is searchable.)

I've tried including the page-image file as an external graphic, with the
OCR in a hidden block (<fo:block visibility="hidden"> ... </fo:block>), but
the OCR remains visible.

I feel like I'm missing something obvious.

-Brian

> I'm unable to find any indication that XEP supports creating searchable
image PDFs. Can anyone clarify?
>
> Specifically, I'd like to produce PDFs that incorporate dirty OCR beneath
scanned page images.
>
> Thanks for any insight.
>
> -Brian
>
>
>
>
> _______________________________________________
> (*) To unsubscribe, please visit
http://lists.renderx.com/mailman/options/xep-support
> (*) By using the Service, you expressly agree to these Terms of Service
http://w
> ww.renderx.com/terms-of-service.html

!DSPAM:87,4ddd67bf63731643514961!

_______________________________________________
(*) To unsubscribe, please visit
http://lists.renderx.com/mailman/options/xep-support
(*) By using the Service, you expressly agree to these Terms of Service
http://w
ww.renderx.com/terms-of-service.html

_______________________________________________
(*) To unsubscribe, please visit http://lists.renderx.com/mailman/options/xep-support
(*) By using the Service, you expressly agree to these Terms of Service http://w
ww.renderx.com/terms-of-service.html

hammer.jpg
Received on Wed May 25 14:02:30 2011

This archive was generated by hypermail 2.1.8 : Wed May 25 2011 - 14:02:32 PDT