[xep-support] Re: Creating embedded index in PDF for faster searching?

From: Jan To¹ovskı <j.tosovsky@gmc.net>
Date: Fri May 02 2014 - 00:46:48 PDT

Dear All,

I've asked for this feature couple years ago: http://services.renderx.com/lists/xep-support/5866.html

>From that time I investigated various approaches which can be summarized:
1. While you can crate a custom script in Acrobat, there is no way to execute it automatically via command line. It has to be invoked from the GUI.
2. The format of search index is not publicly available so hardly ever possible to incorporate it into non Adobe software without reverse engineering.

But I didn't give up... and succeed finally. My solution is based on a custom Acrobat plugin and it is kind of hook. It works in Acrobat XI+ (tested on Windows only) so this software is a must on your machine. It can be executed from a commandline as the part of post-processing step.

It is not immediately available yet as I have to wrap everything into installer first... But this is on my TODO list. I'll update this post when it is ready.

Best regards,

Jan

-----Original Message-----
From: xep-support-bounces@renderx.com [mailto:xep-support-bounces@renderx.com] On Behalf Of Mark Giffin
Sent: Thursday, May 1, 2014 2:56 AM
To: dclunie@dclunie.com; RenderX Community Support List; kevin@renderx.com
Subject: [xep-support] Re: Creating embedded index in PDF for faster searching?

I don't think Word can do this. Adobe Acrobat Professional can do this and I agree, the index it produces is vastly faster, and it will also index a whole bunch of separate PDF files in one index. It's an old feature (used to be called "Catalog") that Adobe doesn't seem to talk about anymore. If you want to automate it you might look at Adobe ExtendScript for Acrobat. ExtendScript is Adobe's JavaScript-based scripting language for products like Photoshop, FrameMaker etc. but I don't know if Acrobat supports it. But if it does you could probably write a small script to kick off this Catalog indexing, and if you're really lucky there may be a way to kick it off from the command line, so you could incorporate it into your PDF build process.

Mark Giffin
http://markgiffin.com/

On 4/30/14 5:13 PM, David Clunie wrote:
> That's a bit disappointing. If Word can do it, it would be nice if
> RenderX could too (as a post-processing step if necessary), since
> doing it manually in Acrobat afterwards is painful, and I couldn't
> find a command line tool to do it.
>
> David
>
> On 4/20/14 5:45 PM, Kevin Brown wrote:
>> This is not supported by RenderX and there are no plans to add it.
>> This is
>> an operation best performed after the entire document is created and
>> not "as" it is being created.
>>
>>
>> Kevin Brown
>> (650) 327-1000 Direct
>> (650) 328-8008 Fax
>> (925) 395-1772 Mobile
>> skype:kbrown01
>> kevin@renderx.com
>> sales@renderx.com
>> http://www.renderx.com
>>
>>
>>
>>
>> -----Original Message-----
>> From: xep-support-bounces@renderx.com
>> [mailto:xep-support-bounces@renderx.com] On Behalf Of David Clunie
>> Sent: Wednesday, April 16, 2014 6:10 AM
>> To: xep-support@renderx.com
>> Subject: [xep-support] Creating embedded index in PDF for faster
>> searching?
>>
>> Hi
>>
>> I am creating quite large PDF files that users frequently search
>> within, and the searches are relatively slow.
>>
>> I am using the ENABLE_ACCESSIBILITY in xep.xml to created tagged PDF.
>>
>> If I load these into Acrobat and then use Advanced > Document
>> Processing > Manage Embedded Index > Create Index, then the result is
>> a MUCH faster search.
>>
>> However, I would rather generate these in the pipeline with XEP (or
>> an additional pass with some other command line tool if anyone knows
>> of one).
>>
>> I couldn't find anything in the manual about this, or any obvious
>> option.
>>
>> David

_______________________________________________
(*) To unsubscribe, please visit http://lists.renderx.com/mailman/options/xep-support
(*) By using the Service, you expressly agree to these Terms of Service http://w ww.renderx.com/terms-of-service.html

!DSPAM:87,53634ed39851807819043!

_______________________________________________
(*) To unsubscribe, please visit http://lists.renderx.com/mailman/options/xep-support
(*) By using the Service, you expressly agree to these Terms of Service http://w
ww.renderx.com/terms-of-service.html
Received on Fri May 2 00:52:54 2014

This archive was generated by hypermail 2.1.8 : Fri May 02 2014 - 00:52:59 PDT