Topic Options
#32507 - 11/16/09 01:51 PM Problem with PP Search's rebuild
Raphael Offline
Junior Member

Registered: 01/16/09
Posts: 15
Hi guys,

I have the following problem.

I have a process that generate 60.000 invoices per day, so I'll have around 1.000.000 per month.

My process was started a few days, and I have now around 250.000 invoices in my database.

The big problem is, when I try to rebuild my database in PlanetPress Search, this "rebuild" took around 3 hours! eek

Well, I would like to know, how can I improve this rebuild ?

Any idea ?

Top
#32508 - 11/16/09 01:59 PM Re: Problem with PP Search's rebuild
Anonymous
Unregistered


Raphael,

250,000 entries taking 3 hours to rebuild is, I think, around what we'd expect. Remember PlanetPress Search has to physically read every single PDI file in your directory to write it in the database.

As your database is built right now, the only way to improve performance of the software would be to improve your hardware - a faster hard drive (10,000+ RPM) would do it.

Of course that entails money, so maybe an alternative would be to "merge" all your invoices into smaller, more digestible bits. You could, for example, create one PDF for each day, so you'd have 30 PDFs per month instead of 1,000,000. Or separate them by client and then by month, so you would have as many invoices per month as you have clients.

This would make the rebuild a lot faster, and would not actually slow the search down that much (it would reduce performances slightly, but you will have to expect this with an increasing number of entries over time anyway).

Search will still be able to look for the proper information, and instead of showing the results separated by file, they would be part of multiple pages spanned over a fewer number of files.

Also, there is the fact that instead of rebuilding the database, you can simply refresh it. Refreshing will only scan for files that have been added to your list since the last scan time, so it should be a lot faster (the more you do it, the faster it will be since it has less files to scan).

Hope this helps,
Eric

Top