If you're at the unix process size limit (typically 2Gb), you may
have to start writing some code. One way to do it is to have a program
that traverses your file system, finds the files you want based on some
criteria (like newness), and hands them to swish in smaller batches (via
the 'prog'
input mode). Swish would then create an index of just those files. Then
your
program would use swish to merge that index into your main index.
Of course, swish 2.4 limits a single index file to 2Gb in size, so if
you really want
to index a lot of data, you'll have to go to multiple indexes, which
swish is
perfectly happy to search.
The Perl 'Find' module is especially handy for this sort of thing.
If you're not at the unix process size limit, add some more swap space to
your machine to get more virtual memory.
Another option is to run a 64-bit processor/OS to get *lots* more
virtual memory.
Bill
Tuc wrote:
>Hi,
>
> I'm trying to index a few large sites, which I copy locally using
>"webcopy". Once I finish the copy, I run it with "-e" . It ran for 12 or so
>hours before I got :
>
>vjofn2# /usr/local/bin/swish-e -e -i /news/webcopy
>Indexing Data Source: "File-System"
>Indexing "/news/webcopy"
>swish-e in malloc(): error: allocation failed
>Abort (core dumped)
>
> There are a few subdirectories :
>
>348430 site1
>547210 site2
>6048 site3
>77880 site4
>16678 site5
>18 site6.a
>3138 site6.b
>4088 site7
>
>
> I saw that I could do it by individual directory, then use the "-M"
>to merge, or allow the searches to use "-f". I think that if I do the "-M"
>that even with "-e" it will cause the memory allocation issue. And I'm
>afraid with the "-f" that the search will take too long to join them all.
>
> Is there something I can do? This system already has 2G of memory
>in it and 2 Xenon 2.6G CPU's.
>
> Thanks, Tuc
>
>
>
Received on Wed Oct 13 14:34:42 2004