Well, I've had to revert to not indexing .pdf files, and that seems to work
all right. Does anyone have any suggestion on limiting memory usage indexing
pdf files, or getting the -e flag to work?
thanks
johnS
> -----Original Message-----
> From: Stewart, John
> Sent: Thursday, December 02, 2004 5:16 PM
> To: 'swish-e@sunsite.berkeley.edu'
> Subject: Using too much memory, or with -e failing out
> with too many files
>
>
> I'm trying to get swish-e working on our internal web server
> box (working great for years on our exernal web server, I
> love it!). It's running Solaris 2.6 (yes, ancient).
>
> I initially did the latest stable version (2.4.2), but am now
> trying with 2.5.2.
>
> With 2.4.2, no -e flag:
>
> The initial issue, when indexing the files, is that swish-e
> is using up gobs and gobs of memory until it crashes and/or
> causes the various other processes to crash. The error
> message is "unable to allocate additional X bytes". I can
> watch (with "top") as the memory usage grows and grows during
> the indexing process.
>
> The biggest chunk to index (/home/manuals) is mostly .pdf
> files, and is about 15GB in size. In fact, with this testing,
> I am only trying to index this chunk of the filesystem.
>
> I tried the -e flag, after seeing some suggestions to do so.
> This gives me this error, after indexing less than 100 files:
>
> err: Couldn't create temporary file './swtmploc7wZ8j_' file
> descriptor: No such file or directory
>
> Furthermore, in the directory where I started swish (and NOT
> in any tmp dir), it is filled with 252 files named
> swtmplocXXXXX_, where XXXXX seems to be random characters (a
> hash of some sort?). I tried bumping up the file descriptor
> limit to 9999 with the "ulimit" command, to no apparent
> effect. Should the -e flag be creating so many files? It's
> more than one per file indexed, apparently!
>
> I then tried with -e with the latest 2.5.2 (from today), and
> got the same behaviour.
>
> I am right now trying to re-index without the -e command, but
> I see already that swish-e has consumed 230MB, and is rapidly growing.
>
> For reference, here is what is in the swish.conf file (I
> commented out everything but the big /home/manuals section):
>
> ###########################################################
> IndexOnly .html .txt .pdf .htm .doc
> NoContents .gif .jpg
> #
> # Don't do these pdf's - crashing
> FileRules pathname contains marketing/competition
> #
> # Index the main internal section
> #
> #IndexDir /www/internal
> ReplaceRules remove /www/internal
> #
> # Index the internal_web section on titan
> #
> #IndexDir /home/groups/internal_web
> ReplaceRules remove /home/groups/internal_web
> #
> # Index the manuals section
> #
> IndexDir /home/manuals
> ReplaceRules remove /home
> ###########################################################
>
>
> And here is how I am invoking swish-e:
>
> /bin/nice -n 19 /usr/local/bin/swish-e -c
> /usr/local/swish/swish.conf -f /usr/local/swish/index.swish-e
>
>
> Any suggestions?
>
> thanks!
>
> johnS
>
>
>
>
>
Received on Thu Dec 9 08:31:47 2004