Skip to main content.
home | support | download

Back to List Archive

RE: Using too much memory, or with -e failing out with too many f

From: Stewart, John <johns(at)not-real.artesyncp.com>
Date: Thu Dec 09 2004 - 16:31:41 GMT
Well, I've had to revert to not indexing .pdf files, and that seems to work
all right. Does anyone have any suggestion on limiting memory usage indexing
pdf files, or getting the -e flag to work?

thanks

johnS

>  -----Original Message-----
> From: 	Stewart, John  
> Sent:	Thursday, December 02, 2004 5:16 PM
> To:	'swish-e@sunsite.berkeley.edu'
> Subject:	Using too much memory, or with -e failing out 
> with too many files
> 
> 
> I'm trying to get swish-e working on our internal web server 
> box (working great for years on our exernal web server, I 
> love it!). It's running Solaris 2.6 (yes, ancient).
> 
> I initially did the latest stable version (2.4.2), but am now 
> trying with 2.5.2.
> 
> With 2.4.2, no -e flag:
> 
> The initial issue, when indexing the files, is that swish-e 
> is using up gobs and gobs of memory until it crashes and/or 
> causes the various other processes to crash. The error 
> message is "unable to allocate additional X bytes". I can 
> watch (with "top") as the memory usage grows and grows during 
> the indexing process.
> 
> The biggest chunk to index (/home/manuals) is mostly .pdf 
> files, and is about 15GB in size. In fact, with this testing, 
> I am only trying to index this chunk of the filesystem.
> 
> I tried the -e flag, after seeing some suggestions to do so. 
> This gives me this error, after indexing less than 100 files:
> 
> err: Couldn't create temporary file './swtmploc7wZ8j_' file 
> descriptor: No such file or directory
> 
> Furthermore, in the directory where I started swish (and NOT 
> in any tmp dir), it is filled with 252 files named 
> swtmplocXXXXX_, where XXXXX seems to be random characters (a 
> hash of some sort?). I tried bumping up the file descriptor 
> limit to 9999 with the "ulimit" command, to no apparent 
> effect. Should the -e flag be creating so many files? It's 
> more than one per file indexed, apparently!
> 
> I then tried with -e with the latest 2.5.2 (from today), and 
> got the same behaviour.
> 
> I am right now trying to re-index without the -e command, but 
> I see already that swish-e has consumed 230MB, and is rapidly growing.
> 
> For reference, here is what is in the swish.conf file (I 
> commented out everything but the big /home/manuals section):
> 
> ###########################################################
> IndexOnly .html .txt .pdf .htm .doc
> NoContents .gif .jpg
> #
> # Don't do these pdf's - crashing
> FileRules pathname contains marketing/competition
> #
> # Index the main internal section
> #
> #IndexDir /www/internal
> ReplaceRules remove /www/internal
> #
> # Index the internal_web section on titan
> #
> #IndexDir /home/groups/internal_web
> ReplaceRules remove /home/groups/internal_web
> #
> # Index the manuals section
> #
> IndexDir /home/manuals
> ReplaceRules remove /home
> ###########################################################
> 
> 
> And here is how I am invoking swish-e:
> 
> /bin/nice -n 19 /usr/local/bin/swish-e -c 
> /usr/local/swish/swish.conf -f /usr/local/swish/index.swish-e
> 
> 
> Any suggestions?
> 
> thanks!
> 
> johnS
> 
> 
> 
> 
> 
Received on Thu Dec 9 08:31:47 2004