Skip to main content.
home | support | download

Back to List Archive

problems with temporary files (etc.) using swish-e -M

From: WWW server manager <webadm(at)not-real.info.cam.ac.uk>
Date: Sun Nov 23 1997 - 23:47:27 GMT
Summary: 

swish-e may use a lot of space for temporary files (which is not documented,
and is done insecurely) and also fails to check (many, if not all) reads and
writes and is therefore prone to blunder on in spite of problems such as
large temporary files filling the target partition, and may loop consuming
CPU time (e.g. when attempting to search a truncated index, though I saw
that with swish and haven't re-tested with swish-e). Additionally, it does
nothing to ensure that the temporary files are deleted if it is interrupted.

In more detail:

After using swish for several years, I've just started looking at swish-e
and tried something which I'd not done with swish (which would quite 
probably behave the same) - creating multiple indexes and merging them,
so I could offer the choice of searching everything or searching distinct
areas of the data specifically. This amounted to around 90 separate indexes
since our server hosts a large number of University clubs and societies and I
wanted them to be individually searchable...

Indexing the same data (in its entirety, single overall index) with swish
results in a 6MB index file. The separate indexes from swish-e total around
60 MB. That in itself was not a problem, but when I attempted to use swish-e
-M to combine them all into a single index, I hit a problem. In fact,
several problems:

 * swish-e filled the /var partition (because it wrote large temporary
   files in /var/tmp, swish-e's use of which is not documented anywhere 
   that I could see).
 * swish-e failed to notice that writes were failing, and carried on
   trying to merge the indexes.
 * 2-3 hours later I logged on and found that swish-e was still running,
   still trying to merge indexes using the full partition for temporary
   space.
[plus
 * I should have got paged soon after the partition filled up, but the 
   pager system's network interface wasn't working that evening... Can't
   blame swish-e for that!]

The documentation mentions that memory use should be around half the total
size of the indexes to be merged, but no hint that it uses temporary files 
as well. I was also surprised, when I checked with du after the problem, to
find that the many separate indexes totalled around 60MB when indexing
everything together with swish produces a 6MB index. Memory wouldn't have
been a problem, but the temporary files were, since /var is a relatively
small partition and had nothing like enough space.

There are a number of issues here:

 * it looks like tmpnam() is used to invent names for the temporary files.
   On Solaris 2, at least, that always uses /var/tmp (no way to redirect
   it to somewhere with more space, as allowed by tempnam()). [As a 
   separate issue, creating temporary files in world-writable directories
   (in particular, with more-or-less predictable names) is a security issue
   unless care is taken when opening them, which swish-e does not do.]

   Using tempnam() and documenting (a) the likely temporary file space
   requirements, and (b) that sites might need to redirect the temporary 
   files by defining the TMPDIR environment variable, would help with
   the file size/location issue; it wouldn't help with the security aspect
   except to the extent that the files could be placed in a non-world-
   writable directory. Safe use of /tmp is tricky...

 * swish-e does not check whether writing to a file succeeded; it failed
   to notice the partition was full. If it had noticed and terminated after
   deleting the files, the system wouldn't have been running with /var 
   full (as far as non-root users were concerned) for several hours.

 * swish-e does not (always) check whether reading from files succeeds; if
   it had done so, it would have noticed the temporary index files were 
   truncated (and wouldn't have run for 2-3 hours with no sign of
   stopping - see next point).

 * A related point, noted previously with swish (no bug report as it was
   not being maintained...), is that swish search processes do not notice
   if the index file is truncated, and instead loop consuming CPU time as
   fast as they can get it. I presume swish-e would  behave similarly.
   
   [The overnight index build - single index - failed due to the target
   partition being filled by something else; no error report for that, the
   first hint was load average 50+ due to the search processes - with
   users repeating the failing searches and making matters worse, due to
   lack of response. It was that incident that prompted a cron job to 
   page me if partitions got overfull - but the pager system let me down
   yesterday. Sigh...]

 * When interrupted (control-C while running interactively, or kill PID
   when running in the background), swish-e does not take any action to
   ensure the temporary files are deleted, and since it's not documented
   that they are even created, they might be left consuming space with 
   no-one noticing.

I suspect fixing the code to deal with all of those points may be quite a lot 
of work, unfortunately!

                                John Line

-- 
University of Cambridge WWW manager account (usually John Line)
Send general WWW-related enquiries to webmaster@ucs.cam.ac.uk
Received on Sun Nov 23 15:55:35 1997