Skip to main content.
home | support | download

Back to List Archive

patch for filesystem indexing

From: John-Marc Chandonia <JMChandonia(at)>
Date: Thu May 20 2004 - 21:30:25 GMT
	When traversing a filesystem to build indices, the first thing
swish-e does when it indexes a directory is check whether the
directory has been visited before; if so, it skips it.  However, the
problem is that this check is done before the FileRules check to
see whether we really want to index the directory.  As a result,
I have cases where a directory is visited once and skipped due to
a FileRules check, then visited again via a different pathname
(where the FileRules doesn't match) and skipped due to having been
visited before.

	Why would anybody want to do this?  In my lab, we keep lab
notebooks in datestamped directories (e.g., /040520/) with symlinks to
the next and previous weeks (/040513/n/ is the same directory as
/040520/ or 040527/p/).  Because I want the indices to be clear, I
only want to index files in the directory according to their
non-symlinked path.  I therefore wrote FileRules to skip paths with
/n/ and /p/ in them, but when the directory is visited via its "real"
name, swish-e reports already having indexed the directory.  (Note, in
this case I can't just turn off following symlinks because there are
lots of other symlinks I do want to follow.)

	My fix was just to move the fs_already_indexed check to after
the FileRules dirname check; this solves the problem I mention above.
This is based on a quick glance at the code; you might want to move it
even later (after the FileRules directory check) if there's no potential
for an infinite loop:

diff fs.c~ fs.c
<     if (fs_already_indexed(sw, dir))
<         return;
<     /* and another stat if not set to follow symlinks */
<     if (!fs->followsymlinks && islink(dir))
<         return;
>     if (fs_already_indexed(sw, dir))
>         return;
>     /* and another stat if not set to follow symlinks */
>     if (!fs->followsymlinks && islink(dir))
>         return;

John-Marc Chandonia (               We're everywhere...
Structural Genomics Center, Berkeley National Lab       for your convenience.                                 -- Psi Corps <*>
Received on Thu May 20 14:30:26 2004