Skip to main content.
home | support | download

Back to List Archive

an exclusion question

From: Bruce Bowler <bbowler(at)not-real.bigelow.org>
Date: Wed Jan 27 1999 - 20:32:55 GMT
Now that I've got the spider working, I'd like him to ignore some files.
Specifically I'd like it to ignore files of the form *.wwwstat.html.  I
tried adding .wwwstat.html to the NoContents directive but that reduced my
contents to 0 since it excluded all .html files :-).  Other than stuffing
them off in some other directory with I could then exclude with robots.txt,
is there another way (when spidering) to tell it to ignore files that match
a certain pattern?

Bruce

Bruce Bowler                             207.633.9600 (voice)
Research Associate                       207.633.9641 (fax)
Bigelow Laboratory for Ocean Sciences    bbowler@bigelow.org
West Boothbay Harbor ME  04575           http://www.bigelow.org/
Received on Wed Jan 27 12:32:34 1999