Now that I've got the spider working, I'd like him to ignore some files.
Specifically I'd like it to ignore files of the form *.wwwstat.html. I
tried adding .wwwstat.html to the NoContents directive but that reduced my
contents to 0 since it excluded all .html files :-). Other than stuffing
them off in some other directory with I could then exclude with robots.txt,
is there another way (when spidering) to tell it to ignore files that match
a certain pattern?
Bruce
Bruce Bowler 207.633.9600 (voice)
Research Associate 207.633.9641 (fax)
Bigelow Laboratory for Ocean Sciences bbowler@bigelow.org
West Boothbay Harbor ME 04575 http://www.bigelow.org/
Received on Wed Jan 27 12:32:34 1999