Skip to main content.
home | support | download

Back to List Archive

Re: error indexing pdf files

From: Bill Moseley <moseley(at)not-real.hank.org>
Date: Tue Apr 15 2003 - 17:06:04 GMT
On Tue, 15 Apr 2003, Jody Cleveland wrote:

> > By the way:
> > 
> > moseley@bumby:~/apache$ HEAD 
> www.oshkoshpubliclibrary.org/../index.html
> 200 OK
> Connection: close
> 
> So, does that mean it is actually working, despite the fact it has the .. in
> it?

It means that from the server's point of view /index.html and
/../index.html and /../../index.html ... are all the same file, but from
the spider's point of view they are all different paths.

My guess is if you tried indexing a file at the top level:

test.html:
<html>
<head><title>Go Dog, Go!</title></head>
<body><a href="../test.html">go around again!</a></body>
</html>

It would go forever.  If test.html was in some subdirectory it would
return a 404.  But if you move it to the top level it returns a 200.

By the way, 

  $URI::ABS_REMOTE_LEADING_DOTS = 1;

at the top of my spider config file does seem to fix it.  I'll make a note
in spider.pl docs.


-- 
Bill Moseley moseley@hank.org
Received on Tue Apr 15 17:09:54 2003