Skip to main content.
home | support | download

Back to List Archive

Restatement: Problem with Spidering "file://" PDFs

From: McQuiggin, Kevin <kevin.mcquiggin(at)not-real.vancouver.ca>
Date: Tue Jul 05 2005 - 15:30:03 GMT
Hi All:

I've corrected a typo in the URL I included previously.  Thanks to Bill
Moselely for pointing this out!

I'm new to swish-e (about 3 weeks experience).  I have the package running
more or less successfully on a Linux box indexing some IIS pages.  Very
powerful, and my colleagues love it!

I have created a web page that has several references to PDF documents via
URLs that look like:

    <a href="file:///path/to/local/file/document.pdf">  ...  </a>

I note that spider.pl does not appear to be reading and indexing these
files.

I can think of workarounds to get the URL into an http: context, but it
would be simpler if there is a way to get this to work with file:// URLs.

Someone must have asked/answered this before, but I could not find anything
in the docs or the FAQ.

Help appreciated,

Kevin
Received on Tue Jul 5 08:30:04 2005