Skip to main content.
home | support | download

Back to List Archive does not follow Metadata redirects

From: Robert Keith <Robert(at)>
Date: Mon Jul 28 2003 - 08:43:03 GMT
I need to spider a web site that has a Metadata tag in the main index.html
that redirects the browser to index.php.

Example: ->
    <meta http-equiv="refresh" content="0;

This works fine via browsers (albeit slowly).
I know there are better ways to do this (use web server to set head html
correctly, etc.) , but we can't control foreign sites.  Should not the
spider system behave similar to browsers?

Is this the current behavior or did I miss something?

Robert Keith


The command I run is:

        /usr/bin/perl /fs/area/search/prog-bin/
/fs/area/search/conf/ | swish-e -S prog -c
/fs/area/search/conf/prof2 -i stdin -v3

The output is:

Parsing config file '/fs/area/search/conf/prof2'
Parsing config file '/fs/area/search/conf/common.config'
Indexing Data Source: "External-Program"
Indexing "stdin"
/fs/area/search/prog-bin/ Reading parameters from

 -- Starting to spider: --
>> +Fetched 0 Cnt: 1 200 OK text/html 130

Summary for:
Total Bytes: 130  (130.0/sec)
 Total Docs:   1  (1.0/sec)
Unique URLs:   1  (1.0/sec) - Using HTML parser -  (no words indexed)

Removing very common words...
no words removed.
Writing main index...
err: No unique words indexed!
Received on Mon Jul 28 08:43:56 2003