Skip to main content.
home | support | download

Back to List Archive

Re: help swish-e

From: <moseley(at)not-real.hank.org>
Date: Thu May 15 2003 - 21:46:39 GMT
On Thu, May 15, 2003 at 01:22:09PM -0700, Brenda G. Nieves wrote:
> Help swish-e Indexing a Directory in other server
> 
> 
> base_url        =3D> 'http://thewebsite.com/directory1/',

If you only want to index that directory and beow (and not follow links 
elsewhere like to the root, then test the URL for "directory1".

In your spider config try something like:

     test_url => sub { $_[0]->path =~ m!^/directory1/! },

$_[0] (the first parameter passed to the function) is a URI object, and
thus $_[0]->path is the path part of the URL.  The subroutine returns true 
if that path starts with /directory1/ and that tells the spider that it's 
ok to process that URL.  Returning false (i.e. when it doesn't match) says 
to ignore that URL.

-- 
Bill Moseley
moseley@hank.org
Received on Thu May 15 21:46:45 2003