Skip to main content.
home | support | download

Back to List Archive

[swish-e] Can I index by filename and directory over http

From: David Annis <david(at)not-real.ualconsulting.com>
Date: Tue Feb 12 2008 - 14:47:14 GMT
I am a longtime user of htdig and would like to switch to swish-e, but I
need to be able to index part of sites in several ways.  I need to be able
to do particular page(s) on one site, a directory on a second and a set of
pages on a third that all use a common naming convention, but the page that
links to them does not.

Here's an example and how I think the swish configuration might work.  I
want to index:

http://site1.com/flowers.html,
anything in http://www.my-site.com/flowers/
And all of the pages linked from http://www.athirdsite.org/products.html
that match flowers_*.html

I think that the first two would be:
IndexDir http://www.site1.com/flowers.html
IndexDir http://www.my-site.com/flowers/

But the third line of the config is harder.  I don't see how to start at one
page (products.html) that I really don't care to have indexed but follow its
links or how to use a regex on the results only from the links on that
particular page.  Is this doable with swish-e?

Thanks,
David

_______________________________________________
Users mailing list
Users@lists.swish-e.org
http://lists.swish-e.org/listinfo/users
Received on Tue Feb 12 09:47:17 2008