Skip to main content.
home | support | download

Back to List Archive

Probably dumb newbie question.

From: Nic Gibson <nicg(at)not-real.noslogan.org>
Date: Thu Aug 26 2004 - 11:08:35 GMT
--45Z9DzgjV8m4Oswq
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline

Hi

I'm having an odd problem with swish-e 2.4.2. I have an index generated using 
spider.pl. Contrary to my expectations it appears to be indexing the href content
of html anchors. I've attached the index configuration file to this message.  The only
odd thing I can think of about this particular website is that the URLs don't have
file extensions (see http://pmr.corbas.co.uk/dynamic/). However, the content type
is definitely correct.


I'd be really grateful if someone more experienced in the ways of SWISH-E could
tell me that I've done something particularly stupid (failing that - clues as to
where to look next would be great).

cheers

nic gibson
-- 
love is the shit that makes life bloom


--45Z9DzgjV8m4Oswq
Content-Type: text/plain; charset=us-ascii
Content-Disposition: attachment; filename="site.conf"

IndexDir /usr/local/src/pmr/bin/spider.pl
SwishProgParameters /usr/local/src/pmr/search/spider.conf
IndexFile /usr/local/src/pmr/search/index.site
IndexName 'PMR Site'
IndexDescription "Index of the PMR Site Content"
MetaNames title 
PropertyNames title category

ExtractPath category regex !/dynamic/([^/]+)/!$1!
ExtractPathDefault category general

DefaultContents HTML2

--45Z9DzgjV8m4Oswq--
Received on Thu Aug 26 04:08:56 2004