Paras Fadte wrote on 08/11/2009 04:38 AM:
> Hi,
>
> I have had strange problem while indexing with swish-e wherein it
> appears to start indexing data all over again as if it is in some
> loop. When I try with say max_depth=1 or 2 it works fine . Can anybody
> point out what could be happening here ?
>
Sounds like the spider.pl (I assume you are using that) is not
identifying URLs as duplicates. You could try turning on the md5 option
as described in the documentation:
http://swish-e.org/docs/spider.html#duplicate_documents
Search for 'use_md5' in the docs and make sure you have the Digest::MD5
perl module installed from CPAN.
--
Peter Karman . http://peknet.com/ . peter(at)not-real.peknet.com
gpg key: 37D2 DAA6 3A13 D415 4295 3A69 448F E556 374A 34D9
_______________________________________________
Users mailing list
Users@lists.swish-e.org
http://lists.swish-e.org/listinfo/users
Received on Wed Aug 12 09:35:22 2009