Skip to main content.
home | support | download

Back to List Archive

Re: [swish-e] Indexing starts all over again

From: Paras Fadte <plfgoa(at)not-real.gmail.com>
Date: Fri Aug 14 2009 - 05:00:39 GMT
Tried it but doesn't seem to work.

On Wed, Aug 12, 2009 at 7:04 PM, Peter Karman<peter@peknet.com> wrote:
> Paras Fadte wrote on 08/11/2009 04:38 AM:
>> Hi,
>>
>> I have had strange problem while indexing with swish-e wherein it
>> appears to start indexing data all over again as if it is in some
>> loop. When I try with say max_depth=1 or 2 it works fine . Can anybody
>> point out what could be happening here ?
>>
>
> Sounds like the spider.pl (I assume you are using that) is not
> identifying URLs as duplicates. You could try turning on the md5 option
> as described in the documentation:
> http://swish-e.org/docs/spider.html#duplicate_documents
>
> Search for 'use_md5' in the docs and make sure you have the Digest::MD5
> perl module installed from CPAN.
>
>
> --
> Peter Karman  .  http://peknet.com/  .  peter(at)not-real.peknet.com
> gpg key: 37D2 DAA6 3A13 D415 4295  3A69 448F E556 374A 34D9
> _______________________________________________
> Users mailing list
> Users@lists.swish-e.org
> http://lists.swish-e.org/listinfo/users
>
_______________________________________________
Users mailing list
Users@lists.swish-e.org
http://lists.swish-e.org/listinfo/users
Received on Fri Aug 14 01:00:41 2009