Skip to main content.
home | support | download

Back to List Archive

Re: [swish-e] Seeding a swish-e index

From: Peter Karman <peter(at)not-real.peknet.com>
Date: Mon Sep 08 2008 - 01:38:49 GMT
Sean wrote on 9/7/08 7:26 PM:
>>> Is there a weighting for each of the different elements of a html page ?
>> http://swish-e.org/docs/swish-faq.html#how_is_ranking_calculated_
>>
>> ^^ read that first
>>
>> I would personally put all my keywords in a <meta> tag in the header and set a
>> MetaRankBias with RankScheme=1
>>
> 
> Thanks. That will help with this issue.
> 
> How does the ranking work with non .html documents, that don't don't have:
> <title>Titles</title>
> <h1>headers</h1>
> <meta name="keywords" content="META">
> <!-- Comments -->
> etc ?
> 
> For instance a word document or a PDF document
> 

Swish-e doesn't index Word or PDF documents in their original formats. They must
be converted to html, text, or xml. That's what SWISH::Filter does, for example.

So the same ranking strategy applies. SWISH::Filter (for example) assigns
<title> and <meta> where it can for things like PDF and MS docs.


-- 
Peter Karman  .  http://peknet.com/  .  peter(at)not-real.peknet.com
_______________________________________________
Users mailing list
Users@lists.swish-e.org
http://lists.swish-e.org/listinfo/users
Received on Sun Sep 7 21:38:59 2008