Skip to main content.
home | support | download

Back to List Archive

Re: [swish-e] Change the indexed 'title'

From: Peter Karman <peter(at)not-real.peknet.com>
Date: Wed Oct 24 2007 - 18:50:50 GMT
On 10/24/2007 01:30 PM, josh@relativelysane.com wrote:

> 
> I am not concerned with filtered queries or anything fancy like that; i just
> want the title on the output to be pulled from the HTML tags mentioned
> previously. I looked up the ExtractPath metaname; not really 100% on how
> that could help me use these fields as the 'title' on the output as opposed
> to the <title></title> field. I also looked up PropertyNames; and I set them
> in my cfg, but not sure how to populate them with the fields that I want
> form the html that it indexes......?
> 

If you want the title on the output to be pulled from tags other than <title>
tags, then you either need to (a) filter the content before it reaches swish-e
to put your title content inside the <title> tagset, or (b) use existing
swish-e features to mimic that approach.

In my example below, I created 3 dummy docs, one in each directory, using the
tags you specified. I configured swish-e to extract the first part of the file
path (the directory name) to a MetaName called 'flavor' (which I can then later
search on), and I added a PropertyName for each of the tags you want to save
the content for display in results. I also set 'flavor' as a PropertyName, just
so you can easily see what value is being set for the title.

NOTE: if you're planning to use the swish.cgi example script in the distrib,
then you'd have to hack it to return different PropertyNames instead of
swishtitle for the title of each result, and then test the flavor property
value as well to know which one to display. If you're writing your own search
app, you'd have to put the same logic in there.

[pek@dewpoint:~/tmp/josh]$ ls -1
conf
docsthatarenormal/
docswith-ahref/
docswith-strong/
index.swish-e
index.swish-e.prop
[pek@dewpoint:~/tmp/josh]$ cat docs*/*

docsthatarenormal:
<html>
 <head><title>real title is the title I want</title></head>
 <body><a href="bar">link text</a> <strong>strong text</strong> blah</body>
</html>

docswith-ahref:
<html>
 <head><title>real title</title></head>
 <body><a href="bar">title I want</a></body>
</html>

docswith-strong:
<html>
 <head><title>real title</title></head>
 <body><strong>title I want</strong></body>
</html>

[pek@dewpoint:~/tmp/josh]$ cat conf
ExtractPath flavor regex !^([^/]+)/.*$!$1!
PropertyNames strong a flavor


[pek@dewpoint:~/tmp/josh]$ swish-e -w title AND flavor=strong -x '"<strong>"
"<swishtitle>" "<flavor>"\n'

# SWISH format: 2.5.6
# Search words: title AND flavor=strong
# Removed stopwords:
# Number of hits: 1
# Search time: 0.001 seconds
# Run time: 0.008 seconds
"title I want" "real title" "docswith-strong"
.


-- 
Peter Karman  .  peter(at)not-real.peknet.com  .  http://peknet.com/

_______________________________________________
Users mailing list
Users@lists.swish-e.org
http://lists.swish-e.org/listinfo/users
Received on Wed Oct 24 14:50:49 2007