Skip to main content.
home | support | download

Back to List Archive

swish-e search difficulties

From: Chris Kantarjiev <cak(at)not-real.dimebank.com>
Date: Tue May 11 2004 - 19:24:57 GMT
I'm indexing a mail archive (one file per message) and searching
with swish.cgi. (I'm running 2.4.1.) It was recently pointed
out to me that "Subject & Body" searches don't find all the
messages that "Subject" does - that is, if the keyword only
appears in the subject field, which becomes swishtitle, it
isn't found by Subject & Body. 

I'm guessing that my metanames aren't set up quite right, but haven't
been able to figure out how. Help?

swish.conf:

IndexDir ./index_mh.pl
SwishProgParameters msgs

MetaNames swishtitle from 
PropertyNames from
PropertyNamesDate date
DefaultContents HTML
StoreDescription HTML <body> 10000
UndefinedMetaTags  ignore


swishcgi.conf:

use lib '/var/www/cgi-bin/modules';

return {
   title           => 'Search the Silent-Tristero Archives',
   swish_binary    => '/usr/local/bin/swish-e',
   swish_index     => '/WebPages/dimebank/s-t/index.swish-e',
   description_prop => 'swishdescription',

   template	   => {
	package    => 'stTemplateDefault',
   },
   timeout	   => 30,
   display_props   => [qw/ from date /],
   sorts           => [qw/swishrank swishtitle from date/],
   secondary_sort  => [qw/date desc/],  
   metanames       => [qw/swishdefault swishtitle from all/],
   name_labels     => {
       swishrank   =>  'Rank',
       all	   =>  'Entire message',
       swishtitle  =>  'Subject Only',
       from        =>  "Poster's Email",
       date        =>  'Message Date',
       swishdefault  =>  'Subject & Body',
   },
   meta_groups => {
	      all =>  [qw/swishdefault from /],
   },

   highlight       => {
       package         => 'PhraseHighlight',
       show_words      => 10,    # Number of swish words words to show around highlighted word
       max_words       => 100,   # If no words are found to highlighted then show this many words
       occurrences     => 6,     # Limit number of occurrences of highlighted     words
       highlight_on    => '<font style="background:#FFFF99">',
       highlight_off   => '</font>',
       meta_to_prop_map => {   # this maps search metatags to display properties
	swishdefault    => [ qw/swishtitle swishdescription/ ],
	swishtitle      => [ qw/swishtitle/ ],
	from            => [ qw/from/ ],
	all		=> [ qw/swishdefault swishtitle from/ ],
	swishdocpath    => [ qw/swishdocpath/ ],
       },
    },
    date_ranges     => {
        property_name   => 'date',      # property name to limit by
        time_periods    => [  
            'All',
  	    'Today',
	    'Yesterday',
	    'This Week',
	    'Last Week',
	    'Last 90 Days',
	    'This Month',
	    'Last Month',
        ],

        line_break      => 0,
        default         => 'All',
        date_range      => 1,
    },
};

A typical message after index_mh.pl gets done with it:

Content-Length: 975
Last-Mtime: 1073517607
Path-Name: msgs/msgs24001-27000/24106

<html>
<head>
<title>

</title>
<meta name="precedence" content="list">
<meta name="swishtitle" content="Girls Aloud's year at the top">
<meta name="to" content="Name <your@name.here>">
<meta name="sender" content="your@name.here">
<meta name="date" content="1066685834">
<meta name="from" content="Another Name <my@name.here>">
<meta name="received" content="by wolfe.bbn.com (Postfix, from userid 13274)">
</head><body>


<quot>
Tweedy's life was transformed when she joined Girls Aloud, leaving behind life o
n a council estate.
</quot>

<quot>
"Four months earlier I was sitting in a council house drinking tea and watching 
Oprah Winfrey on television all day."
</quot>

<http://news.bbc.co.uk/1/low/entertainment/tv_and_radio/3207926.stm>
<http://news.bbc.co.uk/1/low/england/3207822.stm>

</body>
</html>
Received on Tue May 11 12:24:57 2004