Skip to main content.
home | support | download

Back to List Archive

Re: Problem Indexing META Tags

From: Bill Moseley <moseley(at)not-real.hank.org>
Date: Thu Jan 02 2003 - 19:48:33 GMT
On Thu, 2 Jan 2003 Jeffrey.Grunstein@ny.frb.org wrote:

> I am trying to index META tags.  I am starting with the DESCRIPTION and
> KEYWORDS tags, and will
> soon expand to our own META tags which we intend to start using.
> 
> I can't get pages that have a search term in the KEYWORDS tag to
> appear if that search term appears only in the META tag.

If I follow... that's right.  Metanames are separate "fields" and if you
you have a word in a metaname called "description" then you can only
search for it in that metaname.  That is, if "foo" is in the description
meta tag:

   -w description=foo

will find it, but

   -w foo 

will not find it.  To search for "foo" in both you have to say

   -w foo or description=foo
   -w swishdefault=foo or description=foo (same thing)

The one "exception" is <title> is stored in the "swishdefault" metaname,
and then specifying swishtitle as you do below places it in both the
"swishdefault" and "swishtitle" metanames.  

> MetaNames swishtitle swishdocpath description keywords

Ok.



> I'm also trying to add an option to the search form to select which META
> tags to search in.  Here's what I have but
> I can't figure out what to do with MetaName => 'site'.  I have no metaname
> defined as site.

No, you have that wrong.  What "select_by_meta" does is limit a search
based on the *value* of a meta tags.  For example, you might have a
metaname called "department" and then each document might have:

 <meta name="department" content="accounting">

And you have a limited selection of departments (sales, support,
marketing).  So then select_by_meta allows a checkbox selection for what
departments to limit the search to.

The example in swish.cgi uses "site" because it's using the "ExtractPath"
directive to create a metaname "site" based on part of the path to the
file -- and in that specific case it's organizing the Apache documentation
into groups based on the path name (misc, mod, vhosts, other).

If you want to have an option that says what field (i.e. metaname) to
search in use simply:

   metanames => [ qw/ description swishdefault swishtitle / ],

Then if you want to apply better labels then use the name_labels hash.

Also, you can use the meta_groups setting to search in more than one
metaname at a time.

Yes, it's confusing.  Too many options.


-- 
Bill Moseley moseley@hank.org
Received on Thu Jan 2 19:48:41 2003