On Sunday 24 February 2002 06:03 pm, Fred Toth wrote:
[I seem to have failed to queue this message a few days ago -- sorry]
> It appears that when meta names are defined ("author" for example), the
> data accumulated for the "author" meta name is no longer included in the
> default meta name "swishdefault". Is this correct?
Yes. "swishdefault" is just another metaname, the one that's used if
something else doesn't match.
> Meaning, if I have <author>smith</author> somewhere in my input, and I
> search:
>
> swish-e -w author=smith
>
> I get a hit. This is good. However, if I search:
>
> swish-e -w smith
>
> I don't get a hit, since, presumably, "smith" does not exist in
> "swishdefault".
>
> So, to my questions: Is there any way to change this? I'd like
> "swishdefault" to be the "full text" of the input, including any and all
> meta names. Is this possible? If so, how do I express that in the
> configuration?
So you want -w author=smith to only search the author field, but -w smith
(which is the same as swishdefault=smith) to search all fields?
-w smith or author=smith
is the current way to do that.
There has been discussion about extending the search syntax to do something
like
-w swishdefault,author,subject=smith
to search all those listed fields. I've also thought about something like
-w *=smith
to say search all metaIDs. But for my use I'd think I'd want more control --
that is I'd want to specify what metaIDs were also part of an "ALL" search.
That means either defining what metaID should also be indexed as
"swishdefault", or to allow multiple metaID searches as shown above.
> However, this could get very cumbersome if there are a lot of meta names:
>
> swish-e -w 'smith or author=smith or abstract=smith or keywords=smith'
> (etc. etc.)
One work around is to use nested metanames in your source documents:
<html>
<head>
<title>Title</title>
<group>
<meta name="author" content="bla">
<meat name="abstract" content="foo">
</group>
metanames autho abstract group
And then use libxml2 as your HTML parser.
Now, it's not always possible to change the source. There's a way to alias a
collection of metanames onto one metaname. You can say that author, and
abstract and keywords are all aliases for the "group" metaname, but then you
can't search for just "author". That might be a nice feature.
> Another thing I thought of was to have 2 indexes. One has all my meta names
> defined,
> and the other has none. A full text query then queries the index with no
> meta names.
> Any other field specific query goes against the meta name index.
What would be an easy workaround.
Received on Tue Feb 26 19:08:36 2002