Skip to main content.
home | support | download

Back to List Archive

Re: StoreDescription question

From: Bill Moseley <moseley(at)not-real.hank.org>
Date: Thu Apr 08 2004 - 01:57:42 GMT
On Wed, Apr 07, 2004 at 01:41:26PM -0700, Phil Matt wrote:
> A quick question:
> 
> Is the StoreDescription directive used like this when the config file has already been set 
> to index only .SHTML files?
> 
> StoreDescription SHTML <body> 100

No, that first parameter specifies the *parser* it applies to.  It's kind
of like this (from the docs):

For example:

    PropertyNamesMaxLength 1000 swishdescription
    PropertyNameAlias swishdescription body

Is somewhat like

    StoreDescription HTML <body> 1000
    StoreDescription XML <body> 1000
    StoreDescription HTML2 <body> 1000
    StoreDescription XML2 <body> 1000

but StoreDescription allows setting the tag for each parser type. 
 


> 
> Or is there another way to do this? The docs only show file types of XML, HTML and 
> TXT for this directive.

Those are the available parsers: XML, HTML, and TXT, and i linked with
libxml2, XML2, HTML2 and TXT2.

You set the default parser used with:

   DefaultContents HTML2

which says use the libxml2 parser by default.  And

   IndexContents XML2 .foo .bar

says that files ending in .foo or .bar are to be indexed using the XML2
parser.

It's slightly confusing on this point:  Documents, by default, have NO
type assigned and documents without a type assigned are indexed using the HTML* 
parser (HTML2 if available, otherwise HTML).  But, StoreDescription's
first parameter is a parser type.

What that means is that a complete config of only:

   StoreDescription HTML* <body> 1000

won't work without using DefaultContents or IndexContents to set a
document's type.  That is, even though documents are indexed by default
with the HTML* parser, store description won't work unless the document
has a type specifically assigned with IndexContenst or DefaultContents.

That make any sense?

-- 
Bill Moseley
moseley@hank.org
Received on Wed Apr 7 18:57:43 2004