At 11:24 AM 02/26/02 -0800, Fred Toth wrote:
>Regarding nested meta names: Is it reasonable to use "html" as
>the top level of nested meta names (using HTML2 as the IndexContents value)?
I'm not sure how reasonable it is ;)
Best to test:
> cat 1.html
<html>
<head>
<title>Titletext</title>
<group>
<meta name="meta1" content="meta1text">
<meta name="meta2" content="meta2text">
<meta name="meta3" content="meta3text">
</group>
</head>
<body>
Bodytext
</body>
</html>
> cat c
defaultcontents HTML2
metanames html meta1 meta2 meta3 group
> ./swish-e -c c -i 1.html -T indexed_words -v0
Indexing Data Source: "File-System"
Adding:[1:html(10)] 'titletext' Pos:2 Stuct:0x87 ( META HEAD TITLE FILE )
Adding:[1:html(10)] 'meta1text' Pos:6 Stuct:0x85 ( META HEAD FILE )
Adding:[1:meta1(11)] 'meta1text' Pos:6 Stuct:0x85 ( META HEAD FILE )
Adding:[1:group(14)] 'meta1text' Pos:6 Stuct:0x85 ( META HEAD FILE )
Adding:[1:html(10)] 'meta2text' Pos:9 Stuct:0x85 ( META HEAD FILE )
Adding:[1:meta2(12)] 'meta2text' Pos:9 Stuct:0x85 ( META HEAD FILE )
Adding:[1:group(14)] 'meta2text' Pos:9 Stuct:0x85 ( META HEAD FILE )
Adding:[1:html(10)] 'meta3text' Pos:12 Stuct:0x85 ( META HEAD FILE )
Adding:[1:meta3(13)] 'meta3text' Pos:12 Stuct:0x85 ( META HEAD FILE )
Adding:[1:group(14)] 'meta3text' Pos:12 Stuct:0x85 ( META HEAD FILE )
Adding:[1:html(10)] 'bodytext' Pos:16 Stuct:0x89 ( META BODY FILE )
Indexing done!
So you can see how the words are nested. Everything has the "META" structure flag set since everything in within <html>.
>Then, the "all" search would be:
>
> swish -w html=smith
Yep.
Note that this will only work with libxml2 linked in.
This is only possible in the last month or so, because I relaxed the requirements for metanames -- before metanames could only be non-html tags.
--
Bill Moseley
mailto:moseley@hank.org
Received on Tue Feb 26 19:34:28 2002