For the archive-
libxml2 did solve my problem. Thanks!
pek
Bill Moseley wrote:
> On Thu, Oct 02, 2003 at 03:04:10PM -0700, Peter Karman wrote:
>
>>I have an HTML document that contains this markup:
>>
>>
>><tt CLASS="literal">
>>-h
>> [
>><span CLASS="optional">
>>no
>></span>
>>]
>>aggress
>></tt>
>>
>>
>>And I would like a search for the following phrase to find that doc:
>>
>>"-h [no]aggress"
>
>
> I guess this is a bug.
>
> moseley@bumby:~$ cat t.xml
> <xml>
> <tt CLASS="literal">
> -h
> [
> <span CLASS="optional">
> no
> </span>
> ]
> aggress
> </tt>
> </xml>
>
> Here's with the XML parser:
>
> moseley@bumby:~$ swish-e -c c -i t.xml -T indexed_words -v0
> Adding:[1:swishdefault(1)] 'h' Pos:3 Stuct:0x1 ( FILE )
> Adding:[1:swishdefault(1)] 'no' Pos:5 Stuct:0x1 ( FILE )
> Adding:[1:swishdefault(1)] 'aggress' Pos:7 Stuct:0x1 ( FILE )
>
> The thing to note is that the word position got bumped due to the tag.
>
> If I use the XML2 parser I get:
>
> moseley@bumby:~$ swish-e -c c -i t.xml -T indexed_words -v0
> Adding:[1:swishdefault(1)] 'h' Pos:7 Stuct:0x1 ( FILE )
> Adding:[1:swishdefault(1)] 'no' Pos:8 Stuct:0x1 ( FILE )
> Adding:[1:swishdefault(1)] 'aggress' Pos:9 Stuct:0x1 ( FILE )
>
> moseley@bumby:~$ swish-e -w '"-h [no] aggress"' -H0
> 1000 t.xml "t.xml" 93
>
> Can you use libxml2?
>
--
Peter Karman - Software Publications Programmer - Cray Inc
phone: 651-605-9009 - mailto:karman@cray.com
Received on Fri Oct 3 20:55:09 2003