I wonder if you could set
PropertyNamesNoStripChars
for each property, then append something like a ^D (that's ASCII char
04) at the end of each <tagset>
config:
PropertyNamesNoStripChars bar
XML:
<bar>foo^D</bar>
<bar>someelse^D</bar>
and then, like Bill suggests, split on ^D in your postprocessor.
my @bar = split(/chr(04)/e, $bar_property);
(does that perl work?)
I haven't actually tried this, but it might be an interesting experiment.
pek
Thoreau Lovell supposedly wrote on 04/07/2004 11:27 AM:
> >You could either pre-parse the files and add your own separator
> >character(s) that you can split on in the view,
>
> I experimented with this, but couldn't get anything to work. Can you expand
> a bit?
>
> Thanks.
>
>
>
>
>
> At 04:54 PM 4/6/2004 -0700, Bill Moseley wrote:
>
>>On Tue, Apr 06, 2004 at 03:39:48PM -0700, Thoreau Lovell wrote:
>>
>>>I'm new to Swish-e, so please forgive another newbe question. I'm indexing
>>>a set of xml docs that have been generated by an XSLT stylesheet. Each doc
>>>represents a record for an electronic journal. The problem I'm having is
>>>that many of the records have multiple instances of a given element tag.
>>>For instance, the <provider> tag. Instead of displaying each instance
>>>separately, Swish-e concatenates them into a list when they are displayed
>>>as part of a search result.
>>
>>That's how it works[1]. I thought it was a config.h setting, but it turns
>>out to be a docprop.c setting:
>>
>> if ( add_a_space )
>> p->propValue[p->propLen++] = ' ';
>>
>>So you might be able to change that to a pipe, for example, and then
>>split on that pipe when displaying the property. Hum, that actually
>>might not won't work because long properties might be added to the index
>>in chunks and have that pipe added to them, too.
>>
>>
>>>I'm using TemplateToolkit to generate the output. Any suggestions on
>>
>>how to
>>
>>>display each <provider> instance on a separate line? Or, should I
>>>restructure the xml files? If so, how?
>>
>>You could either pre-parse the files and add your own separator
>>character(s) that you can split on in the view, or rename the tags
>>on-the-fly by preparsing <providerA> <providerB> ... and then try and
>>pull them out one-by-one. Not a very good solution, either.
>>
>>
>>[1] Swish used to internally create a linked-list of properties each
>>time a property of the same name was added to the index, but that caused
>>problems when using the SAX parser where data was added in chunks, and
>>also there was no way to print out those properties once in the index.
>>
>>--
>>Bill Moseley
>>moseley@hank.org
>
>
> Thoreau Lovell
> Digital Systems Design and Development Coordinator
> J. Paul Leonard Library, San Francisco State University
> 415-338-2285 | tlovell@sfsu.edu
--
Peter Karman - Software Publications Programmer - Cray Inc
phone: 651-605-9009 - mailto:karman@cray.com
Received on Wed Apr 7 09:57:40 2004