Bernhard Weisshuhn wrote on 9/10/04 9:08 AM:
> On Fri, Sep 10, 2004 at 06:54:30AM -0700, Peter Karman <karman@cray.com> wrote:
>
>
>>Bill Moseley wrote on 9/9/04 2:04 AM:
>>
>>
>>>Which, of course, we use the SAX interface. I also see on
>>>
>>> http://www.xmlsoft.org/html/index.html
>>>
>>>that our SAX usage of libxml2 is deprecated. Looks like a trip to the
>>>xml list might be in my future.
>>
>>If you do consider rewriting swish-e to use the DOM interface, consider
>>making it optional/configurable. I suspect that folks use swish-e with
>>XML that might be derived from a database (which SAX seems better for),
>>as well as 'real' XML documents (which DOM seems better for -- as in
>>this case with resolving entities).
>
>
> I seriously doubt whether using the DOM interface would solve more problems
> than it would create. Some xml files get *hughe*, and might be indexed
> for exactly that reason. Indexing hughe files via DOM will drive
> indexing speed down and resource requirements up. Maintaining both
> interfaces within swish-e drives the load on our cherished developers
> up, something we also don't want, do we?
>
> I personally find filtering stuff through xmllint acceptable, swish-e
> users are used to filter all kinds of documents prior to indexing.
point(s) well taken (esp. the cherished developers). filtering through
xmllint seems like a better solution and is consistent with the filter
model.
--
Peter Karman 651-605-9009 karman@cray.com
Received on Fri Sep 10 07:42:27 2004