On Fri, Sep 10, 2004 at 06:54:30AM -0700, Peter Karman <karman@cray.com> wrote:
> Bill Moseley wrote on 9/9/04 2:04 AM:
>
> > Which, of course, we use the SAX interface. I also see on
> >
> > http://www.xmlsoft.org/html/index.html
> >
> > that our SAX usage of libxml2 is deprecated. Looks like a trip to the
> > xml list might be in my future.
>
> If you do consider rewriting swish-e to use the DOM interface, consider
> making it optional/configurable. I suspect that folks use swish-e with
> XML that might be derived from a database (which SAX seems better for),
> as well as 'real' XML documents (which DOM seems better for -- as in
> this case with resolving entities).
I seriously doubt whether using the DOM interface would solve more problems
than it would create. Some xml files get *hughe*, and might be indexed
for exactly that reason. Indexing hughe files via DOM will drive
indexing speed down and resource requirements up. Maintaining both
interfaces within swish-e drives the load on our cherished developers
up, something we also don't want, do we?
I personally find filtering stuff through xmllint acceptable, swish-e
users are used to filter all kinds of documents prior to indexing.
just my 2 cents though,
Bernie
Received on Fri Sep 10 07:09:15 2004