From: Peter Flynn <pflynn(at)>
Date: Wed Dec 01 2010 - 16:18:22 GMT
There is a much more serious problem in indexing XML documents.

Unlike HTML, where the element type most people are interested in is
<body>, the element types of an XML document are not fixed, and can be
anything. In the case of Word .docx files the text is in <w:body> and
for OpenOffice .odt files it is <document:body>, but in other XML
documents it could be <article>, <book>, <report>, or virtually anything.

Is there a way to specify the StoreDescription directive to use the root
element type, whatever it happens to be, if the named element type is
not in a particular document?

If not, can this be put on the list? If we could be sure that namespaces
like w: and document: were being ignored, a syntax such as this would be

StoreDescription XML <body>,<article>,<book>,<>

where it would use one of the named element types if it existed, and
otherwise the <> would mean "the whole document".

