Back to the list.
On Fri, Feb 03, 2006 at 01:17:08PM -0600, Jon Sorensen wrote:
> > Are you sure you want to preserve entities?
> >
>
> maybe I'm misunderstanding the documentation
>
> I want entities such as ® to be preserved in the index
> so that when swish.cgi returns results it doesn't return the
> ascii character for ®, but the entity in the html.
Do you want people to be able to search for "reg" in your documents?
The entities are (in part) a way to include characters that are not
in the encoding your are delivering your web pages in.
Many of the common entities like ® are in Latin-1 so swish can
handle those and you don't need to use entities in your output if you
state that your encoding is 8859-1.
If you are using any entities that do not map to 8859-1 then swish
will replace those with a space. (Swish only indexes 8 bit chars).
If you still want to use entities then you should convert the text in
your search script back into entities when generating results. Just
like you would esacpe < > &.
The entities are only needed when sending the text to a web browser
and your encoding does not include those characters.
--
Bill Moseley
moseley@hank.org
Received on Fri Feb 3 11:46:35 2006