Skip to main content.
home | support | download

Back to List Archive

Question: indexing of html pages containing tabular numeric data

From: Bruce Rodney <brodney(at)not-real.exprodat.com>
Date: Thu Jun 13 2002 - 22:35:57 GMT
I've just started with swish-e 2.0.5, compiled and running under Solaris 8
with initial quick success. It's great! But I can't figure out how to
achieve optimal indexing for the following scenario:

The HTML files contain header info plus tables of floating point data, e.g.
a coordinate such as 1532727.45. The tables contain 100's of rows of this
rather tedious data which I wish to EXCLUDE from the index. My initial
approach was to use BeginCharacters to only index words starting with [a-z].
Problem is: there may be valid keywords in the header info above the table
with are in fact integers, e.g. a unique object identifier such as
102320000. So I really want to let integers be indexed, but not floats.

I've scoured the documentation and the header files and hope I've not missed
something obvious... any help much appreciated.

Bruce Rodney
--
brodney@exprodat.com
+1 303 972 4236 (w)
+1 720 394 9404 (c)





*********************************************************************
Due to deletion of content types excluded from this list by policy,
this multipart message was reduced to a single part, and from there
to a plain text message.
*********************************************************************
Received on Thu Jun 13 22:40:00 2002