Hi Rainer,
On 18 Jul 2000, at 7:34, Rainer.Scherg@rexroth.de wrote:
> >2. Add Files to the index file
> >3. Delete Files from the index file
> maybe usefull in some cases.
> Add can be implemeted as "index a file and merge indexes" (?)
>
Merging can be work fine with small index files. But for large
index files, merging is a very heavy memory proccess: It reads the
original files into memory an creates a different one.
> >4. Better XML integration
> IMO necessary, but not only XML.
> See other mail from me. - There are new formats upcoming - like wml (WAP).
Sure, we are also working with wml.
I like your idea of implementing something like:
IndexContents HTML .html .htm .shtml .htm. .html. .shtml.
IndexContents XML .xml
IndexContents WAP .wap
IndexContents TXT .txt .txt.
>
>
> >5. Multidocument Files. This will allow to write filters for
> >SQL databases. Needs to define a document separator.
> ?? sorry, to understand this fully, I need an example.
> Basically, you can use the filter feature to index e.g. a database.
> But you need a search (cgi script), which generates a proper URL
> to retrieves and display this information stored in the db.
Here is a file with two documents (using a line with '---' as a
separator):
<meta1>Doc 1</meta1>
<meta2>Some text text</meta2>
---
<meta1>Doc 2</meta1>
<meta2>Some text text</meta2>
Eg: This can be obtained via a SQL report but, now, using the filter
option it will be indexed as just one file.
To store this documents in the index file we can add the start
position in the file to the entry description. So, results can look
like this:
rank filename title start size [props]
start will be 0 for the most normal common case: One document
per file. The problem is backwards compatibility.
> >10. Option to retrive documents with words highlighted
> >in some way.
> This is to be done in the search cgi script and the output generator...
> Could be easily done for static html files - but not for dynamic or
> SSI files (you would need a postprocessor for the searchengine build
> into the webservers output stream).
>
You are right. I was thinking on static html or text.
>
> >12. Stemming modules for non english languages.
> Ok would be nice, but how to configure?
>
Perhaps, something like this in config file...
Stemming german
# For backwards compatibility Stemming Yes will be english
cu
Jose
Received on Wed Jul 19 10:01:49 2000