Skip to main content.
home | support | download

Back to List Archive

Re: Q: Swish-E foreign language character support

From: <Rainer.Scherg(at)not-real.rexroth.de>
Date: Mon Feb 05 2001 - 23:25:43 GMT
> -----Original Message-----
> From: Kati Gäbler [mailto:katigaebler@topmail.de]
> Sent: Monday, February 05, 2001 11:34 PM
> To: Multiple recipients of list
> Subject: [SWISH-E] Re: Q: Swish-E foreign language character support

> 
> Also, I have some ideas I'd like to contribute to the Swish 
> developers on 
> this list, in case some of the features doesn't already exist.

feel free to do so...
project help is always appreciated ...

> For example, if an index has been created from 100 HTML pages 
> or so, all 
[...]
> however the website author feels like placing them, e.g. in 
> some META tag:
> 
> <meta name="department" content="first_floor">

Meta search is possible...


> Or within some script tag:
> 
> <script language="javascript">
> var = 'first_floor';
> </script>
> 
> Or it could just be a keyword or phrase contained somewhere 
> in the files. 
> Would it be possible for Swish to index ONLY those FILES 
> where the spider 
> finds the keyword? or the other way, those where keyword does 
> NOT exist.

Mhh, not easy,
but this could be done via the filter feature.
It would index the document path, but not the content, if the filter
is returning an empty document.


> Another thing that might be useful would be if the spider 
> could recognize and 
> ignore any frameset files, or reverse, only to index 
> framesets, as the 
> administrator likes it. Because framesets can be pretty 
> useless from an

Two ways to do this: ;-)
  - you have to use the same filenames as a kind of a template for your
    framesets.
  - use config rules not to index certain files.
  - I've done this in my cgi search script.
    This script tries to recognize e.g. a navigation frame and will
    try to dispay the entire framesets as a searchresult.

The method you are proposing is also not easy. 


> no-frames state or not. Maybe could Swish could therefore 
> avoid indexing any 
> files containing a part of a string, as always found in a 
> frameset file, e.g.:
> 
> <frame src=
> 
> Or just:
> 
> </frameset>
> 

An idea to be discussed...



> Your search for "blabla" returned "X" number of hits.
> 
> 1. Some Title Link
>    Description of the page... - [modification date]
>    http://www.blabla.com/pagethis.html
>   
> 2. Another Title
>    Description of anotehr page.. - [modification date]
>    http://www.blabla.com/hello.html
>  
> 3. Yet Another Title
>    Description Blabla... - [modification date]
>    http://www.blabla.com/whatever.html
>  
>    etc....
> 
>    page: 1 2 3


This will be able in the next version (exactly as you described) ;-).
The new version will be able to return last modification dates and
descriptions/summaries of a document an also others.

We are still in a beta status and still hacking new code.
There have still some things to be done (e.g. better result sorting).

> 
> Possible features on the HTML form page could include:
> 
> Allow the user to decide, display "X" number of hits p/page, 
> from within the 
> search form, drop menu or whaetever. But this paramenter 
> could also be placed 
> in a hidden form tag so that the administrator can fix it.

This is also included in swish.
The main problem is, we don't have a new swiss-army-knife-cgi
search srcipt, which fits the new features of swish.
This has yet to be done.

> 
> For example something like this:
> 
> <script language="javascript">
> function changeInput(object){
> document.chooseIndex.mySelection.value=object.options[object.s
> electedIndex].value;
> }
> </script>

Javascript is bad... sorry.
It causes to many problems on different browsers (and is turned off
in some cases - even some companies do not allow javascript swistched
on in their company networks).


> And for a more complex one: I don't know what ranking system 
> Swish uses and 
> how it could be converted into something else, for example, a GIF 
> star/relevance system something like that used on the 
> follwoing search engine would be cool!


Can be easily done in a search script (I use a "bar" system to display
the ranking...) 


> http://www.irt.org/cgi-bin/htwrap?method=and&format=builtin-lo
ng&sort=score&config=htdig&restrict=&exclude=&words=search+forms
> 
> Thanks for listening to my ideas!
> 
> Regards,
> Kati


No problem, as you see many things you mentioned are not new
(so this is no complain). But the main problem developing such
things is time (some of us have family and have also earn some
money... ;-).

But what do you think about the following:
  - most features you mentioned are or will be implemented into swish.
  - what is missing is a new swish-army-knife-cgi as you described.
  - would you like to contribute some code for the cgi?


cu - rainer


----------------------------------------------------------------------
This Mail has been checked for Viruses
Attention: Encrypted Mails can NOT be checked !

* * *

Diese Mail wurde auf Viren ueberprueft
Hinweis: Verschluesselte Mails koennen NICHT geprueft werden !
----------------------------------------------------------------------
Received on Mon Feb 5 23:31:18 2001