Skip to main content.
home | support | download

Back to List Archive

[swish-e] Wheel reinventing - can I use swish-e?

From: Simon Waters <simonw(at)not-real.zynet.net>
Date: Wed Feb 07 2007 - 17:49:43 GMT
Hi,

we use swish-e for basic website indexing at ZyNet. It works - we are happy! 

But since we don't use it for much that is clever, I've only a minimal grasp 
of what it can do.

I'm looking at a project that needs a search engine like component, that will 
extract and store metadata from pages (Lastmodified, Metanames, properties 
etc). The basic input will be a URI with some associated custom metadata that 
doesn't appear in the URI contents (although I keep suggesting this is a 
really good place to put it!). A key use of this will be generation of 
HTML/XML from the search results, including RSS feeds, but I have to build a 
lot of different types of queries.

However we plan to do a fair bit with searching metanames/properties, and we 
also want prompt insertion and deletion(!) of records. For deletion I don;t 
think rebuilding will be acceptable (but hey the plans are fairly fluid 
still).

I assume that deletions (or disabling) isn't supported yet, couldn't see it in 
the docs?

After some thinking about what we needed, some Googling, I decided I was 
specifying something that looks a lot like what swish-e does already with 
some extra bits (although I'm not sure precisely how much swish-e can do).

I guess part of it is I'm familiar with SQL relational databases, and have a 
vague feel for what I can force into an index, and a good idea of what I can 
query. But I don't have that "comfort" or knowledge of swish-e although 
(superificially at least) it looks a lot closer to what I want to achieve 
(indeed it may do all of it bar a little configuring and a couple of scripts, 
and with lots of Perl bits which is a plus point for us).
 
Perhaps if I saw a few more complex examples of swish-e searches, rather than 
me feebly struggling to figure out the syntax for "-s swishlastmodified 
desc", I might feel more confident.

Of course the killer is I know that the SQL databases we use will do whatever 
I asked of them, including cascading deleted. I've no idea how efficiently 
they will do it, but I know that I can get them to do the task. But I've not 
used the swish properties and metanames so I'm not sure whether something is 
possible.

I guess I can always store some additional metadata in (Postgres probably) 
along side the list if URI to index, if my needs for additional properties 
get too great, but I'd really prefer it if my queries only included one type 
of system at a time! Anyone gone with swish-e and ended up finding they had 
to do extra stuff in another database? Or am I unduely pessimistic on this 
point?

I'm thinking queries like; 
URI starts with 'http://www.example.com/somesubsystem/', and author is 'Joe 
Bloggs', the most recent N sorted by date, and some other property isn't set, 
and maybe after all that require a keyword.

Or is that madness, and should I run back to an SQL database straight away.

 Simon
_______________________________________________
Users mailing list
Users@lists.swish-e.org
http://lists.swish-e.org/listinfo/users
Received on Wed Feb 7 12:49:17 2007