Skip to main content.
home | support | download

Development

The current stable release is 2.4.7.

Swish-e is continually under development. This page contains a laundry list of requested features planned for a future Swish-e release. To request new features, bug fixes, or (best of all) to submit code patches, send e-mail to the Swish-e mailing list.

Daily Builds

Swish-e source is available for anonymous public download from the swish-e subversion server.

The daring and adventurous can download the daily build snapshot from the swish-daily page. This is not an official release of Swish-e, rather the current development version. There is no guarantee that these packages run. Please do not use this code in production.

For Windows development binary (pre-compiled) snapshots, please see http://www.webaugur.com/wares/files/swish-e/daily/.
The most current Windows development version is here.

Questions regarding daily development builds, or about using Swish-e in general, should be directed to the Swish-e mailing list.

Features planned for 2.6

  • Remove expat and other older parsers. Libxml2 will be default (only) parser.
  • Remove -S http method.
  • Documentation overhaul.

Features planned for 3.0

Swish-e 3.0 (abbreviated Swish3) will be a complete overhaul of the code. You can track development progress here. Major feature improvements will include:

Unicode support
Unicode is the international standard for character encodings. Swish3 will implement support for the UTF-8 character encoding, which should handle all major languages in the world (UTF-8 handles up to 2,147,483,648 unique characters). The Swish-e developers need input from non-English language experts. Please contribute to the discussion at the Swish-e mailing list. Some significant known issues include:

lowercase vs. UPPERCASE
Version 2.x uses tolower() to lowercase all characters before searching and indexing. Should the same approach be used for UTF-8? Will this have significant impact on usability for non-English languages?
Wildcards
Version 2.x uses an internal table to support wildcard searching with *. The table assumes 8-bit (non-Unicode) character encoding. That approach will likely need to be re-thought for multibyte encodings like UTF-8.
Tokenizing
Version 2.x uses 5 different configuration options to control how a 'word' (token) is defined. The basic assumption is that a word is defined by which characters it includes. That assumption is based on a manageable character set of 256 characters. However, the sheer size of UTF-8 makes that system unworkable. Instead, some kind of regular expression library will likely be used.
Stemming
The stemmers used will need full international support.
Configuration format
Since Swish-e depends on a configuration file for StopWords, Character definitions, etc., the parsing of the configuration file must support UTF-8 as well. The current idea is to switch to XML-style configuration files and use Libxml2 to parse them.
Incremental indexing
Swish3 will support true incremental indexing. This will allow for document records to be modified, added and deleted in an existing index. This feature may or may not build on the version 2.x experimental btree/incremental feature.
Scaling
Swish3 will reliably scale to larger (multimillion) document collections.
Indexing API
Swish3 will include an indexing API in addition to the current searching API.
Streamlined feature set
Swish3 will not contain several features in the current version:
  • Expat parsers
  • -S http indexing method and related configuration options
  • Older stemmers
  • Current native index format
Alternate index backends
Swish3 will offer alternate index backends using available open source libraries, such as Xapian, HyperEstraier, Lucene, or Lemur.


The Players

You can't tell the players without a program. And we wouldn't have a program without all these players! All these folks have made key contributions to Swish-e: If you are not listed here, and you should be, drop a line.

On the Field

Bill Moseley
The person leading the charge. Rewrote much of the documentation and bundled it with the distribution (you now know who to complain to), added the "prog" document source feature, added Expat and libxml2 parsers, redesigned properties, and added many new and exciting features.
Jose Manuel Ruiz
Jose added phrase searching and has made huge contributions toward speed and memory usage improvements. He added result sorting, improved metanames and properties, merging, and searching. Swish is the powerful program it is today because of Jose. And there's more coming!
David Norris
David has provided ports to all flavors of Windows, as well as a Swish-e interface script written in PHP3. The windows version is now bundled with a self installer, making instalation just a click away.
Peter Karman
Peter added improvements to the ranking code and a new website design. His main role is creating more work for Bill.
Roy Tennant
Roy was the one who originally rescued SWISH when Kevin Hughes, the original author, was no longer supporting it. He has remained active in the effort since the beginning, but can't code in C to save his life, and therefore must remain content with web site support and other such minor tasks.

Hall of Fame

Bill Meier
Bill improved the ranking code, and provided much help in memory optimizations and indexing speed.
Rainer Scherg
Rainer has worked on Swish-e for many years. Rainer added Swish-e's filters providing ways to index many document types. Rainer also added the powerful "-x" feature to easily control Swish-e's output.
Giulia Hill
Giulia was the first programmer to tackle upgrading SWISH to Swish-e, back when it was a project of the UC Berkeley Library. Without her, we would not have gotten out of the starting gate.
Ron Klatchko
Ron added the crawling capability to Swish-e, subsequently enhanced by others.
Kirk Hastings
Kirk programmed a neat Perl-based tool, called "AutoSwish" that allowed anyone to easily set up and maintain indexes from a web page. Unfortunately, this program is no longer a part of the release due to security issues.
Bas Meijer
Bas has been an active member of the Swish-e team since 1999 providing code enhancements and user support. He converted Swish-e's build process to the GNU Auto Configure script and ported Swish-e to a number of platforms. Bas has also provided add-on scripts to the Swish-e user community.
Marc Gaulin
Marc added code to support the document properties and stemming features, among other things.
Warren Jones
Prentiss Riddle, Rice University
The source of a number of SWISH bug fixes that were implemented in the first Swish-e release
Mark Seiden

We owe a debt of gratitude to Kevin Hughes, without whom there would be no SWISH, and definitely no Swish-e. His dedication to building useful tools and making them widely available should be an inspiration to us all.