Re: Kudos

From: James <swish.enhanced(at)>
Date: Mon Jan 08 2007 - 14:45:41 GMT

Maybe you could steal one of these guys:

I am sure one of them has the needed experience.  :-)  :-P

On 1/8/07, Bill Moseley wrote:
> On Mon, Jan 08, 2007 at 01:11:14AM -0800, James wrote:
> > I would be EXTREMELY happy if Swish-e was UTF-8 compatible before the
> end of
> > the year.  Isn't that a reasonable goal?  I think the developers should
> > shoot for September 1st (that gives them the summer to work through this
> > too) as the date to release the UTF-8 compatible Swish-e.  I believe
> once
> > Swish-e does this, you'll receive MUCH more attention (not that you
> don't
> > receive a lot of attention now!).
> I would be happy, too.  I'm just worried that the attention I would
> get would be from the bank repossessing my house.   Swish processes
> text -- so almost all the code deals with characters.  Plus, Swish has
> been worked on by a number of developers for a decade or longer now,
> so much of the code is showing its age.  So, much of the code would
> need to be re-written.
> Total Physical Source Lines of Code (SLOC)                = 62,677
> Development Effort Estimate, Person-Years (Person-Months) = 15.42 (185.00)
> (Basic COCOMO model, Person-Months = 2.4 * (KSLOC**1.05))
> Schedule Estimate, Years (Months)                         = 1.51 (18.17)
> (Basic COCOMO model, Months = 2.5 * (person-months**0.38))
> Estimated Average Number of Developers (Effort/Schedule)  = 10.18
> Total Estimated Cost to Develop                           = $ 2,082,597
> (average salary = $56,286/year, overhead = 2.40).
> SLOCCount, Copyright (C) 2001-2004 David A. Wheeler
> So we either need someone with lots of time or someone with lots of
> money. ;)
> A few days ago I was going over in my head an idea of how to patch the
> current code to get some level of utf8 in it for those that need it
> now.  I was basically wondering how much the current code could work
> by just not knowing the encoding -- that is, where byte comparisons
> would be fine or not (I think there's cases where two utf8 chars would
> be the same but have different byte values).
> Might have to lose wild card searches and the ability to do first
> letter searches for words (there's currently a 256 wide table that
> handles that).  And many of the config options might not work, and the
> concept of "WordCharacters" would likely not work, and have to look at
> new regex engine, and so on.
> Basically, start pushing utf8 into swish and see where
> things break.  That *might* be faster than a rewrite.  Or it could be
> a waste of time as it would lead to a rewrite, just doing it the hard
> way.
> Of course, I have not tried it yet.  I've been hoping someone with
> time and utf8 experience might show up one of these days.  Is that
> you?
