Ok, you programmers. Anyone take compiler theory classes and studied
recursive descent parsers?
I'm back at looking at rewriting the swish-e query parsing code.
I've been reading a bit about recursive descent parsers but seem to have
a block in implementing one. Maybe someone here can help me past that
block.
Erik Corry offered up this almost a year ago:
http://swish-e.org/archive/4643.html
but I'm not seeing how the final parse tree should look from that
example.
We already have scanning code to tokenize a query string into an array
of search words and operators. So now I'm trying to work on the code to
generate the parse tree.
For a simple query like
foo and bar
I can see a tree with three nodes, with AND at the top and two leafs
of "foo" and "bar".
But how to then represent these two queries?:
foo and bar or baz
foo and (bar or baz)
Swish-e currently only has three boolean operators, "and", "or", and
"phrase_and". phrase_and is the special case of the and where the
position numbers must be only one apart.
Parenthesis and double quotes are for grouping -- I guess they represent
a sub-query.
The "not" operator is really just a search modifier. It simply negates
the results of a list of results, which can be either a single term or
applied to the result of a sub-query started by parenthesis or phrase
quotes. Metanames are somewhat similar in that they can apply to a
single search words, or a group.
foo and meta=bar or baz # only on bar
foo and meta=(bar or baz) # apply to both
I'm not sure what this should do:
foo and meta="some phrase"
BTW -- the Swish-e distribution contains a Perl module to parse a query
into words based on metanames. It's somewhat what I'm trying to do, but
I want to end up with a parse tree that can then be passed off to the
search code for processing.
http://cvs.sourceforge.net/viewcvs.py/*checkout*/swishe/swish-e/example/modules/SWISH/ParseQuery.pm?content-type=text%2Fplain&rev=1.1
--
Bill Moseley
moseley@hank.org
Received on Fri Nov 14 20:39:22 2003