Skip to main content.
home | support | download

Back to List Archive

Re: [swish-e] First time Swish-e user with some thoughts/feedback

From: Bill Moseley <moseley(at)not-real.hank.org>
Date: Mon Feb 26 2007 - 19:44:31 GMT
On Wed, Feb 21, 2007 at 09:51:25AM -0500, Jason Purdy wrote:
> The docs say that spider.pl is a better choice, but I found that it 
> really didn't work for me.  Come to find out, spider.pl does a decoding 
> of the content and then recoding vs. swishspider just gets the content 
> and doesn't worry about coding.  Then I found out some pages of our site 
> use 'utf-8' and others use 'ISO-8859-1' and there would be the odd 
> character that couldn't decode accordingly.  I didn't find this out 
> until I hacked spider.pl:
> 
> my %opts = ( 'raise_error' => 1 );
> $content = $request->decoded_content( %opts );
> 
> When I did this, spider.pl died when it was decoding content that had a 
> charset of utf-8 and there was a some odd character ("\x92") in there 
> (see error msg below).  Took me way too long to figure that one out. 
> Perhaps we should raise_error by default.  The warning that the document 
> had no content was good, but it would be better if a warning was fired 
> before that if the content couldn't be retrieved b/c it had the wrong 
> charset.

What about just printing $@?

Something like:

    unless ( $content ) {
        warn "Failed decode of $uri: $@\n" if $@;
        my $empty = '';
        output_content( $server, \$empty, $uri, $response )
            unless $server->{no_index};
        return;
    }

> 2) Using a template system
> 
> I was excited to see that you could use HTML::Template w/ the search 
> results, as that's our template language of choice, but I couldn't find 
> really good documentation on how to configure .swishcgi.conf accordingly 
> until I dove into the source code for swish.cgi.  Here is my .swishcgi.conf:
> 
> return {
>      title        => 'QSR magazine search results',
>      swish_binary => '/usr/local/bin/swish-e',
>      swish_index  => '/var/www/qsr/web/search/index.swish-e',
>      template     => {
>              package         => 'SWISH::TemplateHTMLTemplate',
>              options         => {
>                  filename            => 'swish.tmpl',
>                  path                => '/var/www/qsr/web/search',
>                  die_on_bad_params   => 0,
>                  loop_context_vars   => 1,
>                  cache               => 1,
>              },
>          },
> }
> 
> I got stuck b/c I thought the file parameter was named 'file' and was 
> its own key/value vs. being nested in 'options'.

Ya, the example in the source isn't very clear:

        xtemplate => {
            package         => 'SWISH::TemplateHTMLTemplate',
            options         => {
                filename            => 'swish.tmpl',
                path                => '@@pkgdatadir@@',
                die_on_bad_params   => 0,
                loop_context_vars   => 1,
                cache               => 1,
            },
        },

Not sure why it's not consistent.

BTW -- did you look at the search.cgi example?  swish.cgi is kind of a
mess since it tries to do so much and if you have Perl experience then
it's not that hard to write a script that is customized to your needs.

-- 
Bill Moseley
moseley@hank.org

Unsubscribe from or help with the swish-e list: 
   http://swish-e.org/Discussion/

Help with Swish-e:
   http://swish-e.org/current/docs

_______________________________________________
Users mailing list
Users@lists.swish-e.org
http://lists.swish-e.org/listinfo/users
Received on Mon Feb 26 14:41:50 2007