Skip to main content.
home | support | download

Back to List Archive

Re: [swish-e] Parsing plain text emails to use the subject line as the title

From: Troy Wical <troy(at)not-real.wical.com>
Date: Sat Mar 20 2010 - 20:35:50 GMT
On Mar 19, 2010, at 10:29 PM, Peter Karman wrote:

> I wrote a MailFS aggregator tonight and uploaded to CPAN just now as  
> SWISH::Prog
> 0.44 (ignore 0.43 if you see that -- I forgot to register the new  
> aggregator so
> it can be used with swish3 -S).
>
> If you want to use it before your cpan mirror syncs, you can grab  
> from here:
>
> http://svn.swish-e.org/perl/SWISH-Prog/trunk

I upgraded SWISH::Prog and the build appeared to run into speedbump,  
though I'm not sure if its a deal killer for swish3.

#####################################################################
#####################################################################
<snipped>
Installing /usr/local/lib/perl5/5.8.8/man/man3/ 
SWISH::Prog::Native::Searcher.3
Installing /usr/local/lib/perl5/5.8.8/man/man3/SWISH::Prog::Searcher.3
Installing /usr/local/lib/perl5/5.8.8/man/man3/ 
SWISH::Prog::Aggregator::Mail.3
Installing /usr/local/lib/perl5/5.8.8/man/man3/ 
SWISH::Prog::Native::InvIndex.3
Appending installation info to /usr/local/lib/perl5/5.8.8/mach/ 
perllocal.pod
   KARMAN/SWISH-Prog-0.44.tar.gz
   /usr/bin/make install  -- OK
Failed during this command:
  MSCHWERN/ExtUtils-MakeMaker-6.56.tar.gz      : install NO -- is only  
'build_requires'
#####################################################################
#####################################################################


> You'd change your command to be:
>
> % swish3 -S mailfs -c file.config -i path/to/mail

That command core dumped.

#####################################################################
#####################################################################
[/home/mail-archive/search]# swish3 -d -v 3 -S mailfs -c test.conf -i / 
home/mail-archive/test/
{
   Debug           => 0,
   Format          => "native",
   Headers         => 1,
   Limit           => [],
   Merge           => undef,
   Source          => "fs",
   Version         => 0,
   Warnings        => 2,
   aggregator      => "mailfs",
   begin           => 0,
   config          => "test.conf",
   debug           => 1,
   extended_output => undef,
   folder          => "index.swish3",
   help            => 0,
   indexer         => "native",
   input           => 1,
   invindex        => "index.swish3",
   links           => 0,
   max             => undef,
   newer_than      => undef,
   query           => "",
   sort_order      => "",
   test_mode       => 0,
   verbose         => 3,
}
creating indexer: SWISH::Prog::Native::Indexer at /usr/local/lib/perl5/ 
site_perl/5.8.8/SWISH/Prog.pm line 114.
creating aggregator: SWISH::Prog::Aggregator::MailFS at /usr/local/lib/ 
perl5/site_perl/5.8.8/SWISH/Prog.pm line 139.
do {
   my $a = bless({
     _start     => 1269116682,
     aggregator => bless({
                     _ext_re          => qr/\.(html|htm|xml|txt|pdf|ps| 
doc|ppt|xls|mp3|css|ico|js|php)(\.gz)?/i,
                     _mailer          => bless({
                                           _start => 1269116683,
                                           debug => 1,
                                           doc_class =>  
"SWISH::Prog::Doc",
                                           indexer => bless({
                                                 _start    =>  
1269116682,
                                                 config    => bless({
                                                                 
DefaultContents                   => ["TXT*"],
                                                                 
"IgnoreTotalWordCountWhenRanking" => [0],
                                                                 
IndexDir                          => ["/home/mail-archive/test"],
                                                                 
IndexFile                         => ["/home/mail-archive/search/ 
test.index"],
                                                                 
IndexReport                       => [1],
                                                                 
MetaNameAlias                     => ["swishdefault mail"],
                                                                 
MetaNames                         => { url => 1 },
                                                                 
PropertyNames                     => { url => 1 },
                                                                 
ReplaceRules                      => [
                                                                                                       "replace 
  \"/home/mail-archive/\" \"http://type2.com/mail-archives/\"",
                                                                                                     ],
                                                                 
StoreDescription                  => ["XML* <body>"],
                                                                 
_start                            => 1269116682,
                                                                 
debug                             => 0,
                                                                 
verbose                           => 0,
                                                              },  
"SWISH::Prog::Config"),
                                                 debug     => 1,
                                                 exe       => "swish-e",
                                                 invindex  => bless({
                                                                 
_start  => 1269116682,
                                                                 
clobber => 0,
                                                                 
debug   => 0,
                                                                 
file    => bless({
                                                                             dir 
  => bless({ dirs => ["index.swish3"], file_spec_class => undef,  
volume => "" }, "Path::Class::Dir"),
                                                                             file 
  => "index.swish-e",
                                                                             file_spec_class 
  => undef,
                                                                           }, "Path 
::Class::File"),
                                                                 
path    => bless({ dirs => ["index.swish3"], file_spec_class => undef,  
volume => "" }, "Path::Class::Dir"),
                                                                 
verbose => 0,
                                                              },  
"SWISH::Prog::Native::InvIndex"),
                                                 test_mode => 0,
                                                 verbose   => 3,
                                               },  
"SWISH::Prog::Native::Indexer"),
                                           progress_size => 1000,
                                           swish_filter_obj => bless({
                                                 doc_class    =>  
"SWISH::Filter::Document",
                                                 filters      => [
                                                                    
bless({
                                                                      
_mimetypes => bless({}, "SWISH::Filter::MIMETypes"),
                                                                      
gz => { perl => 1 },
                                                                      
mimetypes => [qr|application/x-gzip|],
                                                                      
type => 1,
                                                                   },  
"SWISH::Filters::Decompress"),
                                                                 ],
                                                 mimetypes    =>  
bless({}, "SWISH::Filter::MIMETypes"),
                                                 skip_filters => {},
                                               }, "SWISH::Filter"),
                                           verbose => 3,
                                         },  
"SWISH::Prog::Aggregator::Mail"),
                     _start           => 1269116682,
                     _swish3          => bless(do{\(my $o =  
674251808)}, "SWISH::3"),
                     debug            => 1,
                     doc_class        => "SWISH::Prog::Doc",
                     indexer          => 'fix',
                     progress_size    => 1000,
                     swish_filter_obj => bless({
                                           doc_class    =>  
"SWISH::Filter::Document",
                                           filters      => [
                                                             bless({
                                                                
_mimetypes => bless({}, "SWISH::Filter::MIMETypes"),
                                                               gz =>  
{ perl => 1 },
                                                                
mimetypes => [qr|application/x-gzip|],
                                                               type =>  
1,
                                                             },  
"SWISH::Filters::Decompress"),
                                                           ],
                                           mimetypes    => bless({},  
"SWISH::Filter::MIMETypes"),
                                           skip_filters => {},
                                         }, "SWISH::Filter"),
                     test_mode        => 0,
                     verbose          => 3,
                   }, "SWISH::Prog::Aggregator::MailFS"),
     config     => "test.conf",
     debug      => 1,
     indexer    => 'fix',
     invindex   => "index.swish3",
     test_mode  => 0,
     verbose    => 3,
   }, "SWISH::Prog");
   $a->{aggregator}{indexer} = $a->{aggregator}{_mailer}{indexer};
   $a->{indexer} = $a->{aggregator}{_mailer}{indexer};
   $a;
} at /usr/local/bin/swish3 line 186
opening: swish-e  -f index.swish3/index.swish-e -v3 -W0 -S prog -i  
stdin -c /tmp/ZQ9GWja3Qa at /usr/local/lib/perl5/site_perl/5.8.8/SWISH/ 
Prog.pm line 197
checking dir /home/mail-archive/test
   /home/mail-archive/test -> ok
crawling /home/mail-archive/test
checking file /home/mail-archive/test/00
   /home/mail-archive/test/00 -> ok
Bad realloc() ignored at /usr/local/lib/perl5/site_perl/5.8.8/SWISH/ 
Prog/Aggregator/FS.pm line 273.
Parsing config file '/tmp/ZQ9GWja3Qa'
Indexing Data Source: "External-Program"
Indexing "stdin"

Removing very common words...
Segmentation fault: 11 (core dumped)
no words removed.
Writing main index...
err: No unique words indexed!
#####################################################################
#####################################################################

> thanks for working so hard. Hopefully your stumblings make it easier  
> for
> everyone in the long run.

Your active support and development sure makes it a lot easier, and  
it's much appreciated.

Peace, Troy
_______________________________________________
Users mailing list
Users@lists.swish-e.org
http://lists.swish-e.org/listinfo/users
Received on Sat Mar 20 16:35:59 2010