Hi Bill,
Here's the output of what you suggested. I'm still clueless...apart from
ID3toHTML the other filters are loaded.
------------------------------------------
[jayaraj@tnt jayaraj]$ swish-filter-test -v test.doc
SWISH::Filter found at [/usr/local/lib/swish-e/perl/SWISH/Filter.pm]
>> Loading filter: [SWISH/Filters/Pdf2HTML.pm]
Find path of [pdftotext] in
/usr/kerberos/bin:/bin:/usr/bin:/usr/local/bin:/usr/bin/X11:/usr/X11R6/bin:/home/jayaraj/bin:/usr/local/lib/swish-e
Not found at path [/usr/kerberos/bin/pdftotext]
Not found at path [/bin/pdftotext]
* Found program at: [/usr/bin/pdftotext]
Find path of [pdfinfo] in
/usr/kerberos/bin:/bin:/usr/bin:/usr/local/bin:/usr/bin/X11:/usr/X11R6/bin:/home/jayaraj/bin:/usr/local/lib/swish-e
Not found at path [/usr/kerberos/bin/pdfinfo]
Not found at path [/bin/pdfinfo]
* Found program at: [/usr/bin/pdfinfo]
>> Loading filter: [SWISH/Filters/ID3toHTML.pm]
trying to load [MP3::Tag]
Can not use Filter SWISH::Filters::ID3toHTML -- need to install
MP3::Tag: No such file or directory
:-( Filter [SWISH/Filters/ID3toHTML.pm] not loaded
>> Loading filter: [SWISH/Filters/XLtoHTML.pm]
trying to load [Spreadsheet::ParseExcel]
** Loaded Spreadsheet::ParseExcel **
trying to load [HTML::Entities]
** Loaded HTML::Entities **
>> Loading filter: [SWISH/Filters/Doc2txt.pm]
Find path of [catdoc] in
/usr/kerberos/bin:/bin:/usr/bin:/usr/local/bin:/usr/bin/X11:/usr/X11R6/bin:/home/jayaraj/bin:/usr/local/lib/swish-e
Not found at path [/usr/kerberos/bin/catdoc]
Not found at path [/bin/catdoc]
Not found at path [/usr/bin/catdoc]
* Found program at: [/usr/local/bin/catdoc]
>> Starting to process new document: application/x-msword
++Checking filter [SWISH::Filters::Pdf2HTML=HASH(0x805f83c)] for
application/x-msword
++ application/x-msword was not filtered by
SWISH::Filters::Pdf2HTML=HASH(0x805f83c)
++Checking filter [SWISH::Filters::XLtoHTML=HASH(0x83493c4)] for
application/x-msword
++ application/x-msword was not filtered by
SWISH::Filters::XLtoHTML=HASH(0x83493c4)
++Checking filter [SWISH::Filters::Doc2txt=HASH(0x835f794)] for
application/x-msword
++ application/x-msword was not filtered by
SWISH::Filters::Doc2txt=HASH(0x835f794)
Final Content type for test.doc is application/x-msword
*No filters were used
Document test.doc was not filtered.
Document: test.doc (test.doc)
Content-Type: application/x-msword
Parser type:
** /usr/local/bin/swish-filter-test:
Skipping binary [test.doc]
------------------------------------------------------------------
Bill Moseley wrote:
>On Fri, Oct 14, 2005 at 02:54:50PM -0700, Sebastian Jayaraj wrote:
>
>
>>Hello All,
>>
>> I have been using swish-e for a while and it works beautifully while
>>indexing PDF and XL files. I was trying to index MS word files and only
>>the filenames were being indexed. So I tried a simple swish-filter-test
>>and found this....
>>
>>-------------------------------------------------
>>[root@tnt filters]# catdoc -V
>>Catdoc Version 0.93.3
>>[root@tnt filters]# swish-e -V
>>SWISH-E 2.4.2
>>[root@tnt filters]# swish-filter-test test.doc
>>
>>Document test.doc was not filtered.
>> Document: test.doc (test.doc)
>> Content-Type: application/x-msword
>> Parser type:
>>
>>** /usr/local/bin/swish-filter-test:
>> Skipping binary [test.doc]
>>------------------------------------------------
>>
>>Catdoc by itself works fine and is in the right path. Any pointers or
>>suggestions would be helpful.
>>
>>
>
>One suggestion would be to try the above with the -v option.
>And maybe run as a normal user instead of root.
>
>
>
Received on Tue Oct 18 12:29:23 2005