Skip to main content.
home | support | download

Back to List Archive

Re: switched to server, still no luck (almost)

From: J. David Boyd <david(at)not-real.adboyd.com>
Date: Tue Oct 25 2005 - 17:14:51 GMT
Bill Moseley wrote:
> On Tue, Oct 25, 2005 at 09:02:53AM -0700, J. David Boyd wrote:
> 
>>swish-e -w rotable
>>
>>for example, shows nothing.  The word "rotable" is positively in my
>>index_words_only list.
>>
>>Any ideas what I am doing wrong this time?
> 
> 
> Perhaps it's indexed under a different metaname?  We can only guess
> since you are not providing any examples that we can reproduce.
> 

Hmm, how would I tell?

As for details, I have a directory that contains ~550 PDF files, each in
one if 12 subdirectories, like MOD_0, MOD_1, etc.

I'm in the ~/share/doc/swish-e/examples/conf directory,
and I'm running

swish-e -S prog -c example9.config

The only change I've made in example9.config was to set the
SwishProgParameters /usr/home/tsc0/public_html/add
line to point to my base PDF directory.

When I run the above command line, I get

/usr/home/tsc0/public_html/add/MOD_0/AAA-MOD0.TBL.pdf - Using XML parser
- !!!Adding automatic MetaName 'all' found in file
'/usr/home/tsc0/public_html/add/MOD_0/AAA-MOD0.TBL.pdf'
!!!Adding automatic MetaName 'headers' found in file
'/usr/home/tsc0/public_html/add/MOD_0/AAA-MOD0.TBL.pdf'
!!!Adding automatic MetaName 'creationdate' found in file
'/usr/home/tsc0/public_html/add/MOD_0/AAA-MOD0.TBL.pdf'
!!!Adding automatic MetaName 'encrypted' found in file
'/usr/home/tsc0/public_html/add/MOD_0/AAA-MOD0.TBL.pdf'
!!!Adding automatic MetaName 'file_size' found in file
'/usr/home/tsc0/public_html/add/MOD_0/AAA-MOD0.TBL.pdf'
!!!Adding automatic MetaName 'moddate' found in file
'/usr/home/tsc0/public_html/add/MOD_0/AAA-MOD0.TBL.pdf'
!!!Adding automatic MetaName 'optimized' found in file
'/usr/home/tsc0/public_html/add/MOD_0/AAA-MOD0.TBL.pdf'
!!!Adding automatic MetaName 'page_size' found in file
'/usr/home/tsc0/public_html/add/MOD_0/AAA-MOD0.TBL.pdf'
!!!Adding automatic MetaName 'pages' found in file
'/usr/home/tsc0/public_html/add/MOD_0/AAA-MOD0.TBL.pdf'
!!!Adding automatic MetaName 'pdf_version' found in file
'/usr/home/tsc0/public_html/add/MOD_0/AAA-MOD0.TBL.pdf'
!!!Adding automatic MetaName 'producer' found in file
'/usr/home/tsc0/public_html/add/MOD_0/AAA-MOD0.TBL.pdf'
!!!Adding automatic MetaName 'tagged' found in file
'/usr/home/tsc0/public_html/add/MOD_0/AAA-MOD0.TBL.pdf'
!!!Adding automatic MetaName 'content' found in file
'/usr/home/tsc0/public_html/add/MOD_0/AAA-MOD0.TBL.pdf'


Then I get several hundred of these (Actually, one for each file, so I
get ~550 of these, with the only difference being the file name of the
PDF file:

Warning: XML parse error in file
'/usr/home/tsc0/public_html/add/MOD_0/AAA-MOD0.TBL.pdf' line 18.  Error:
not well-formed
 (36 words)


Then, finally, I get this:

Removing very common words...
no words removed.
Writing main index...
Sorting words ...
Sorting 4,935 words alphabetically
Writing header ...
Writing index entries ...
  Writing word text: Complete
  Writing word hash: Complete
  Writing word data: Complete
4,935 unique words indexed.
Sorting property: swishdocpath
Sorting property: swishtitle
Sorting property: swishdocsize
Sorting property: swishlastmodified
4 properties sorted.
557 files indexed.  4,493,431 total bytes.  151,792 total words.
Elapsed time: 00:00:45 CPU time: 00:00:01
Indexing done!


Then, like I said,

swish-e -T index_all_words shows me all the words I am looking for, but
I can't get one with the "-w".

I thought that a 'swish-e -w WORD' would be the least restrictive kind
of search...

Hope this helps some, and thanks for your time!

Dave
Received on Tue Oct 25 10:14:53 2005