Skip to main content.
home | support | download

Back to List Archive

Re: filtered filenames

From: Bill Moseley <moseley(at)not-real.hank.org>
Date: Tue Nov 30 2004 - 05:28:46 GMT
On Mon, Nov 29, 2004 at 05:24:55PM -0800, Bill Conlon wrote:
> In this case the document name will be stored in the index as 
> 'viewdoc.taf?_uid1=71' instead of test.doc.  But if the same url is 
> viewed in a browser, the file will be downloaded and named test.doc.  
> Does it make more sense to modify spider.pl to test for the existence 
> of a filename in the Content-Disposition header or do this as part of 
> filtering?

I don't think so.  Search results give the url of where to find the
document -- having search results return "test.doc" doesn't provide
that information.

But, I think if you wanted to modify the name (say to the name
provided in the Content-Disposition header) that you could make that
modification in the filter_content() callback.  Well, you can modify
the URI object, but I'm not sure if you can change the scheme of the
URI object.  You would have to try.  There's also an "output_function"
callback (which may not be in 2.4.2) where you can override the output
generated by the spider.

-- 
Bill Moseley
moseley@hank.org

Unsubscribe from or help with the swish-e list: 
   http://swish-e.org/Discussion/

Help with Swish-e:
   http://swish-e.org/current/docs
   swish-e@sunsite.berkeley.edu
Received on Mon Nov 29 21:28:50 2004