On Tue, Jun 10, 2003 at 10:33:35AM -0700, Nathan Vonnahme wrote:
>
> I have to use IIS (arggg),
I always wonder about that. Throw linux and Apache on an old P133
that's too slow for Windows, use samba and you have a nice stable web
server platform.
> swish_binary => 'C:\"Program Files"\SWISH-E\swish-e.exe', # Location of swish-e binary
> swish_index => 'C:\Program Files\SWISH-E\index.swish-e', # Location of your index file
>
> The space in "Program Files" needs to be quoted in the first, but not
> in the second, because of the way DOS and Perl interact.
Huh, interesting. Under Windows it uses IPC::Open2, which still goest
through the shell. I see a line that does:
my @command = map { s/"/\\"/g; $_ } $self->{prog}, $self->swish_command_array;
I think I was trying to prevent Windows from swallowing up quotes (like
ones used in phrase searches).
That still stinks. I just looked at the SWISH::Filter.pm module and it
does:
my @command = map { s/"/\\"/g; qq["$_"] } @args;
So quotes are escaped and then then entire parameter is placed inside
double quotes. Would that fix your problem above? Then you wouldn't
need to add that quotes in the "swish_binary" above.
> ALSO, I had to modify doc2txt.pm and pdf2xml.pm (at least... maybe
> more I don't remember) to quote the filename when they use system() or
> backticks (``), since some directory names have spaces. E.g. in
> doc2txt.pm:
>
> 80c80
> < my $content = `$catdoc "$file"`;
Ok. This has come up before, and I can't remember if I had a reason not
to do that or if I just never got around to it.
There's also in pdf2html:
open $sym, "pdfinfo $file |" || die "$0: Failed to open $file $!";
open $sym, "pdftotext $file - |" or die "$0: failed to run pdftotext: $!";
Those should be quoted, too.
> Windows doesn't allow doublequotes in file or directory names so that
> should be an OK way to do it even though it doesn't escape anything.
> It might be better to add a more robust argument escaping method to
> prevent filenames with special characters from doing unexpected things
> (or better would be to not use backticks to avoid the shell
> completely).
Well, without fork/exec on Windows it's hard. I'm sure there's some
Win32 specific functions to do that, but I have never looked into it. I
spent a *year* posting to Win32 CGI lists asking how to securely run an
external program (like swish-e) from a CGI script and never got any
response.
> In some cases the current code could be a security hole
> on unix, because a user whose documents are indexed could execute
> arbitrary code as the webserver user by naming a document 'haha; rm
> -rf /; youlose.doc' or whatever.
Yes, that's true. If you are indexing with the file system (not
spidering) and can't trust the files you are indexing, yes.
> If the webserver user has
> sufficiently few rights it shouldn't be possible to cause a lot of
> damage that way though.
Please feel free to look over the utilities with Windows in mind:
filters/SWISH/Filter.pm (see run_program() )
files in prog-bin and filter-bin
There's also risk using the FileFilter directive if you don't escape
correctly.
> I just wanted to share my hard-won experience with the archive, and
> maybe these hints/changes could make it into at least the Windows
> release. Thanks to all you who have developed swish!
Thanks for the comments!
--
Bill Moseley
moseley@hank.org
Received on Tue Jun 10 19:18:31 2003