Skip to main content.
home | support | download

Back to List Archive

Re: & Windows Thread safety

From: Bill Moseley <moseley(at)>
Date: Sun May 23 2004 - 05:12:51 GMT
Hi James,

On Sat, May 22, 2004 at 08:41:16PM -0700, Job, James wrote:
> However, about 25-50 documents into my crawl, I'd start seeing "Skipped
> whatever.doc due to filter 'filter_content' user supplied function #1.

So the filter started to fail.

> Looking at task manager, I would see a running "catdoc" or "pdftotext"
> process.  After tearing my hair out for a while, I suspected there may be a
> threading issue (since I'm running a SMP system),

I don't know anything about SMP or threaded applications.  Can you
explain why just having two CPUs would result in such a problem?

> and made some changes to
> the windows_fork subroutine in  I eventually had success with the
> following:

Good.  I'll apply the patch, but I'd like to understand what's

>     my $pid = IPC::Open2::open2($rdrfh, $wtrfh, @command );
>     # --- BEGIN WIN32 SMP MODS
>     # Wait for Process to complete before we continue (max 10 sec), else kill it!
>     use POSIX ":sys_wait_h";
>     my ($stiff, $tcks);
>     $tcks = 0;
>     while (($stiff=waitpid(-1,&WNOHANG))>0 && $tcks<9) {
>     	sleep 1;
>     	$tcks++;
>     	}
>     if ($tcks>8) {
>     	$pid->Kill(9);
>     	}
>     # --- END WIN32 SMP MODS

OK, so is that waiting on the just run program?  Seems like would want
to do that after reading from the pipe.  I would think the OS would
block the program until the pipe was read from -- so it would always get

Or is it too late and I'm missing something obvious?


BTW -- what ever happened with your other problem:

  Warning: Failed to uncompress Property. zlib uncompress returned: -5.
  uncompressed size: 140 buf_len: -1073746392

Did that go away after reindexing?

Bill Moseley
Received on Sat May 22 22:12:53 2004