I gave you wrong advice -- the FileFilter script is passed a file name
where the file is located. Take a look at the docs on FileFilter and
it explains how the filter is called.
But, I think it shouldn't matter. If you write your perl script like
while (<>) {
...
}
Then it can read from either STDIN or by files passed on the command
line.
On Mon, Aug 09, 2004 at 12:09:54PM -0700, Alan Ivey wrote:
> It's 20MB of TXT files. I'm at home so I can't access
> any of them to show you. They're 14 years of press
> releases, and all of these text files follow the same
> format. The first line (unless there are empty lines)
> is the author. Further down, there's something like
> RELEASE: or EDITOR'S NOTE: and the document title
> follows that.
>
> I wrote a Perl script that reads the STDIN of the text
> file, grabs the author and title, and prints (STDOUT,
> right?) an HTML page with the document as the html
> <Title> and the author as Metadata.
I would instead write a program that can be used with the -S prog
option. I'm not sure there's that much advantage of doing it that
way over FileFilter (FileFilter will be slower), though.
Take a look at DirTree.pl in the prog-bin director for an example.
> Reasons for doing this... The documents need to be
> searchable for an author, or within an author's
> documents. I understand how this is done on the search
> cgi script. Also, the cgi results page need to have
> the document's title shown, not the filename (I think
> txt files show the filename.txt).
>
> I figured converting to HTML would be the best way to
> achieve this. When I tested my script, I did...
>
> $ perl gettxttitle.pl < 1996-032.txt
>
> and the output was the HTML that I was after. However,
> when I plug it into my SWISH-E settings, it hangs on
> the first file as if it's taking a long time to
> process it.
Can you run it like:
./gettxttitle.pl 1996-032.txt
> If I need to supply more information, I can provide
> more examples at work tomorrow.
What "FileFilter" command are you using?
--
Bill Moseley
moseley@hank.org
Unsubscribe from or help with the swish-e list:
http://swish-e.org/Discussion/
Help with Swish-e:
http://swish-e.org/current/docs
swish-e@sunsite.berkeley.edu
Received on Mon Aug 9 12:49:21 2004