Skip to main content.
home | support | download

Back to List Archive

RE: timestamps in the database?

From: David Norris <kg9ae(at)not-real.geocities.com>
Date: Tue Aug 24 1999 - 21:39:56 GMT
>> timestamps will become outdated as well.  This may or may not be a

> I could life with this. If you update your index file with cron at
> midnight and move documents before midnight the index will be inconsistent too.
> The problem of inconsistency can only be solved by generating the index at
> runtime... nobody wants this.

I agree, in many cases this is insignificant because old documents don't change often.  Where
documents update more often, it would be bad to indicate that this document was updated last
week when in fact it was updated yesterday.  How many people update their index as often as
every day?  It takes a long time to index a large site, most folks I know do it on Sundays or
their normal low traffic times.  I'd be surprised if many people do it more than once a week.
Grabbing filemtime at runtime gives an indicator that the file has changed since the index was
updated.  New documents wouldn't be shown, but, old documents which have changed will be shown
as such.

>> My search results rarely
>> takes more than 30 - 50 milliseconds to generate results while reading

> However it becomes more important if your server hardware is not that
> fast.

What do you mean by not fast?  386/486?  My server machine is a Pentium 100 MHz with 16 MB RAM
running Linux 2.2.5 (SuSE 6.1)...  PHP is extremely fast when compared to almost anything,
especially on old hardware.  Similar PERL scripts take in seconds to parse search results even
on a fast machine.  In comparison, the latency in my PHP script is almost completely caused by
the file I/O routines in Linux and slow hardware.

>> in the index, then why even have them in the file system.  Convenience?
> :-) You are right. But it would not need more space than the file size
> actually takes.

I agree, adding one thing or another wouldn't take as much space as the file.  But, adding
everything everyone wants to have returned in the results could become a significant percentage.
I was referring to trend of trying to add everything into the index.  Paragraph+ descriptions,
file size, time stamps, keywords, etc would become large.  And, the only time it saves is
measured in milliseconds.

,David Norris

World Wide Web - http://www.webaugur.com/dave
Page via mail - 412039@pager.mirabilis.com
ICQ Universal Internet Number - 412039
E-Mail - dave@webaugur.com
Received on Tue Aug 24 14:31:14 1999