Skip to main content.
home | support | download

Back to List Archive

read_stream optimized...

From: <Rainer.Scherg(at)not-real.rexroth.de>
Date: Sun Nov 19 2000 - 22:38:51 GMT
Hi Jose,

I thought a little bit about following routine (as mentioned) in the
prior mails. The introduction of "read_stream" (read the file to be indexed
completley into memory) by Jose is a great idea. It might speed up the
indexing process, when there is a need to reposition in the indexing data
(also a good OS would do the same using a file cache...).

----------------
char *read_stream(FILE *fp,int filelen)
{
int c=0,offset=0,bufferlen=0;
unsigned char *buffer;
	if(filelen)
	{
		buffer=emalloc(filelen+1);
		vread(buffer,1,filelen,fp);
		buffer[filelen]='\0';
	} else {    /* if we are reading from a popen call, filelen is 0 */
		buffer=emalloc((bufferlen=MAXSTRLEN)+1);
		while((c=fgetc(fp))!=EOF)
		{
			if(offset==bufferlen)
			{
				bufferlen+=MAXSTRLEN;
				buffer=erealloc(buffer,bufferlen+1);
			}
			buffer[offset++]=(unsigned char)c;
		}
		buffer[offset]='\0';
	}
	return (char *)buffer;
}
-----------------

As I mentioned the routine is not optimized (fget, lots of possible
reallocs), when reading e.g. data from a file stream. So the routine
could look like (the vread-part with filelen might be better, but on most
filesizes below the initial READ_BUFFER_SIZE, does not make
any difference...):

-----------------

#define READ_BUFFER_SIZE (128 * 1024)

char *read_stream(FILE *fp)

{  unsigned char *buffer;
   long          n, rd_len;


	buffer=emalloc(READ_BUFFER_SIZE);
	rd_len = 0;

	while (! feof(fp)) {
		n = fread(buffer+rd_len,sizeof(unsigned
char),READ_BUFFER_SIZE,fp);
		rd_len += n;
		if (n == READ_BUFFER_SIZE) {
		  buffer = erealloc (buffer, rd_len+READ_BUFFER_SIZE);
		}
	}
	buffer[rd_len]='\0';

	return (char *)buffer;
}

--------------------------

also possible is a different interface, which returns the buffersize:

  long (FILE *fp, unsigned char **buffer)

This would be the filesize or the size of the filter output
(depends on...).

The routine is not yet tested, but should work fine.


cu - rainer








----------------------------------------------------------------------
This Mail has been checked for Viruses
Attention: Encrypted Mails can NOT be checked !

* * *

Diese Mail wurde auf Viren ueberprueft
Hinweis: Verschluesselte Mails koennen NICHT geprueft werden !
----------------------------------------------------------------------
Received on Sun Nov 19 22:41:07 2000