Skip to main content.
home | support | download

Back to List Archive

No title being returned for version 1.3.2

From: Steve Shearman <steve(at)not-real.webmasters.co.nz>
Date: Mon May 31 1999 - 03:22:02 GMT
Hi 

We have recently upgraded to version 1.3.2 to take advantage of returned
meta properties

However instead of the title being returned, we are now getting the filename.

Has anyone else seen this - and is there a fix or has something changed

Steve

At 11:34 PM 5/27/99 -0700, you wrote:
>			    SWISH-E Digest 379
>
>Topics covered in this issue include:
>
>  1) Problems with Indexing Meta Tags
>	by Chris Blackstone <cblackst@teacher.mail.arlington.k12.va.us>
>  2) Quick indexing
>	by RAGHAVV <RAGHAVV@inf.com>
>  3) Re: Problems with Indexing Meta Tags
>	by Ron Klatchko <ron@library.ucsf.edu>
>  4) RE: Problems with Indexing Meta Tags
>	by "David Norris" <kg9ae@geocities.com>
>  5) RE: Problems with Indexing Meta Tags
>	by "David Norris" <kg9ae@geocities.com>
>  6) Re: Problems with Indexing Meta Tags
>	by Chris Blackstone <cblackst@teacher.mail.arlington.k12.va.us>
>
>----------------------------------------------------------------------
>
>Topic No. 1
>
>Date: Thu, 27 May 1999 09:27:04 -0400
>From: Chris Blackstone <cblackst@teacher.mail.arlington.k12.va.us>
>To: swish-e@sunsite.berkeley.edu
>Subject: Problems with Indexing Meta Tags
>Message-ID: <374D482C.31A91EBB@tmail.arlington.k12.va.us>
>
>As a webmaster for a school district, I have many different people
>working on web pages. Each uses different software and client machines.
>My problem is that if a user is using Composer, for examply, the meta
>tags that it inserts in the header are preventing swish-e from correctly
>spidering the page. Is there any way to turn OFF Meta tag indexing? I
>don't want to have to tell all the people working on pages that they
>have to do through and delete their meta tags so swish-e will work.
>
>Thanks in advance,
>
>Chris Blackstone
>--
>Chris Blackstone
>Web Services Coordinator
>
>Arlington Public Schools -- 1426 N. Quincy Street -- Arlington, VA 22207
>Tel: 703/228-6185
>Fax: 703/875-9491
>Pager: 703/612-3042
>http://www.arlington.k12.va.us
>
>------------------------------
>
>Topic No. 2
>
>Date: Thu, 27 May 1999 20:31:39 +0530
>From: RAGHAVV <RAGHAVV@inf.com>
>To: swish-e@sunsite.berkeley.edu
>Subject: Quick indexing
>Message-ID: <8EE756E49A17D21194860008C7F49AFE01A01635@TWRMSG01>
>
> 
>I want to index all the HTML pages that are accessed by my corporate users
>while surfing the Internet. I am planning to develop a plugin that will sit
>with the web-proxy and will capture all the HTMLs before giving them to the
>users. The corporate web-administrator can view these pages by providing
>querying on the indexed database. This query can be done at any time of the
>day.
> 
>As all the search engine tools are designed to deliver search results much
>faster than indexing HTML pages, I am not sure which tool to choose for this
>requirement. Has anybody ever tried real time indexing with Swish-E? I would
>like to know if Swish-E can be suitable for this requirement. As an
>approximation I would like the system to be able to index upto 5 HTMLs /
>sec. Also I need to keep these HTML pages for 7 days. Which will be around
>210K pages.
> 
>Thanks,
>Raghvendra Varma,
>Infosys Technologies Limited
> 
>
>------------------------------
>
>Topic No. 3
>
>Date: Thu, 27 May 1999 15:31:50 -0700
>From: Ron Klatchko <ron@library.ucsf.edu>
>To: cblackst@teacher.mail.arlington.k12.va.us,
>Subject: Re: Problems with Indexing Meta Tags
>Message-ID: <3.0.5.32.19990527153150.00a4ac30@mail.ckm.ucsf.edu>
>
>
>Chris-
>
>Do you know if it's actually mucking up the spidering or the indexing?
>Also, could you provide a URL for a page that SWISH chokes on so that we
>could potentially fix the problem.
>
>moo
>
>
>At 06:26 AM 5/27/99 -0700, Chris Blackstone wrote:
>>As a webmaster for a school district, I have many different people
>>working on web pages. Each uses different software and client machines.
>>My problem is that if a user is using Composer, for examply, the meta
>>tags that it inserts in the header are preventing swish-e from correctly
>>spidering the page. Is there any way to turn OFF Meta tag indexing? I
>>don't want to have to tell all the people working on pages that they
>>have to do through and delete their meta tags so swish-e will work.
>>
>>Thanks in advance,
>>
>>Chris Blackstone
>>--
>>Chris Blackstone
>>Web Services Coordinator
>>
>>Arlington Public Schools -- 1426 N. Quincy Street -- Arlington, VA 22207
>>Tel: 703/228-6185
>>Fax: 703/875-9491
>>Pager: 703/612-3042
>>http://www.arlington.k12.va.us
>>
>>
>----------------------------------------------------------------------
>          Ron Klatchko - Manager, Advanced Technology Group           
>           UCSF Library and Center for Knowledge Management           
>                        ron@library.ucsf.edu                
>
>------------------------------
>
>Topic No. 4
>
>Date: Thu, 27 May 1999 15:35:44 -0500
>From: "David Norris" <kg9ae@geocities.com>
>To: <cblackst@teacher.mail.arlington.k12.va.us>
>Cc: <swish-e@sunsite.berkeley.edu>
>Subject: RE: Problems with Indexing Meta Tags
>Message-ID: <NABBJAELJCIBPNFJODIGEEICENAA.kg9ae@geocities.com>
>
>> <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=iso-8859-1">
>> Any ideas?
>
>No ideas at all, many of my documents are full of HTTP-EQUIV stuff.  It
>works fine for me.  And, I just tested it to be sure with the above meta
>tag.  Normally when I have had SWISH-E stop on a document it was because of
>some weird, invisible character being inserted into the file.  I suspect
>that you will find some non-ASCII characters lurking about.  Netscape, for
>instance, has been known to insert NUL (\0 bug in NSKB) characters into
>files for no apparent reason.  It probably wouldn't appear in the file with
>most text viewers, either.  This could be a disaster for a text-mode
>program.  I assume that SWISH-E isn't binary safe, since the docs seem to
>indicate as such.
>
>If you do find non-ASCII characters in the file, then you will probably need
>some method of cleaning the files before reading them.  Some folks have
>posted file filtering code that runs certain file types through a PERL
>script.  Perhaps you could make a filter to strip or convert non-ASCII stuff
>in HTML files.  Then again, you could make a filter for anything.  Figure
>out exactly why the file crashes SWISH-E and filter it.
>
>,David Norris
>
>World Wide Web - http://www.geocities.com/CapeCanaveral/Lab/1652/
>Home Computer - http://illusionary.tzo.cc/
>Page via mail - 412039@pager.mirabilis.com
>ICQ Universal Internet Number - 412039
>E-Mail - kg9ae@geocities.com
>
>
>------------------------------
>
>Topic No. 5
>
>Date: Thu, 27 May 1999 13:46:51 -0500
>From: "David Norris" <kg9ae@geocities.com>
>To: <cblackst@teacher.mail.arlington.k12.va.us>
>Cc: "Multiple recipients of list" <swish-e@sunsite.berkeley.edu>
>Subject: RE: Problems with Indexing Meta Tags
>Message-ID: <NABBJAELJCIBPNFJODIGKEHPENAA.kg9ae@geocities.com>
>
>What specifically is causing it not to index correctly?  Read the comments
>for the 'MetaNames' property in the user.config.  You just don't specify any
>MetaNames.  If that doesn't do it, then there is likely some problem.
>
> ,David Norris
>
>World Wide Web - http://www.geocities.com/CapeCanaveral/Lab/1652/
>Home Computer - http://illusionary.tzo.cc/
>Page via mail - 412039@pager.mirabilis.com
>ICQ Universal Internet Number - 412039
>E-Mail - kg9ae@geocities.com
>
>
>------------------------------
>
>Topic No. 6
>
>Date: Thu, 27 May 1999 15:02:35 -0400
>From: Chris Blackstone <cblackst@teacher.mail.arlington.k12.va.us>
>To: David Norris <kg9ae@geocities.com>
>Subject: Re: Problems with Indexing Meta Tags
>Message-ID: <374D96C8.7BC662@tmail.arlington.k12.va.us>
>
>The following line of code in the header of a file prevents swish-e from
>spidering the file corectly
>
><META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=iso-8859-1">
>
>Netscape and other editors insert lines like this.
>If I leave the 'MetaNames' property blank with nothing after it, swish-e
>still doesn't spider correctly. Only if I remove the "<META HTTP...."
>line does it work fine.
>
>Any ideas?
>
>Chris
>
>
>David Norris wrote:
>> 
>> What specifically is causing it not to index correctly?  Read the comments
>> for the 'MetaNames' property in the user.config.  You just don't specify
any
>> MetaNames.  If that doesn't do it, then there is likely some problem.
>> 
>>  ,David Norris
>> 
>> World Wide Web - http://www.geocities.com/CapeCanaveral/Lab/1652/
>> Home Computer - http://illusionary.tzo.cc/
>> Page via mail - 412039@pager.mirabilis.com
>> ICQ Universal Internet Number - 412039
>> E-Mail - kg9ae@geocities.com
>
>--
>Chris Blackstone
>Web Services Coordinator
>
>Arlington Public Schools -- 1426 N. Quincy Street -- Arlington, VA 22207
>Tel: 703/228-6185
>Fax: 703/875-9491
>Pager: 703/612-3042
>http://www.arlington.k12.va.us
>
>------------------------------
>
>End of SWISH-E Digest 379
>*************************
>
>
____________________________________________________________________________
___
Steve Shearman               
WebMasters Ltd       Internet Marketing, Web Site Development and Hosting
P.O. 108189          Email             : steve@webmasters.co.nz
Auckland             WWW               : http://www.webmasters.co.nz/
New Zealand          Access NZ         : http://www.accessnz.co.nz/
                     Phone             : (09) 359 1116
                     Mobile            : 025 521 944
                     Customer Services : (09) 359 1111
                     Fax               : (09) 359 1110
                     Toll Free         : (0800) ACCESS or (0800) 222-377
Received on Sun May 30 20:14:33 1999