Skip to main content.
home | support | download

Back to List Archive

Re: [swish-e] multiple Warnings: 'could not be encoded to charset 'ISO-8859-1'

From: at <Peter>
Date: Thu, 15 Mar 2012 20:16:17 -0500
Dr Michael Daly wrote on 3/15/12 8:00 AM:
> for _docs/test3, I only see the two .doc files
> 
> I deleted some .html files in the _docs dir, and now I get a different output
> (it goes on & on, attempting to index .xls files in /_doc):
> 
>  swish-e -S prog -c /share/MD0_DATA/swish-e-files/swish-e-conf/web_1.conf
> Indexing Data Source: "External-Program"
> Indexing "spider.pl"
> External Program found: /opt/lib/swish-e/spider.pl
> Missing argument in sprintf at /opt/lib/swish-e/spider.pl line 38.
> Missing argument in sprintf at /opt/lib/swish-e/spider.pl line 38.
> /opt/lib/swish-e/spider.pl: Reading parameters from 'default'
> 
> Summary for: http://localhost:104/_docs/test3/Reception-duties.doc
>              Connection: Close:     1  (1.0/sec)
>                    Total Bytes: 1,217  (1217.0/sec)
>                     Total Docs:     1  (1.0/sec)
>                    Unique URLs:     1  (1.0/sec)
> application/msword->text/plain:     1  (1.0/sec)
> Warning: document 'http://localhost:104/_docs/test3/' could not be encoded
> to charset 'ISO-8859-1'
> Warning: document 'http://localhost:104/_docs/' could not be encoded to
> charset 'ISO-8859-1'
> Warning: document 'http://localhost:104/' could not be encoded to charset
> 'ISO-8859-1'
> http://localhost:104/_docs/2008%20CASH%20FLOW%20ESTIMATES.xls:317: error:
> Unexpected end tag : table
> </table>
>         ^
> http://localhost:104/_docs/2008%20CASH%20FLOW%20ESTIMATES.xls:318: error:
> Unexpected end tag : table
> </table>
>         ^
> Warning: document 'http://localhost:104/_docs/21st_aug/' could not be
> encoded to charset 'ISO-8859-1'
> http://localhost:104/_docs/%20sims%20st.xls:396: error: Unexpected end tag
> : table
> </table>
>         ^
> http://localhost:104/_docs/%20thomas%20st.xls:191: error: Unexpected end
> tag : table
> </table>
>         ^
> http://localhost:104/_docs/%20thomas%20st.xls:192: error: Unexpected end
> tag : table
> </table>
>         ^
> Syntax Error: Couldn't read xref table
> Syntax Warning: PDF file is damaged - attempting to reconstruct xref table...
> http://localhost:104/_docs/Book1.xls:14648: error: Unexpected end tag : table
> </table>
> 
> 
> What is wrong?


why is your .xls being indexed as .pdf?

What are the contents of
/share/MD0_DATA/swish-e-files/swish-e-conf/web_1.conf
?

again, break this down to a single URL to isolate your problem. Try turning on
the spider debug options too:

http://swish-e.org/docs/spider.html#debug



-- 
Peter Karman  .  http://peknet.com/  .  peter(at)not-real.peknet.com
_______________________________________________
Users mailing list
Users(at)not-real.lists.swish-e.org
http://lists.swish-e.org/listinfo/users
Received on Fri Mar 16 2012 - 01:16:19 GMT