Hi Peter, Thanks for your help, but the problem still does not resolved. Similiar errors also includes: When indexing: https://www.lbl.gov/lists.archives/theta13-offline.archive/: Warning: Unknown header line: 'https://www.lbl.gov/lists.archives/theta13-offline.archive/author.html' from program spider.plerr: External program failed to return required headers Path-Name: or https://www.lbl.gov/lists.archives/theta13-eng.archive/: Warning: Unknown header line: 'ive/author.html' from program spider.plerr: External program failed to return required headers Path-Name: and other similiar error messages. It seems to me that spider.pl does not parse the hypermail archive correctly. Any help? Best Regards Xinchun Tian wrote on 3/2/08 12:59 AM: > Hi experts, > > I am a swish-e newbie, and am trying to index hypermail archives on a remote web servers. > DefaultContents HTML* Try: DefaultContents XML* instead. > https://www.lbl.gov/lists.archives/theta13-general.archive/:1: error: htmlParseStartTag: invalid element name > <?xml version="1.0" encoding="ISO-8859-1"?> > ^ looks like your hypermail server is returning xhtml with the xml declaration and the libxml2 parser is expecting html. -- Peter Karman . http://peknet.com/ . peter(at)not-real.peknet.com _______________________________________________ Users mailing list Users@lists.swish-e.org http://lists.swish-e.org/listinfo/users Xinchun Tian 2008-03-02 _______________________________________________ Users mailing list Users@lists.swish-e.org http://lists.swish-e.org/listinfo/usersReceived on Sun Mar 2 08:38:10 2008