Skip to main content.
home | support | download

Back to List Archive

Re: swishspider

From: Martial Chartoire <chartoir(at)not-real.ipnl.in2p3.fr>
Date: Thu Jul 26 2001 - 14:14:40 GMT
Le 26 Jul, Bill Moseley a ecrit :
> At 12:53 AM 07/26/01 -0700, Martial Chartoire wrote:
> 
>>  sub linkcb {
>>      my($tag, %links) = @_;
>>!     my $link;
>>!     my($ok) = 0; 
>>!     
>>!     if (($tag =~ /^a/) && ($links{"href"}))  {
> 
> Would it be better to enumerate which tags instead of just tags that start
> with an "a"?  Of do you think that's a safe approach?
> 
> Thanks for the patch.
> 
> 
> 
> 
> Bill Moseley
> mailto:moseley@hank.org

I think that's is good because HTML::LinkExtor
kown only this list of links :

# Tags that might contain links and the link attribute name(s)
%LINK_ELEMENT =
(
 a       => 'href',
 applet  => [qw(archive codebase code)],
 area    => 'href',
 base    => 'href',
 bgsound => 'src',
 blockquote => 'cite',
 body    => 'background',
 del     => 'cite',
 embed   => [qw(pluginspage src)],
 form    => 'action',
 frame   => [qw(src longdesc)],
 iframe  => [qw(src longdesc)],
 ilayer  => 'background',
 img     => [qw(src lowsrc longdesc usemap)],
 input   => [qw(src usemap)],
 ins     => 'cite',
 isindex => 'action',
 head    => 'profile',
 layer   => [qw(background src)],
'link'   => 'href',
 object  => [qw(classid codebase data archive usemap)],
'q'      => 'cite',
 script  => [qw(src for)],
 table   => 'background',
 td      => 'background',
 th      => 'background',
 xmp     => 'href',
);

only 'applet' can match '/^a/' but dont match 'href'

-- 
Martial Chartoire, Service Informatique | E-mail: m.chartoire@ipnl.in2p3.fr
Institut de Physique Nucleaire de Lyon  | phone : +33 472 448 430
43, BD du 11 Novembre 1918              | fax   : +33 472 448 004
F 69622 Villeurbanne Cedex              |
Received on Thu Jul 26 14:18:17 2001