Skip to main content.
home | support | download

Back to List Archive

Re: Ignore Question

From: Bill Moseley <moseley(at)not-real.hank.org>
Date: Sat Mar 01 2003 - 06:22:24 GMT
BTW -- Here's an example at looking at what the perl variables contain.
It also shows how perl 5.6.1 is broken.  It takes a UTF-8 character and
splits it using a regular expression containing an 8-bit character (not
flagged as UTF-8).

#!perl -w
use strict;
use Devel::Peek;


my $x = "\x{263A}";
Dump($x);
my $y = chr( 128+24 );
Dump($y);

print "\nsplit..\n\n";

my @foo = split /$y/, $x;
print "Split into ", scalar @foo, " scalars\n";

print "\nFirst element:\n";
Dump( $foo[0] );

print "\nSecond element\n";
Dump( $foo[1]);

print "Now try to print\n$foo[0]\n";


..Run this with 5.6.1 and you get:

SV = PV(0x80f6344) at 0x80fd444
  REFCNT = 1
  FLAGS = (PADBUSY,PADMY,POK,pPOK,UTF8)
  PV = 0x80f9e58 "\342\230\272"\0  <<< there's the UTF-8 char.
  CUR = 3
  LEN = 4
SV = PV(0x80f63b0) at 0x80fd414
  REFCNT = 1
  FLAGS = (PADBUSY,PADMY,POK,pPOK)
  PV = 0x80fe300 "\230"\0   <<< non-utf
  CUR = 1
  LEN = 2

split..

Split into 2 scalars

First element:
SV = PV(0x80f64d0) at 0x80fd3c0
  REFCNT = 1
  FLAGS = (POK,pPOK,UTF8)
  PV = 0x8107168 "\342"\0   <<< broken character
  CUR = 1
  LEN = 2

Second element
SV = PV(0x80f6494) at 0x8115fa4
  REFCNT = 1
  FLAGS = (POK,pPOK,UTF8)
  PV = 0x8106e58 "\272"\0   <<< same here.
  CUR = 1
  LEN = 2
Now try to print


Perl 5.8 doesn't do this.

Maybe this will help you debug.



-- 
Bill Moseley moseley@hank.org
Received on Sat Mar 1 06:23:12 2003