Does Perl's w
match all alphanumeric characters defined in the Unicode standard?
For example, will w
match all (say) Chinese and Russian alphanumeric characters?
I wrote a simple test script (see below) which suggests that w
does indeed match "as expected" for the non-ASCII alphanumeric characters I tested. But the testing is obviously far from exhaustive.
#!/usr/bin/perl
use utf8;
binmode(STDOUT, ':utf8');
my @ok;
$ok[0] = "abcdefghijklmnopqrstuvwxyz";
$ok[1] = "éè?áà???????í?ń??áy?ó?????";
$ok[2] = "??ü??ai?ó?ń???íáυσνχατ???η";
$ok[3] = "τσιαιγολοχβ?αν???????тераб";
$ok[4] = "иневоаслк??иневоцеда?еволс";
$ok[5] = "рглсывызтоμ??κινα??γο";
foreach my $ok (@ok) {
die unless ($ok =~ /^w+$/);
}
See Question&Answers more detail:
os 与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…