"\s is a whitespace character and represents
[\ \t\r\n\f]
Hence, I suspect that if the data OP is dealing with has embedded tab chars, the count will be unreliable.
FTR, brian d. foy's remarks in perlfaq4.pod either ignore this case or indicate there's something wrong with my understanding (in which case, correction would be welcome).
So, in the spirit of self-education, I tried this little experiment (Update: all tabs in original are hard tabs; here they are replaced by multiple spaces</update>):
Which largely undermines my supposition above.#!usr/bin/perl use strict; use warnings; my @var= ('now is the time', #space, tab between "n +ow" and "is" ' for all good men', #leading tab, no spac +e 'to come to the aid of their party.' #space, tab +before party ); #hbm's method: my $count = 0; my $linecount = 0; for my $var(@var) { $linecount = $var =~ tr/ \t/ \t/; print "\$linecount: $linecount\n"; $count += $linecount; } print "$count \n"; #hbm's method with \t (tr doesn't know from "\s" my $count_s = 0; my $linecount_s = 0; for my $var(@var) { $linecount_s = $var =~ tr/\t/|\t/; print "\$linecount_s: $linecount_s\n"; $count_s += $linecount_s; } print "$count_s \n"; =head OUTPUT $linecount: 4 $linecount: 4 $linecount: 8 16 # WTF? with tabs converted to spaces, I count 17 as I h +ave my tabs set. $linecount_s: 1 $linecount_s: 1 $linecount_s: 1 3 =cut
Update 20090212 00:35
Ignore the comment in line 37. That's not a Perl issue (nor a reflection of my inability to count, but it's waaaaay OT and way complicated). But having elaborated the code in this manner (still using the same array):
# and now using \s & /g # output smells bad print "and now using \\s only, with /g modifier\n"; my $count_s = 0; my $linecount_s = 0; for my $var(@var) { $linecount_s = scalar ( $var =~ s/\s/_/g ); print "\$linecount_s: $linecount_s, \$var after substitution: $var +\n"; $count_s += $linecount_s; } print "\$count_s: $count_s \n\n";
the output of that snippet confounds me:
and now using \s only, with /g modifier $linecount_s: 3, $var after substitution: now_|is_the_time $linecount_s: 3, $var after substitution: |for_all_good_men $linecount_s: 7, $var after substitution: to_come_to_the_aid_of_their_ +|party. $count_s: 13
because -- while substituting "_" (only) for \s I now find pipes in the output where \t existed in @var. WTF????
More in the next node below, but it does NOT explain the pipes. :-(
In reply to Re: counting leading white spaces
by ww
in thread counting leading white spaces
by Spooky
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |