Re: Split a string into items of constant length
by Roy Johnson (Monsignor) on Oct 04, 2005 at 12:41 UTC
|
Another solution is to use magical values for open and $/:
my $a = 'a(b)cd(e)fg(h)ij(k)l';
open IN, '<', \$a or die "Reading string: $!\n";
$/=\5;
my @s = <IN>;
print map "$_\n", @s;
(You need to be using a modern perl. I think 5.8.x)
Caution: Contents may have been coded under pressure.
| [reply] [d/l] |
Re: Split a string into items of constant length
by Perl Mouse (Chaplain) on Oct 04, 2005 at 10:11 UTC
|
my @chunks = unpack "(A5)*", "a(b)cd(e)f";
print $chunks[0], "\n";
print $chunks[1], "\n";
__END__
a(b)c
d(e)f
| [reply] [d/l] |
|
| [reply] |
A reply falls below the community's threshold of quality. You may see it by logging in.
|
|
#!/usr/bin/perl
use strict;
use warnings;
while (<DATA>) {
my $line = $_; chomp $line;
my @chunks = unpack "(A5)*", $line;
print @chunks . "\n";
# beware, the last 'entry' in chunks is empty, note the < and >!
foreach my $i (@chunks) {
print ">" . $i . "<\n";
}
}
__DATA__
a(b)cd(e)fg(h)i
j(k)lm(n)o
--
if ( 1 ) { $postman->ring() for (1..2); }
| [reply] [d/l] |
|
my $line = $_; chomp $line;
my @chunks = unpack "(A5)*", $line;
Just a side note: what's wrong with
chomp;
my @chunks = unpack "(A5)*", $_;
?
(or else
while (my $line=<DATA>) { ...
instead.)
Also, more on a stylistic personal preference ground:
foreach my $i (@chunks) {
print ">" . $i . "<\n";
why not
print ">$_<\n" for @chunks;
instead? | [reply] [d/l] [select] |
Re: Split a string into items of constant length
by Samy_rio (Vicar) on Oct 04, 2005 at 10:22 UTC
|
$a="a(b)cd(e)f";
print "\n", substr $a, 0,5,'' until $a eq '';
| [reply] [d/l] |
|
Just a minor nitpick that is always worth to repeat, IMHO: $a should not be used as a general purpose variable -- see sort.
Update: (especially for those who didn't like this post) sauoq explained in greater detail the potential issues with using $a and $b as general purpose variables. He also pointed out that, as is well known, in most cases it won't do much harm. However it was apparent that the person I was answering to was not aware of them and the OP appeared to be a newbie. So, in this context, I'm still convinced it was an important circumstance to bring to their knowledge. Sad to notice more than one's mileage does vary...
| [reply] [d/l] [select] |
|
$ perl -le '$a="foo"; my @n = (3,2,1); print for sort {$a<=>$b} @n; pr
+int $a'
1
2
3
foo
It only becomes an issue if you declare them as lexical variables.
$ perl -le 'my $a; my @n = (3,2,1); print for sort {$a<=>$b} @n'
Can't use "my $a" in sort comparison at -e line 1.
And the better fix is probably not to avoid $a and $b but to be explicit in your sort blocks and subs by using $::a and $::b (or $Foo::a and $Foo::b if you are in package Foo) explicitly.
$ perl -le 'my $a; my @n = (3,2,1); print for sort {$::a<=>$::b} @n;'
1
2
3
That isn't to say avoiding $a and $b is a bad thing... they are generally lousy variable names anyway. But I often use them in one-liners. So long as you know when and, more importantly, why it can be an issue, there's no harm in it.
-sauoq
"My two cents aren't worth a dime.";
| [reply] [d/l] [select] |
|
|
thanks for all those suggestions!
| [reply] |
Re: Split a string into items of constant length
by sauoq (Abbot) on Oct 04, 2005 at 10:22 UTC
|
my $string = "a(b)cd(e)f";
my @parts = split /(?=.{5}$)/, $string';
And if your strings are longer and you want to chop it all up the same way...
my $string = "a(b)cd(e)fg(h)ij(k)l";
my @parts = split /(?=(?:.{5})+$)/, $string';
It would be simpler to drop split in this case though...
my $string = "a(b)cd(e)fg(h)ij(k)l";
my @parts = $string =~ /(.{5})/g;
-sauoq
"My two cents aren't worth a dime.";
| [reply] [d/l] [select] |
|
local $_ = "a(b)cd(e)f";
my @parts = /.{5}/g;
(Yes: of course this does not take care of the case when the given string has a length that is not a multiple of 5. But then I'm using unpack, as well as Perl Mouse does!)
Update: fixed a split that I had inadvertently left in -- see stroken out code above. Thanks to sauog's comment. | [reply] [d/l] [select] |
|
Yes: of course this does not take care of the case when the given string has a length that is not a multiple of 5.
If you want to catch that, you can change the regex to /.{1,5}/g.
perl -le 'print for "ABCDEFGHIJKLMNOPQRSTUVWXYZ" =~ /.{1,5}/g'
ABCDE
FGHIJ
KLMNO
PQRST
UVWXY
Z
Of course, depending on what it's used for, simply discarding it can be a better idea, and /.{5}/g works just fine for that. | [reply] [d/l] [select] |
|
my @parts = split /.{5}/g;
You probably meant to leave split out of that. I'm not sure because of your comment about a string with a length that isn't a multiple of 5 though... In any case, you were probably adding this at the same time I was adding (the correct version of) it as an afterthought to my own post. That's pretty quick as I added it within seconds... :-P
-sauoq
"My two cents aren't worth a dime.";
| [reply] [d/l] [select] |
Re: Split a string into items of constant length
by inman (Curate) on Oct 04, 2005 at 11:07 UTC
|
A simple regex should work fine. No need for anything complicated.
my $data = 'a(b)cd(e)f';
my @chunks = $data =~ /.{5}/g;
print "@chunks";
Change the value in the regex for a different number of chars. | [reply] [d/l] |
Re: Split a string into items of constant length
by blazar (Canon) on Oct 04, 2005 at 10:22 UTC
|
do_something($_) for unpack 'A5A5', 'a(b)cd(e)f';
Update: I know that we should not really care XP points, but since I see a -1 on top of this node, I'm astonished. What's wrong with the code I proposed?
I don't see why it shouldn't work:
$ perl -le 'print for unpack qw/A5A5 a(b)cd(e)f/'
a(b)c
d(e)f
I wonder if the person who downvoted this also had something intelligent to say...
| [reply] [d/l] [select] |