comment on

EDIT: added the third possibility to the benchmark (null out the 'vlan ' and then split)

RE-EDIT: grumble grumble markup grumble

Captures and the /g flag will help here. (Rewritten slightly to allow me to test the code easily.)

#!/opt/local/bin/perl
use strict;
use warnings;

open my $vlan_in, '<', shift @ARGV;
my @vlans = <$vlan_in>;
foreach my $line (@vlans) {
  chomp $line;
  my @items = ($line =~ /(    # capture:
                          \d  # digit characters
                            + # one or more in a row
                             )/gx );   # as many as you can find
  foreach my $item (@items) {
    next unless defined $item;
    $item =~ s/,//;
    print "vlan $item\n";
  }
}
[download]

The regex is /x'ed for tutorial purposes; you'd almost certainly write it as /(\d+)/g in your program. Since we're using \d+, we'll match the longest possible string of digits in each case. Alternatively, we could use split():

#!/opt/local/bin/perl
use strict;
use warnings;

open my $vlan_in, '<', shift @ARGV;
my @vlans = <$vlan_in>;
foreach my $line (@vlans) {
  chomp $line;

  # Separate the 'vlan' from the list (and throw it away).
  my (undef, $vlans) = split /\s+/, $line;

  # Break up the list into items.
  my @items = split /,/, $vlans;

  # Print your new output.
  foreach my $item (@items) {
    next unless defined $item;
    $item =~ s/,//;
    print "vlan $item\n";
  }
}
[download]

Timing this:

use Benchmark qw(:all);
my @lines = split /\n/, <<EOF;
vlan 107
vlan 121
vlan 122,127,129,137
vlan
EOF
pop @lines;
cmpthese(
  500_000,
  { 'split-split' => sub {
                     my @copy = @lines;
                     foreach my $line (@copy) {
                         my (undef, $vlans) = split /\s+/, $line;
                         my @items = split /,/, $vlans;
                      }
                 },
     '/g'    => sub {
                      my @copy = @lines;
                      foreach my $line (@copy) {
                         my @items = ($line =~ /(\d+)/g);
                      }
                },
    'sub-split' => sub {
                      my @copy = @lines;
                      foreach my $line (@copy) {
                         s/vlan //;
                         my @items = split /,/, $line;
                      }
                   },
   }
);
[download]

shows that substitute then split is fastest. 100 thousand iterations:

                Rate split-split          /g   sub-split
split-split 104167/s          --         -8%        -22%
/g          113636/s          9%          --        -15%
sub-split   133333/s         28%         17%          --
[download]

500 thousand:

                Rate split-split          /g   sub-split
split-split 102459/s          --        -11%        -26%
/g          115741/s         13%          --        -16%
sub-split   138504/s         35%         20%          --
[download]

1 million:

                Rate split-split          /g   sub-split
split-split 100503/s          --        -12%        -28%
/g          114286/s         14%          --        -18%
sub-split   138889/s         38%         22%          --
[download]

5 million:

                Rate split-split          /g   sub-split
split-split 102480/s          --        -10%        -24%
/g          114495/s         12%          --        -15%
sub-split   134590/s         31%         18%          --
[download]

And that's all I feel like running. Obviously sub-split keeps getting better as the number of iterations increase; I think that's because the sub allows the string to be "shrunk" in-place without allocating any more memory. However, it starts falling off again at 5 million iterations; someone with more time than me might want to investigate more iterations.

Basically, substitute out the stuff you don't need then split is fastest.

In reply to Re: replace separator from array elements by pemungkah
in thread replace separator from array elements by nidhi

Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!

Titles consisting of a single word are discouraged, and in most cases are disallowed outright.

Read Where should I post X? if you're not absolutely sure you're posting in the right place.

Please read these before you post! —

Posts may use any of the Perl Monks Approved HTML tags:

a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr

You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)

	For:		Use:
	&		`&`
	<		`<`
	>		`>`
	[		`[`
	]		`]`

Link using PerlMonks shortcuts! What shortcuts can I use for linking?

See Writeup Formatting Tips and other pages linked from there for more info.