Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

I'm attempting to extract numbers from codes in this generalised format (always begin with MD:Z: but what follows is variable):

MD:Z:4C3C7C0T2^T9C44

Desired output (separate array elements):

4 3 7 0 2 9 44

As I need to maintain positional information and length of each number (i.e. it must be clear that 44 = 44 and not the product of something like 4T4 after screening out letters) i've tried to use split:

my @test = split((/\D/g), $input);

To split on any non-number; however this returns a set of leading empty array values arising from the consecutive matches on 'MD:Z:'. Is there a way to suppress these or a better way to go about getting the desired result?

Replies are listed 'Best First'.
Re: Leading empty array elements after splitting
by Athanasius (Archbishop) on Aug 22, 2016 at 15:26 UTC

    If you use warnings, you’ll see that the /g modifier on the regex has no effect within a split. But why not keep the modifier, change \D to \d+, and dispense with split altogether?

    1:22 >perl -MData::Dump -wE "my $input = 'MD:Z:4C3C7C0T2^T9C44'; my @ +test = $input =~ /\d+/g; dd \@test;" [4, 3, 7, 0, 2, 9, 44] 1:22 >

    Hope that helps,

    Athanasius <°(((><contra mundum Iustus alius egestas vitae, eros Piratica,

      use warnings; use strict; Are always on for me - no warning was given, odd!

        What version of Perl are you using (perl -v)?

        On both 5.20.3 and 5.24.0, this code produces the following warning:

        perl -wMstrict -E 'my @test = split((/\D/g), "x4d77");' Use of /g modifier is meaningless in split at -e line 1.
Re: Leading empty array elements after splitting
by stevieb (Canon) on Aug 22, 2016 at 15:27 UTC

    I'd use a global regex to extract just the digits directly into an array:

    use warnings; use strict; my $string = 'MD:Z:4C3C7C0T2^T9C44'; my @nums = $string =~ /\d+/g; print "$_\n" for @nums;

    Output:

    4 3 7 0 2 9 44
Re: Leading empty array elements after splitting
by TomDLux (Vicar) on Aug 23, 2016 at 16:01 UTC

    Extracting just the numbers works well, but since you describe a problem with getting blank stuff you don't want, you might as well know about grep ...

    22 11:56:12 tomdlux [ ~ ] $ p522 -E 'say for split /\D+/, "MD:Z:4C3C7C0T2^T9C44"' + 4 3 7 0 2 9 44 23 11:56:19 tomdlux [ ~ ] $ p522 -E 'say for grep {length $_} split /\D+/, "MD:Z:4C3C7C0T2^T9C44 +"' 4 3 7 0 2 9 44

    As Occam said: Entia non sunt multiplicanda praeter necessitatem.