Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

Hi Monks, I have an output as below
2005-04-01 root replica "ml_v_dialer" 2004-06-22 root replica "pu_v_dialer" 2006-02-11 ccvob01 replica "rd_v_dialer" "v_dialer replica for Redmond" 2003-11-25 root replica "v_dialer_drcalvin"

From this output i want to store only the replica names as ml_v_dialer,pu_v_dialer,rd_v_dialer,v_dialer_drcalvin in one array,can anyone tell me what regular expression to use to get the replica names in an array.

Replies are listed 'Best First'.
Re: how to parse to get only the replica names in an array
by citromatik (Curate) on Aug 08, 2007 at 09:45 UTC

    You can try something like:

    use strict; use warnings; my @replicas; while (<DATA>){ next if (! /^\d+/); /replica\s+"(.+)"/; push @replicas,$1; } __DATA__ 2005-04-01 root replica "ml_v_dialer" 2004-06-22 root replica "pu_v_dialer" 2006-02-11 ccvob01 replica "rd_v_dialer" "v_dialer replica for Redmond" 2003-11-25 root replica "v_dialer_drcalvin"

    Update: Another possibility (if the file to process is not too big):

    use strict; use warnings; { local $/; my @replica = <DATA> =~ /replica\s+"(.+)"/g; }

    citromatik

      The third would return rd_v_dialer" "v_dialer replica for Redmond instead of the wanted rd_v_dialer. You need the non-greedy quantifier with "?" to match the nearest closing '"' (as little as possible). So it should be /replica\s+"(.+?)"/;.

      Open source softwares? Share and enjoy. Make profit from them if you can. Yet, share and enjoy!

        Hmmm, Are you sure? the dot "." matches everything but "\n", so it will not expand through newlines (see perlretut):

        use strict; use warnings; use Data::Dumper; { local $/; my @replica = <DATA> =~ /replica\s+"(.+)"/g; print Dumper \@replica; } __DATA__ 2005-04-01 root replica "ml_v_dialer" 2004-06-22 root replica "pu_v_dialer" 2006-02-11 ccvob01 replica "rd_v_dialer" "v_dialer replica for Redmond" 2003-11-25 root replica "v_dialer_drcalvin"

        Outputs:

        $VAR1 = [ 'ml_v_dialer', 'pu_v_dialer', 'rd_v_dialer', 'v_dialer_drcalvin' ];

        And:

        use strict; use warnings; use Data::Dumper; my @replica; while (<DATA>){ next if (! /^\d+/); /replica\s+"(.+)"/; push @replica,$1; } print Dumper \@replica; __DATA__ 2005-04-01 root replica "ml_v_dialer" 2004-06-22 root replica "pu_v_dialer" 2006-02-11 ccvob01 replica "rd_v_dialer" "v_dialer replica for Redmond" 2003-11-25 root replica "v_dialer_drcalvin"

        Prints the same:

        $VAR1 = [ 'ml_v_dialer', 'pu_v_dialer', 'rd_v_dialer', 'v_dialer_drcalvin' ];

        citromatik

Re: how to parse to get only the replica names in an array
by GrandFather (Saint) on Aug 08, 2007 at 09:49 UTC

    On the face of it you need simply match the first double quote character then capture all the following characters up to the next double quote. See perlretut and perlre for the regex documentation.

    If you have trouble putting a regex together, write something (anything actually), then post your attempt in a follow up question and we will help more.


    DWIM is Perl's answer to Gödel
Re: how to parse to get only the replica names in an array
by naikonta (Curate) on Aug 08, 2007 at 10:25 UTC
    This will do:
    $ cat replica my @dialers; while (<DATA>) { chomp; push @dialers, $1 if /replica\s+"([^"]+)/; } print $_, "\n" for @dialers; __DATA__ 2005-04-01 root replica "ml_v_dialer" 2004-06-22 root replica "pu_v_dialer" 2006-02-11 ccvob01 replica "rd_v_dialer" "v_dialer replica for + Redmond" 2003-11-25 root replica "v_dialer_drcalvin" $ perl replica ml_v_dialer pu_v_dialer rd_v_dialer v_dialer_drcalvin
    You can also use split to achieve the same, so instead of,
    push @dialers, $1 if /replica\s+"([^"]+)/;
    you can write,
    (my $dialer = (split)[3]) =~ s/"//g; push @dialers, $dialer;

    Open source softwares? Share and enjoy. Make profit from them if you can. Yet, share and enjoy!