xMiDgArDx has asked for the wisdom of the Perl Monks concerning the following question:

I have this 2 blocks that i need to parse:
#!perl while(<DATA>){ if ($_ =~ /^\s*(\d+?)\s+(\d+?)\s+?(.*?) \s+(.*?)\s+(.*)/) { print "================= IF 1 =================\n"; print "SLOT: $1\n"; print "PORTAS: $2\n"; print "DESC: $3\n"; print "Model: $4\n"; print "Sw: $5\n"; print "\n"; } elsif($_ =~ /^\s*(\d+)\s+(.+)\s*((?:WS|76).*?)\s+(\w+)\s+(.*)\s+(\w ++)$/) { print "================= ELSIF =================\n"; print "<> Slot: $1\n"; print "<> Desc: $2\n"; print "<> Model: $3\n"; print "<> Sw: $4\n"; print "<> Hw: $5\n"; print "<> Status: $6\n"; print "\n"; } } close(DATA); __DATA__ Mod Ports Card Type Model Se +rial No. --- ----- -------------------------------------- ------------------ -- +--------- 1 24 CEF720 24 port 1000mb SFP WS-X6724-SFP SA +L1434RA09 3 20 7600 ES+T 76-ES+T-20G JA +E14530455 5 2 Route Switch Processor 720 (Active) RSP720-3CXL-GE JA +E14330N9B 6 2 Route Switch Processor 720 (Hot) RSP720-3CXL-GE JA +E14330NA6 7 4 CEF720 4 port 10-Gigabit Ethernet WS-X6704-10GE SA +L1433QVJQ 8 4 CEF720 4 port 10-Gigabit Ethernet WS-X6704-10GE SA +L1433QVJW Mod Sub-Module Model Serial Hw + Status ---- --------------------------- ------------------ ----------- ------ +- ------- 1 Distributed Forwarding Card WS-F6700-DFC3CXL SAL1434RLPY 1.6 + Ok 3 7600 ES+ DFC XL 7600-ES+3CXL JAE14520N29 1.2 + Ok 3 7600 ES+T 20x1GE SFP 76-ES+T-20GQ JAE145301XM 1.1 + Ok 5 Policy Feature Card 3 7600-PFC3CXL JAE14330E6J 1.1 + Ok 5 C7600 MSFC4 Daughterboard 7600-MSFC4 JAE14320QBE 1.6 + Ok 6 Policy Feature Card 3 7600-PFC3CXL JAE14330EAO 1.1 + Ok 6 C7600 MSFC4 Daughterboard 7600-MSFC4 JAE14320QA8 1.6 + Ok 7 Distributed Forwarding Card WS-F6700-DFC3CXL SAL1433QHBR 1.6 + Ok 8 Distributed Forwarding Card WS-F6700-DFC3CXL SAL1433QXF9 1.6 + Ok
first If would parse only first block ... and elsif the second block ... But I have a problem if the description starts with (\d+): => 3 20 7600 ES+T 76-ES+T-20G JAE14530455 => 3 7600 ES+ DFC XL 7600-ES+3CXL JAE14520N29 1.2 Ok => 3 7600 ES+T 20x1GE SFP 76-ES+T-20GQ JAE145301XM 1.1 Ok My problem is with this 3 lines, because he only appears in the first IF, and not in elsif ... Nobody know how I can do this? Tnx

Replies are listed 'Best First'.
Re: Problem with regex ...
by johngg (Canon) on Dec 26, 2012 at 18:07 UTC

    Your data appears to be tabular in fixed-width columns so consider using unpack rather than regular expression captures. See pack for a description of how to construct a template to extract your data items.

    Cheers,

    JohnGG

      Hei JohnGG

      Better don't spend your time repeating this, we already preached it!

      And the OP got brilliant answers with excellent code snippets. =)

      --> How we can do regex of this?

      @xMiDgArDx: please clarify your problem better, and use code tags there too. And in the future please link to older discussions about the same topic to avoid loss of energy.

      Cheers Rolf

Re: Problem with regex ...
by LanX (Saint) on Dec 26, 2012 at 22:26 UTC
    If your problem is to distinguish between different blocks, then test for the delimiting "white line".

    my $block=0; while (<DATA>){ if (!/^\s*$/) { print "$block: $_"; } else { $block++; } } __DATA__ Mod Ports Card Type Model Se +rial No. --- ----- -------------------------------------- ------------------ -- +--------- 1 24 CEF720 24 port 1000mb SFP WS-X6724-SFP SA +L1434RA09 3 20 7600 ES+T 76-ES+T-20G JA +E14530455 4 ... Mod Sub-Module Model Serial Hw + Status ---- --------------------------- ------------------ ----------- ------ +- ------- 1 Distributed Forwarding Card WS-F6700-DFC3CXL SAL1434RLPY 1.6 + Ok 3 7600 ES+ DFC XL 7600-ES+3CXL JAE14520N29 1.2 + Ok 5 ...

    now you can delegate the rest to specific one-block solution(s).

    Cheers Rolf

Re: Problem with regex ...
by jandrew (Chaplain) on Dec 26, 2012 at 19:24 UTC

    With the risk of good money after bad and acknowledging that this could have the (your prefered afterlife paradigm here) golfed out of it. I submit the following column split using named captures for each column.

    Update: fixed the position split. Requires perl 5.10 or higher.

      A slightly more extensible version with a complete dispatch array implementation matched to named capture regular expressions for the columns. (Use the same data file.)

      Update: removed un-needed variable declarations, still requires perl 5.10 or higher

Re: Problem with regex ...
by Anonymous Monk on Dec 26, 2012 at 22:30 UTC
    Just swap the if block with the elsif block and everything works fine.
    However, the regular expressions are not very good written.

    You might want something like this:
    #!perl my @lens; while (<DATA>) { /\S/ || next; my $pos = tell(DATA); my $nline = <DATA>; if ($nline =~ /-/ && $nline =~ /^[-\s]+$/) { @lens = map { length() + 1 } split(' ', $nline); next; } else { seek DATA, $pos, 0; } my @values = map { s{^\s+}{}; $_ } unpack(join('', map { "A${_}" } + @lens), $_); if (@lens == 5) { print "================= IF 1 =================\n"; print "SLOT: $values[0]\n"; print "PORTAS: $values[1]\n"; print "DESC: $values[2]\n"; print "Model: $values[3]\n"; print "Sw: $values[4]\n"; print "\n"; } elsif (@lens == 6) { print "================= ELSIF =================\n"; print "<> Slot: $values[0]\n"; print "<> Desc: $values[1]\n"; print "<> Model: $values[2]\n"; print "<> Sw: $values[3]\n"; print "<> Hw: $values[4]\n"; print "<> Status: $values[5]\n"; print "\n"; } } close(DATA); __DATA__ Mod Ports Card Type Model Se +rial No. --- ----- -------------------------------------- ------------------ -- +--------- 1 24 CEF720 24 port 1000mb SFP WS-X6724-SFP SA +L1434RA09 3 20 7600 ES+T 76-ES+T-20G JA +E14530455 5 2 Route Switch Processor 720 (Active) RSP720-3CXL-GE JA +E14330N9B 6 2 Route Switch Processor 720 (Hot) RSP720-3CXL-GE JA +E14330NA6 7 4 CEF720 4 port 10-Gigabit Ethernet WS-X6704-10GE SA +L1433QVJQ 8 4 CEF720 4 port 10-Gigabit Ethernet WS-X6704-10GE SA +L1433QVJW Mod Sub-Module Model Serial Hw + Status ---- --------------------------- ------------------ ----------- ------ +- ------- 1 Distributed Forwarding Card WS-F6700-DFC3CXL SAL1434RLPY 1.6 + Ok 3 7600 ES+ DFC XL 7600-ES+3CXL JAE14520N29 1.2 + Ok 3 7600 ES+T 20x1GE SFP 76-ES+T-20GQ JAE145301XM 1.1 + Ok 5 Policy Feature Card 3 7600-PFC3CXL JAE14330E6J 1.1 + Ok 5 C7600 MSFC4 Daughterboard 7600-MSFC4 JAE14320QBE 1.6 + Ok 6 Policy Feature Card 3 7600-PFC3CXL JAE14330EAO 1.1 + Ok 6 C7600 MSFC4 Daughterboard 7600-MSFC4 JAE14320QA8 1.6 + Ok 7 Distributed Forwarding Card WS-F6700-DFC3CXL SAL1433QHBR 1.6 + Ok 8 Distributed Forwarding Card WS-F6700-DFC3CXL SAL1433QXF9 1.6 + Ok