in reply to Re: Parsing a Variable Format String
in thread Parsing a Variable Format String

Thanks for the suggestions... I'll put it in the pot(!)

After some more walking and thinking, a woefully poor way of doing what I (ultimately) need might be:
@opts = ( "SS 21 PL 2#3 PV 51.3 CL #110 +0 RL 126' SA 106 DS 93", "SS 21 PL 2.3 PVa51.3 CT^ 110 +0 RL126, SA 106 DS 93", "SS 21 PL2#3# PV 51.3 CL #110 +0 RL 126' SA 106 DS 93" +, "SS 21 PL2.3# PV 51.3 CL #110 +0 RL 126' SA 106 DS 93" ); foreach $buf (@opts) { printf(" 1 2 3 4 5 + 6\n"); printf(" 0123456789012345678901234567890123456789012345678901 +234567890\n"); printf("\$buf: >%s<\n", $buf); printf("\n"); $ssos = index($buf, "SS", 0); $plos = index($buf, "PL", $ssos); $pvos = index($buf, "PV", $plos); $ctos = index($buf, "CT", $pvos); if ($ctos < 0) { $ctos = index($buf, "CL", $pvos); } $rlos = index($buf, "RL", $ctos); $saos = index($buf, "SA", $rlos); $dsos = index($buf, "DS", $saos); printf("\$ssos = $ssos\n"); printf("\$plos = $plos\n"); printf("\$pvos = $pvos\n"); printf("\$ctos = $ctos\n"); printf("\$rlos = $rlos\n"); printf("\$saos = $saos\n"); printf("\$dsos = $dsos\n"); printf("\n"); $ssstr = substr($buf, $ssos+2, $plos - $ssos - 2); $plstr = substr($buf, $plos+2, $pvos - $plos - 2); $pvstr = substr($buf, $pvos+2, $ctos - $pvos - 2); $ctstr = substr($buf, $ctos+2, $rlos - $ctos - 2); $rlstr = substr($buf, $rlos+2, $saos - $rlos - 2); $sastr = substr($buf, $saos+2, $dsos - $saos - 2); $dsstr = substr($buf, $dsos+2); $ssstr =~ s/\s+//g; $plstr =~ s/\s+//g; $pvstr =~ s/\s+//g; $ctstr =~ s/\s+//g; $rlstr =~ s/\s+//g; $sastr =~ s/\s+//g; $dsstr =~ s/\s+//g; printf("\$ssstr = >$ssstr<\n"); printf("\$plstr = >$plstr<\n"); printf("\$pvstr = >$pvstr<\n"); printf("\$ctstr = >$ctstr<\n"); printf("\$rlstr = >$rlstr<\n"); printf("\$sastr = >$sastr<\n"); printf("\$dsstr = >$dsstr<\n"); printf("\n"); } # another opt
...which gives a typical output:
$ssstr = >21< $plstr = >2#3< $pvstr = >51.3< $ctstr = >#110+0< $rlstr = >126'< $sastr = >106< $dsstr = >93<
...but that's pretty dashed ugly, even if it does work.

Now, if I could replicate all that index/substr garbage with something more elegant...

Replies are listed 'Best First'.
Re^3: Parsing a Variable Format String
by jethro (Monsignor) on Jul 10, 2008 at 03:04 UTC
    The more elegant is, as you already guessed, a regex.

    When you are looking for 'CT', the regex is /CT/. when you are looking for the first number after 'CT', the regex becomes/CT .*? (\d*)/x. The x at the end of the regex allows me to insert spaces so that the regex is easier to read. They don't get matched. If you really need to match a space, you can put a slash before it or use \s which parses anything spacy, like tab characters too

    When this regex matches something, it returns true. In that case what was parsed between the first and only parens is now in $1. Further parens in the regex would be stored in $2,$3,$4 and so on

    The .*? matches anything, but tries to match as few characters as possible

    With slight variations of this regex you probably can substitute all your index thingies.

    You can even combine your regexes to one long regex by combining all of them with .*? inbetween.