punkish has asked for the wisdom of the Perl Monks concerning the following question:

regexp monks, I can't get my head around this one --

I have an arbitrary string like so

ABCD----[this] fgab [that] BFTE-- [other]
I want to convert this into
%hash = ( ABCD => 'this', fgab => 'that', BFTE => 'other', );

I say "arbitrary" because there are many such strings, and they are all different. The only pattern is that the characters within the square brackets are the values, and the characters to the left of the square brackes are the corresponding keys, except for the silly dashes or spaces in between the key and the value. In other words,

AB CD EF---- [foo]
would be
'AB CD EF' => 'foo',
To make life more interesting, the number of keys and values varies from string to string.

muchas gracias por adelantado.

--
when small people start casting long shadows, it is time to go to bed

Replies are listed 'Best First'.
Re: converting an arbitrary string into a hash based on a pattern
by holli (Abbot) on Mar 04, 2005 at 04:23 UTC
    use strict; use Data::Dumper; my %hash; $_="ABCD----[this] fgab [that] BFTE-- [other] AB CD EF---- [foo]"; while ( /\G\s*([\w ]+)[\s-]+\[(\w+)\]/g ) { $hash{$1}=$2; } print Dumper (\%hash); #$VAR1 = { # 'fgab ' => 'that', # 'ABCD' => 'this', # 'BFTE' => 'other', # 'AB CD EF' => 'foo' # };


    holli, /regexed monk/

      Hi Holli,

      Let me know the meaning of \G.

        From "perldoc perlre"
        \G Match only at pos() (e.g. at the end-of-match position of prior m//g) [snip] The "\G" assertion can be used to chain global matches (using " +m//g"), as described in "Regexp Quote-Like Operators" in perlop. It is + also useful when writing "lex"-like scanners, when you have several +patterns that you want to match against consequent substrings of your st +ring, see the previous reference. The actual location where "\G" wil +l match can also be influenced by using "pos()" as an lvalue: see "pos" + in perlfunc. Currently "\G" is only fully supported when anchored +to the start of the pattern; while it is permitted to use it elsewhere +, as in "/(?<=\G..)./g", some such uses ("/.\G/g", for example) current +ly cause problems, and it is recommended that you avoid such usage for n +ow.
Re: converting an arbitrary string into a hash based on a pattern
by gopalr (Priest) on Mar 04, 2005 at 04:43 UTC
    $str='ABCD----[this] fgab [that] BFTE-- [other] AB CD EF---- [foo] +'; $hash{$1}=$2 while ($str=~m#\s*([\w ]+)[\s-]+\[([^\]]+)\]#g); print "$_=$hash{$_}\n" for keys %hash;

    OUTPUT

    fgab=that ABCD=this BFTE=other AB CD EF=foo
Re: converting an arbitrary string into a hash based on a pattern
by punkish (Priest) on Mar 04, 2005 at 05:05 UTC
    Thanks holli and gopalr. Both of you gave me a fantastic (if not a perfect) start ;-). Turns out, there are many variations in my strings, and neither solution found all the key-value pairs correctly, however, gopalr's solution worked better on the data I threw at it. One particular miss was when there were a series of square brackets adjacent to each other, as in
    FOO[this][that][other]
    which had to resolve into
    FOO => 'thisthatother'
    and the other was, empty square brackets were not caught. In other words,
    BAR []
    should have become
    BAR => ''
    In any case, many thanks to the both of you for clearing my haze. I can take your suggestions and create one that will work for me.
    --
    when small people start casting long shadows, it is time to go to bed
      use strict; use Data::Dumper; my %hash; $_="ABCD----[this] fgab [that] BFTE-- [other] AB CD EF---- [foo] +FOO[this][that][other] BAR []"; while ( /\G\s*([\w ]+)[\s-]*\[([\w\[\]]*)\]/g ) { $hash{$1}=$2; $hash{$1} =~ s/\]\[//g; } print Dumper (\%hash); #$VAR1 = { # 'BAR ' => '', # 'fgab ' => 'that', # 'ABCD' => 'this', # 'BFTE' => 'other', # 'FOO' => 'thisthatother', # 'AB CD EF' => 'foo' # };
      I just love regexes. Try to implement that in VB!


      holli, /regexed monk/
Re: converting an arbitrary string into a hash based on a pattern
by ssk (Initiate) on Mar 04, 2005 at 09:54 UTC
    you could try out this...
    $x = 'ABCD----[this] fgab [that] AB CD EF -- [foo]'; $x =~ s/\s*([^\[]+?)\s*\-*\s*\[([^\]]+)\]/$myhash{$1} = $2;/ge; print Dumper(\%myhash);
    -sampath

    2005-03-05 Janitored by Arunbear - added code tags, as per Monastery guidelines

      Welcome to the monastery ssk.

      As this is your first post, you might find reading Perl Monks Approved HTML tags useful, especially the part about code-tags. If you follow the advice there your future writeups will look far better.
      For now, I considered your node to be janitored (to add code-tags).


      holli, /regexed monk/