Ideally, I want to use method 1 and create the regex on the fly from tags read from other files. As you can see, my attempt fails miserably (does not split where I want to split).
So I tried method 2 and made the regex manually. That one almost works, except that I end up with an extra (empty) value in $stuff2[0], which makes a subsequent %hash = @stuff2 break.
OK, the input is not that ugly, but I thought the solution would be easier. What am I missing?
OUTPUT:use 5.010; use strict; use warnings; $/ = undef; my $data = <DATA>; #method 1: build regex on the fly (read tags from files) say '---------- method 1 ----------'; my @tags = qw( TAG1 TAG2 TAG3 TAG4 ); my $tags_re = join "|", @tags; $tags_re = qr{ $tags_re }; say $tags_re; my @stuff = split /($tags_re)=\s*/, $data; say "#$_#" for @stuff; # method 2: static regex say '---------- method 2 ----------'; my @stuff2 = split /(TAG1|TAG2|TAG2|TAG4)=\s*/, $data; say "#$_#" for @stuff2; __DATA__ TAG1= data TAG2= more data TAG3= even more data that sometimes has = and runs on to more than one line TAG4= still more
---------- method 1 ---------- (?-xism: TAG1|TAG2|TAG3|TAG4 ) #TAG1= data # #TAG2# #more data # #TAG3# #even more data that sometimes has = and runs on to more than one line TAG4= still more # ---------- method 2 ---------- ## #TAG1# #data # #TAG2# #more data TAG3= even more data that sometimes has = and runs on to more than one line # #TAG4# #still more #
In reply to problems splitting ugly input data by Anonymous Monk
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |