dsayars has asked for the wisdom of the Perl Monks concerning the following question:
I have a simple extraction script for extracting shape names from a visio stencil in .vsx (XML format). Problem is, there's a bug in Visio that puts newlines in some of the name strings. If the names are clean, this works as a regex:
<Master ID='.*?' NameU='(.*?)'"(.*?)" then extracts fine as $1.
However, since newlines are present, I have to OR with something that matches them:
while ($text=~/<Master ID='.*?' NameU='(.*?)' |<Master ID='.*?' NameU='(.*?)/sg)This matches the names containing newlines, but apparently because the match goes over the line boundary, the $1 contains a null. Only an empty line in my output shows that the match was made.
Is there a way out of this catch 22 when you have to match something containing a newline and extract data from it?
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re: Extract data from regex match where "." is newline?
by Eliya (Vicar) on Dec 15, 2011 at 21:22 UTC | |
|
Re: Extract data from regex match where "." is newline?
by ww (Archbishop) on Dec 16, 2011 at 02:05 UTC | |
by dsayars (Initiate) on Dec 17, 2011 at 06:10 UTC | |
|
Re: Extract data from regex match where "." is newline?
by muba (Priest) on Dec 16, 2011 at 01:58 UTC |