nabeenj has asked for the wisdom of the Perl Monks concerning the following question:

Hi Monks

I'm new to perlmonks and seeking your wisdom.

Here's my question...

I have a string such as this –

something in this string is a number. ${=NumberFormat "99999999"} is a number. And another ${=CharFormat "X*"} is ${=NumberFormat "9+"}.

I want to match everything ahead of ${=CharFormat "X*"} up to the first occurrence of '${'. If this literal string '${' does not occur in the rest of the string, then match up to the end. So if, by mistake, the left curly bracket is missing from the last NumberFormat placeholder – i.e. if that placeholder has been written '$=NumberFormat "9+"}' – then match up to the end of the string.

Below is my test perl script that is matching the string between the CharFormat and NumberFormat placeholders, but if I remove the left curly bracket from the NumberFormat placeholder nothing matches.

my $string_exp = 'something in this string is a number. ${=NumberForma +t "99999999"} is a number. And another ${=CharFormat "X*"} is ${=Numb +erFormat "9+"}.'; # remove the left curly bracket from the last NumberFormat placeholder + in the above string my $placeholder = '${=CharFormat "X*"}'; print "$string_exp\n"; print "$placeholder\n"; if ( $string_exp =~ m/\Q$placeholder\E(.*)?(\$\{)|\z/ ) { print "\"$1\"\n"; } print "END\n";

I hope this makes some sense.

Thanks!

Replies are listed 'Best First'.
Re: Regex matching next occurrence or to the end of line.
by Eily (Monsignor) on Feb 04, 2015 at 16:54 UTC

    I may have misunderstood what you are asking for, but I think the solution to your problem is look-ahead assertions.

    m< (?: # A char that is: [^\$] # Not a $ | # Or \$ (?! {= ) # A $ not followed by {= )+ # Many times >x;

      Many thanks for your response Eily. I have not used look-ahead assertions before. Useful stuff and it looks like what I need... I have had a reply from a colleague at work with a slight modification to the regex I was using. This works -

      if ( $string_exp =~ m/\Q$placeholder\E(.*?)((\$\{)|\z)/ ) { print "\"$1\"\n"; }

      This matches up to the next occurrence of the literal ${, otherwise to the end of the string.

      Thanks again.

Re: Regex matching next occurrence or to the end of line.
by hdb (Monsignor) on Feb 04, 2015 at 18:21 UTC

    I am not completely sure what you want to achieve but using split capturing the separators using parentheses could help here:

    use strict; use warnings; use Data::Dumper; my $string_exp = 'something in this string is a number. ${=NumberForma +t "99999999"} is a number. And another ${=CharFormat "X*"} is ${=Numb +erFormat "9+"}.'; my @pieces = split /(\$\{.*?\})/, $string_exp; print Dumper \@pieces;

    gives you

    $VAR1 = [ 'something in this string is a number. ', '${=NumberFormat "99999999"}', ' is a number. And another ', '${=CharFormat "X*"}', ' is ', '${=NumberFormat "9+"}', '.' ];

    and the following if the curly brace is missing:

    $VAR1 = [ 'something in this string is a number. ', '${=NumberFormat "99999999"}', ' is a number. And another ', '${=CharFormat "X*"}', ' is $=NumberFormat "9+"}.' ];

      Thank you for this. Really useful stuff which I am sure I'll be using as this work progresses.

      The script is matching a SOAP response to an expected SOAP response XML file. The expected SOAP response can have these sort of placeholders to match actual values (strings, numbers, dates) in the response that can be of variable length, and also values that may or may not be there - thus the requirement of the ${=CharFormat "X*"} placeholder. Much of the work is complete. Much appreciation and credit goes to the work by G. Wade Johnson in HTTPTest.

      Thanks again.