in reply to Regular expression

G'day pravakta,

Welcome to the Monastery.

"Can any body highlight the issue with my code?"

Firstly, there seems to be a mismatch between your stated intent (i.e. "print each number") and the example code that you're using to achieve this (that particular example is demonstrating the use of the '\G' assertion in regexes). I'll attempt to point out where I think you've taken ideas from the example and incorrectly (or, perhaps, just naïvely) used them in your code: that will, of course, just be guesswork.

In the examples below, I've used the alias perle. It saves a lot of typing and should make the one-liners easier for you to read. It's intended to catch many problems and also enable all the features of my current Perl version (5.26.0).

$ alias perle alias perle='perl -Mstrict -Mwarnings -Mautodie=:all -E'

You're using a named capture; I can't see any need for this here. While you're learning, I'd certainly encourage you to try lots of different things; however, start off simply; get that working; then add a new concept; get that working; and so on. When you try to do too much at once, you can end up with a "can't see the wood for the trees" situation. Just to show that level of complexity is unnecessary here:

$ perle 'my $x = "1 12 123"; while ($x =~ /(?<w>\d+)/g) { say $+{w} }' 1 12 123 $ perle 'my $x = "1 12 123"; while ($x =~ /(\d+)/g) { say $1 }' 1 12 123

I won't use the named capture in any further examples. You may want to extend my examples, using named captures, for your own learning experience. I did notice that the main difference between your code and the example, in this area, was "?<weight>" vs. "[+-]?": if that's presenting you with some confusion, or other difficulty, that you can't resolve, just ask.

As AnomalousMonk thought, your anchors are causing problems. Compare these with the last example:

$ perle 'my $x = "1 12 123"; while ($x =~ /^(\d+)/g) { say $1 }' 1 $ perle 'my $x = "1 12 123"; while ($x =~ /(\d+)$/g) { say $1 }' 123 $ perle 'my $x = "1 12 123"; while ($x =~ /^(\d+)$/g) { say $1 }' $

The '^' anchor (start of string) is the only one used in the example code: it has no '$' anchor (end of string). I suspect the 'g' modifier may be at the root of some confusion. Your code wants it for mutliple matches, i.e. each of the numbers in the string. The example code wants it for a completely different reason: as its regex is anchored to the start of the string, it will only match once; it needs that modifier for the correct operation of the '\G' assertion later in the code.

I see you commented out this assignment: "#my $weight = $1;". It's important to understand that capture variables ('$1', '%+', etc.) are cleared whenever a new regex is encountered. Unless you're using those values immediately after the regex, you should assign them to another variable for subsequent use. Compare these:

$ perle 'my $x = "123"; $x =~ /(1)/; say $1' 1 $ perle 'my $x = "123"; $x =~ /(1)/; say $1 if $x =~ /2/' Use of uninitialized value $1 in say at -e line 1. $ perle 'my $x = "123"; $x =~ /(1)/; my $y = $1; say $y' 1 $ perle 'my $x = "123"; $x =~ /(1)/; my $y = $1; say $y if $x =~ /2/' 1

In general, it's good practice to do that anyway: it's called defensive programming. Although your original code might use '$1' immediately after the regex, a later code change might introduce one or more intervening lines: with code more complex than the simple examples here, the effect of adding new code may not be immediately obvious, and, whenever it does becomes noticeable, the source of the bug may not be easy to find.

The "\s*" in the example code is to eat up any optional whitespace before the units ("kg." or "lbs."); the next regex can start to match those units immediately. It is not needed in your code because you're only capturing strings of digits; in fact, it doesn't matter whether the numbers are separated by spaces or something else:

$ perle 'my $x = "1 12 123"; while ($x =~ /(\d+)\s*/g) { say $1 }' 1 12 123 $ perle 'my $x = "1_12_123"; while ($x =~ /(\d+)\s*/g) { say $1 }' 1 12 123 $ perle 'my $x = "1_12_123"; while ($x =~ /(\d+)/g) { say $1 }' 1 12 123

Your final line of code

print "Weight:\n" unless $x=~/Gkg\./g;

has a number of issues.

You have a prompt for "Weight:" but no value. You should reinstate the assignment to '$weight' and include it in the output. You can't use '$1' (or '$+{weight}') because the unless condition will be evaluated first and, as shown above, that contains a regex and existing capture values will be wiped.

You've attempted to copy the "/\Gkg\./g" from the example (I suspect without really understanding that piece of code).

I don't think you really want to conditionally print anyway; however, if you do, you'll need something completely different.

This isn't related to your code; however, I thought I'd just point out this potential gotcha. When a '\G' assertion is used in a loop condition, without a 'g' modifier, this type of infinite loop may result (which I've trapped here after a thousand iterations):

$ perle ' my $x = "AB"; my $c = 0; $x =~ /A/g; while ($x =~ /\GB/) { ++$c; last if $c > 1000; } say $c; ' 1001

Avoid that by using a 'g' modifier:

$ perle ' my $x = "AB"; my $c = 0; $x =~ /A/g; while ($x =~ /\GB/g) { ++$c; last if $c > 1000; } say $c; ' 1

— Ken

Replies are listed 'Best First'.
Re^2: Regular expression
by pravakta (Novice) on Oct 30, 2017 at 23:55 UTC

    surprised my last reply to you dont appear here. I have to draft again </p?

      "surprised my last reply to you dont appear here."

      You've replied to other posts in this thread, so obviously you've worked out how to do it sucessfully (I'm aware that this thread was your first posting here).

      I checked all your posts: you didn't reply to another post by mistake.

      I don't see any anonymous posts that fit the bill. There's only been two of those in the last 24 hours.

      I've no idea what you did wrong. I'll await your redraft.

      — Ken

Re^2: Regular expression
by pravakta (Novice) on Oct 31, 2017 at 20:45 UTC

    Hi Ken,
    A big thanks to you for your detailed analysis and patience to explain things. You guessed it right I am relatively new to ‘serious perl’ learning and have been experimenting with the language. Now coming back o the problem. I understood points you made. You are right that primary motive of my code snippet was to understand the modifiers \g and \G. to better apprciate the effect of \g ad \G I wrote a sample code as follows-

    my $x= '1 2 3kg 4 5 6 7 8 9 10Kg 11 12 13 kg 14 15'; chomp $x; print "Values in variabe x are : \n$x\n"; $x =~ m/(?<weight>\d+)\s*/g; my $weight = $1; print "Matched pattern is : $+{weight}\n"; print "Values in variabe x are : \n$x\n"; #print "Weights are : $1 Kg\n" while $x=~/G(\d+)\s*kg\s*/ig; #print - +1 #print "Weights are : $1 Kg\n" while $x=~/(\d+)\s*kg\s*/ig; #print -2 print "Weights are : $1 Kg\n" while $x=~/(\d+)\s*kg\s*/i; #print -3

    My expected result is<\p>
    Weights are 3kg
    Weights are 10kg
    Weights are 13kg
    I have used three print statements #print-1/2/3. My observations-
    print-1-> I was expecting this to be the right statement for my output requirement. But enabling this doesn't seem to have any thing matching. No print.
    print-2-> this one does the job. As I understand its kind of global match and with every iteration of the loop it start looking in the string from a point where it matched last.
    print-3-> goes in a infinite loop whihc I understand due the fact that every iteration of the loop start looking from the start of the string and it always find 3kg there. So it only print 3kg in infinite loop.
    Please add some insight on what is the importance of \G and what are some practice usage scenario of \G?

      Your test results:

      print-1
      In "/G(\d+)\s*kg\s*/ig" you're making exactly the same mistake as you did in your OP and which I pointed out in my initial response: "('G' should be '\G')". There is no 'G' to match!
      print-2
      In "/(\d+)\s*kg\s*/ig", you've used the 'g' modifier. It's repeatedly matching each number followed by the units (one after another). As you say: "this one does the job".
      print-3
      Your explaination of "/(\d+)\s*kg\s*/i" is correct.
      "Please add some insight on what is the importance of \G and what are some practice usage scenario of \G?"

      '\G' is not an assertion I use that much: I can't really give you an "I often find it useful for ..." type answer.

      There's more information in "perlre: Assertions"; as well as links to additional, related documentation.

      — Ken

        Ohh my bad. I corrected G with \G but still no chnage
        print "Weights are : $1 Kg\n" while $x=~/\G(\d+)\s*kg\s*/ig; #print -1

        prints nothing. Still not sure what difference it was supposed to make. NO issues if you don't have much experience to share with this. I will try reading about it more.
        Thanks for your help.