Beefy Boxes and Bandwidth Generously Provided by pair Networks
Think about Loose Coupling
 
PerlMonks  

Re: Regex question

by BillKSmith (Monsignor)
on Sep 17, 2023 at 04:16 UTC ( [id://11154496] : note . print w/replies, xml ) Need Help??


in reply to Regex question

If numbers, operators, and whitespace are as shown:
use strict; use warnings; use Test::More tests => 1; my $sample_string = '240 x 240 x 2/3600'; my $number = qr/1?\d{1,4}/; my $valid = $sample_string =~ m{$number x $number x $number/$number}; ok($valid, 'Sample string');

UPDATE: Add revised code incorporating comments below.

use strict; use warnings; use Test::More tests => 6; my @valid_strings = ( ['240 x 240 x 2/3600', 'Sample'], ['10000 x 240 x 2/3600', 'Max number'], ['1 x 240 x 2/3600', 'Min number'], ); #my $number = qr/1?\d{1,4}/; my $number = qr/10000|[1-9]\d{0,3}/; my $valid = qr {\A$number x $number x $number/$number\z}; foreach my $case (@valid_strings) { like $case->[0], $valid, $case->[1] } my @invalid_strings = ( ['240 * 240 x 2/3600', 'Missing operator'], ['10001 x 240 x 2/3600', 'number exceeds max'], ['0 x 240 x 2/3600', 'number less than minimum'], ); foreach my $case (@invalid_strings) { unlike $case->[0], $valid, $case->[1] }

OUTPUT:

1..6 ok 1 - Sample ok 2 - Max number ok 3 - Min number ok 4 - Missing operator ok 5 - number exceeds max ok 6 - number less than minimum
Bill

Replies are listed 'Best First'.
Re^2: Regex question
by kcott (Archbishop) on Sep 17, 2023 at 09:51 UTC

    G'day Bill,

    I'd say you're on the right track creating a regex for $number and building up a more complex regex from there. Unfortunately, you're matching some things that you shouldn't.

    $ perl -E ' say "$_: ", /1?\d{1,4}/ ? "Y" : "N" for qw{240 2 3600 1 10000 0 0000 19999 999999999}; ' 240: Y 2: Y 3600: Y 1: Y 10000: Y 0: Y 0000: Y 19999: Y 999999999: Y

    I'd aim for a more stringent regex for $number.

    $ perl -E ' say "$_: ", /(?<![0-9])(?:10000|[1-9][0-9]{0,3})(?![0-9])/ ? "Y" : + "N" for qw{240 2 3600 1 10000 0 0000 19999 999999999}; ' 240: Y 2: Y 3600: Y 1: Y 10000: Y 0: N 0000: N 19999: N 999999999: N

    The OP is somewhat unclear in that it shows an example with spaces then says spaces are removed. Stuff is also removed, whatever that refers to. There's not much we can do about that beyond requesting clarification.

    I'd also add ^ and $ (or equivalent) assertions to the final regex.

    — Ken

      Well, I would use \b,
      say "$_: ", /\b(10000|[1-9][0-9]{0,3})\b/ ? "Y" : "N" for qw{240 2 3600 1 10000 0 0000 19999 999999999};
      Same as your result:
      240: Y 2: Y 3600: Y 1: Y 10000: Y 0: N 0000: N 19999: N 999999999: N
      updated: made capturing group
Re^2: Regex question
by Marshall (Canon) on Sep 17, 2023 at 07:13 UTC
    I would add \b boundaries at the front and rear else for example, 2/365555555 is a valid number. m{\b$number x $number x $number/$number\b} Update: alternatively using {\A$number x $number x $number/$number\z} or {^$number x $number x $number/$number$} also looks fine to me.

    In addition, for even more potential validation of this string, the number itself could be made capturing or put parens in the longer regex.

    if ( my($n1,$n2,$n3,$n4) = string =~ {\A($number) x ($number) x ($numb +er)/($number)\z} and $n1 == $n2 ) { valid format and square}
Re^2: Regex question
by perlboy_emeritus (Scribe) on Sep 17, 2023 at 21:07 UTC

    What does this mean: 'I strip all spaces and stuff out of the string upfront.' ???

    Make whitesspace optional with \s* and add a few more tests:

    my @valid_strings = ( ['240 x 240 x 2/3600', 'Sample'], ['10000 x 240 x 2/3600', 'Max number'], ['1 x 240 x 2/3600', 'Min number'], ['120x240x2/3600', 'Number sans whitespace'], ['120x240x2 / 3600', 'Fraction with whitespace'], ); my $number = qr/10000|[1-9]\d{0,3}/; my $valid = qr{\b$number\s*x\s*$number\s*x\s*$number\s*/\s*$number\b} +; #my $valid = qr{\A$number\s*x\s*$number\s*x\s*$number\s*/\s*$number\z +}; #my $valid = qr{^$number\s*x\s*$number\s*x\s*$number\s*/\s*$number$}; my @invalid_strings = ( ['240 * 240 x 2/3600', 'Missing operator'], ['10001 x 240 x 2/3600', 'number exceeds max'], ['100 x 10001 x 2/3600', 'number exceeds max'], ['100 x 10001 x 2/10001', 'number exceeds max'], ['240x240x10001/3600', 'number exceeds max'], ['0 x 240 x 2/3600', 'number less than minimum'], ['240 x 240 x 2/0', 'Cannot divide by zero'], ['240 x 240 x 0/3600', 'Numerator cannot be zero'], );