ogxela has asked for the wisdom of the Perl Monks concerning the following question:

I'm trying to extract page sizes from various PDF files, and I'm using the following regex to do so: my ($x, $y) = m/^\/MediaBox \[ \S+ \S+ (\S+) (\S+) \]$/;

an example of a typical line that might be matched here is:

/MediaBox [ 0 0 614.39999 792 ]

where I'm trying to get the last two numbers into $1 and $2. But, I keep getting uninitialized values instead.

Help please!

Update: Escaping the brackets doesn't change anything

Thanks,
Alex

Replies are listed 'Best First'.
Re: Regex problem
by Zed_Lopez (Chaplain) on Nov 04, 2004 at 01:41 UTC

    Works for me. Your string, your code, my print statement:

    $_ = "/MediaBox [ 0 0 614.39999 792 ]"; my ($x, $y) = m/^\/MediaBox \[ \S+ \S+ (\S+) (\S+) \]$/; print "$x $y\n";

    produces:

    614.39999 792

      Hey, that info actually helped. I had trailing whitespace in $_ that was causing my regex to fail. Removing the EOL match made it work.

      --Alex

Re: Regex problem
by erix (Prior) on Nov 04, 2004 at 01:48 UTC
    keep it simple :)
    my $string = '/MediaBox [ 0 0 614.39999 792 ]'; my ($x,$y); if ( $string =~ /^\/MediaBox \[ \S+ \S+ (\S+) (\S+) \]$/ ) { $x = $1; $y = $2; print "matched\n"; print $x . "\n"; print $y . "\n"; } else { print "no match\n"; }
Re: Regex problem
by injunjoel (Priest) on Nov 04, 2004 at 01:19 UTC
    Greetings all,
    try escaping your brackets
    my ($x, $y) = m/^\/MediaBox \[ \S+ \S+ (\S+) (\S+) \]$/;

    Just a thought,
    -InjunJoel

    "I do not feel obliged to believe that the same God who endowed us with sense, reason and intellect has intended us to forego their use." -Galileo