Tricky has asked for the wisdom of the Perl Monks concerning the following question:

Hello again brethren, Having dealt with the file-writing problem with your help, Im confronted with confusing error messages concerning my regexps to remove letter-spacing and word-spacing attributes from my in-document style sheet. The code's clunky but it's just a little test of my budding regexps. What am I missing?! Could someone explain why this is occuring? Here are the error messages and the code:
use of unintialised value in pattern match (m//)... An undefined value was used as if it were already defined
#!/usr/bin/perl # read letter and word spacing attributes.plx # Program will read in an html file, scan for word and letter spacing +attributes # and print out entire doc. # 1. No need for file variable yet: open (INFILE, "<".$htmlFile) or di +e("Can't read source file!\n"); use warnings; use diagnostics; use strict; # Declare and initialise variables. my $pattern1 =~ /(word)-(spacing):\s*[\d]+px/; # word spacing regexp my $pattern2 =~ /(letter)-(spacing):\s*[\d]+px/; # letter spacing re +gexp my @htmlLines; # Open HTML test file and read into array. open INFILE, "E:\\Documents and Settings\\Richard Lamb\\My Documents\\ +HTML\\test2InDocCSS.html" or die "Sod! Can't open this file.\n"; @htmlLines = <INFILE>; close (INFILE); while(@htmlLines) { # check for word spacing attributes in array sub wordSpacing { foreach my $line (@htmlLines) { if(/$pattern1 && $pattern2/) { printHTML(); } else { notFound(); } } } } # prints the reformatted HTML doc sub printHTML { for my $i (0..@htmlLines-1) { print $htmlLines[$i]; print "Success!\n"; } } # Prints error message sub notFound { print "Letter and word spacing attributes note found in HTML file.\n +"; }
Many thanks for your wisdom and time! Tricky

Replies are listed 'Best First'.
Re: Regexp conundrum
by Abigail-II (Bishop) on Aug 19, 2003 at 15:26 UTC
    use of unintialised value in pattern match (m//)... An undefined value was used as if it were already defined

    You know, it is helpful if you indicate where the warning occurs.

    my $pattern1 =~ /(word)-(spacing):\s*[\d]+px/; # word spacing regexp my $pattern2 =~ /(letter)-(spacing):\s*[\d]+px/; # letter spacing re +gexp

    What's that supposed to do? You are matching against undefined variables.

    if(/$pattern1 && $pattern2/)

    You haven't defined $pattern1 and $pattern2 yet. But if you do, the code above most likely doesn't do what you think it does.

    Abigail

      How may I initialise the variables? A literal string within quotes? Much of the code I wrote didn't make much sense, so I've cut the unnecessary stuff out. Live and learn... Richard
        If the intention is to later interpolate them into a regexp, I'd go for a literal string. A qr construct works too, but that may be less efficient (it depends on how the interpolation happens).

        Abigail

Re: Regexp conundrum
by Limbic~Region (Chancellor) on Aug 19, 2003 at 15:27 UTC
    Tricky,
    You have most likely created an infinite loop:
    while(@htmlLines)
    This is not likely what you wanted. You probably want to do a foreach loop where it will go through the array one time or shift the lines off one at a time. The reason being is because the array is being tested in scalar context - which for an array is the number of elements it contains. Unless the number of elements reaches 0 - this will loop forever.

    Additionally - I don't think

    if(/$pattern1 && $pattern2/)
    is what you wanted either. You probably want something more like: if (/$pattern1/ && /$pattern2/) once you defined $pattern1 and $pattern2 correctly.
    Sorry - HTH

    Cheers - L~R

      Cheers L-R, I'm working hard on joining the coding community - never thought is was going to be easy. Cheers for the advice folks! T
Re: Regexp conundrum
by batkins (Chaplain) on Aug 19, 2003 at 15:30 UTC
    Well, the syntax of lines 11 and 12 isn't quite right. If you're trying to save a regex to a variable for later use, then what you want is:
    my $pattern1 = qr/(word)-(spacing):\s*[\d]+px/; # word spacing regexp my $pattern2 = qr/(letter)-(spacing):\s*[\d]+px/; # letter spacing r +egexp
    This will compile each regex to that variable.

    I don't really know why you have a sub inside a while loop, or why there's a for loop inside that sub. I would recommend just using one big for loop, like so:

    foreach my $line (@htmlLines) { if(/$pattern1 && $pattern2/) { printHTML(); } else { notFound(); } }
    You don't need the while loop (which wasn't doing what you wanted anyway - it would have looped forever), and you don't really need the extra sub anyway.

    The code in the if clause is also not likely to do what you want. Personally, I would recommend avoiding compiled regexes (like those on lines 11 and 12). So take out lines 11 and 12 and replace the if clause with this:

    if(/(word)-(spacing):\s*[\d]+px/ && qr/(letter)-(spacing):\s*[\d]+px/)
    The printHTML function is getting called on every iteration through the loop, which is probably unnecessary. The notFound sub is superfluous, but won't cause you any problems.

    To be brutally honest, this code doesn't make much sense. Could you explain exactly what you're trying to do? This way, we can help you get where you're trying to go.


    milkbone - perl/tk instant messaging - it's the only way to fly
      Hello monks, I'm scrambling around too much, and not thinking about what I'm doing; I'm a newbie to coding, and still finding my feet.
      1. My aim (for today) is to read-in the HTML file on a filehandle, che +ck for the presence of the word and letter spacing attributes and pri +nt out a 'success, patterns are in the document' message.
      2. Next is to remove the attributes with 'substitute', write the coore +ctions to the HTML source file, and open the browser to see if the it +s worked.
      3. I'm going to remove the attributes from the styles (in-line styles + at the moment), check the source file for the absence of them, and r +e-include them within the tags.
      I want to see how I can change the HTML using regexps, from a dyslexic/poor-vision user's point-of-view. Word and letter spacing, Linearising hyperlinks and removing unnecessary emphasis of text (confusing for dyslexics - I should know, I'm one of them!) are some of the requirements I've identified. I've managed to extract image tags and re-write the changes, so I'm heading in the right direction. Question: I've not encountered the 'qr' modifiers before; what are they, and what do they do?
      Thanks for your patience, brethren...