Beefy Boxes and Bandwidth Generously Provided by pair Networks
No such thing as a small change
 
PerlMonks  

comment on

( [id://3333]=superdoc: print w/replies, xml ) Need Help??

Yesterday, I had just finished reading Perl Best Practices and I felt electrified. So many things that you could have done and you haven't bothered for so long!

The book is so full of useful advice that I felt ashamed for every good piece that I could have found on my own and instead, due to too much laziness, I had not. Thus, I decided, from now on, I will put into my personal practice all the advice that I liked (most of them, actually) and were not already consolidated in my day-by-day programming.

Then, I updated some of my templates for subs, class creation, and so on, and I started the next project in my agenda fully armed with new knowledge. (Update Please notice: I said "next project". Truly to the if-it-works-don't-change-it principle, I leave my existing code as it is until it's in need of maintenance.)

One of the useful pieces of advice that struck me as simple and very easy to adopt was one about regular expressions. The book says "always use the /x modifier" (and the /m and /s as well, for reasons that I leave to the reader to find out in the book), so that your regex are easier to read. No big deal, I thought. I occasionally use the /x modifier, and why not making a habit of it? So I set off with the new course of action, and I changed one of my templates for parsing a simple data file. Before, I used to write things like this:

#!/usr/bin/perl use strict; use warnings; PARSE: while (<DATA>) { chomp; next PARSE if /^\s*$/ ; # skip blank lines next PARSE if /^\s*#/ ; # skip comments # do something useful with the data print "<$_>\n"; } __DATA__ some data here # a comment # another comment, followed by a blank line more data # another comment final data

Since I use this kind of thing very often, I have a template for it in my editor. I updated it so it became:

#!/usr/bin/perl use strict; use warnings; PARSE: while ( my $line = <DATA> ) { chomp $line; next PARSE if $line =~ m{ ^ \s* $ }xsm; next PARSE if $line =~ m{ ^ \s* # }xsm; # do something useful with the data print "<$line>\n"; } __DATA__ some data here # a comment # another comment, followed by a blank line more data # another comment final data

It looks better, doesn't it?

Unfortunately, this code is not the same as the previous one. When I ran my program for the first time, I got an empty result set. No lines were parsed at all.

I spent several minutes scratching my head, until I realized that the idiom I had used countless times in the past was failing me now.

The problem is, /^\s*#/ is not the same as /^\s*#/x, because the /x modifier allows not only whitespace, but also comments and therefore the "#" character is not a literal any more!

Of course, I should have adopted yet another best practice piece of advice, i.e. the one saying to put a character to escape into a character class.   (*)

Now,  m{ ^ \s* [#] }xms works as advertised, but I wonder if it was worth the trouble of deviating from the less readable but consolidated m/^\s*#/.   (**)

Lesson learned: cargo cult coding is always a risk, even for experienced programmers. Think before refactoring!

(*) Provided that I realized first that there was a character to escape!

(**) Yes, of course it was. I just have to remember to connect fingers and brain before starting a coding session!

 _  _ _  _  
(_|| | |(_|><
 _|   

In reply to A refactoring trap by gmax

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":



  • Are you posting in the right place? Check out Where do I post X? to know for sure.
  • Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
    <code> <a> <b> <big> <blockquote> <br /> <dd> <dl> <dt> <em> <font> <h1> <h2> <h3> <h4> <h5> <h6> <hr /> <i> <li> <nbsp> <ol> <p> <small> <strike> <strong> <sub> <sup> <table> <td> <th> <tr> <tt> <u> <ul>
  • Snippets of code should be wrapped in <code> tags not <pre> tags. In fact, <pre> tags should generally be avoided. If they must be used, extreme care should be taken to ensure that their contents do not have long lines (<70 chars), in order to prevent horizontal scrolling (and possible janitor intervention).
  • Want more info? How to link or How to display code and escape characters are good places to start.
Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others meditating upon the Monastery: (4)
As of 2024-04-19 23:24 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found