Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

Hey monks, there are a couple of minor things I need a bit of help on if u can...

I know how to find a word on a line...

if ($line =~ /\b$word\b/){}

But how do I find out if a word is not on a line, I thought maybe prefix the statement with a '!'? Theres clearly something obvious im missing here!

The second thing i would like to know is how do I remove the first char from a string in an array element?
I would have something like "10000001", but i want the first '1' to no longer exist so it will read "0000001". It must be the first char as sometimes there maybe letters there or special chars (e.g. -, +, $) that i dont want. I was thinking of using split which will work but i thought there must be some regular expression that is much more efficient (As u can guess, regular expressions are not my strong point!).

The third thing is i want to remove all chars up to the last ',' in an array element. For example, if I have:

"","","hello, my name is Bob, my cell number is, 1234567890",""

I want to remove ", 1234567890" in array element 2. It wont always be a number, this is just an example. Again i guess theres some clever bit of regex that solves this problem very quickly!

These are probably easy to solve for u perl guru's out there, but for us lowly apprentices its not so easy! I must get to grips with these god damn regex, can anyone recommend a decent book that can explain this sorta stuff for people as mentally challenged as myself?! (No books for dummies please, i dont need my mental capacity challenged by patronising yellow books with kiddie pics plastered over it! ;-))

Thanks peeps, Chad :-)

Replies are listed 'Best First'.
Re: Not find a word / remove first char in string / remove last value in list
by Corion (Patriarch) on Nov 18, 2003 at 11:33 UTC

    I think that perldoc perlre as a reference and perldoc perlretut as a tutorial for regular expressions work well. If you want a book that goes far beyond this, have a look at Mastering Regular Expressions, although I don't know how good it is at introducing regular expressions.

    Some pointers to get you started on your homework:

    1. A match can be naively negated in two ways:
      if ($item !~ /foo/) {}; if (! $item =~ /foo/) {};
      Experiment with these to find the solution for the first exercise.
    2. For removing the first char of a string, perlfunc -f substr might be of interest, but as you want to limit yourself to regular expressions, the substitution operator s/// will be of interest to you. perldoc -f split might also be a way to a solution, but you will also need perldoc -f join to complete that way.
    3. This is a hard problem to solve with regular expressions for someone new to REs. I would use one of the prefabricated modules for parsing CSV strings, but if you really want to go down the road of parsing CSV with regular expressions, approach the problem in the following fashion:
      1. read about negated character classes in perlre
      2. find a RE that matches a quoted string. A quoted string is a string that starts with a double quote, ends with a double quote and contains no double quote between those two.
      3. find a RE that matches a comma within a quoted string
      4. find out how to combine smaller REs into larger REs

    Personally, I wouldn't approach many of these problems with regular expressions, as the time invested to craft such a special RE takes much too long for me, when a step-by-step solution is immediately available.

    perl -MHTTP::Daemon -MHTTP::Response -MLWP::Simple -e ' ; # The $d = new HTTP::Daemon and fork and getprint $d->url and exit;#spider ($c = $d->accept())->get_request(); $c->send_response( new #in the HTTP::Response(200,$_,$_,qq(Just another Perl hacker\n))); ' # web
Re: Not find a word / remove first char in string / remove last value in list
by broquaint (Abbot) on Nov 18, 2003 at 11:27 UTC
    But how do I find out if a word is not on a line
    Just use the negated version of the regex match e.g
    shell> perl -le 'print "yep" if "foo bar" !~ /baz/' yep
    The second thing i would like to know is how do I remove the first char from a string in an array element?
    Using substr is a pretty sensible approach e.g
    shell> perl -le '$_="xfoo"; print; substr($_,0,1)=""; print' xfoo foo
    Or you could use a simple s/// e.g
    shell> perl -le '$_="xfoo"; print; s/^.//; print' xfoo foo
    The third thing is i want to remove all chars up to the last ',' in an array element.
    You could use a naive regex for this which would just match a comma and then everything that wasn't a comma to the end of the string
    shell> perl -le '$_="foo,bar,baz"; print; s/,[^,]+$//; print' foo,bar,baz foo,bar
    But if you're parsing comma-delimited fields then the likes of Text::xSV is a much saner approach.

    The above answers can be applied to an array element just as easily a simple scalar variable as they're both scalar values.

    HTH

    _________
    broquaint

Re: Not find a word / remove first char in string / remove last value in list
by Roger (Parson) on Nov 18, 2003 at 11:26 UTC
    But how do I find out if a word is not on a line, I thought maybe prefix the statement with a '!'?

    Yes, the syntax for a negative match is:
    if ($line !~ /\b$word\b/) { ... }
    how do I remove the first char from a string in an array element

    You can use the substr function as lvalue:
    substr($string,0,1) = undef;
    I want to remove ", 1234567890" in array element 2. It wont always be a number

    Yes you can use a simple regex:
    my $str = '"hello, my name is Bob, my cell number is, 1234567890"'; $str =~ s/,[^,]*"$/"/; print "$str\n";
    Now, the tricky bit is to split your records. You can either use a complicated split 306640, or you can use the Text::CSV_XS module from CPAN, or tilly's Text::xSV for a pure Perl implementation.

Re: Not find a word / remove first char in string / remove last value in list
by Art_XIV (Hermit) on Nov 18, 2003 at 14:11 UTC

    Another (distinctly Perlish) way to see if a word is not on a line, which is often more to-the-point than negative conditions is:

    unless ($item =~ /foo/) {};

    The best way to come to terms w/ regular expressions is to just experiment & play with them until they start to 'click' for you. Once you start to get them, then you can re-read the docs w/ at least half a clue about what they're talking about. Then you can move on to the more advanced materials.

    Don't feel bad about having problems w/ regular expressions. There are thousands of experienced, high quality coders that regard them as weird, voodoo-like arcana. The fact that they're really a language-within-a-language can really throw coders.

    Hanlon's Razor - "Never attribute to malice that which can be adequately explained by stupidity"