frasco has asked for the wisdom of the Perl Monks concerning the following question:

dear monks, I have a small question concerning with database fields and regex. As far I understood in a flat file (e.g. a .txt file) the end of a line is marked by \n (at least on UNIX-like systems). Well, how strings are terminated in database field?
For instace, if a field is filled by a string like this "*E2@x*PAP" I can match the pattern E2 (between * and @x) but how can I match the pattern PAP (between * and ...)? If I use "find everything between * and \n" it doesn't work. Moreover I have thousand of these pattern (character sequences to be matched), thus I suppose I need expressions like (.*?) or stuf like that.
p.s. I use MySQL (I don't know if it is relevant!)

Replies are listed 'Best First'.
Re: regex and database fields
by kyle (Abbot) on Jun 10, 2008 at 15:35 UTC

    If you're matching against some $string, \z matches the end of the string.

    my $field = '*E2@x*PAP'; $field =~ /\*PAP\z/;

    You can also use $, which will match at the end of the string with or without a newline.

    my $nl = "newline\n"; my $nonl = "newline"; $nl =~ /line$/; $nonl =~ /line$/;

    Update: As Test::More:

    use Test::More 'tests' => 3; my $field = '*E2@x*PAP'; ok( $field =~ /\*PAP\z/, '\\z matches at end of string' ); my $nl = "newline\n"; my $nonl = "newline"; ok( $nl =~ /line$/, '$ matches newline at end' ); ok( $nonl =~ /line$/, '$ matches end of string without newline' );
Re: regex and database fields
by Narveson (Chaplain) on Jun 10, 2008 at 15:33 UTC

    To capture whatever comes after an asterisk:

    /[*](.*)/

    To capture whatever comes after the last asterisk in the field:

    /[*]([^*]*)$/

    There is no field termination character, or if there is in the implementation, it is hidden from you and you should not worry about it. To make sure your expression matches at the end of the field, just use the $ positional anchor (or \z).

      thank you
Re: regex and database fields
by toolic (Bishop) on Jun 10, 2008 at 15:48 UTC
    Here is one way:
    use strict; use warnings; my $str = '*E2@x*PAP'; if ($str =~ /\*(.+?)\@x\*(.+?)$/) { print "1st: $1\n"; print "2nd: $2\n"; } __END__ 1st: E2 2nd: PAP

    With an explanation brought to you by YAPE::Regex::Explain: