Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

I'm trying to parse an ICal Duration string. Here's the definition from the RFC:
Formal Definition: The value type is defined by the following notation: dur-value = (["+"] / "-") "P" (dur-date / dur-time / dur-week) dur-date = dur-day [dur-time] dur-time = "T" (dur-hour / dur-minute / dur-second) dur-week = 1*DIGIT "W" dur-hour = 1*DIGIT "H" [dur-minute] dur-minute = 1*DIGIT "M" [dur-second] dur-second = 1*DIGIT "S" dur-day = 1*DIGIT "D"
and here's the Perl regex I've been using, and that's been failing..
my @temp = $str =~ m{ ([\+\-])? (?# Sign) (P) (?# 'P' for period? This is our magic character) (?: (?:(\d+)Y)? (?# Years) (?:(\d+)M)? (?# Months) (?:(\d+)W)? (?# Weeks) (?:(\d+)D)? (?# Days) )? (?:T (?# Time prefix) (?:(\d+)H)? (?# Hours) (?:(\d+)M)? (?# Minutes) (?:(\d+)S)? (?# Seconds) )? }x;
this fails even on simple duration strings (eg "P1D" ). thanks, in advance, for the suggestions.

Replies are listed 'Best First'.
Re: regex - parsing a string with many optional fields
by matija (Priest) on Feb 25, 2004 at 14:34 UTC
    Are you sure it's failing? And not just giving what you asked of it, but not expected?

    If I run the code you provided, with the string you provided ("P1D"), the match suceeds, and I get this in @temp:

    DB<1> x @temp 0 undef 1 'P' 2 undef 3 undef 4 undef 5 1 6 undef 7 undef 8 undef
    To me that looks like what you should be getting, given the approach you've chosen - each field in the grammar will always come into the same array field.

    Could it be that you saw that first undef and didn't realize that the data was in the later fields?

    With the approch you're using, when the regexp doesn't match, you will get an empty array, so test for that. (Maybe by looking at $#temp).
Re: regex - parsing a string with many optional fields
by dragonchild (Archbishop) on Feb 25, 2004 at 14:31 UTC
    Have you looked at Date::ICal? I went to http://search.cpan.org, typed in "ical", and it was the first hit.

    ------
    We are the carpenters and bricklayers of the Information Age.

    Please remember that I'm crufty and crochety. All opinions are purely mine and all code is untested, unless otherwise specified.

Re: regex - parsing a string with many optional fields
by Fletch (Bishop) on Feb 25, 2004 at 14:32 UTC

    You might also look at how Net::ICal does things.

Re: regex - parsing a string with many optional fields
by delirium (Chaplain) on Feb 25, 2004 at 14:26 UTC
    Well, it doesn't look like you are making any decisions based on what type of duration is being read, so why not simplify it, e.g:

    my @temp = $str =! /([PT]\d+[YMWDHMS])/g;
      Note that this will not actually finish the parsing. Once the string has been broken out into its component parts (which this will do), more work needs to be done in order to properly understand the string.

      ------
      We are the carpenters and bricklayers of the Information Age.

      Please remember that I'm crufty and crochety. All opinions are purely mine and all code is untested, unless otherwise specified.