cajun has asked for the wisdom of the Perl Monks concerning the following question:
The type of URL's I'm downloading the data from are as follows:
What I want to extract from this is:http://www.domain.com/data/2005/sales/01012005.txt http://www.domain.com/data/2005/sales-jan/01232005.txt http://www.domain.com/data/2005/sales-local/01012005.txt http://www.domain.com/data/2005/sales-outside-jan/01012005.txt ... ...
sales
sales-jan
sales-local
sales-outside-jan
...
...
The regex I have come up with to extract the information is:
$dir = $1 if /\/(\w+(|-\w+|-\w+-\w+))\/\w+\.txt$/;
This regex appears to be working correctly. My question is am I going about it the right way? Could I have shortened the regex somehow?
Thanks,
Mike
Update: Corrected typo in regex that GrandFather found.
Thanks GrandFather and ikegami for the suggestions. Yes, I should have used a different delimiter, the leaning toothpicks are confusing. I understand GrandFather's suggestion, but I'll have to study ikegami's suggestion a bit. Thanks!
Update II: Thanks to all for the great responses / ideas. Thanks to davidrw & YuckFoo for their suggestions on the split. Frankly I hadn't even thought of that. I became so wrapped up in the regex to get the directory, I hadn't even thought about the filename yet. Clearly a case of not seeing the forest for the trees.
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re: Regex question
by ikegami (Patriarch) on Aug 19, 2005 at 01:58 UTC | |
by Roy Johnson (Monsignor) on Aug 19, 2005 at 03:12 UTC | |
by radiantmatrix (Parson) on Aug 19, 2005 at 16:07 UTC | |
by tlm (Prior) on Aug 19, 2005 at 17:45 UTC | |
|
Re: Regex question
by davidrw (Prior) on Aug 19, 2005 at 02:25 UTC | |
|
Re: Regex question
by YuckFoo (Abbot) on Aug 19, 2005 at 02:35 UTC | |
|
Re: Regex question
by GrandFather (Saint) on Aug 19, 2005 at 01:57 UTC |