Re^2: splitting up a string based on character locations

Thanks Grandfather - What would you do with this real example.

e:\logfiles\mylocationbase.log [3] Mon 13Mar06 11:00:25 - (000558) Sen
+ding file d:\data\18bn5489.dat

e:\logfiles\mylocationbase.log [3] Mon 13Mar06 11:00:31 - (000558) Sen
+t file d:\data\18bn5489.dat successfully (169 Kb/sec - 1049600 bytes)
 
e:\logfiles\mylocationbase.log [3] Mon 13Mar06 11:00:32 - (000558) Sen
+ding file d:\data\18bn5489.sta
[download]

These repeat over and over.

Out of first line I want to exstract what is between e:\logfiles\ and .log "alliancebase" and that will change but the e:\logfiles and .log will always be there.

I also want to exstract the Mon 13Mar06 11:00:25 and keep the skip everything from the - to the \ after data.

And keep everything to the end.

On second I want to skip anything that ends with a )

On third I want to skip anything that ends with .sta

thanks,
Ad.

Comment on Re^2: splitting up a string based on character locations Select or Download Code

Replies are listed 'Best First'.
Re^3: splitting up a string based on character locations by GrandFather (Saint) on Dec 19, 2006 at 21:29 UTC
When you learn to format your messages so that they can be read I'll take a look at your "real" problem. See Writeup Formatting Tips and read about code tags for a start. You could revise your OP while you are at it. You may find PMEdit helps. You can get the current version from the CPAN scripts area: http://cpan.perl.org/scripts/. DWIM is Perl's answer to Gödel	[reply]
Re^4: splitting up a string based on character locations by batcater98 (Acolyte) on Dec 19, 2006 at 21:47 UTC
Sorry, I will try to follow the rules. My Data follows, I have also put in what I am trying to exstract from each. First Line: I want mylocationbase, I want Mon 13Mar06 11:00:25, and I +want 18bn5489 of which all of this data will change, but the data aro +und it will stay static as these lines repeat over and over. Second Line: I want to skip per the ) at the end. Third Line: I want to skip per the .sta at the end. e:\logfiles\mylocationbase.log [3] Mon 13Mar06 11:00:25 - (000558) Sen +ding file d:\data\18bn5489.dat e:\logfiles\mylocationbase.log [3] Mon 13Mar06 11:00:31 - (000558) Sen +t file d:\data\18bn5489.dat successfully (169 Kb/sec - 1049600 bytes) e:\logfiles\mylocationbase.log [3] Mon 13Mar06 11:00:32 - (000558) Sen +ding file d:\bnsf_data\18bn5489.sta [download]	[reply] [d/l]
Re^5: splitting up a string based on character locations by GrandFather (Saint) on Dec 19, 2006 at 22:44 UTC
You really need to go read perlretut, perlre and prelreref. Note that the code below relies on backtracking to get the first file name. The `.* \\` grabs as much as it can consistent with finding a match. On the first try it will grab everything up to the last \ on the line, then fail to match the `\[` later in the regex, so it backtracks to the penultimate \ (which fails too) and so on until it finds a suitable match for the entire expression. Note too that the negative look ahead assertion is not required. use warnings; use strict; while (<DATA>) { chomp; if (/(\) \| \.sta \)?) $/x) { print "Skipping $_\n"; next; } if (! m{ .* \\(\w+) # Get first file name [^\[]* \[ [^\]]* \] \s* # skip to date time string ([^-]) # Everything up to the hyphen \d+/ . \\ # skip to the last file name ([^.]*) # Grab file name excluding extensi +on }x ) { print "No match: $_\n"; next; } my ($first, $datetime, $last) = ($1, $2, $3); print "Extracted $first, $datetime, $last\n"; } __DATA__ e:\logfiles\mylocationbase.log [3] Mon 13Mar06 11:00:25 - (000558) Sen +ding file d:\data\18bn5489.dat e:\logfiles\mylocationbase.log [3] Mon 13Mar06 11:00:31 - (000558) Sen +t file d:\data\18bn5489.dat successfully (169 Kb/sec 1049600 bytes) e:\logfiles\mylocationbase.log [3] Mon 13Mar06 11:00:32 - (000558) Sen +ding file d:\bnsf_data\18bn5489.sta [download] Prints: `Extracted mylocationbase, Mon 13Mar06 11:00:25 , 18bn5489 Skipping e:\logfiles\mylocationbase.log [3] Mon 13Mar06 11:00:31 - (00 +0558) Sent file d:\data\18bn5489.dat successfully (169 Kb/sec 1049600 + bytes) Skipping e:\logfiles\mylocationbase.log [3] Mon 13Mar06 11:00:32 - (00 +0558) Sending file d:\bnsf_data\18bn5489.sta` [download] DWIM is Perl's answer to Gödel	[reply] [d/l] [select]
Re^5: splitting up a string based on character locations by batcater98 (Acolyte) on Dec 21, 2006 at 16:30 UTC
GrandFather - you have been a huge help in my getting to know Regex. If I may ask one more question as I am still learning the in's and out's of Regex. In the data below if you could show me one more thing. What you sent I took and it worked great, I have a slit modification I + need, and I am not sure how to get it working. When I have a line en +ding in a ) I want to through it out unless in the same string you fi +nd .dat, if so then I want to keep the (from exp below) 58bn5904 and +also keep the 859216 bytes. DATA: e:\logfiles\beardstownbase.log [3] Thu 22Jun06 08:07:20 - (006415) Sen +t file d:\data\58bn5904.sta successfully (1.05 Kb/sec - 61 bytes) e:\logfiles\beardstownbase.log [3] Thu 22Jun06 08:07:20 - (006415) Sen +ding file d:\data\58bn5904.sta e:\logfiles\beardstownbase.log [3] Thu 22Jun06 08:07:19 - (006415) Sen +t file d:\data\58bn5904.dat successfully (25.0 Kb/sec - 859216 bytes) e:\logfiles\beardstownbase.log [3] Thu 22Jun06 08:06:46 - (006415) Sen +ding file d:\data\58bn5904.dat Output Looking for: Skip: e:\logfiles\beardstownbase.log [3] Thu 22Jun06 08:07:20 - (006415) Sen +t file d:\data\58bn5904.sta successfully (1.05 Kb/sec - 61 bytes) Skip: e:\logfiles\beardstownbase.log [3] Thu 22Jun06 08:07:20 - (006415) Sen +ding file d:\data\58bn5904.sta Extract beardstownbase, Thu 22Jun06 08:07:19, 58bn5904, 859216 bytes e:\logfiles\beardstownbase.log [3] Thu 22Jun06 08:07:19 - (006415) Sen +t file d:\data\58bn5904.dat successfully (25.0 Kb/sec - 859216 bytes) Skip: e:\logfiles\beardstownbase.log [3] Thu 22Jun06 08:06:46 - (006415) Sen +ding file d:\data\58bn5904.dat [download] Again Thanks for all the help - I am learning!	[reply] [d/l]