in reply to Backtracking for substitutions
And if you're not easily overwhelmed by lots of documentation, here's the whole program with setup code, comments, and output:$mydata =~ s/([\w\s]+)\s([\w\d]+)\s(0000.*)/$1,$2,$3/g; $mydata =~ s/(:\d{2})\s0000/$1,$2/g;
I hope this helps. Let us know. --Mark#!/usr/bin/perl # NODE34853.pl # Assumptions: There are potentially any number of parts of a user's n +ame. # For example, "Bill Clinton" might be a user's name, but # "William Jefferson Clinton the Liar" might also be his name. # The user's name is next, which is always one word long. Might also h +ave numbers # in it, such as Bill69. # The data is currently in a single scalar (as though you read it from + a flat file). # And, It's not clear what granularity you want the data to have. I'm +assuming # that you want the user's name, his username, and the individual chun +ks of login # data. Do you also want to split up the login data? Your post didn't +say. # # Knowing what you want to do with this data afterwards would also hel +p. If you want to # load this into a SQL database, then you'd probably want to do this a + bit differently. # But, if your goal is just to comma-delimit the file so you can load +it into # a spreadsheet, then this oughta do the trick. # # This solution is really just a two line program with lots of comment +s and some # stuff to setup the environment and print the results. # I hope it helps. # --Mark # # This line just sets up the scalar variable you want to parse. # I'm assuming you have other methods of doing this (reading from # CSV, etc.) $mydata = <<ENDDATA; Bob Smith bsmith 00001234567 01/01/1986 00:00:00 Mary Ann Doe mdoe 00001234568 01/01/1986 00:00:01 00001234563 01/01/19 +86 00:00:02 00001234563 01/01/1986 00:00:03 Gilligan Q Smith gsmith 00001234569 01/01/1986 00:00:01 00001234569 01 +/01/1986 00:00:02 ENDDATA # The purpose of this regex is just to split out the user's NAME, # USERNAME, and associated DATA. We're leaving the guts of the DATA al +one for now. $mydata =~ s/([\w\s]+)\s([\w\d]+)\s(0000.*)/$1,$2,$3/g; #MyData temporarily looks like this: #Bob Smith,bsmith,00001234567 01/01/1986 00:00:00 #Mary Ann Doe,mdoe,00001234568 01/01/1986 00:00:01 00001234563 01/01/1 +986 00:00:02 00001234563 01/01/1986 00:00:03 #Gilligan Q Smith,gsmith,00001234569 01/01/1986 00:00:01 00001234569 0 +1/01/1986 00:00:02 # Now, let's split up the DATA parts by looking for the space between +the :00 and 0000 $mydata =~ s/(:\d{2})\s0000/$1,$2/g; print "All done. MyData now looks like this\n$mydata\n\n"; #Bob Smith,bsmith,00001234567 01/01/1986 00:00:00 #Mary Ann Doe,mdoe,00001234568 01/01/1986 00:00:01,00001234563 01/01/1 +986 00:00:02,00001234563 01/01/1986 00:00:03 #Gilligan Q Smith,gsmith,00001234569 01/01/1986 00:00:01,00001234569 0 +1/01/1986 00:00:02
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
(Ovid - Common regex error) RE: A two-liner for Backtracking for substitutions
by Ovid (Cardinal) on Oct 02, 2000 at 23:16 UTC | |
by markwild (Sexton) on Oct 03, 2000 at 02:44 UTC | |
by Ovid (Cardinal) on Oct 03, 2000 at 03:10 UTC | |
|
RE: A two-liner for Backtracking for substitutions
by markwild (Sexton) on Oct 02, 2000 at 23:05 UTC |