cosmicv has asked for the wisdom of the Perl Monks concerning the following question:

<html> <head> <meta http-equiv="Content-Language" content="en-us"> <meta http-equiv="Content-Type" content="text/html; charset=windows-1252"> </head> <body>

Hi all, I got one. Can I write a reg exp to substitute this.

Bob Q Smith bsmith 00001234567 5/1/00 12:00:00

and break this out to comma delimited fields like

Bob Q Smith,bsmith,00001234567,5/1/00 12:00:00

Thanks for any assistance!

</body> </html>
  • Comment on Transpose some spaces to commas in a string

Replies are listed 'Best First'.
(Ovid) Re: Transpose some spaces to commas in a string
by Ovid (Cardinal) on Sep 29, 2000 at 20:43 UTC
    Yes, a regex could do that, but you probably don't want to use a regex for this. Text::CSV is a better solution overall. Your main obstacle, however, is figuring out how to identify the persons's name, so you don't break the fields up incorrectly.

    Cheers,
    Ovid

    Join the Perlmonks Setiathome Group or just go the the link and check out our stats.

      Thats a problem, some user names are Bob Q Smith, Jane Doe, Ralph Dilbet Johnson, only was I see to do it is to detect the 4 "0000" in that serial number then jump back past the previous word and replace that space with a comma. I have no clue on how to do that. Any tips?
        Offhand, I'd say try the FAQ about putting commas into a number. If the variable mess is at the start of your string, reverse it.
        Bob Q Smith bsmith 00001234567 5/1/00 12:00:00
        becomes
        00:00:21 00/1/5 76543210000 htimsb htimS Q boB

        And a fairly simple regex can insert your commas. Re-reverse your result, and there you go.

        Update: here's my attempt that appears to do what you want:

        #!/usr/bin/perl -w use strict; my $test="Bob Q Smith bsmith 00001234567 5/1/00 12:00:00"; $test = reverse $test; #documenting the simple but nasty looking regex is left as an exercise + for the reader $test=~s/([:\d]+)\s([\d\/]+)\s([\d]+)\s([\w]+)\s(.*)/$1,$2,$3,$4,$5/ o +r die "Pattern didn't match: $test"; $test = reverse $test; print $test;
        Untested:
        my $testvar = "Bob Q Smith bsmith 00001234567 5/1/00 12:00:00"; $testvar =~ s/ \s # a whitespace character (space, tab, etc) ( # Capture to $1 \w+ # one or more word characters (e.g. bsmi +th) ) \s # a whitespace character ( # Capture to $2 \d+ # one or more digits ) \s # a whitespace character /,$1,$2,/x;
        If your data is relatively clean, I think that should work. It relies on you knowing that the first digits you encounter are going to be your serial number and immediately preceding them are some letters. From there, it's pretty straightforward. There are many different ways to write this regex and this may not be the best, but it's fairly clear.

        Cheers,
        Ovid

        Join the Perlmonks Setiathome Group or just go the the link and check out our stats.

        Than the first thing you need to do is write something to clean up the data. Maybe throw the cleaned data into and array, run the regex on that and then output those results to a file...

        Maybe if you could change the input so it comes as something like:

        Bob Q Smith 00001234567 bsmith 5/1/00 12:00:00

        It would make it much easier then to use regexes to catch the spaces that fall between number and letter fields. I think this goes under the "it would be easier to control the input if you can" department.

        -- I'm a solipsist, and so is everyone else. (think about it)

(tye)Re: Transpose some spaces to commas in a string
by tye (Sage) on Sep 29, 2000 at 21:53 UTC

    I think swiftone has the right idea. I'd do it like this:

    my $line= "Bob Q Smith bsmith 00001234567 5/1/00 12:00:00\n"; my( $time, $date, $num, $user, @names )= reverse split ' ', $line; my $name= join " ", reverse @names; print "$name,$user,$num,$date $time\n";

            - tye (but my friends call me "Tye")
Re: Transpose some spaces to commas in a string
by jptxs (Curate) on Sep 29, 2000 at 20:51 UTC

    You need to give a little more info. Is every feild garunteed to be like the example you give? i.e. First MiddleI Last email-alias some_number date_in_given_format?

    If so, then you can do it easily (I'm assuming it's a file full of these):

    while ( <> ) { s/^((\w+) (\w) (\w+)) (\w+) ((\d){11}) (\d/\d/\d (\d){2}:(\d){2}: +(\d){2})$/$1,$5,$6,$8/ }

    this is untested, but should be rather close...

    for more on this stuff try:

    -- I'm a solipsist, and so is everyone else. (think about it)

Re: Transpose some spaces to commas in a string
by arturo (Vicar) on Sep 29, 2000 at 20:45 UTC

    It depends on how much variance you can expect in the input fields; if you can be *certain* that you have input of the form

    First MI Last userID IDNUMBER MM/DD/YY HH:MM:SS

    then it's not that hard.

    WARNING: The following is *very* quick n' dirty, and should only be used on highly sanitary input =)

    $idstring =~ s/(\w+ \w \w+) (\S+) (\d+) (\S+ \S+)/$1,$2,$3, $4/;
    Will do it.

    Philosophy can be made out of anything -- or less