in reply to Recommendations for breaking up string?

A dreadful program that works ... for those of you who think I can’t write (bad ...) Perl code:

use strict; use warnings; my $str = "QB Carson Palmer RB Chris Ivory RB Eddie Lacy WR A.J. Green + WR John Brown WR Davante Adams TE Martellus Bennett FLEX Jeremy Hill + DST Panthers"; my $key = undef; my $result = {}; for my $wd (split(/\b(QB|RB|WR|TE|FLEX|DST)\b/,$str)) { # As writ, "split()" will return the key followed by the string # for that key. We abuse "defined($key)" to separate the two. # Probably awful way to do it but it works. # '$result' has a hash-bucket for each key and an array-of-strings # for the content of the bucket. Perl's "auto-vivification" # makes this easy, if cryptic, to write. (If a bucket does not yet # exist, it magically appears.) # The following statement is a further smelly hack. # (The first string returned is empty.) next if $wd eq ""; print "wd $wd\n"; if (defined($key)) { push @{$result->{$key}}, $wd; $key = undef; } else { $key = $wd; } } foreach my $k (keys $result) { print "$k is: " . join(", ", @{$result->{$k}}) . "\n"; }

It does not update a database, of course, and you will also notice that it stacks any number of (say ...) RB entries into a list, instead of distinguishing between RB1 and RB2 (or assuming there are only two).   Probably the only saving grace of it, if any, is the use of split with a particular regular-expression.

Replies are listed 'Best First'.
Re^2: Recommendations for breaking up string?
by AnomalousMonk (Archbishop) on Sep 16, 2015 at 20:53 UTC

    A extractive variation. This approach assumes that a position ('QB', 'RB', etc.) cannot be confused with the second or any subsequent field of a player's name.

    c:\@Work\Perl>perl -wMstrict -MData::Dump -le "my $str = 'QB Carson Palmer RB Chris Ivory RB Eddie Lacy WR A. J. Gre +en ' . 'WR John Brown WR WR Grace WR Pele TE Billy Bob Bennett ' . 'FLEX A.P. Hill DST Panthers'; print qq{[[$str]]}; ;; my $result = {}; ;; my $position = qr{ \b (?: QB | RB | WR | TE | FLEX | DST) \b }xms; my $player = qr{ \S+ (?: \s+ \S+)*? }xms; ;; while ($str =~ m{ ($position) \s+ ($player) (?= \s+ $position | \z) } +xmsg) { ;; my ($posn, $name) = ($1, $2); ;; push @{$result->{$posn}}, $name; ;; } ;; dd $result; " [[QB Carson Palmer RB Chris Ivory RB Eddie Lacy WR A. J. Green WR John + Brown WR WR Grace WR Pele TE Billy Bob Bennett FLEX A.P. Hill +DST Panthers]] { DST => ["Panthers"], FLEX => ["A.P. Hill"], QB => ["Carson Palmer"], RB => ["Chris Ivory", "Eddie Lacy"], TE => ["Billy Bob Bennett"], WR => ["A. J. Green", "John Brown", "WR Grace", "Pele"], }
    The  $player regex could be refined to make it more discriminative of human names, but that's always tricky.


    Give a man a fish:  <%-{-{-{-<