Re^4: Modifying a regex

Alright, so the following is a portion of the data set I am using, and following that is the format I would like it to eventually look like:

DATA SET

012345 NA13333 C C
012345 NA13334 F F
012345 NA13335 E F
012346 NA13333 U U
012346 NA13334 I I
012346 NA13335 Y O

IDEAL OUTCOME **note the spacing comes out weird, SORRY! There is a si
+te number above every pair of letters.

SITES              012345        012346
NA13333  C        C            U         U  

SITES          012345        012346
NA13334  F         F         I           I  

SITES           012345      012346
NA13335  E        F         Y         O
[download]

***** The code I am using again is:

#!/usr/bin/perl

use strict;

my $inFile = 'fanca.txt';

open (IN, $inFile) or die "open $inFile: $!";

my %user;

while (my $line = <IN>) {
     next unless $line =~ m{^(\S+) (\d+) (.*)};
     my ($site, $userID, $data, $data2) = ($1, $2, $3, $4);
    $user{$userID}{$site} = $data, $data2;
}


close(IN) or die "close $inFile: $!";

my $outfile = "parsingoutput_for_fanca.txt";
open(REPORT, ">$outfile") or die "open >$outfile: $!";

foreach my $userID (sort {$a <=> $b} keys %user) {
    my %sites = %{$user{$userID}};

    my $line1 =  'SITES';
    my $line2 = "$userID";

    while (my ($site, $data, $data2) = each %sites) {
        $line1 .= ' ' x (length($line2)-length($line1));
        $line2 .= ' ' x (length($line1)-length($line2));

        #add on next site
        $line1 .= ' '. ' ' . $site;
        $line2 .= ' '. ' '. $data . ' ' . ' '. $data2;
    }

    print REPORT $line1 . "\n";
    print REPORT $line2 . "\n";
    print REPORT "\n";
}

close (REPORT) or die "close $outfile: $!";
[download]

Comment on Re^4: Modifying a regex Select or Download Code

Replies are listed 'Best First'.

Re^5: Modifying a regex
by grep (Monsignor) on Oct 27, 2006 at 21:00 UTC

use strict;
use warnings;

my @lines = (
'012345 NA13333 C C',
'012345 NA13334 F F',
'012345 NA13335 E F',
'012346 NA13333 U U',
'012346 NA13334 I I',
'012346 NA13335 Y O');

foreach my $line (@lines) {
    next unless $line =~ m{^(\S+) NA(\d+) (.*)};
    my ($site, $userID, $data) = ($1, $2, $3);
    print "SITE: $site   USER: $userID   DATA: $data\n";
}
[download]

SITE: 012345   USER: 13333   DATA: C C
SITE: 012345   USER: 13334   DATA: F F
SITE: 012345   USER: 13335   DATA: E F
SITE: 012346   USER: 13333   DATA: U U
SITE: 012346   USER: 13334   DATA: I I
SITE: 012346   USER: 13335   DATA: Y O
[download]

Some problems you have not addressed from my original post:

  next unless $line =~ m{^(\S+) (\d+) (.*)};
  my ($site, $userID, $data, $data2) = ($1, $2, $3, $4);
  # you have 3 capturing paran's but you try to call $4
  # your 2 data columns get folded together in $3 because of your gree
+dy .*

  $user{$userID}{$site} = $data, $data2;
  # $data2 is useless and I think you are trying to use an array ref
  # but that is not what you are doing [ ] signifies an array ref
[download]

Data::Dumper

print Dumper \%user;
[download]

grep

One dead unjugged rabbit fish later

[reply]
[d/l]
[select]