identifying null fields in bar delimited records

jjohhn has asked for the wisdom of the Perl Monks concerning the following question:

Replies are listed 'Best First'.
Re: identifying null fields in bar delimited records by holli (Abbot) on May 31, 2005 at 11:21 UTC
Something along the lines of `my $line = ";field;;field;\t;field; ;field;field"; my @line = split ";", $line; my @null = grep { defined $_ } map { $line[$_] =~ /^$/ ? $_ : undef } +(0..$#line); print "line contains ", scalar @null, " null fields at offset(s): @nul +l"; #line contains 2 null fields at offset(s): 0 2` [download] ? Update: To fit in the OP's specified requirements: `use strict; while ( <DATA> ) { my @line = split /\\|/, $_; my @null = grep { defined $_ } map { $line[$_] =~ /^$/ ? $_ : unde +f } (0..$#line); print "line contains ", scalar @null, " null fields at offset(s): +@null\n"; } #line contains 3 null fields at offset(s): 1 3 4 #line contains 2 null fields at offset(s): 2 4 #line contains 3 null fields at offset(s): 0 1 4 #line contains 2 null fields at offset(s): 1 2 4 __DATA__ first\|\|third\|\| alpha\|beta\|\|delta\| \|\|c\|d\| one\|\|\|four\|` [download] holli, /regexed monk/	[reply] [d/l] [select]
Re^2: identifying null fields in bar delimited records by wfsp (Abbot) on May 31, 2005 at 12:07 UTC
holli++ Here's a humble for loop: `#!/usr/bin/perl use strict; use warnings; my $line = "a\|\|c\|\t\| \|d\|\|e"; my @fields = split(/\\|/, $line); my @null; for my $i (0..$#fields){ push @null, $i unless $fields[$i]; } print "null fields: ", scalar @null, "\n"; print "at field: "; print ++$_, ", " for @null; # output # null fields: 2 # at field: 2, 7,` [download]	[reply] [d/l]
Re^3: identifying null fields in bar delimited records by lupey (Monk) on May 31, 2005 at 12:38 UTC
holli and wfsp, great answers. Now, I'd like to make this snippet of code more general by turning the delimiter into a variable. But doing so doesn't work (I'm struggling to understand regular expressions). `my $delim = '\|'; my $line = "a\|\|c\|\t\| \|d\|\|e"; my @fields = split(/\\$delim/, $line); # output # null fields: 0 # at field:` [download]	[reply] [d/l]
Re^4: identifying null fields in bar delimited records by wfsp (Abbot) on May 31, 2005 at 12:50 UTC
Re^4: identifying null fields in bar delimited records by holli (Abbot) on May 31, 2005 at 12:53 UTC
Re^3: identifying null fields in bar delimited records by jjohhn (Scribe) on May 31, 2005 at 14:04 UTC
I posted as anon by mistake, which means I can't edit my post to remove the superfluous line containing the variable "line".	[reply]
Re^2: identifying null fields in bar delimited records by Anonymous Monk on May 31, 2005 at 14:01 UTC
I ended up doing this in awk; it could probably be translated to perl by someone smarter than me. For some reason the filename is printed AFTER the results instead of before as I would have expected, but otherwise this seems to work. `BEGIN{FS="\\|";} { line= substr($0,0,length($0)-1); #peel trailing bar n=split($0,a); for (i=0;i<n;i++){ line=line a[i]; if(a[i] ==""){nulls[i]++}; } } END{ print FILENAME; for (i in nulls) print "\t field " i": " nulls[i] " nulls"; }` [download]	[reply] [d/l]
Re: identifying null fields in bar delimited records by tchatzi (Acolyte) on May 31, 2005 at 11:05 UTC
Do you mind if you show us a sample of your files? ``The wise man doesn't give the right answers, he poses the right questions.'' TIMTOWTDI	[reply]
Re^2: identifying null fields in bar delimited records by jjohhn (Scribe) on May 31, 2005 at 11:41 UTC
They look like this: first\|\|third\|\| alpha\|beta\|\|delta\| \|\|c\|d\| one\|\|\|four\| I'd like to print something like: field1->1 field2->2 field3->2 field4->1	[reply]
Re^3: identifying null fields in bar delimited records by wfsp (Abbot) on May 31, 2005 at 14:33 UTC
Build a hash `#!/usr/bin/perl use strict; use warnings; my %null; while (<DATA>){ my @fields = split(/\\|/); for my $i (0..$#fields){ $null{"field$i"}++ unless $fields[$i]; } } for my $key (sort keys %null){ print "$key->$null{$key}\n"; } __DATA__ first\|\|third\|\| alpha\|beta\|\|delta\| \|\|c\|d\| one\|\|\|four\|` [download] output: `field0->1 field1->3 field2->2 field3->1` [download] The field numbers start at zero but also the second field has 3 compared to 2 in your desired ouput. The third line of your data starts with two bars, isn't that 2 null fields?	[reply] [d/l] [select]
Re^4: identifying null fields in bar delimited records by graff (Chancellor) on Jun 01, 2005 at 04:39 UTC
Re^3: identifying null fields in bar delimited records by holli (Abbot) on May 31, 2005 at 12:00 UTC
How does "first\|\|third\|\|" fit together with "field1->1"? There are 3 null fields in that line. holli, /regexed monk/	[reply] [d/l]
Re: identifying null fields in bar delimited records by ghenry (Vicar) on May 31, 2005 at 11:04 UTC
Update: Misinterpretated the OP last comments. ~~I think you should consider looking for tabs and whitesapce on each line too.~~ Maybe I'm wrong, but if you could provide some sample data, then I'm sure we can help. Walking the road to enlightenment... I found a penguin and a camel on the way..... Fancy a yourname@perl.me.uk? Just ask!!!	[reply]