in reply to identifying null fields in bar delimited records

Something along the lines of
my $line = ";field;;field;\t;field; ;field;field"; my @line = split ";", $line; my @null = grep { defined $_ } map { $line[$_] =~ /^$/ ? $_ : undef } +(0..$#line); print "line contains ", scalar @null, " null fields at offset(s): @nul +l"; #line contains 2 null fields at offset(s): 0 2

?

Update: To fit in the OP's specified requirements:

use strict; while ( <DATA> ) { my @line = split /\|/, $_; my @null = grep { defined $_ } map { $line[$_] =~ /^$/ ? $_ : unde +f } (0..$#line); print "line contains ", scalar @null, " null fields at offset(s): +@null\n"; } #line contains 3 null fields at offset(s): 1 3 4 #line contains 2 null fields at offset(s): 2 4 #line contains 3 null fields at offset(s): 0 1 4 #line contains 2 null fields at offset(s): 1 2 4 __DATA__ first||third|| alpha|beta||delta| ||c|d| one|||four|


holli, /regexed monk/

Replies are listed 'Best First'.
Re^2: identifying null fields in bar delimited records
by wfsp (Abbot) on May 31, 2005 at 12:07 UTC
    holli++

    Here's a humble for loop:

    #!/usr/bin/perl use strict; use warnings; my $line = "a||c|\t| |d||e"; my @fields = split(/\|/, $line); my @null; for my $i (0..$#fields){ push @null, $i unless $fields[$i]; } print "null fields: ", scalar @null, "\n"; print "at field: "; print ++$_, ", " for @null; # output # null fields: 2 # at field: 2, 7,
      holli and wfsp, great answers. Now, I'd like to make this snippet of code more general by turning the delimiter into a variable. But doing so doesn't work (I'm struggling to understand regular expressions).
      my $delim = '|'; my $line = "a||c|\t| |d||e"; my @fields = split(/\\$delim/, $line); # output # null fields: 0 # at field:
        my $delim = qr/\|/; my $line = "a||c|\t| |d||e"; my @fields = split($delim, $line);

        The qr operator quotes and compiles its STRING as a regular expression.
        split splits on a pattern (regex).

        update:

        Added explanation.

        I would put the escaping backslash into $delim, so you are truly independent. Besides that your code works fine and splits the string as intended.
        my ($delim, $line, @fields); $delim = '\|'; $line = "a|c||b"; @fields = split(/$delim/, $line); print join ("*", @fields), "\n"; $delim = ';'; $line = "a;c;;b"; @fields = split(/$delim/, $line); print join ("*", @fields), "\n"; #a*c**b #a*c**b
        What else did you expect?


        holli, /regexed monk/
      I posted as anon by mistake, which means I can't edit my post to remove the superfluous line containing the variable "line".
Re^2: identifying null fields in bar delimited records
by Anonymous Monk on May 31, 2005 at 14:01 UTC
    I ended up doing this in awk; it could probably be translated to perl by someone smarter than me. For some reason the filename is printed AFTER the results instead of before as I would have expected, but otherwise this seems to work.
    BEGIN{FS="\|";} { line= substr($0,0,length($0)-1); #peel trailing bar n=split($0,a); for (i=0;i<n;i++){ line=line a[i]; if(a[i] ==""){nulls[i]++}; } } END{ print FILENAME; for (i in nulls) print "\t field " i": " nulls[i] " nulls"; }