in reply to delimited files

Making a few assumptions about the nature of your "delimited" files:

  • The delimiter char should appear in every line.
  • The delimieter char should appear the same number of times in every line.

    It should be possible to analyse the file to determine the delimiter. Something like this may work (untested):

    #! perl -slw use strict; my %charsByLine; my %freqCharsPerLine; my %chars; while( <> ) { chomp; for( split '', $_ ) { $chars{ $_ } = 1; push @{ $charsByLine{ $_ } }, $.; $freqCharsPerLine{ $. }{ $_ }++ ; } } my $last = $.; ## Eliminate chars that do not appear in every line @{ $charsByLine{ $_ } } != $last and delete $chars{ $_ } for keys %cha +rs; ## Eliminate chars where they appear a different number of time per li +ne for my $char ( keys %chars ) { my $previousCount = $freqCharsPerLine{ 1 }{ $char }; for my $line ( 2 .. $last ) { if( $freqCharsPerLine{ $line }{ $char } != $previousCount ) { delete $chars{ $char }; last; } } } if( keys %chars == 1 ) { print "The delimiter for this file is: ", keys %chars; } elsif( keys %chars ) { print "Candidate delimiters for this file are: ", keys %chars; } else { print "Unable to determine a likely candidate for this file!"; }

    Of course, if the files have header lines, or contain quoted items that can contain the delimiter char, then the above assumptions would need to be modified to account for that. But if there is any consistancy in the format of the files, it should be possible to derive a heuristic that would detect the right character in most cases, and flag any anomolies for manual inspection/determination.


    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    Lingua non convalesco, consenesco et abolesco. -- Rule 1 has a caveat! -- Who broke the cabal?
    "Science is about questioning the status quo. Questioning authority".
    The "good enough" maybe good enough for the now, and perfection maybe unobtainable, but that should not preclude us from striving for perfection, when time, circumstance or desire allow.