http://qs1969.pair.com?node_id=686546


in reply to how to identify a fixed width file

Why don't you try running through the file counting up the number of times a line of a given length is seen...
my %line_count_by_length; while (<DATA>) { my $line_length = length($_); $line_count_by_length{$line_length}++; }
If any (or a sufficiently large portion of) those line counts represent a big percentage of the total line count, you could make a guess that the file was fixed width. Perhaps also giving a weighting on how many different line lengths are represented in the file, compared to how many you might expect given the file's length?
---
my name's not Keith, and I'm not reasonable.

Replies are listed 'Best First'.
Re^2: how to identify a fixed width file
by ftumsh (Scribe) on May 14, 2008 at 15:48 UTC

    My initial stab was a count of record lengths which was fine until the different length files cropped up.

    I think bringing that back along with some analysis of the counts, along with tachyon/mortitz' text OR should go a long way to solving this