in reply to Unpack/format

I took jlongino's suggestion and reworked it a little bit. I moved the padding into a subroutine and made it so it only creates enough extra whitespace to satisfy the format condition.
$c = "test test test"; $c = correct_length($c); $fmt = "A4 x1 A4 x1 A4 x1 A4"; @c = unpack($fmt, "$c"); print join ("*",@c), "\n"; $d = "test test test "; $d = correct_length($d); $fmt = "A4 x1 A4 x1 A4 x1 A4"; @d = unpack($fmt, "$d"); print join ("*",@d), "\n"; sub correct_length { my ($string) = shift; $string .= ' ' unless length($string) >= 15; return($string); }
This is assuming all your widths are the same, otherwise you would want to either pass the required width as an argument to the sub (not a good idea if you use the size in multiple places) or create a hash with values and there corresponding lengths and pass the item_name. I think this would give it a little more durability. Some thing like this:
my %formats = ( "item_name" => 15, # where 15 is the length "item_name2" => 20, # etc. ); # code removed for demonstration $c = correct_length($c,'item_name'); sub correct_length { my ($string,$item) = @_; $string .= ' ' unless length($string) >= $formats{$item}; return($string); }
Then the subroutine above could be passed the item_name and the string so the only place you would have to modify length would be in the formats hash, but I don't know if your data is that complex/varied.

Replies are listed 'Best First'.
Re: Unpack/format
by Reverend Phil (Pilgrim) on Feb 13, 2002 at 18:06 UTC
    Thanks for the ideas =)
    Perhaps I should be a little more detailed though as to why I would rather not hard-code the format lengths.
    I'm taking a slew of files in from an FTP site, and their file names are the only indication of their formatting.
    Let's assume the file "2001-123456789-W2.txt"
    unless($file =~ /^(\d{4})-\d{9}-([^\.]+)/){ log_data("$file does not conform to file naming standards"); next; } $lower_format = lc("$1$2");
    This allows me to throw the file contents into a hash, keyed to "2001w2". Later on, I'm going to take the array of records matched to each hash key, and break them up based on their format. I'm disgustingly doing this via an eval, in a function which was passed the hash-key as $format:
    eval "\$current_format = \$fmt_$format"; @line = unpack($current_format, $data);
    Earlier in the script, I've defined $fmt_2001w2 as a format string. There are numerous format strings of various shapes and sizes here.

    I could take the expected string sizes from our specs, and hardcode the values in here, checking the format and then the string lengths as I go along. But when the customer decides that we need to add this and move that, I have to (a) adjust the $fmt_2001w2 variable, and (b) change this hard-coded length value.

    I am lazy, and wish to do this in one place, not two. I don't want to miss one ;)

    Yes, padding the end of the line will work. Yes, hard-coding the lengths of the formats will work. Thanks both of you for helping =) Now though, it's no longer about making this work (or more specifically, not letting this die), and it's about how I can accomplish the objective goal of making sure that such a thing doesn't happen in a general sense - and if it involves calculating the length from the format string, what might be the quickest or keenest way to do so =)

    -=rev=-
      If your format string doesn't have terribly fancy template characters (i.e. doesn't have things like "w"), and yours which are all "a" and "x" fit the bill, you can do
      $expected_length = length(pack($fmt))
      This returns 14 for the format string of "a4 x1 a4 x1 a4".
      Hope this helps...

      -JAS
        HA!
        Can't see the trees for the forest! While TIMTOWTDI, this is the one I'm looking for. I should kick myself for not seeing that! In fact.. hang on a moment..



        OW!


        Thanks a ton =)
        -=rev=-