hmbscully has asked for the wisdom of the Perl Monks concerning the following question:

I have an obnoxious online form that gathers upto ~150 user-supplied values. I need to process these values and build a fixed-length record from them. I've been thinking of half a dozen ways to deal with the data, so I thought I'd ask some wise monks to add to the list of how to do handle them.

I've not worked with creating fixed length records before, but I realize that sprintf() is going to be my friend. What I'm trying to figure is a decently efficient way to handle the fact that many of the fields of the record may be submitted empty, which is fine, but I have to make sure they're holding the right amount of spaces in the final record.

Should I declare all the variables up front something like
my $zip_1 = "            "; #12 spaces etc? and then have the variable values updated when I get the values from the form if there is a submitted value for that form?

I have several groups of variables that all need to be the same lengths. 35 is a popular length. After I read in the data from the form, say
$addressee_1 = param('addressee1'); and so on and so forth... can I assign all the variables that need to be 35 characters in length to an array and then foreach the whole array with a $foo = sprintf("%-35s", $foo);? Or are those variable changes local to the array? What I mean, is can I reference those variables later just as $addressee_1 and get the variable with the change, or do I have to reference them as part of the array (my array syntax sucks and I can't think of it at the moment) to get the padded/truncated value?

Vague enough? I welcome any questions to help me clarify my thinking. Thanks!

----------------------------------------------------------------
I learn more and more about less and less until eventually I know everything about nothing.

Replies are listed 'Best First'.
Re: Best way(s) to process form data into fixed-length values?
by Zaxo (Archbishop) on Jul 25, 2006 at 20:17 UTC

    IMO, the most convenient way to handle fixed-width records is with pack to write and unpack to read. Pack is usable either with arrays or lists of variables.

    my @data = qr/foo bar bazaquuxinator/; my $format = q(A3 A10 A3 A35); my $line = pack $format, @data; print $line, $/; my $line = pack $format, @data[0,1], 'quux', $data[2]; print $line, $/;

    That prints,

    foobar       baz                                   
    foobar       quubazaquuxinator                     
    $
    
    Unpacking with the same format string splits run-together items at the correct places. Notice that too-long data is truncated to the field width. The first printed line was constructed with not enough data for the format, so pack put 35 blanks at the end.

    After Compline,
    Zaxo

Re: Best way(s) to process form data into fixed-length values?
by GrandFather (Saint) on Jul 25, 2006 at 19:23 UTC

    There is no need to init variables with padded strings. Consider:

    my $zip_1 = ''; #... printf ">%12.12s<\n", $zip; printf ">%12.12s<\n", '123456789012';

    Prints:

    > < >123456789012<

    To manage multiple successive entries that ought be the same length:

    my @lines; push @lines, 'Address line 1'; push @lines, ''; # Second line of address skipped push @lines, 'no zip stupid form'; push @lines, 'My beloved country'; push @lines, 'The %35.35s syntax will truncate this is a long line.'; printf "%-35.35s\n", $_ for @lines;

    Prints:

    Address line 1 no zip stupid form My beloved country The %35.35s syntax will truncate th

    But why fixed length records rather than CSV, XML or using a database?


    DWIM is Perl's answer to Gödel

      First off, fixed length is the requirement of the client, I have no say in that. yay me.
      I don't have muliple successive entries that need to be the same length. The fields are of various lengths scattered throughout the file layout. My badly-formed original question was looking to see if I could

      push @lines35char, $value1; push @lines35char, $value13; push @lines35char, $value 22; sprintf "%-35.35s", $_ for @lines35char;
      and then have those padded values show up when I build the line, like so:
      $fl_line = join '', $filetype, $batch, $serial, $sys_time, $value1, $t_lastname, $t_firstname, $t_mi, $t_mi, $ssn, $ssn_cor, $dob, $t_address, $t_city, $t_state, $t_zip, $value13, $t_country, $code_1, $date_1, $type_1, $priority_1, $addressee_1, $institution_1,$value22;
      or do I have to reference them as they are related to the array?
      Thanks!
      I learn more and more about less and less until eventually I know everything about nothing.

        You could toss the various values into arrays. Something tht may help is:

        use warnings; use strict; my @fields; my $fieldValue = 'Grand'; # Given name field contents push @fields, [$fieldValue, 10]; # 20 character wide field $fieldValue = 'Father'; # Sir name push @fields, [$fieldValue, 10]; # 25 character wide field push @fields, ['007 Bond Street', 15]; push @fields, ['Wibble Wobble', 15]; push @fields, ['', 15]; push @fields, ['Erewhon', 15]; #... my $record = join '', map {sprintf "%-*.*s", $_->[1], $_->[1], $_->[0] +} @fields; print $record;

        Prints:

        Grand Father 007 Bond StreetWibble Wobble Erewh +on

        which ties the field size to the field contents at the point where the contents are added to the field list then uses the field size to control field width in a sprintf later on.


        DWIM is Perl's answer to Gödel
        Why would you go to the trouble of pre-padding them? I'm pretty sure you'd be better off performance-wise to just call sprintf once (if that's any sort of concern).

        Assuming an example where odd fields are 5 chars long and evens 15:

        my $var1, $var2, $var3, $var4, $var5; #hopefully you'll be using some +meaningful names instead my $format = "%5s%15s%5s%15s%5s\n"; #you could embed spaces in the for +mat specifier too if needed #check/process inputs here #.... $outputline = sprintf $format, $var1, $var2, $var3, $var4, $var5;
        See the sprintf documentation (pg 797 of Programming Perl 3rd e) for other formats if you need to deal with formatting numerics.

        You can break the format statement up or use repeat specifications too so you don't lose count:

        my $format = "%5s%5s%5s%5s%5s"."%5s%5s%5s%5s%5s"."%5s%5s%5s%5s%5s". "%5s%5s%5s%5s%5s"."%5s%5s%5s%5s%5s"."%5s%5s%5s%5s%5s". "%5s%5s%5s%5s%5s"."%5s%5s%5s%5s%5s"; my $otherformat" = "%5s"x40;
        (Though the former is likely to be unwieldy when the client changes the content three times a day)
Re: Best way(s) to process form data into fixed-length values?
by ptum (Priest) on Jul 25, 2006 at 19:14 UTC

    I think you are correct in pursuing sprintf as one good way to skin this cat. But I must admit, I personally prefer working with delimited records to fixed-length records any day of the week. Are you sure you can't do this with a delimiter, rather than fixed-length?

    If you decide you must use a fixed-length strategy, you might try declaring a hash with attribute names and expected string length (or other formatting instructions). Then you could pass each form input to a standard write_me() routine that would look up the appropriate formatting instruction and write it out to your file (or whatever).

    I'm sure there are lots of TIMTOWTDI variations. :)


    No good deed goes unpunished. -- (attributed to) Oscar Wilde
      The fixed-length issue is a requirement from the client. They're defining the file format, i just have to deal with it. I too prefer delimited. I personally love |.
      Thanks for the suggestions.

      I learn more and more about less and less until eventually I know everything about nothing.
Re: Best way(s) to process form data into fixed-length values?
by zentara (Cardinal) on Jul 25, 2006 at 19:19 UTC
Re: Best way(s) to process form data into fixed-length values?
by metaperl (Curate) on Jul 25, 2006 at 18:57 UTC
Re: Best way(s) to process form data into fixed-length values?
by blokhead (Monsignor) on Jul 25, 2006 at 20:12 UTC
    Using a fixed-width text module sounds a little overkill. Instead, I would simply encapsulate one of the solutions from this thread into a small "pad_string" sub.

    Also be sure to consider whether a field's data should be truncated if it exceeds the maximum width. The thread I referenced has snippets of both forms: pad-or-truncate, and just pad.

    blokhead

Re: Best way(s) to process form data into fixed-length values?
by hmbscully (Scribe) on Jul 26, 2006 at 20:57 UTC
    Many thanks to all... so far I've got this lovely chunk o'code:
    # format for fixed-length line: my $filebegin_format = "%-1.1s%-8.8s%-6.6s%-17.17s%-25.25s%-16.16s +%-1.1s%-9.9s%-9.9s%-8.8s%-35.35s%-25.25s%-2.2s%-12.12s%-3.3s"; #file +layout for characters 1-177 my $request_format = "%-4.4s%-4.4s%-1.1s%-1.1s%-35.35s%-35.35s%-35 +.35s%-25.25s%-2.2s%-12.12s%-3.3s%-4.4s"; #file layout for requests 1- +8 my $request9_format = "%-6.6s%-4.4s%-4.4s%-1.1s%-1.1s%-1.1s%-35.35 +s%-35.35s%-35.35s%-25.25s%-2.2s%-12.12s%-3.3s%-4.4s"; #file layout fo +r request 9 my $fileend_format = "%-5.5s%-1.1s%-16.16s%-4.4s%-25.25s%-16.16s%- +9.9s%-1.1s%-40.40s%-14.14s%-2.2s%-9.9s%-3.3s%-3.3s%-10.10s%-40.40s%-2 +2.22s%-1.1s"; #file layout for characters 1630-1850 my $format = $filebegin_format . $request_format . $request_format + . $request_format . $request_format . $request_format . $request_for +mat . $request_format . $request_format . $request9_format . $fileend +_format; # build fixed-length line: $fl_line = sprintf $format, $filetype, $batch, $serial, $sys_time, + $t_lastname, $t_firstname, $t_mi, $t_mi, $ssn, $ssn_cor, $dob, $t_ad +dress, $t_city, $t_state, $t_zip, $t_country, $code_1, $date_1, $type +_1, $priority_1, $addressee_1, $institution_1, $address_1, $city_1, $ +state_1, $zip_1, $country_1, $filler, $code_2, $date_2, $type_2, $pri +ority_2, $addressee_2, $institution_2, $address_2, $city_2, $state_2, + $zip_2, $country_2, $filler, $code_3, $date_3, $type_3, $priority_3, + $addressee_3, $institution_3, $address_3, $city_3, $state_3, $zip_3, + $country_3, $filler, $code_4, $date_4, $type_4, $priority_4, $addres +see_4, $institution_4, $address_4, $city_4, $state_4, $zip_4, $countr +y_4, $filler, $code_5, $date_5, $type_5, $priority_5, $addressee_5, $ +institution_5, $address_5, $city_5, $state_5, $zip_5, $country_5, $fi +ller, $code_6, $date_6, $type_6, $priority_6, $addressee_6, $institut +ion_6, $address_6, $city_6, $state_6, $zip_6, $country_6, $filler, $c +ode_7, $date_7, $type_7, $priority_7, $addressee_7, $institution_7, $ +address_7, $city_7, $state_7, $zip_7, $country_7, $filler, $code_8, $ +date_8, $type_8, $priority_8, $addressee_8, $institution_8, $address_ +8, $city_8, $state_8, $zip_8, $country_8, $filler, $code_9, $date_9, +$type_9, $exception_9, $priority_9, $addressee_9, $institution_9, $ad +dress_9, $city_9, $state_9, $zip_9, $country_9, $filler, $bill_total, + $cc_type, $cc_number, $cc_exp, $cc_name, $cur_lastname, $cur_firstna +me, $cur_mi, $cur_address, $cur_city, $cur_state, $cur_zip9, $cur_pco +de, $cur_country, $phone, $email, $filler, $reject; my $asr_line = uc($fl_line); #make the line uppercase
    Which so far is giving me just what I need. And should be fairly simple to update when the client changes their minds again. Thanks to all!

    I learn more and more about less and less until eventually I know everything about nothing.
Re: Best way(s) to process form data into fixed-length values?
by duckyd (Hermit) on Jul 25, 2006 at 19:27 UTC
    You might consider using perl formats. See perldoc perlform