Data munging - in, out and something in between

Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

Replies are listed 'Best First'.
Re: Data munging - in, out and something in between by GrandFather (Saint) on Apr 23, 2007 at 22:24 UTC
Show us the code. In particular, show us a short program the demonstrates the problem with what you are getting and what you expected to get. DWIM is Perl's answer to Gödel	[reply]
Re: Data munging - in, out and something in between by ikegami (Patriarch) on Apr 24, 2007 at 02:56 UTC
Whether Perl converts the number to a floating point number internally doesn't matter. The number will automatically get converted back into an integer. Decimals will be lost, of course. `$p = 'ABCD'; $n = unpack('N', $p); print("$n\n"); # 1094861636 $n *= 1/3; print("$n\n"); # 364953878.666667 $p = pack('N', $n); $n = unpack('N', $p); print("$n\n"); # 364953878` [download] The two most likely causes in my mind are 1) that you're trying to store a number too big for the field into which it's being packed, or 2) that you didn't `binmode` the file from which you are reading or to which you are writing. We can best help you if you show us a bit of code that reproduces the problem. Be sure to specify what output you are getting and what output you are expecting.	[reply] [d/l] [select]
Re: Data munging - in, out and something in between by rodion (Chaplain) on Apr 24, 2007 at 02:58 UTC
You may find your problem by just putting together a snippet of code for us to look at, as Grandfather suggests, but if you don't find it, this forum will then have something to work with. As an additional way to characterize the problem, you might try the following command lines: `perl -e print(pack('II',0x085c,0x5c080000)); > temp.pk perl -e "print(pack('NN',map($_*2,unpack('II',<>))));" > temp2.bk perl -e "printf('%8x:%8x',unpack('NN',<>));" temp2.pk produces --> 10b8:b8100000 (as expected)` [download] They look like they do what you describe, but the numbers come out as expected, at least on my windows box. If the numbers come out right on your system, then take a look at what you're doing that's different from the examples, or show us and someone here will certainly take a look at it.	[reply] [d/l]
Re: Data munging - in, out and something in between by Anonymous Monk on Apr 24, 2007 at 08:05 UTC
Sorry for the lack of code in my original question; it is not so easy to demonstrate in a code snippet, and impossible to show the data, but here goes... `open(IN, " < $infile" ); my $si=8192; # the number of entries is known my %matrix; while(my $c<$si) { read(IN, my $bin, 4); $data = unpack('N', $bin); my $d = Unshuffle($c); $matrix{$d} += $data; $c++; } close( IN ); open(OUT, " >$outfile" ); foreach my $d (sort {$a <=> $b} (keys %matrix)) { print (OUT pack('V', $matrix{$d})); } close( OUT );` [download] The subroutine Unshuffle() simply calculates a new position in the output matrix for each entry. The above works until I try to scale the data, e.g. `$matrix{$d} += $data/3;` In which case the output is scrambled. BTW, null opperations like: `$matrix{$d} += $data/1;` `$matrix{$d} += $data+0;` still work. This, I suspect, explains the fact that the script also fails if more than one of data points added to make `$matrix{$d}` is non-zero. I guess that my number is now too long for the "V" template in pack. Yes, the template must be 'V', otherwise a downstream program spits the dummy. thanks again	[reply] [d/l] [select]
Re^2: Data munging - in, out and something in between by rodion (Chaplain) on Apr 24, 2007 at 10:25 UTC
I tried your code on my machine, with a few additions/modifications to get it running, all of them marked with "# " below. I didn't see any problem with the output. The only really significant modifications were to take the "my" away from the "$c" in the while loop, since it was re-initializing $c with each pass through the loop, and to add binmode(IN), which actually doesn't make any difference with the data I'm using. Try this modified code on your machine and see if you get appropriate output. It only processes the first two numbers, so it should be easy to see what's going on and play with it. use warnings; # use strict; # my $data; # my $c = 0; # my $outfile = 'tempo.pk'; # my $infile = shift; # open(IN, " < $infile" ); binmode(IN); # prevents newline translation my $si=2; # the number of entries is known my %matrix; while($c<$si) { # removed my read(IN, my $bin, 4); $data = unpack('N', $bin); my $d = Unshuffle($c); print "$c -> $d ($data)\n"; # $matrix{$d} += $data/3; # $c++; } close( IN ); open(OUT, " >$outfile" ); foreach my $d (sort {$a <=> $b} (keys %matrix)) { print "$outfile $d ($matrix{$d})\n"; # print (OUT pack('V', $matrix{$d})); } close( OUT ); open(IN, " <$outfile" ); # while( read(IN, my $bin, 4) ) { # $data = unpack('V', $bin); # print "$data:"; # } # sub Unshuffle { # interchange the first two positions my $val = shift; return 1 if ($val == 0); return 0 if ($val == 1); return $val; } # output was # 0 -> 1 (2140) # 1 -> 0 (1544028160) # tempo.pk 0 (514676053.333333) # tempo.pk 1 (713.333333333333) # 514676053:713: [download]	[reply] [d/l]