dorian has asked for the wisdom of the Perl Monks concerning the following question:

The code below is an adaptation of several codes I've read here in the Monastery. I've tried to use PDL to improve the speed of the graphics, but I have several problems: 1. I have no clear the way to make a pdl variable with string of binary numbers. 2. It's very slow.

If somebody has an idea about how to improve it, I would be very pleased. You can email me to get a seg y file to make tests Thanks in advance, Dorian

use vars qw( $VERSION ); ($VERSION) = sprintf("%d.%02d", q$Revision: 1.3 $ =~ /(\d+)\.(\d+)/); use Tk; use Tk::Zinc; use PDL; #use strict; my $defaultfont = '-adobe-helvetica-bold-r-normal--*-120-*-*-*-*-*-*'; my $mw = MainWindow->new(); my $zinc = $mw->Zinc(-width => 900, -height => 700, -font => "10x20", # usually fonts are sets in resources # but for this example it is set in the + code! -borderwidth => 3, -relief => 'sunken',-backcolor=>'white +' )->pack; $i=0; #DECLARACION DE VARIABLES $b=pdl[1..1000]; $array2=pdl[[1],[1..1000]]; $file='D:\perl\download\prueba.sgy'; $[ = 1; open (IN,$file); binmode(IN); read (IN,$buf,3200); if (read (IN,$buf,400)){ $dsr = substr ($buf,17,2) ; $sr = unpack ( "n" , $dsr); $dsample = substr ($buf,21,2) ; $sample = unpack ( "n" , $dsample); $dcode= substr ($buf,25,2) ; $code = unpack ( "n" , $dcode); if ($code>4) { $code=1; } $btr=4; if ($code==3){$btr=2;} } $pertrace=($sample*$btr)+240; $bt=($pertrace-240)/4; $t=0; $m=0; $trace_value=pdl[1..10000]; $b=pdl[[1],[1..10000]]; #open (data,'>d:\perl\beta\dumpnum4.txt'); while (read (IN,$buf,$pertrace)) { $[ = 1; for ($i=0;$i<$bt;$i++) { $bit_string=unpack("B*",substr($buf,241+$i*4,4)); set $trace_value,$i,(conv_bit_string_2_ibm32float($bit_string)); } for ($i=0;$i<$bt;$i++) { $[ = 0; $array2[0][2*$i+1]=($i/4)+20; #eje de tiempo $array2[0][2*$i]=$trace_value->at($i)+30+$t*4; #valor de la ampl +itud # $m++; } $b=$zinc->add('curve',1,@array2,-filled=>1,-fillrule=>'negative'); $t++; #cuenta la cantidad de trazas # $m=0; #reinicia los valores del tiempo. last if ($t>1); #esta instruccion no sera necesaria cuando se c +uenten todas las trazas } close IN; #close data; # $b=$zinc->add('curve',1,@array2,-filled=>1,-fillrule=>'negative') +; MainLoop; sub conv_bit_string_2_ibm32float { $aux=$_[0]; $aux = shift; @aux2=$aux; $first_digit = substr($aux, 0, 1); $sign_bit = (-1)**$first_digit; $bin_exponent = substr($aux, 1, 7); $exponent = bin2dec($bin_exponent); $bin_fraction = substr($aux, 8, 24); @bit_chars = unpack("A1" x length($bin_fraction), $bin_fraction); $place_holder = -1; $fraction = 0; foreach $bit ( @bit_chars ) { $fraction += $bit * (2 ** $place_holder); $place_holder += -1; } $ibm_float = (($sign_bit) * (16 ** ($exponent - 64))*($fraction))/3000 +; return $ibm_float; } sub bin2dec { return unpack("N", pack("B32", substr("0" x 32 . shift, -32))); }

Replies are listed 'Best First'.
Re: graphic a segy file
by BrowserUk (Patriarch) on May 14, 2005 at 13:11 UTC

    Unless TkZinc is expecting you to pass data in PDL form (which from a quick scan of the docs, it isn't), your use of PDL here is doing nothing to speed up your program. In fact, it is probably slowing it down by forcing the conversion of the float values you are setting into your PDL arrays into PDL form which will then be converted back when TkZinc accesses them.

    Ie. If you use ordinary perl arrays inplace of the PDL arrays, your code will probably run (slightly) faster.

    Most of the time in your script is being consumed by the routine conv_bit_string_2_ibm32float(). Whilst this routine does correctly (kind of) convert IBM32 floats to IEEE native form, it is not very well coded.

    • First, you have some sort of scaling (/ 3000) going on inside the sub, which may be correct for your use of the data, but is definitly not a part of the IBM32float to IEEE float conversion.
      $ibm_float = (( $sign_bit )*( 16**( $exponent - 64 ) ) * $fraction + )/3000;

      (That had me stumped for a while! :)

      If you need to scale the data, this should be done outside the conversion routine.

    • Conversely, for your conversion routine to work, the IBM32float value will need to be unpacked to a bitstring every time the routine is called.

      Doing this outside the routine makes no sense. It should be a part of the sub.

    • There are a few artifacts kicking around inside the sub that make things more confusing than they need be. Eg.
      $aux = $_[0]; $aux = shift;
    • You aren't using strict, and you are using global vars, which are slower than lexical (my) vars! That may be enough to encourage your to use lexicals, even if their other benefits do not attract you.

    Putting that together with a few other changes that should improve the performance, I came up with this which does the same job somewhat more quickly:

    sub conv_bit_string_2_ibm32float { my( $first_digit, $bin_exponent, $bin_fraction ) = unpack 'A1A7A24', $_[ 0 ]; my $sign_bit = ( -1 ) ** $first_digit; my $exponent = bin2dec( $bin_exponent ); my $place_holder = -1; my $fraction = 0; foreach my $bit ( split '', $bin_fraction ) { $fraction += $bit * ( 2 ** $place_holder ); $place_holder += -1; } my $ibm_float = ( ($sign_bit ) * ( 16 ** ( $exponent - 64 ) ) * $f +raction ); return $ibm_float; } sub bin2dec { return unpack( "N", pack( "B32", substr( "0" x 32 . shift, -32 )) +); }

    Using normal Perl arrays and the above routine may improve your performance somewhat, but doing the conversion this way is always going to be slow.

    If you are plotting seismic data (?) and dealing with large volumes of it, then you would be well advised to pre-convert the IBM32 floats in your SEGY files in bulk before plotting your your data. Pre-conversion would reduce your plotting time immensly.

    There is almost certainly a utility available to convert ibm32 segy format files to IEEE float format segy files.

    A google turned up tito (from the GU toolkit) which appears to do the conversion, but I didn't look further at it to see if it was public domain. There are probably others around also.

    However, if you cannot locate a conversion utility, it would be fairly simple to code the above conversion routine using Inline::C such that you pass it a raw string that you read from the file and a count, and it would return you a perl array of native floats suitable for passing directly to TkZinc. That would run a couple of orders of magnitude more quickly than the pure perl conversion routine above.

    It would still make sense to perform the conversion as a batch process rather than whilst plotting.

    Speak up if you would like some help with that. There are many people here quite adept at writing Inline::C routines.

    Hope that clarifies a few things.


    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    Lingua non convalesco, consenesco et abolesco. -- Rule 1 has a caveat! -- Who broke the cabal?
    "Science is about questioning the status quo. Questioning authority".
    The "good enough" maybe good enough for the now, and perfection maybe unobtainable, but that should not preclude us from striving for perfection, when time, circumstance or desire allow.
Re: graphic a segy file
by dynamo (Chaplain) on May 13, 2005 at 23:09 UTC
    Yowza!

    Ok, look. Normally, when seekers of perl wisdom forget to use <code> tags, I try to read the post and answer anyway. But this time you're just going to have to update your post before I can even try.

    Some <p> tags in the text part wouldn't hurt too much either.

    UPDATE: thanks for fixing that. unfortunately my PDL experience is too limited to actually help you, though.

      We editors usually notice such things fairly quickly, but there's no point chastising the original person, because we can fix it within a few minutes anyway. And, I usually pass along a fairly directed note to the original poster anyway.

      -- Randal L. Schwartz, Perl hacker
      Be sure to read my standard disclaimer if this is a reply.

        I wasn't aware that editors go around actually correcting posts without original poster interaction. Thanks for the heads-up.
Re: graphic a segy file
by zentara (Cardinal) on May 14, 2005 at 12:14 UTC
    What is a segy file? Can you explain in simple terms what you are trying to accomplish with this script?

    I'm not really a human, but I play one on earth. flash japh
      A segy is a format to handle seismic informatrion, according to the SEG standards (www.seg.org). I want to display a seismic section, which information comes in this format.