comment on

The reference that GrandFather gave indicates that .dcd files are broken into records thus:

  <length0><record0><length0>
  <length1><record1><length1>
  ...

each record being bracketed by its length. Quick inspection of your file indicates that the lengths are 32-bit integer, and little-endian. The lengths are for the record part only, so record0, with its overhead, occupies the first length0 + 8 bytes of the file.

The code below is a quick and dirty .dcd reader.

Having established the record structure, the real problem appears to be that you need to know the format of each record in order to be able to unpick it.

I had a quick go, first to extract only characters \x20-\x7F, replacing runs of other stuff by '~~' and singletons by '~'. I observed that where there were numbers, they appeared to be 32-bit, so I also took each record in 4-byte groups, and any that were not 4 characters \x20-\x7F I also rendered as integers (unsigned), and if plausible as floats -- indicating sections of characters as #999.

The result was a bit disappointing:

     0: 'CORDe~~8~N~~X~P~~'=~~'
        #4 0x365 0x14E1438(3.78507e-38) 0xC8 0x150BA58(3.83373e-38)
        0x0 0x0 0x0 0x0 0x0 0x3D2790E3(0.0409097) 0x1 0x0 0x0 0x0
        0x0 0x0 0x0 0x0 0x0 0x18
     1: '~~REMARKS FILENAME=output/restart6_out.dcd CREATED BY NAMD      '
        '                  REMARKS DATE: 06/07/07 CREATED BY USER: fadoul'
        'o                                 '
        0x2 #160
     2: '$~~'
        0xE824
     3: 'k+~~_@~~V@~3~!-~R@~~V@~~V@~~C~FO@'
        0xF2CA2B6B(-8.00876e+30) 0x405FE6F2(3.49847) 0x0 0x40568000(3.35156)
        0x21FF33A1(1.72931e-18) 0x40529C2D(3.29078) 0x0 0x40568000(3.35156)
        0x0 0x40568000(3.35156) 0x43199BB3(153.608) 0x404F4683(3.23868)
failed Record length 237712 > rest of file 86216 (@0x14C of file 'skata3')

such is life. Record 4 is big. It appears to be 59,428 32-bit numbers, at least going by the first 100 or so. I note that Record 2 contains the value 59,428.

use strict ;
use warnings ;

use constant MAX_RECORD => 256 * 1024 ;

my $DCD = dcd_open("skata3") ;

my $data = '' ;
my $rc   = 0 ;
my $rec  = 0 ;
while ($rc = dcd_read($DCD, $data)) {
  (my $chars = $data) =~ s/[\x00-\x1F\x7F-\xFF]{2,}/~~/g ;
  $chars =~ s/[\x00-\x1F\x7F-\xFF]/~/g ;
  printf "%6d: '%s'\n", $rec++, join("'\n        '", $chars =~ m/(.{1,
+64})/g) ; 
  numbers($data) ;
} ; 

if (!defined($rc)) { print "failed $@\n" ; } ;

sub numbers {
  my ($data) = @_ ;

  my $prev = 0 ;
  my $off  = 0 ;
  my $have = length($data) ;

  my $line = '' ;

  while ($have >= 4) {
    if (substr($data, $off, 4) !~ m/[\x20-\x7E]{4}/) {
      if (!$line) { $line = ' ' x 7 ; } ;
      my $i = unpack("\@${off}V", $data) ;
      if ($prev != $off) { $line .= " #". ($off - $prev) ; } ;
      $line .= sprintf(" 0x%X", $i) ;
      if (($i > 0x00FFFFFF) && ($i < 0xFF000000)) {
        my $f = unpack("\@${off}f<", $data) ;
        $line .= sprintf("(%.6g)", $f) ;
      } ;
      $prev = $off + 4 ;
      if (length($line) > 64) { print $line, "\n" ; $line = '' ; } ;
    } ;
    $off  += 4 ;
    $have -= 4 ;
  } ;

  if ($prev != $off) {
    if (!$line) { $line = ' ' x 7 ; } ;
    $line .= " #". ($off - $prev) ;
  } ;

  if ($line) { print $line, "\n" ; } ;
} ;

#=====================================================================
+====================
# dcd_open: open given 'dcd' file and prepare to read records.
#
# Requires: $name   -- name of file to ppen
#
# Returns:  $DCD    -- skata file "object"

sub dcd_open {
  my ($name) = @_ ;

  open my $FH, '<:raw', $name   or die "could not open $name: $!" ;

  my $f_offset = 0 ;
  my $b_offset = 0 ;
  my $buffer   = '' ;
  my $eof_met  = 0 ;

  my $DCD = [] ;
  @$DCD = ($FH, \$buffer, $f_offset, $b_offset, $eof_met, $name) ;

  return $DCD ;
} ;

#=====================================================================
+====================
# dcd_read: read record from given 'skata' file.
#
# Requires: $SK     -- skata file "object"
#           $rec    -- where to read record into  -- UPDATED IN PLACE
#
# Returns:  > 0     -- OK, record length + 1
#           = 0     -- OK, eof
#         undef     -- failed -- see $@

sub dcd_read {
  my ($DCD, undef) = @_ ;

  my ($FH, $r_buffer, $f_offset, $b_offset, $eof_met, $name) = @$DCD ;

  my $have = length($$r_buffer) - $b_offset ;
  if (($have < (MAX_RECORD + 8)) && !$eof_met) {
    substr($$r_buffer, 0, $b_offset) = '' ;
    my $rc = read $FH, $$r_buffer, (MAX_RECORD * 2) - $have, $have ;
    if (!defined($rc)) { $@ = "failed while reading $!" ; goto FAILED 
+; } ;
    $eof_met  = ($rc == 0) ;
    $have     = length($$r_buffer) ;
    $b_offset = 0 ;
  } ;

  if ($have < 4) {
    if ($have == 0) { return 0 ; } ;       # eof
    $@ = "Attempt to read when only $have bytes available" ;
    goto FAILED ;
  } ;

  my $len = unpack("\@${b_offset}V", $$r_buffer) ;
  if (($len + 8) > $have) {
    if ($len > MAX_RECORD) { $@ = "Record length $len > MAX_RECORD(".M
+AX_RECORD.")" ; }
                      else { $@ = "Record length $len > rest of file $
+have" ;
                             if (!$eof_met) { $@ .= " **BUT NOT AT EOF
+**" ; } ;       } ;
    goto FAILED ;
  } ;

  my $nel ;
  ($_[1], $nel) = unpack("\@${b_offset}V/a*V", $$r_buffer) ;

  if ($nel != $len) {
    $@ = "Start record length $len != end record length $nel" ;
    goto FAILED ;
  } ;

  $b_offset += $len + 8 ;
  $f_offset += $len + 8 ;

  @$DCD = ($FH, $r_buffer, $f_offset, $b_offset, $eof_met, $name) ;

  return $len+1 ;

FAILED:
  $@ .= sprintf(" (\@0x%X of file '%s')", $f_offset, $name) ;
  return undef ;
} ;
[download]

In reply to Re: binary to ascii convertion by gone2015
in thread binary to ascii convertion by pytheas

Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!

Titles consisting of a single word are discouraged, and in most cases are disallowed outright.

Read Where should I post X? if you're not absolutely sure you're posting in the right place.

Please read these before you post! —

Posts may use any of the Perl Monks Approved HTML tags:

a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr

You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)

	For:		Use:
	&		`&`
	<		`<`
	>		`>`
	[		`[`
	]		`]`

Link using PerlMonks shortcuts! What shortcuts can I use for linking?

See Writeup Formatting Tips and other pages linked from there for more info.