Or is it a bad style to pass a file handle to a constructor?

No, it's fine, as long as you're using lexical filehandles (open my $fh ...). It only gets difficult if any code that is reading from the filehandle either needs to look back at something that was already read from the file, or needs to look ahead further into the file into a section that is supposed to be parsed by another piece of code - in cases like that, it's usually more appropriate to use an approach similar to what I showed above.

Because of this I would prefer to have the parse logic in each class.

Sure, that's fine too. Here's one quick example*:

use warnings; use 5.012; # for "package BLOCK" syntax BEGIN { $INC{'MyUtils.pm'}=__FILE__; # emulate "use" package MyUtils { use Exporter 'import'; our @EXPORT_OK = qw/ binread /; sub binread { # helper function for read+unpack my ($fh, $bytes, $templ) = @_; read($fh, my $data, $bytes) == $bytes or die "failed to read $bytes bytes"; return unpack($templ, $data); } } } package Packet { use Moo; use MyUtils qw/binread/; has type => ( is => 'ro' ); has length => ( is => 'ro' ); my %packet_types = ( 0x01 => 'MainHeader', 0x02 => 'ExtHeader', 0x13 => 'Section' ); sub parse { my ($class, $fh) = @_; my ($type, $length) = binread($fh, 2, "CC"); die "Unknown packet type $type" unless $packet_types{$type}; return $packet_types{$type} ->new( type=>$type, length=>$length )->parse($fh); } } package MainHeader { use Moo; use MyUtils qw/binread/; extends 'Packet'; has foo => ( is => 'rwp' ); sub parse { my ($self, $fh) = @_; my ($foo) = binread($fh, $self->length, 'Z*'); $self->_set_foo($foo); return $self; } } package ExtHeader { use Moo; use MyUtils qw/binread/; extends 'Packet'; has bar => ( is => 'rwp' ); sub parse { my ($self, $fh) = @_; my ($bar) = binread($fh, $self->length, 'N'); $self->_set_bar($bar); return $self; } } package Section { use Moo; use MyUtils qw/binread/; extends 'Packet'; has quz => ( is => 'rwp' ); sub parse { my ($self, $fh) = @_; my ($quz) = binread($fh, $self->length, 'n'); $self->_set_quz($quz); return $self; } } sub parsefile { my $file = shift; open my $fh, '<:raw', $file or die $!; my @pkts; while (!eof($fh)) { push @pkts, Packet->parse($fh); } close $fh; return @pkts; } my $binfile = "\x01\x06Hello\0" ."\x02\x04\0\x12\x34\x56" ."\x13\x02\xBE\xEF"; my @packets = parsefile(\$binfile); use Data::Dump; dd @packets; __END__ ( bless({ foo => "Hello", length => 6, type => 1 }, "MainHeader"), bless({ bar => 1193046, length => 4, type => 2 }, "ExtHeader"), bless({ length => 2, quz => 48879, type => 19 }, "Section"), )
But would be interesting what would you recommend if I had to compute a checksum over all sections although I would like to have the parse logic in each class instead of one central place

Well, if by that you mean you want to checksum the entire file, then probably the above sub parsefile is a good place, perhaps devising a way to keep track of the bytes already read or computing the checksum while reading - like for example, an object that wraps the filehandle and exposes the binread method I showed to read from the file could calculate the checksum as the file is read piece by piece. But in my experience it's more common to see checksums on a per-packet basis, in which case, in the above code, Packet::parse could take over the checksum reading and checking.

But I want to learn how to solve my tasks with a better design and in a better way.

* There are a whole bunch of possible variations on the above code. For example, I could've used Moo's features like BUILDARGS to have the constructor do the parsing, instead of a separate sub parse (although the former solution makes it a little more tricky to create packets in code that haven't been parsed from a file). Or, I could have structured the classes differently: If this was like a network protocol and each "packet" has a header, then it would make sense for the class Packet to have a header field that is populated with a corresponding class, instead of having the *Header classes be subclasses of Packet. Or I could've defined a role that requires each class to have a parse method. And so on.

So in general, the usual software design principles apply: reduce repetition, design your OO "isa" and "has" relationships in a sensible manner, make judicious use of factory methods, and so on. If you feel like something is getting too difficult, then it's best to step back and see if there might be some architecture changes that would help the situation, instead of plowing on, because the more code you write, the more reluctant you'll be to make larger architecture changes.


In reply to Re^3: Best way to a parse a big binary file by haukex
in thread Best way to a parse a big binary file by Dirk80

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.