Beefy Boxes and Bandwidth Generously Provided by pair Networks
good chemistry is complicated,
and a little bit messy -LW
 
PerlMonks  

comment on

( [id://3333]=superdoc: print w/replies, xml ) Need Help??
Waste? I don't see waste.

The thing I found most vexing was that mocking up objects which contained arbitrary binary data was brain-bending and time-consuming. Let's say I want to write a deserialization method. We'll follow TDD and write a failing test first.

package Foo; sub get_data { $_[0]->{data} } package main; use strict; use warnings; use Bar; my $blackbox = bless { data => "\x3foo\x3bar\x3bazasdfasdfasdfasdf", }, Foo; my $object = Bar->new; my $deserialized = [ $object->deserialize( $blackbox ) ]; is_deeply($deserialized, [ qw(foo bar baz) ], "Deserializer decodes correctly");

FYI, in order to write that, I had to go look up BER compressed integers and see how the byte-level algorithm worked. Let's hope I got it right.

Here's the actual code, now that I am allowed to type it.

sub deserialize { my ($self, $blackbox) = @_; # capture up to 16-byte random sentinel. $blackbox->get_data =~ s/ (.*?) (?: $self->{record_separator} | $ ) //xsm or confess("no match"); return unpack("(w/a)*", $1); }

Now, even if I risk displeasing the gods of TDD and cheat by typing the code I'm actually going to use before writing my test, it's still a pain to generate this intermediate data. And if I decide to experiment with another algorithm, I wind up throwing away that hard-won mock data, as it's rare that it transmutes easily. Ironically, tests like these are tightly coupled to the code they test, which makes them brittle and difficult to adapt or reuse.

You're rewriting the library from scratch, period.

Credit where it's due: Plucene was originally written over a year ago, as a port of Lucene 1.3. The problem is this:

# time to index 1000 documents: Plucene 1.25 276 secs Kinosearch 0.021 88 secs Kinosearch 0.03_02 35 secs Java Lucene 13 secs

I'm now working on a port of the current version of Lucene (essentially 1.9, not yet officially released), leveraging what I learned by reinventing the wheel with Kinosearch.

The same problems of dealing with arbitrary binary data arise, though since this is a port and not an alpha, I won't have to continually rewrite tests as I would have had to (if I'd followed TDD) when I was writing Kinosearch. Perhaps you can suggest an alternative technique for creating the mock objects? You can't algorithmically generate this data; even if you could live with large copy and paste ops, too many dependencies are involved to pull it off.

In addition, you can probably end up with several new distros to add to CPAN that aren't directly usable solely for reverse indexing.

That's where Sort::External came from.

Best,

--
Marvin Humphrey
Rectangular Research ― http://www.rectangular.com

In reply to Re^6: Documenting non-public OO components by creamygoodness
in thread Documenting non-public OO components by creamygoodness

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":



  • Are you posting in the right place? Check out Where do I post X? to know for sure.
  • Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
    <code> <a> <b> <big> <blockquote> <br /> <dd> <dl> <dt> <em> <font> <h1> <h2> <h3> <h4> <h5> <h6> <hr /> <i> <li> <nbsp> <ol> <p> <small> <strike> <strong> <sub> <sup> <table> <td> <th> <tr> <tt> <u> <ul>
  • Snippets of code should be wrapped in <code> tags not <pre> tags. In fact, <pre> tags should generally be avoided. If they must be used, extreme care should be taken to ensure that their contents do not have long lines (<70 chars), in order to prevent horizontal scrolling (and possible janitor intervention).
  • Want more info? How to link or How to display code and escape characters are good places to start.
Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others romping around the Monastery: (2)
As of 2024-04-26 03:53 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found