lomSpace:

You first need to chop things up into related concepts. For example, the first thing I notice is that chunk you helpfully marked with =cut lines. There's a bit there you could make into its own subroutine1:

sub convert_raw_to_fasta { my $InFileName = shift; my $OutFileNameBase = shift or die "Missing argument(s)!"; open $hu, '<', $InFileName or die $!; my $seq=<$hu>; close $hu or die $!; open $hu, '>', $OutFileNameBase . ".fa" or die $!; print $hu ">$maid\n$seq"; close $hu or die $!; }

Then you could simplify the remaining subroutines:

sub blast_parse{ my($maid,$maid_dir) = @_; my $url_hu = "http://hu_seq/"; my $hu = get($url_hu.$maid); my $ltvec_small = $maid_dir.$maid."Ltvec_small.fa"; convert_raw_to_fasta($hu, $maid); # syntax # bl2seq -p blastn -i nucleotide1 -j nucleotide2 -F F -D 1 my $command = "bl2seq -p blastn -i $ltvec_small -j $hu_fa -F F -D +1"; print $command,"\n"; open OUTPUT, '>', "$maid_dir\\".$maid."_bl2seq.out" ; STDOUT->fdopen( \*OUTPUT, 'w'); system($command); bl2seq_parse(); } sub blast_hd_parse{ my($maid,$maid_dir) = @_; my $url_hd = "http://hd_seq/"; my $hd = get($url_hd.$maid); my $ltvec_small = $maid_dir.$maid."Ltvec_small.fa"; convert_raw_to_fasta($hu, $maid); my $command = "bl2seq -p blastn -i $ltvec_small -j $hd_fa -F F -D +1"; print $command,"\n"; open OUTPUT, '>', "$maid_dir\\".$maid."_bl2seq.out" ; STDOUT->fdopen( \*OUTPUT, 'w'); system($command); bl2seq_parse(); }

Then you might notice that the ends of each function are similar as well. The different bits are the command string to execute and the name of the output file. Factor out those bits into arguments, and you can create another sub:

sub process_and_parse { my $command = shift; my $output_file = shift or die "Missing argument(s)!"; print $command,"\n"; open OUTPUT, '>', $output_file or die $!; STDOUT->fdopen( \*OUTPUT, 'w'); system($command); bl2seq_parse(); }

So your functions would then reduce to:

sub blast_parse{ my($maid,$maid_dir) = @_; my $url_hu = "http://hu_seq/"; my $hu = get($url_hu.$maid); my $ltvec_small = $maid_dir.$maid."Ltvec_small.fa"; convert_raw_to_fasta($hu, $maid); process_and_parse( "bl2seq -p blastn -i $ltvec_small -j $hu_fa -F F -D 1", "$maid_dir/" . $maid . "_bl2seq.out" ); } sub blast_hd_parse{ my($maid,$maid_dir) = @_; my $url_hd = "http://hd_seq/"; my $hd = get($url_hd.$maid); my $ltvec_small = $maid_dir.$maid."Ltvec_small.fa"; convert_raw_to_fasta($hu, $maid); process_and_parse( "bl2seq -p blastn -i $ltvec_small -j $hd_fa -F F -D 1", "$maid_dir/" . $maid . "_bl2seq.out" ); }

You'd continue in this manner, as needed. Along the way, you'd remove unneeded bits of code and variable declarations, etc. Then, if you wanted to compress them into a single function, you'd find that again there are some bits that are different, and you could turn those differences into arguments and pull the code together.

When you're done factoring some of the bits out, sometimes you'll find that you really want to compose your system differently. Don't allow the current subroutine boundaries to constrain your thinking. Sometimes by chopping things up a bit differently, you'll wind up removing a *lot* of code and gaining features.

In fact, that's usually when I know that I understand the business process. I start thinking about things, factor out a little code here, reuse it there and there. Generalize it a little and replace several subroutines with the one. At the beginning of the process, you solve each problem as given, and you're afraid to take any liberties because you don't know the impacts on other items. Then you learn more about the system and know where to generalize. Once I have that "aha!" and start removing code while improving it, I know I'm near the end of the road.

1. I noticed that the code isn't real, functional code. So I took liberties in cleaning up a bit, making no attempt at fixing any of the code. Instead, I just added a little error handling and such. But if it were real code, the process would be similar.

2. Always (and I mean always check the result of function calls where appropriate. (open, system, get, bl2seq_parse all come to mind)

Generally, I like it when my code looks like an outline in structure. Each subroutine calls other subroutines that each to a small, simple task. You keep decomposing things until you get to something that's just trivial to implement. Something like:

# main task initialize_frobnitz(); generate_zanzibar(); show_results(); sub initialize_frobnitz { my $frobber = allocate_frobnitz(1); send_to_frobber($frobber, 'INIT') or die "Can't initialize frobber!"; send_to_frobber($frobber, 'configuration value 1') or die "Can't configure frobber!"; } sub send_to_frobber { my $serial_port = ... etc ...

...roboticus

You can tell when I'm not terribly busy at work ... I tend to make longer, more rambling posts. Ah, well!


In reply to Re: how can I combine into one method by roboticus
in thread how can I combine into one method by lomSpace

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.