So, my first post and experience here with Perl has prompted me to find better solutions to what I do using Perl. First post: http://perlmonks.org/index.pl?node_id=1137275.

I am trying to modify this solution for another project of mine. I can get all the individual parts to work, but they won’t come together. The idea is I have a large file (again) and am trying to extract text (again) from a single line that starts with a certain delimiter (the previous solution pulled text from different lines). I can get each column to extract to my csv individually, but I can’t get them to do it all at the same time. Here is what I have, modified from the previous solution

use strict; use warnings; use MCE::Loop; use MCE::Candy; ## Input and output files defined here as well as what we are searchin +g for in the input file my $input_file = shift || 'InputFile.OH1'; my $output_file = shift || 'OutputFile.csv'; my $match_string = "+ "; open my $ofh, ">", $output_file or die "cannot open '$output_file' for writing: $!\n"; ## This writes a header row that GIS sees as the column heading print $ofh "HEC1_ID,Q100_Base,TTP,Area\n"; MCE::Loop::init { use_slurpio => 1, chunk_size => 1, max_workers => 4, gather => MCE::Candy::out_iter_fh($ofh), RS => "\n${match_string}", }; ## Below, each worker receives one record at a time ## Output order is preserved via MCE::Candy::out_iter_fh ## EXAMPLE INPUT FILE Line 1##+ BPI30 1319. 13.50 +477. 147. 49. 4.64 Line 2## Line 3## ROUTED TO Line 4##+ RPI30 1220. 13.75 +475. 147. 49. 4.64 Line 5## Line 6## HYDROGRAPH AT Line 7##+ BPI31 765. 12.42 +102. 26. 9. .73 Line 8## Line 9## 2 COMBINED AT Line 10##+ CPI31 1242. 13.75 +571. 172. 58. 5.37 mce_loop_f { my ( $mce, $chunk_ref, $chunk_id ) = @_; ## Skip initial record containing header lines including *** *** if ( $chunk_id == 1 && $$chunk_ref !~ /^${match_string}/ ) { ## Gathering here is necessary when preserving output order, ## to let the manager process know chunk_id 1 has completed. MCE->gather( $chunk_id, "" ); MCE->next; } ## Each record begins with "+ " my ( $k1, $k2, $k3, $k4 ) = ( "", "", "", "" ); open my $ifh, "<", $chunk_ref; while ( <$ifh> ) { $k1 = $1 and next if $. == 1 && /^\S\s+(\S+)/; $k2 = $1 and next if $. == 1 && /^\S\s+\S+\s+(\S+)/; $k3 = $1 and next if $. == 1 && /^\S\s+\S+\s+\S+\s+(\S+)/; $k4 = $1 and last if $. == 1 && /(\S+)\s*$/; } close $ifh; ## Gather values. ## This outputs everything to the output file in the format below. MCE->gather( $chunk_id, "$k1,$k2,$k3,$k4\n" ); } $input_file;

As an example from Line 1 I am trying to extract BPI30, 1319, 13.50, 4.64, and then RPI30, 1220, 13.75, 4.64 from Line 4, etc. The lines always begin with “+ (spaces)“. Each one of the $k1-4 will extract the correct data into the right column, but it won’t put it together. Any help is appreciated, and I hope it is a silly oversight on my part. Thanks.


In reply to Extract string to file by oryan

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.