in reply to Re^2: Textfile to csv with a small twist
in thread Textfile to csv with a small twist

Wow! I appreciate all of the input to my problem. InfiniteSilence's output is the closest to what I'm looking for(I haven't looked at the output from all of the example yet); however seeing the varied replies, I see I didn't explain myself clearly enough. I am basically looking for output like his/hers, except w/o the commas, and still have the crlfs in there. I belive I can modify the code provided to suit my needs, but what do I know? My perl knowledge only fills a matchbook :( Bentov
  • Comment on Re^3: Textfile to csv with a small twist

Replies are listed 'Best First'.
Re^4: Textfile to csv with a small twist
by jZed (Prior) on Aug 25, 2005 at 19:43 UTC
    I am still not understanding what output you want. If your data is this:
    H1:
    T1.1
    T1.2
    H2:
    T2.1
    T2.2
    
    Do you want this:
    H1,H2
    "T1.1\nT1.2\n","T2.1\nT2.2\n"
    
    Or this:
    H1,H2
    "T1.1\n","T2.1\n"
    "T1.2\n","T2.2\n"
    
    Or something else? And another ambiguity: you haven't mentioned whether headings can repeat in your input data (for example more than one section labeled heading1).
      Closer to the first one, what I'm exactly looking for is
      H1,T1.1 CRLF T1.2 crlf,H2,T2.1 CRLF T2.2 CRLF
      While the headings are supposed to be same in all of my files, I.e. they will always appear in the same order, and all of the headings will appear, I know they arn't. So I'm figuring that I'll need to have the heading as part of the record, that way if for example I have something like
      H1: T1.1 T1.2 H3: T3.1 T3.2 T3.3
      I can generate something like:
      H1,T1.1 CRLF T1.2 CRLF,H3,T3.1 CRLF T3.2 CRLF T3.3 CRLF
      and from that generate...
      H1,T1.1 CRLF T1.2 CRLF,H2,,H3,T3.1 CRLF T3.2 CRLF T3.3 CRLF
      Since the H2 heading didn't appear in the data, I will have to force in in the final file. I planned this to be a two steep process. I hope that explains things better. Bentov
        You say you want this
        H1,T1.1 CRLF T1.2 crlf,H2,T2.1 CRLF T2.2 CRLF
        
        How will you ever retrieve the data from that? If instead you have this:
        H1,"T1.1 CRLF T1.2 CRLF"
        H2,"T2.1 CRLF T2.2 CRLF"
        
        You will then be able to retrieve the text for any heading, for example if you print out the text for H2, it will look like this:
           T2.1
           T2.2
        
        Is that what you want? If so, here's how I would do it:
        #!/usr/bin/perl -w use strict; use IO::Scalar; # not needed if input and output are from files use Text::CSV_XS; my $csv = Text::CSV_XS->new( {binary=>1} ); my @input_data = <DATA>; # turn the input data into a CSV string with records and fields # my $csv_str = text_to_csv( @input_data ); # as a test, turn the CSV string back into a text string # my $output_data = csv_to_text( $csv_str ); # check that the text string created from the CSV is the same # as the original # print "ok!\n" if $output_data eq join '', @input_data; sub text_to_csv { my (@new_row,$new_text,$output_csv,$heading); for my $line (@_) { if ($line =~ /^(\w+:)$/) { $heading = $1; if (@new_row) { $output_csv .= make_row(@new_row,$new_text); } @new_row = ($heading); $new_text = ''; } else { $new_text .= $line; } } $output_csv .= make_row(@new_row,$new_text); } sub csv_to_text { my($input_str)=@_; my $output_str = ''; my $fh = IO::Scalar->new(\$input_str); while (my $cols = $csv->getline($fh)) { last unless @$cols; $output_str .= sprintf "%s\n%s", @$cols; } return $output_str; } sub make_row { my $success = $csv->combine(@_); die "Coulnd't parse '@_'\n" unless $success; return $csv->string . "\n"; } __DATA__ H1: T1.1 T1.2 H2: T2.1 T2.2
        Note that only the text_to_csv() sub is needed to do what you asked, the other subs are there as tests and illustrations.