in reply to Multiple Line Data from text file to SQL

My reading of your sample data would suggest you have the format:

(Unique integer). (Question)"/n" A)(Answer A)"/n" B)(Answer B)"/n" C)(Answer C)"/n" D)(Answer D)"/n"

A similar question was asked yesterday. Likely the easiest solution is to assume a line that starts with a letter and a close parenthesis is the start of the next answer. You might find regular expressions useful to write that test. I suspect there will be more of a challenge differentiating the end of D) from the id number of the next question. If you haven't thought about it yet, a hash of hashes is probably the best solution for storing your read data.

Good luck, and please post code if you run into development headaches.

Replies are listed 'Best First'.
Re^2: Multiple choice from text file to SQL
by pie2re (Initiate) on Mar 03, 2009 at 20:43 UTC
    Hi,

    Thanks for your advice and sorry for posting something already asked. The post you referenced in your reply was very helpful. But some darkness remains in my DBA's brain.

    Here is the code I've done:

    #!/usr/bin/perl $file = "data.txt"; open (INPUT, "< $file"); undef $/; my $string = <INPUT>; $string =~ s/\n//g; while ($string =~ /A\)(.*?)B\)/g) { print "<a>".$1."</a>\n"; } close (INPUT);

    This prints out all the occurrences of the A)... as expected

    Thanks again for your precious knowledge.

    P.

      Why limit your code to one match at a time? As long as you are capturing things, you can use 1 regex to extract all the details from a question at once. If your questions are separated by a blank line (i.e. end marked by "\n\n"), you could try something like:

      #!/usr/bin/perl use strict; use warnings; my $file = "data.txt"; open (INPUT, "< $file"); undef $/; my $string = <INPUT>; while ($string =~ /(\d+)\.\n(.+?)A\)(.+?)B\)(.+?)C\)(.+?)D\)(.+?)(\n\n +|$)/sg) { my @captures = ($1, $2, $3, $4, $5, $6); foreach (@captures) { s/\n//g; } print "<QN>$captures[0]</QN>\n"; print "<QQ>$captures[1]</QQ>\n"; print "<Aa>$captures[2]</Aa>\n"; print "<Ab>$captures[3]</Ab>\n"; print "<Ac>$captures[4]</Ac>\n"; print "<Ad>$captures[5]</Ad>\n"; } close (INPUT);

      Some side notes: you should use strict; use warnings in your scripts since they'll help prevent typos from causing you incredible misery. You should use 3-argument open in place of the 2-argument form you used - 3-argument has some important security implications.