Re: Is there any regex limitation?

This CSV file is complex enough that I think using the DBI in conjunction with the DBD::CSV module is certainly worth considering. This enables the use of standard SQL statements to extract, update, create information in the .CSV file. Later if you want to move the data to a real DB, the same code will work. For that next step, I recommend SQLite as a great DB that provides about 90% of what a "real DB" provides. I show some code below.

A few notes on the code... There are many ways to write the connect statement. I show one way and also some of the defaults on purpose. The file I made with the OP's data is DataBase.CSV - that file is a verbatim copy of the posted data. With DBD::CSV normally each file is a table so I set default extension to be .csv so that I didn't have to type that in the SQL statements. RaiseError=1 is very common and with that you don't have a bunch of "or die" statements in the code. Of course there is no username or password for a .csv file so those are undef in the connect statement.

I show how to make a simple query to print a couple of values. The DB returns a NULL string "" if there is no data for that column.

I got confused about which columns actually had data or not, so that was a good opportunity to show another type of query where I printed out the column name and value for every column that had data.

Anyway this approach does require a lot of upfront learning, but I've found it to be a very good way of dealing with CSV files. The classic book: Programming the Perl DBI" by Alligator Descartes and Tim Bunce is highly recommended.

Update:
I added an example of updating the .csv file to the previous code. Also corrected a mistake, csv_eol => "\n", the \n was in single quotes before which caused problems with the update operation! Normally, I would not even have these kind of default parameters in the code, but I wanted to demonstrate that there are many parameters that can be tweaked and show a few of them.

I do not claim that the DBI approach executes faster than the CVS_XS parser, indeed DBD::CSV uses it. The advantage of this DBI approach is that it is more flexible than other approaches. And when things grow in size or complexity, the DBI/SQL code can be directly used in a real database where it will definitely run much faster than with CVS_XS code.

The big disadvantage of the approach shown is that some knowledge of SQL and the Perl DBI is needed. The total scope of SQL and the Perl DBI is huge, but for jobs like this only about 1/2 dozen functions are needed. So learn what you need and learn more when needed.

In the code below, I use "pod" directives (like "=cut") to embed the output within the Perl code itself. I hope that this does not confuse anybody. I do it this way so that you have a runnable set of code that includes the expected output.

#!/usr/bin/perl -w
use strict;

use DBI;   ## need to have DBD::CSV installed for this script!

my $dbh = DBI->connect ("DBI:CSV:", undef, undef,  # no uname, pwd
    {
        f_dir        => './',   # default current directory
        f_ext        => ".csv", # DB files end in .csv
        csv_eol      => "\n",   # default (used for writing)
        csv_sep_char => ',',    # default of course is comma!
        RaiseError   => 1,      # errors are fatal (with description)
        PrintError   => 0,      # default is 0 not needed if RaiseErro
+r=1
                                #   non-zero prints error and continue
+s
    }
) or die "Cannot connect: " . $DBI::errstr;

my $st_basic_actions = $dbh->prepare("SELECT action,userID,Place
                        FROM DataBase
                        ORDER BY userID");
$st_basic_actions->execute();

printf "%-10s %-10s %-10s\n", 'Action','userID','Place';  
while (my ($action,$userID,$Place) = $st_basic_actions->fetchrow)
{
   $Place ||= 'unknown';  # use 'unknown' if null string, ''
   printf "%-10s %-10s %-10s\n", $action,$userID,$Place
}
=outputs  ################
Action     userID     Place     
G          CHXAS      unknown   
X          LLXEAS     Wah Woo, Section A
X          LLXEAS2    Wah Woo, Section A
=cut      ################

my $st = $dbh->prepare ("SELECT * FROM DataBase");
$st->execute();

while (my $href = $st->fetchrow_hashref())
{
   print "****\n";
   foreach my $key (sort keys %$href)
   {
      print "$key=$href->{$key}\n" if $href->{$key}; 
   }
}
=outputs ###############
****
action=X
dept2=DSA
first_name=Lamshi
footer=Welcome to Coo Duk, ltd.
nameid=LLXEAS2
oxcode=A
pc=HKA
place=Wah Woo, Section A
section=CWW
state=Important
status=NAW
subject=No vell
surname=Coo
userid=LLXEAS2
****
action=X
dept2=DSA
first_name=Lamshi
footer=Welcome to Coo Duk, ltd.
nameid=LLXEAS
oxcode=A
pc=HKA
place=Wah Woo, Section A
section=CWW
state=Important
status=NAW
subject=No vell
surname=Coo
userid=LLXEAS
****
action=G
second_mail_alias=sam.mo@hotmail.com
userid=CHXAS
=cut

# **CODE UPDATE**: adding an "update" to show modification is possible

$st = $dbh->prepare ("UPDATE DataBase 
                      SET action='Done' 
                      WHERE userID='LLXEAS2'");
my $rows_modified = $st->execute();
print "\nRows Modified from X to Done = $rows_modified\n";

## now re-running previous query, can re-use a prepared statement
## handle 

$st_basic_actions->execute();

printf "%-10s %-10s %-10s\n", 'Action','userID','Place';  
while (my ($action,$userID,$Place) = $st_basic_actions->fetchrow)
{
   $Place ||= 'unknown';  # use 'unknown' if null string, ''
   printf "%-10s %-10s %-10s\n", $action,$userID,$Place
}

##########################
=this new section of code prints:

Rows Modified from X to Done = 1
Action     userID     Place     
G          CHXAS      unknown   
X          LLXEAS     Wah Woo, Section A
Done       LLXEAS2    Wah Woo, Section A

=cut
############################
[download]

Comment on Re: Is there any regex limitation? Download Code