in reply to Re^7: String Search
in thread String Search

Marshall,

I got an idea for this. Please find the structure:

1> we can declare structure in C for same structure as the data.

2> Then create link list for the stucture.

3> Map the data with the structure with the structure and allocate memory dynamically as each structure.

4> Search the data and extract the data from leaf to root.

Please help me whether I am right or wrong..

Regards,

Replies are listed 'Best First'.
Re^9: String Search
by Marshall (Canon) on Sep 02, 2009 at 14:42 UTC
    You seem determined to use this text dump from DB and make a CSV file for import again. I still recommend other approaches, but here are some more thoughts for you:

    Your thinking apprears too be way too complex for the job at hand! You are making a "one off" thing. Usually the objective is to just get this one-off thing done and out of your hair. Think simple and take advantage of the details in this specific situation. Don't worry about "General purpose". I wouldn't worry about "elegant" or "fast" although simple approaches are often very fast. And to me, "straightforward" is its own kind of elegance!

    As far as creating a complex structure in either C or Perl, this appears to be "over kill". You are going towards a "flat" one line per record format. The variable names that you want are unique between "sections" (ie if you know the variable name, then you know what kind of sub-section it came from and the vars look like they can only appear once per call record). Take advantage of that! Your code doesn't appear to have any need to understand the multi-level nature of the input data.

    Nothing says that you can't do this is in multiple scripts or steps. This often is a good way as it eases the debug process. If code isn't "optimally efficient" don't worry about it! The idea is to set up a series of "filters" that progressively work towards your goal.

    So as a "first parsing step", I would do something like the code below. This makes a intermediate file that has all of the "var : value" things in each call record in a "flat" format. Fiddle with regex until you have what you need at this step.

    Then write code such that for each call record, you initialize a hash table with the default values for each var that will go into output line. Then for each var line in file's CDR record, if that name tag exists in hash, override with value from file. Then at end of record, print the CSV line. Record starts with something that matches CME20CP6.CallDataRecord and ends with blank line. Nothing is wrong with you adding a blank line manually to end of intermediate file to make the termination condition easy.

    Update: Just an example of how to implement the above strategy. @csv_order is the var names in order that they should appear in CSV. Now if you need say these "ChargingNumbers", I would make up a new name for that and "squish it" into one value in the intermediate file format, like you want it to appear in output CSV file. Anyway these 2 scripts will run in just a few seconds even for a million records.

      Dear Marshall,

      Thanks a lot for you code. It is working fine. But it has been very complex to process.

      Can you please help me two things? 1. How Can I run a loop that will take the data between CME20CP6.CallDataRecord.uMTSGSMPLMNCallDataRecord to CME20CP6.CallDataRecord.uMTSGSMPLMNCallDataRecord without braces and the data will store in a array. Because after taking all the data I have to convert the data to decimal or binary according to the CDR Logic. 2. Which is the faster way to insert a row in a table using DBI.

        Look at sub dump_csv_line() at Re^9: String Search (its in second code posting there). That sub with one line in it dumps the data as a CSV line. If you wanted a simple array, then my @array=map{$curr_record{$_}}@csv_order); would do it. I put that single line of csv dump code in a sub all by itself to try to make its function very clear. I guess this wasn't clear enough! Code really didn't need join or map, could have been like this :

        #another way to dump values in record foreach my $tag (@csv_order) { print "Tag = $tag value =$curr_record{$tag} \n"; }

        Looks like you have two types of data, as per my example run: "DWLCCN6",'6CBFD7'H.... A string or I guess what is a binary hex value. I would assume that if your DB can export the data in that way, it can import that format also! I wouldn't think that your Perl program has to do any data conversion at all!

        So for your questions:
        1. Code I gave you already processes records. That "dump subroutine" gets called once per record.
        2. "fast" is irrelevant..its not gonna make any difference considering the number of days already spent on this!

        I still think your best bet is to take the DB guy out for a nice lunch and have him/her write a simple DB merge thing for you. Then you are done!

Re^9: String Search
by kallol.chakra (Initiate) on Sep 12, 2009 at 07:18 UTC

    Hi,

    Can you give any help on this?

    Regards