edwardt_tril has asked for the wisdom of the Perl Monks concerning the following question:

Hi please help!! really new to perl and regexp
I need to do string (call it field) replacement on a line.
Those lines are from application logs. I need to manipulate them
They can be single replacement or multiple replacements
all "fields" is delimited by a comma. The number of
comma is fixed, but those field can be empty (see the sample
data for example). The data comes from a source data file
for each non empty field, it can be assigned with new value

my approach:
- open the file-
- read data file line by line, store old line to temp.
- break the field using comma as delimiter and store each to global variable
- replace the fields of the temp line <<<<problem >>>
so far I can get to step 3. but things get messy in step 4.

I am using $newline =~ s/$oldField/$newFieldContent/g; to do replacement
I put this into a subroutine: replaceLog($oldLine,$oldField,$newFieldContent)
I end up with a huge if - else statements but essentially
each one is doing the same thing
plus I can't replace multiple fields with the huge if-else statement.
e.g. my huge if-else statement:
if ($field1 ne '' ){
replaceLog($oldline,$FIELD1,$NEWFIELD1);
} else ( ..repeat of above. only change the last 2 arguments.. )
so I have these problems:
1. the huge if -else is horrible and any new field addition
means the code need to grow linearly with new field addition 2. it replaces 1 field as a time, cannot to replace new
contents for field1, field2, field3 this time, while
field2 next, and field3, field8, field50 the 3rd time. really inflexible.
3. when the field has hi-ascii/dbcs content.. the
replacement doesn't seem to work. instead I get all the
character codes back.
4. when the content contains \ or \\ the replacement chokes.
Following real sample data: note that it contains ~ \ or \\
also that each field is delimited by ,
and that some fields are empty denoted by ,,
you can copy & paste the below onto notepad. turn off word wrap.
230A18121606,492000,49,30170101,something here-RP6RQL,SYSTEM,,The pres +ent default action is to 0x22block0x22 communications.,0,0,167100,0,1 +6777216,"",369033216,,497101,RAQUINO3,497300,497201,0,0,0,0,0,0,,0,16 +7151,0,3017,C:\Program Files\something here\Anywhere\awhost32.exe,{23 +0E5C08-649A-495C-A29C-DE43A460577F},,,,MSHOME,,8.6.0.80,,,,,,,,,,,,,, +,,0,,,0,something here-RP6RQL 230A18121627,7,3,8,something here-RP6RQL,SYSTEM,,,,,,,16777216,"New vi +rus definition file loaded. Version: 70831p.",0,,0,,,,,0,,,,,,,,,,RAQ +UINO3,{230E5C08-649A-495C-A29C-DE43A460577F},,,,MSHOME,,10.0.0.359,,, +,,,,,,,,,,,,,0,,,,something here-RP6RQL 230A18170B03,12,4,8,something here-RP6RQL,administrator,,,,,,,16777216 +,"Changed value 'HKLM\SOFTWARE\Intel\LANDesk\VirusProtect6\CurrentVer +sion\PatternManager\LockUpdatePatternScheduling' from '1' to '0'",0,, +0,,,,,0,,,,,,,,,,RAQUINO3,{230E5C08-649A-495C-A29C-DE43A460577F},,(IP +)-127.0.0.1,,MSHOME,,10.0.0.359,,,,,,,,,,,,,,,,0,,,,something here-RP +6RQL 230A18170E10,16,3,7,something here-RP6RQL,administrator,,,,,,,16777216 +,"Manual LiveUpdate failed to download Virus Definitions.",0,,0,,,,,0 +,,,,,,,,,,RAQUINO3,{230E5C08-649A-495C-A29C-DE43A460577F},,(IP)-127.0 +.0.1,,MSHOME,,10.0.0.359,,,,,,,,,,,,,,,,0,,,,something here-RP6RQL 230A19161A38,45,4,14,something here-RP6RQL,administrator,,,,,,,65536," +C:\Program Files\Common Files\something here Shared\SPBBC\SPBBCSvc.ex +e",0,,0,301 2656 D:\\DOCUME~1\\ALLUSE~1\\APPLIC~1\\AOLDOW~1\\TR +ITON~2.3\\setup.exe 12 2364 C:\\Program Files\\Common Files\ +\something here Shared\\SPBBC\\SPBBCSvc.exe C:\\Program Files\\Com +mon Files\\something here Shared\\12345\\12345.exe 0 1,,,,0,,,, +,,,,,,,{2D7BD59B-DF46-41BB-B2AA-E893B8D41370},,(IP)-10.0.0.10,,MSHOME +,,,,,,,,,,,,,,,,,,0,,,,something here-RP6RQL 230A19161A38,45,4,14,something here-RP6RQL,administrator,,,,,,,65536," +C:\PROGRA~1\SYMANT~2\SYMANT~2\VPTray.exe",0,,0,301 2656 D:\\DOC +UME~1\\ALLUSE~1\\APPLIC~1\\AOLDOW~1\\TRITON~2.3\\setup.exe 10 2 +388 C:\\PROGRA~1\\SYMANT~2\\SYMANT~2\\VPTray.exe C:\\PROGRA~1\\ +SYMANT~2\\SYMANT~2\\VPTray.exe 0 1,,,,0,,,,,,,,,,,{2D7BD59B-DF4 +6-41BB-B2AA-E893B8D41370},,(IP)-10.0.0.10,,MSHOME,,,,,,,,,,,,,,,,,,0, +,,,something here-RP6RQL 230A1A16271E,492000,49,30170101,something here-RP6RQL,SYSTEM,,Rule 0x2 +2Implicit block rule0x22 blocked (something here-RP6RQL(10.0.0.10)0x2 +c1040).,0,0,167103,16777343,16777216,"",68158480,,497101,RAQUINO3,497 +301,497201,0,167772170,0,0,0,0,,0,167151,0,3017,N/A,{2D7BD59B-DF46-41 +BB-B2AA-E893B8D41370},,(IP)-10.0.0.10,,MSHOME,,8.6.0.80,,,,,,,,,,,,,, +,,0,,,0,something here-RP6RQL

Replies are listed 'Best First'.
Re: how to so this string replacement elegantly
by Zaxo (Archbishop) on Dec 01, 2005 at 08:02 UTC

    It's not too clear just what kind of transformations you want to do, but the data looks like the fields are different enough to each take completely different treatment.

    How about making an array of subroutines, each handling the transformations for the field with the corresponding index?

    my @xforms = ( sub { $_[0] = $_[0] || 1; }, sub { # dummy, no transform $_[0]; }, sub {1}, sub { $_[0] =~ s/foo/bar/; $_[0]; }, sub {1}, sub {1}, sub {1}, sub {1}, ); while (<DATA>) { my @tmp = split ','; for (0 .. $#tmp) { $xforms[$_]->($tmp[$_]); } print join ',', @tmp; }
    So long as you don't use code tags, I can't tell where your data lines end or guess what you might want to do to them.

    After Compline,
    Zaxo

Re: how to so this string replacement elegantly
by serf (Chaplain) on Dec 01, 2005 at 07:59 UTC
    Hi edwardt_tril,

    are you able to be more specific about how you want to modify the lines please?

    It's good that you've provided your approach, but it's quite possible that if everyone were looking at the problem from a bit further back then we would see a different way of approaching the challenge which would not run into the same problems you have stopped on.

    Your aim I'm sure is to create some functionality, not to get a specific piece of code to work.

    Having said that, could you possibly provide a bit more of your code if that will help us to understand what you're trying to achieve?

    You haven't given enough code yet for us to see where it's breaking specifically :o)

Re: how to so this string replacement elegantly
by tphyahoo (Vicar) on Dec 01, 2005 at 12:20 UTC
    1, I think you're likely to get better help if you asked the question more clearly. Check out How (Not) To Ask A Question, think about, and maybe clean up your post. If your post is old by then and doesn't get any answers post-cleanup, you could post a new question as a short summary of the original question and a link back to the original question.(Updates don't show up in newest nodes but they do show up in recently active threads, so updating alone might get you attention and help if you are patient.)

    2, by looking at your data, it looks to me like you have csv. regex is maybe not the best option here. For my own csv parsing, I personally use Text::xSV. So I would have a look at that as well.

    Hope this helps, and good luck.