in reply to Re^7: Write to multiple files according to multiple regex
in thread Write to multiple files according to multiple regex

i tried. it still gives me no matches. i also tried to add another line "THE_END" at the top of my file and I tried "THE_END\n", "\nTHE_END", "\rTHE_END", "THE_END\r".

i wonder if the regex - in the way they are written - are suitable for this $/ approach. they look like this

(?^:^UT A19(?:7(?:0G990800007|6CQ89200006)|8(?:0JW32900007|2PN88100001 +)|90DD63700001))

it only adresses "UT" and the "numbercode", nothing afterwards. so if everthing from UT xxxxx to THE_END is treated as one line, they maybe dont match?

Replies are listed 'Best First'.
Re^9: Write to multiple files according to multiple regex
by BillKSmith (Monsignor) on Jul 22, 2015 at 18:02 UTC

    You probably have two problems which interact. First, you must parse the input file into blocks. Second, you must process each block with the regular expressions from the other files.

    Lets address just the parsing problem. My code (with $/ = "THE_END\n") correctly parses your sample data. If that sample is accurate, the code will parse real data. If there is a problem, as you have guessed, it almost certainly has to do with whitespace at the end on the block.

    Use a text editor on your copy of the sample data file. Verify that between the last data digit in one block and the first data digit of the next block we find "\nTHE_END\n" (and absolutely nothing else!). Do the same for live data. Check several pairs of blocks just to be sure. Let me know what happened.

    If we passed the previous test, You are almost certainly parsing correctly, but have a problem with the processing. Again, the problem very likely has to do with whitespace. I really cannot offer any more help without having a real regex and a data block that you expect to match it.

    Bill

      thanks, this pointed me into the right direction. the exact problem was that my regex for the start of the block where starting with a beginning of line "^". that runs correctly on the sample data I gave you. In my original data however there sometimes are additional lines between "THEEND" and the next "UT ... " line, so splitting the input at "THEEND" means, the next input is not starting ("^") with "UT". I assembled my regex without the ^ and now it seems to work fine.

      i will also post an update in my original question and sum up all key elements

      cheers