in reply to Re^5: Parsing .txt into arrays
in thread Parsing .txt into arrays

i'm sorry I didn't realize you already sent the solution in the earlier comment, i was working on something else with reference to your code where I want to print lines only after I find the keyword until the table starts ,modifying your previous code I was able to perform my required operation on the obtained output but i'd like to optimize it by just printing the lines after I find the keyword(including the line with keyword),in your code it prints all the lines above the table but I need only those after I find the keyword!! thank you.

Replies are listed 'Best First'.
Re^7: Parsing .txt into arrays
by Marshall (Canon) on Jun 07, 2017 at 07:35 UTC
      hi Marshall, I was hoping not to seek your help unless it was utterly necessary, I figured out how to transpose and replicate values for empty columns, but when I put this code to test I was facing few practical issues with the code you modified for me,
      2017 Position log :Fp379 place: cal time: 23:01:45 | | |Pos |value | |bulk|lot| prev| newest| |# |Locker|(dfg) |(no) |nul|val |Id | val |val | ----------------------------------------------------------- | 0| 1| 302832| -11.88| 1| 0|Pri| 16| 0| | 1| 9| 302836| 11.88| 9| 0|Pri| 10| 0| | 2| 1| 302832| -11.88| 5| 3|Pri| 14| 4| | 3| 3| 302833| 11.88| 1| 0|sec| 12| 0| | 4| 6| 302837| -11.88| 1| 0|Pri| 16| 3| language data: time= 24hrs |no.| name | languages | proficiency | time taken| |_ _| _ _ _| _ _ _ _ _ |_ _ _ _ _ _ _| _ _ _ _ _ | |1 | eliz | English | good | 24 hrs | |2 | susan| Spanish | good | 13 hrs | |3 | danny| Italian | decent | 21 hrs | _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ |no.| name | age | place | year | |_ _|_ _ _ _|_ _ _ | _ _ _ | _ _ | |1 | sue |33 | NY | 2015 | |2 | mark |28 | cal | 2106 | 2017 Review log :Gt149 place: NY time: 13:31:15 _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ |no.| name |level | dist | year | |_ _|_ _ _ _|_ _ _ | _ _ _ | _ _ | |1 | sue |96 | Gl | 2015 | |2 | mark |67 | Yt | 2106 |
      the problem I have is that there isn't just one table under a header ,there are multiple tables(of different sizes) under each header as shown, by observation I found out that the page (header+tables) ends only when we find year(2017) in the next line ,the position log here has 3 tables and each with its own header and it ends only when we see the next "2017" (Review log) PS: max tables in a page are 4.
        Ok, you are transforming one data representation into another, a very common task for Perl. In your data above, I see that there are 2 collections of data, "2017 Position log :Fp379" which contains 3 tables and "2017 Review log :Gt149" which contains one table. What format does the data need to be in for whatever program consumes the output of your program? In other words, what happens to your output? Where does it go and what does that thing that it goes to do with it?

        There are many ways to express the idea that N tables belong to a single "record". Ultimately what you generate will need to be parsed and understood by something else. Can you explain more?

        Update: I think that the subroutine, finish_current_table() that decides which tables to "keep" would need to be modified. Perhaps with some state flag variable that indicates that we are within some 2017 year record? You keep talking about "pages". If you mean that these "pages" are separated by a form-feed (\f), that could potentially simplify the parsing situation. We could read an entire page at a time, then decide to keep or not the tables on that page? I personally don't like code or formats that depend upon "invisible" characters like \t or \f. But this could potentially be of help to simply the code. I am unsure. In any event, your code appears to be intended to transform a human understandable thing into a computer understandable thing. More detail about what this "computer understandable thing" is is appropriate.

        Update: with your extra example DATA: