comment on

hi Marshall, thanks for the reply and yes I need exactly what your code produces but with few changes, before that I want to let you know of what I'm doing or what output I'm expecting and what I want to do of it, I have a text file which is of format I've uploaded (although dummy data) this data pertains to some real time experiment I want to analyze this file using plots ,what I want perl to do is to remove these delimiters and extract particular type of pages and export it to a text file from where I'll use some other language (matlab) to parse this output file to extract some outputs.

Back to the modification I need , the code prints all the tables (awesome) but I need only specific page and if possible I need each table into a different array (explanation followed)

TABLE: place and year data: 67
Record_Start: 3
Record_End: 10
COLUMNS: no.,name,age,place,year
1,sue,33,NY,2015
2,mark,28,cal,2106

TABLE: work and language :65
Record_Start: 12
Record_End: 19
COLUMNS: no.,name,languages,proficiency,time taken
1,eliz,English,good,24 hrs
2,susan,Spanish,good,13 hrs
3,danny,Italian,decent,21 hrs

TABLE: Position log
Record_Start: 20
Record_End: 30
COLUMNS: #,Locker,Pos (dfg),value (no),nul,bulk val,lot Id,prev val,ne
+west val
0,1,302832,-11.88,1,0,Pri,16,0
1,9,302836,11.88,9,0,Pri,10,0
2,1,302832,-11.88,5,3,Pri,14,4
3,3,302833,11.88,1,0,sec,12,0
4,6,302837,-11.88,1,0,Pri,16,3

TABLE: 2017 Position log :Fp379
place: cal
time: 23:01:45
Record_Start: 31
Record_End: 44
COLUMNS: #,Locker,Pos (dfg),value (no),nul,bulk val,lot Id,prev val,ne
+west val
0,1,302832,-11.88,1,0,Pri,16,0
1,9,302836,11.88,9,0,Pri,10,0
2,1,302832,-11.88,5,3,Pri,14,4
3,3,302833,11.88,1,0,sec,12,0
4,6,302837,-11.88,1,0,Pri,16,3

TABLE: Position log table 1349F.63
time  10:23:66
sequence = 39
range = 6678
Record_Start: 46
Record_End: 67
COLUMNS: #,Locker,Pos (dfg),value (no),nul,bulk val,lot Id,prev val,ne
+west val
0,1,302832,-11.88,1,0,Pri,16,0
5,,,,,,,,
6,,,,,,,,
7,,,,,,,,
1,9,302836,11.88,9,0,Pri,10,0
2,1,302832,-11.88,5,3,Pri,14,4
5,,,,,,,,
6,,,,,,,,
7,,,,,,,,
3,3,302833,11.88,1,0,sec,12,0
4,6,302837,-11.88,1,0,Pri,16,3

TABLE: 2017 Position log :Fp379
place: cal
time: 23:01:45
Record_Start: 69
Record_End: 82
COLUMNS: #,Locker,Pos (dfg),value (no),nul,bulk val,lot Id,prev val,ne
+west val
0,1,302832,-11.88,1,0,Pri,16,0
1,9,302836,11.88,9,0,Pri,10,0
2,1,302832,-11.88,5,3,Pri,14,4
3,3,302833,11.88,1,0,sec,12,0
4,6,302837,-11.88,1,0,Pri,16,3

TABLE: language data:
time= 24hrs
Record_Start: 83
Record_End: 90
COLUMNS: no.,name,languages,proficiency,time taken
1,eliz,English,good,24 hrs
2,susan,Spanish,good,13 hrs
3,danny,Italian,decent,21 hrs

TABLE: Record_Start: 91
Record_End: 95
COLUMNS: no.,name,age,place,year
1,sue,33,NY,2015
2,mark,28,cal,2106

TABLE: 2017 Review log :Gt149
place: NY
time: 13:31:15
Record_Start: 96
Record_End: 104
COLUMNS: no.,name,level,dist,year
1,sue,96,Gl,2015
2,mark,67,Yt,2106

TABLE: Record_Start: 105
Record_End: 111
COLUMNS: no.,name,age,place,year
1,sue,33,NY,2015
2,mark,28,cal,2106

TABLE: work and language :65
Record_Start: 113
Record_End: 119
COLUMNS: no.,name,languages,proficiency,time taken
1,eliz,English,good,24 hrs
2,susan,Spanish,good,13 hrs
3,danny,Italian,decent,21 hrs
[download]

Above is what your code produces but what I need is only(below) given the keyword Fp379


TABLE: 2017 Position log :Fp379
place: cal
time: 23:01:45
Record_Start: 69
Record_End: 82
COLUMNS: #,Locker,Pos (dfg),value (no),nul,bulk val,lot Id,prev val,ne
+west val
0,1,302832,-11.88,1,0,Pri,16,0
1,9,302836,11.88,9,0,Pri,10,0
2,1,302832,-11.88,5,3,Pri,14,4
3,3,302833,11.88,1,0,sec,12,0
4,6,302837,-11.88,1,0,Pri,16,3

TABLE: language data:
time= 24hrs
Record_Start: 83
Record_End: 90
COLUMNS: no.,name,languages,proficiency,time taken
1,eliz,English,good,24 hrs
2,susan,Spanish,good,13 hrs
3,danny,Italian,decent,21 hrs

TABLE: Record_Start: 91
Record_End: 95
COLUMNS: no.,name,age,place,year
1,sue,33,NY,2015
2,mark,28,cal,2106
[download]

more over it would make my parsing easier if each of these tables would have different arrays, as I'm converting the header also as another field in the array e.g: a column with name "record start" and for a particular table all rows have same value under it

sadly there isn't any form feed separating the pages there's a double blank line between pages but this is the same for few tables too, so I think end of the page can be found only by seeing the year in the next line(start of next page)

One more important thing is that the program takes a lot of time to give out output considering the size of the file almost(1.5Gb), Will it not speed things up if we just read only the required pages won't it speed things up, its taking me more than 5 min to parse a single extension ,I'd desperately request you to suggest a way to reduce the time

All I want is to look for a keyword and start printing all the headers + tables (as you are doing now ) until we find year name in the next line

In reply to Re^10: Parsing .txt into arrays by Fshah
in thread Parsing .txt into arrays by Fshah

Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!

Titles consisting of a single word are discouraged, and in most cases are disallowed outright.

Read Where should I post X? if you're not absolutely sure you're posting in the right place.

Please read these before you post! —

Posts may use any of the Perl Monks Approved HTML tags:

a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr

You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)

	For:		Use:
	&		`&`
	<		`<`
	>		`>`
	[		`[`
	]		`]`

Link using PerlMonks shortcuts! What shortcuts can I use for linking?

See Writeup Formatting Tips and other pages linked from there for more info.