OK. I am a new newbie and I have been tasked with doing some data parsing for a file. I have two versions of the script. Neither seem to work. I am trying to copy data that contains the last numeric term from all stations that contain PM2.5 data but only the set (there is always two for each variable) that has 24 hourly averages not 13 average time periods. I have included the complete file that I would have to parse. I have also included both scripts labeled script 1 & 2 below. Please help!!

This is what the output file should look like

PM2.5 21-9-2010 22:4:49 (dd-mm-yyyy hrs:min:sec) KA5 4 OV20 10 DH1 2 PA16 8 MV17 0 HL11 3 KN12 17 PC 4 KH19 0 SI2 8

This is what the input file looks like: BEGIN_FILE FORMAT_VERSION,2 AGENCY,HI1 FILENAME,090913.HI1 DATA_VERSION,201009091310 TZONE,HST,10 BEGIN_GROUP VARIABLE,CO DATA_TYPE,POINT MEASUREMENT_TYPE,SAMPLE CHARACTERISTIC,OBSERVED START_DTG,201009080000 END_DTG,201009082359 INTERVAL,60 START_REF,0 NUMSTEPS,24 AVG_TIME,60 UNITS,PPM STATIONS,2 BEGIN_DATA KA5,150030010,0.2,0.2,0.2,0.2,0.2,0.2,-999,0.2,0.2,0.2,0.2,0.2,0.2,0.2,0.2,0.2,0.2,0.2,0.2,0.2,0.2,0.2,0.2,0.2 KA5,150030010,G,G,G,G,G,G,B,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G DH1,150031001,0.5,0.5,0.5,0.5,0.5,0.5,0.5,0.6,0.6,0.5,0.5,0.5,0.5,0.5,0.5,0.5,0.5,0.5,0.5,0.5,0.5,0. 5,0.5,0.5 DH1,150031001,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G END_DATA END_GROUP BEGIN_GROUP VARIABLE,CO DATA_TYPE,POINT MEASUREMENT_TYPE,SAMPLE CHARACTERISTIC,OBSERVED START_DTG,201009090000 END_DTG,201009091309 INTERVAL,60 START_REF,0 NUMSTEPS,13 AVG_TIME,60 UNITS,PPM STATIONS,2 BEGIN_DATA KA5,150030010,0.2,0.2,0.2,0.2,0.2,0.2,-999,0.3,0.2,0.2,0.2,0.2,0.2 KA5,150030010,G,G,G,G,G,G,B,G,G,G,G,G,G DH1,150031001,0.5,0.5,0.5,0.5,0.5,0.5,0.5,0.5,0.5,0.5,0.5,0.5,0.5 DH1,150031001,G,G,G,G,G,G,G,G,G,G,G,G,G END_DATA END_GROUP BEGIN_GROUP VARIABLE,NO2 DATA_TYPE,POINT MEASUREMENT_TYPE,SAMPLE CHARACTERISTIC,OBSERVED START_DTG,201009080000 END_DTG,201009082359 INTERVAL,60 START_REF,0 NUMSTEPS,24 AVG_TIME,60 UNITS,PPM STATIONS,2 BEGIN_DATA KA5,150030010,0.001,0,0,0,0.001,0.004,-999,0.004,0.005,0.003,0.003,0.002,0.002,0.002,0.002,0.002,0.001,0.002,0.001,0.002,0.002,0.002,0.001, 0 KA5,150030010,G,G,G,G,G,G,B,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G WB6,150030011,0,0,0,0,0,0,-999,0,0.002,0.002,0,0.001,0,0,0,0,0,0,0,0,0,0,0,0 WB6,150030011,G,G,G,G,G,G,B,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G END_DATA END_GROUP BEGIN_GROUP VARIABLE,NO2 DATA_TYPE,POINT MEASUREMENT_TYPE,SAMPLE CHARACTERISTIC,OBSERVED START_DTG,201009090000 END_DTG,201009091309 INTERVAL,60 START_REF,0 NUMSTEPS,13 AVG_TIME,60 UNITS,PPM STATIONS,2 BEGIN_DATA KA5,150030010,0,0,0.001,0.001,0.003,0.011,-999,0.009,0.004,0.002,0.002,0.002,0.003 KA5,150030010,G,G,G,G,G,G,B,G,G,G,G,G,G WB6,150030011,0,0,0,0,0,0.002,-999,0.005,0.002,0,0.001,0,0 WB6,150030011,G,G,G,G,G,G,B,G,G,G,G,G,G END_DATA END_GROUP BEGIN_GROUP VARIABLE,OZONE DATA_TYPE,POINT MEASUREMENT_TYPE,SAMPLE CHARACTERISTIC,OBSERVED START_DTG,201009080000 END_DTG,201009082359 INTERVAL,60 START_REF,0 NUMSTEPS,24 AVG_TIME,60 UNITS,PPM STATIONS,1 BEGIN_DATA SI2,150031004,0.013,0.014,0.013,0.013,0.013,0.01,0.009,0.007,0.011,-999,0.021,0.019,0.019,0.018,0.019,0.017,0.018,0.018,0.016,0.009,0.014,0.017,0.017,0.017 SI2,150031004,G,G,G,G,G,G,G,G,G,K,G,G,G,G,G,G,G,G,G,G,G,G,G,G END_DATA END_GROUP BEGIN_GROUP VARIABLE,OZONE DATA_TYPE,POINT MEASUREMENT_TYPE,SAMPLE CHARACTERISTIC,OBSERVED START_DTG,201009090000 END_DTG,201009091309 INTERVAL,60 START_REF,0 NUMSTEPS,13 AVG_TIME,60 UNITS,PPM STATIONS,1 BEGIN_DATA SI2,150031004,0.016,0.017,0.016,0.01,0.014,0.011,0.006,0.009,0.017,0.018,0.02,0.022,0.02 SI2,150031004,G,G,G,G,G,G,G,G,G,G,G,G,G END_DATA END_GROUP BEGIN_GROUP VARIABLE,PM10 DATA_TYPE,POINT MEASUREMENT_TYPE,SAMPLE CHARACTERISTIC,OBSERVED START_DTG,201009080000 END_DTG,201009082359 INTERVAL,60 START_REF,0 NUMSTEPS,24 AVG_TIME,60 UNITS,UG/M3 STATIONS,4 BEGIN_DATA KA5,150030010,3,5,9,7,4,9,11,24,26,28,22,20,13,18,13,18,11,9,7,3,1,2,5,6 KA5,150030010,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G WB6,150030011,3,2,2,6,10,7,3,5,16,9,7,9,16,14,11,8,7,6,7,6,5,4,5,4 WB6,150030011,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G DH1,150031001,10,11,10,7,8,7,8,6,5,5,-999,-999,6,8,8,8,9,9,9,16,10,8,7,7 DH1,150031001,G,G,G,G,G,G,G,G,G,G,B,B,G,G,G,G,G,G,G,G,G,G,G,G PC,150032004,19,10,18,12,8,7,12,24,11,12,12,8,18,10,9,8,8,10,11,11,12,13,15,13 PC,150032004,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G END_DATA END_GROUP BEGIN_GROUP VARIABLE,PM10 DATA_TYPE,POINT MEASUREMENT_TYPE,SAMPLE CHARACTERISTIC,OBSERVED START_DTG,201009090000 END_DTG,201009091309 INTERVAL,60 START_REF,0 NUMSTEPS,13 AVG_TIME,60 UNITS,UG/M3 STATIONS,4 BEGIN_DATA KA5,150030010,7,9,11,9,7,8,22,30,14,26,20,19,15 KA5,150030010,G,G,G,G,G,G,G,G,G,G,G,G,G WB6,150030011,3,8,5,3,6,6,8,10,18,11,9,9,9 WB6,150030011,G,G,G,G,G,G,G,G,G,G,G,G,G DH1,150031001,9,8,5,6,5,5,8,9,8,7,4,-999,5 DH1,150031001,G,G,G,G,G,G,G,G,G,G,G,B,G PC,150032004,12,12,11,11,17,20,13,20,10,8,9,7,-999 PC,150032004,G,G,G,G,G,G,G,G,G,G,G,G,M END_DATA END_GROUP BEGIN_GROUP VARIABLE,PM2.5 DATA_TYPE,POINT MEASUREMENT_TYPE,SAMPLE CHARACTERISTIC,OBSERVED START_DTG,201009080000 END_DTG,201009082359 INTERVAL,60 START_REF,0 NUMSTEPS,24 AVG_TIME,60 UNITS,UG/M3 STATIONS,10 BEGIN_DATA KA5,150030010,0,0,0,0,2,1,0,0,0,3,1,0,1,1,0,5,3,2,3,0,-999,0,0,4 KA5,150030010,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,B,G,G,G OV20,150012020,17,17,18,11,11,6,6,16,9,8,10,13,11,8,7,5,6,6,3,5,6,4,9,10 OV20,150012020,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G DH1,150031001,5,5,3,1,1,4,4,2,2,3,2,1,3,4,3,2,2,5,4,4,5,4,3,2 DH1,150031001,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G PA16,150012016,8,11,7,6,10,8,6,4,5,6,6,5,3,6,6,3,4,4,6,8,6,6,8,8 PA16,150012016,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G MV17,150012017,1,5,3,0,2,1,1,1,4,6,6,4,2,2,4,3,2,1,1,2,3,2,0,0 MV17,150012017,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G HL11,150011006,4,2,-1,0,2,1,2,4,3,2,2,0,-1,0,4,4,2,1,1,3,5,5,2,3 HL11,150011006,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G KN12,150011012,8,10,7,7,9,11,11,7,4,7,8,6,5,5,6,5,7,10,18,15,13,19,20,17 KN12,150011012,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G PC,150032004,2,3,2,0,1,0,0,2,3,4,4,1,1,2,0,0,0,0,2,3,4,3,5,4 PC,150032004,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G KH19,150090006,1,0,0,4,7,4,1,0,0,1,1,1,3,2,6,11,18,5,3,2,2,0,0,0 KH19,150090006,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G SI2,150031004,7,5,4,6,7,6,6,6,6,6,6,6,6,10,8,5,8,9,8,10,13,9,8,8 SI2,150031004,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G END_DATA END_GROUP BEGIN_GROUP VARIABLE,PM2.5 DATA_TYPE,POINT MEASUREMENT_TYPE,SAMPLE CHARACTERISTIC,OBSERVED START_DTG,201009090000 END_DTG,201009091309 INTERVAL,60 START_REF,0 NUMSTEPS,13 AVG_TIME,60 UNITS,UG/M3 STATIONS,10 BEGIN_DATA KA5,150030010,6,2,1,5,1,1,4,1,4,6,3,3,2 KA5,150030010,G,G,G,G,G,G,G,G,G,G,G,G,G OV20,150012020,6,5,11,13,11,11,12,16,17,13,17,21,18 OV20,150012020,G,G,G,G,G,G,G,G,G,G,G,G,G DH1,150031001,5,8,7,2,0,1,3,6,6,4,-999,3,2 DH1,150031001,G,G,G,G,G,G,G,G,G,G,B,G,G PA16,150012016,8,22,19,14,13,15,13,12,11,6,2,3,4 PA16,150012016,G,G,G,G,G,G,G,G,G,G,G,G,G MV17,150012017,3,4,4,4,3,2,0,0,1,3,3,2,3 MV17,150012017,G,G,G,G,G,G,G,G,G,G,G,G,G HL11,150011006,3,1,2,1,0,0,1,2,3,2,1,1,-1 HL11,150011006,G,G,G,G,G,G,G,G,G,G,G,G,G KN12,150011012,12,11,11,10,10,9,10,11,8,8,9,10,11 KN12,150011012,G,G,G,G,G,G,G,G,G,G,G,G,G PC,150032004,1,0,2,3,2,1,4,7,6,2,3,6,-999 PC,150032004,G,G,G,G,G,G,G,G,G,G,G,G,M KH19,150090006,2,0,0,0,1,4,0,1,4,1,0,0,-999 KH19,150090006,G,G,G,G,G,G,G,G,G,G,G,G,M SI2,150031004,6,6,8,8,7,5,7,8,9,6,5,8,8 SI2,150031004,G,G,G,G,G,G,G,G,G,G,G,G,G END_DATA END_GROUP BEGIN_GROUP VARIABLE,SO2 DATA_TYPE,POINT MEASUREMENT_TYPE,SAMPLE CHARACTERISTIC,OBSERVED START_DTG,201009080000 END_DTG,201009082359 INTERVAL,60 START_REF,0 NUMSTEPS,24 AVG_TIME,60 UNITS,PPM STATIONS,9 BEGIN_DATA KA5,150030010,0,0,0,0,0,0,-999,0.001,0.001,0.001,0.001,0.001,0.001,0.001,0.001,0.001,0.001,0.001,0,0,0,0,0,0 KA5,150030010,G,G,G,G,G,G,B,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G WB6,150030011,0,0,0,0,0,0,-999,0.001,0.001,0.002,0.001,0.001,0.001,0,0,0,0,0,0,0,0,0,0,0 WB6,150030011,G,G,G,G,G,G,M,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G OV20,150012020,0,0.001,0.001,-0.001,-0.001,-0.002,-0.001,0.002,0.006,0,0.003,0.008,0.001,-0.001,-0.001,-0.001,-0.001,-0.002,-0.002,-0.002,-0.001,0,-0.001,-0.001 OV20,150012020,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G DH1,150031001,0.001,0.001,0.001,0.001,0.001,0.001,0.001,0.001,0.001,0.001,0.001,0.001,0.001,0.001,0. 001,0.001,0.001,0.001,0.001,0.001,0.001,0.001,0.001,0.001 DH1,150031001,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G PA16,150012016,0.087,0.036,0.079,0.13,0.105,0.081,0.102,0.069,0.087,-999,0.007,0.004,0.002,0.001,0.001,0.001,0.001,0.003,0.006,-999,0.013,0.011,0.016,0.053 PA16,150012016,G,G,G,G,G,G,G,G,G,K,G,G,G,G,G,G,G,G,G,B,G,G,G,G MV17,150012017,0.002,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,-999,0.004,0.001 MV17,150012017,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,B,G,G HL11,150011006,0.001,0.001,0.001,0.001,0.001,0.001,0.001,0.001,0.001,0.001,0.001,0.001,0.001,0.001,0 .001,0.001,0.001,0.001,0.002,-999,0.005,0.002,0.002,0.002 HL11,150011006,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,B,G,G,G,G KN12,150011012,0,0,0,0,0,0,0.001,0.001,0.001,0.001,0.002,0.001,0.001,0.001,0.001,0.001,0.003,0.003,0 .002,0.003,0.004,0.004,0.003,0.002 KN12,150011012,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G PE10,150012010,0.001,0.001,0.001,0.001,0.001,0.001,0.001,0.001,0.001,0.001,0.001,0.001,0.001,0.001,0 .001,0.001,0.001,0.001,0.001,0.001,0.001,0.001,0.001,0.001 PE10,150012010,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G END_DATA END_GROUP BEGIN_GROUP VARIABLE,SO2 DATA_TYPE,POINT MEASUREMENT_TYPE,SAMPLE CHARACTERISTIC,OBSERVED START_DTG,201009090000 END_DTG,201009091309 INTERVAL,60 START_REF,0 NUMSTEPS,13 AVG_TIME,60 UNITS,PPM STATIONS,9 BEGIN_DATA KA5,150030010,0,0,0,0,0,0,-999,0.002,0.001,0.001,0.001,0.001,0.001 KA5,150030010,G,G,G,G,G,G,B,G,G,G,G,G,G WB6,150030011,0,0,0,0,0,0,-999,0.002,0.001,0.001,0.001,0.001,0.001 WB6,150030011,G,G,G,G,G,G,M,G,G,G,G,G,G OV20,150012020,-0.001,0.055,0.007,0,-0.001,-0.001,-0.001,-0.001,-0.001,0.001,0.013,0.007,0.004 OV20,150012020,G,G,G,G,G,G,G,G,G,G,G,G,G DH1,150031001,0.001,0.001,0.001,0.001,0.001,0.001,0.001,0.001,0.001,0.001,0.001,0.001,0.001 DH1,150031001,G,G,G,G,G,G,G,G,G,G,G,G,G PA16,150012016,0.16,0.352,0.37,0.328,0.308,0.265,0.224,0.175,0.051,0.008,0.006,0.003,0.002 PA16,150012016,G,G,G,G,G,G,G,G,G,G,G,G,G MV17,150012017,0,0,0,0,0,0,0,0,0,0,0,0,0 MV17,150012017,G,G,G,G,G,G,G,G,G,G,G,G,G HL11,150011006,0.002,0.001,0.001,0.001,0.001,0.001,0.001,0.002,0.002,0.002,0.002,0.002,0.002 HL11,150011006,G,G,G,G,G,G,G,G,G,G,G,G,G KN12,150011012,0.001,0.001,0.001,0.001,0,0,0,0.001,0.001,0.001,0.002,0.003,0.003 KN12,150011012,G,G,G,G,G,G,G,G,G,G,G,G,G PE10,150012010,0.001,0.001,0.001,0.001,0.001,0.001,0.001,0.001,0.001,0.001,0.001,0.001,0.001 PE10,150012010,G,G,G,G,G,G,G,G,G,G,G,G,G END_DATA END_GROUP BEGIN_GROUP VARIABLE,WD DATA_TYPE,POINT MEASUREMENT_TYPE,SAMPLE CHARACTERISTIC,OBSERVED START_DTG,201009080000 END_DTG,201009082359 INTERVAL,60 START_REF,0 NUMSTEPS,24 AVG_TIME,60 UNITS,DEGREES STATIONS,11 BEGIN_DATA KA5,150030010,58,66,66,47,43,46,59,44,64,66,66,69,71,78,71,64,62,61,71,66,85,53,49,45 KA5,150030010,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G WB6,150030011,63,69,67,65,59,56,63,65,75,81,84,81,70,68,65,68,66,71,56,63,62,66,64,52 WB6,150030011,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G OV20,150012020,336,8,343,26,19,28,22,267,229,224,249,256,264,288,239,187,125,207,338,350,34,65,19,36 OV20,150012020,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G DH1,150031001,93,-999,-999,-999,-999,-999,62,84,56,48,57,54,54,54,56,54,56,57,61,76,64,68,66,74 DH1,150031001,G,K,K,K,K,K,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G PA16,150012016,325,331,323,303,300,308,294,229,173,-999,106,106,117,113,128,133,180,257,306,267,287,296,328,311 PA16,150012016,G,G,G,G,G,G,G,G,G,K,G,G,G,G,G,G,G,G,G,G,G,G,G,G MV17,150012017,246,249,257,289,263,249,264,275,357,29,37,39,49,36,51,52,52,32,12,274,268,262,259,256 MV17,150012017,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G HL11,150011006,264,260,264,250,271,271,257,279,351,76,68,67,73,70,68,78,60,36,311,284,282,280,274,26 5 HL11,150011006,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G KN12,150011012,78,95,93,107,104,96,104,122,153,168,189,198,205,232,256,260,239,99,71,64,50,46,78,103 KN12,150011012,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G PE10,150012010,322,318,314,314,319,316,315,319,332,358,4,10,22,26,25,32,24,17,358,344,335,329,331,32 8 PE10,150012010,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G PC,150032004,37,0,338,-999,338,355,335,334,63,55,58,50,50,56,47,54,56,49,51,49,51,51,42,41 PC,150032004,G,G,G,K,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G SI2,150031004,100,100,23,18,36,38,23,22,16,21,29,32,32,27,28,25,20,26,35,45,42,36,21,24 SI2,150031004,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G END_DATA END_GROUP BEGIN_GROUP VARIABLE,WD DATA_TYPE,POINT MEASUREMENT_TYPE,SAMPLE CHARACTERISTIC,OBSERVED START_DTG,201009090000 END_DTG,201009091309 INTERVAL,60 START_REF,0 NUMSTEPS,13 AVG_TIME,60 UNITS,DEGREES STATIONS,11 BEGIN_DATA KA5,150030010,37,62,65,61,-999,-999,40,49,62,73,37,65,47 KA5,150030010,G,G,G,G,K,K,G,G,G,G,G,G,G WB6,150030011,57,54,57,50,-999,-999,65,59,83,81,79,78,79 WB6,150030011,G,G,G,G,K,K,G,G,G,G,G,G,G OV20,150012020,26,19,27,24,15,23,15,314,273,228,232,269,288 OV20,150012020,G,G,G,G,G,G,G,G,G,G,G,G,G DH1,150031001,64,79,109,-999,112,114,83,94,117,99,84,66,67 DH1,150031001,G,G,G,K,G,G,G,G,G,G,G,G,G PA16,150012016,311,315,316,316,318,319,306,316,40,99,97,89,93 PA16,150012016,G,G,G,G,G,G,G,G,G,G,G,G,G MV17,150012017,256,248,278,273,255,240,277,20,26,348,96,26,67 MV17,150012017,G,G,G,G,G,G,G,G,G,G,G,G,G HL11,150011006,261,264,265,258,269,268,271,335,83,265,222,76,49 HL11,150011006,G,G,G,G,G,G,G,G,G,G,G,G,G KN12,150011012,-999,100,107,-999,77,89,82,-999,225,226,227,215,239 KN12,150011012,K,G,G,K,G,G,G,K,G,G,G,G,G PE10,150012010,324,329,328,315,314,315,316,330,329,356,38,81,97 PE10,150012010,G,G,G,G,G,G,G,G,G,G,G,G,G PC,150032004,46,67,74,60,-999,-999,52,55,76,69,64,57,-999 PC,150032004,G,G,G,G,K,K,G,G,G,G,G,G,M SI2,150031004,20,51,92,-999,94,107,76,73,77,75,52,34,48 SI2,150031004,G,G,G,K,G,G,G,G,G,G,G,G,G END_DATA END_GROUP BEGIN_GROUP VARIABLE,WS DATA_TYPE,POINT MEASUREMENT_TYPE,SAMPLE CHARACTERISTIC,OBSERVED START_DTG,201009080000 END_DTG,201009082359 INTERVAL,60 START_REF,0 NUMSTEPS,24 AVG_TIME,60 UNITS,M/H STATIONS,11 BEGIN_DATA KA5,150030010,3.4,3.4,3,3.3,3.9,3.4,2.9,4,5.4,6.2,6.3,6.2,6.5,7.3,6.4,6.5,6.3,5.7,5.5,3.4,2,2.5,3.4, 3.9 KA5,150030010,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G WB6,150030011,7,6.6,6.3,5.2,4.9,2.8,4.4,5.3,9.3,10.7,10.5,10.6,9.6,9.7,8.2,8.9,9.3,8.8,6.6,5.1,5,5.9 ,7.4,5.2 WB6,150030011,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G OV20,150012020,1.6,1.9,1.2,3,4.7,4.3,3,1.4,2.8,2.7,3.6,7.2,5.7,5.7,3.4,1.7,4.7,0.7,1.9,1.9,0.8,1.5,4 .3,3.3 OV20,150012020,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G DH1,150031001,1.2,-999,-999,-999,-999,-999,1,1.2,2.1,2.3,3.7,4.9,6.1,6.8,7.1,6.8,5.8,5.7,4.5,2.2,4.2,2.9,3.5,2.2 DH1,150031001,G,K,K,K,K,K,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G PA16,150012016,4.9,4.4,3.1,4.3,4.7,4.6,3,1.4,0.9,-999,10.2,9.2,7.3,7,6.6,4.8,2.8,3.5,2.7,1.2,3.8,4.3,5.7,4.1 PA16,150012016,G,G,G,G,G,G,G,G,G,K,G,G,G,G,G,G,G,G,G,G,G,G,G,G MV17,150012017,2.7,1.7,3.4,1.4,3.7,2,3.6,3,3.2,3.2,3.6,3.9,4.5,4.5,4.8,5.3,4.1,3,1.6,0.9,1.3,1.3,2,2 .9 MV17,150012017,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G HL11,150011006,3.7,3.5,3.2,2.7,4.7,3.4,3.2,2.7,2,3.3,3.5,4.3,6,4.1,5,5.5,2.7,1.8,1.7,2.9,3.9,3.9,3.8 ,3.4 HL11,150011006,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G KN12,150011012,2.9,4.2,5,4.8,4.7,5.9,5.7,4.2,5.1,5.3,6,6.1,5.6,5.4,4.5,2.9,0.9,1.5,2.6,2.2,2.2,2.2,2 .4,1.7 KN12,150011012,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G PE10,150012010,2.9,3.1,3.4,3.8,3.5,4,4,4.8,4.8,4.5,4.6,4.9,4.9,4.9,4.8,4.2,4.8,4.1,2.5,2.3,3.1,4.6,4 .5,5.5 PE10,150012010,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G PC,150032004,2.1,2,2.1,-999,2.6,2.4,2.1,2,4,6.3,6,6.3,7.9,6.5,7.4,7.6,7,7,5.4,5.3,5.3,3.4,4.1,3.2 PC,150032004,G,G,G,K,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G SI2,150031004,2.2,2,1.5,2.4,1.5,1.5,2.4,2.3,2.9,3.4,4.8,5.4,6.6,6.9,7.4,7.6,8,7.2,5.6,2.7,5.4,4.9,3. 9,3.5 SI2,150031004,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G,G END_DATA END_GROUP BEGIN_GROUP VARIABLE,WS DATA_TYPE,POINT MEASUREMENT_TYPE,SAMPLE CHARACTERISTIC,OBSERVED START_DTG,201009090000 END_DTG,201009091309 INTERVAL,60 START_REF,0 NUMSTEPS,13 AVG_TIME,60 UNITS,M/H STATIONS,11 BEGIN_DATA KA5,150030010,3.9,2.3,1.7,1.6,-999,-999,1.4,2.7,4.2,5,5.5,5.5,6.1 KA5,150030010,G,G,G,G,K,K,G,G,G,G,G,G,G WB6,150030011,5.3,3.4,2.7,2.3,-999,-999,1.2,3.2,7,8.4,8.3,7.7,8.4 WB6,150030011,G,G,G,G,K,K,G,G,G,G,G,G,G OV20,150012020,5.5,3.1,3.4,4.6,4.5,3.9,4.8,3.1,3.4,3.2,4.9,5,4.9 OV20,150012020,G,G,G,G,G,G,G,G,G,G,G,G,G DH1,150031001,4.4,2.4,2,-999,1.8,2.5,2.4,2.1,3,2.9,3,4.4,4.2 DH1,150031001,G,G,G,K,G,G,G,G,G,G,G,G,G PA16,150012016,4.5,5.4,5.3,5.1,4.7,5.2,3.6,1.9,1.3,6.8,5,3.9,3.7 PA16,150012016,G,G,G,G,G,G,G,G,G,G,G,G,G MV17,150012017,2.6,2.8,1.8,1.7,2.7,1.9,1.2,1.6,2.3,2.9,2.7,1.9,2.6 MV17,150012017,G,G,G,G,G,G,G,G,G,G,G,G,G HL11,150011006,3.7,3.7,3.7,3.8,2.8,2.6,2.5,1.5,2.6,1.5,1,1.6,1.9 HL11,150011006,G,G,G,G,G,G,G,G,G,G,G,G,G KN12,150011012,-999,2.1,2.6,-999,3.4,3.7,3.8,-999,2,3.2,3.6,4.1,3.9 KN12,150011012,K,G,G,K,G,G,G,K,G,G,G,G,G PE10,150012010,5,3.7,3.6,3.8,4.6,5.1,4,3.5,2.9,2.5,2.4,3.3,3.1 PE10,150012010,G,G,G,G,G,G,G,G,G,G,G,G,G PC,150032004,3.6,3.5,3.3,2,-999,-999,2.6,2.8,4.1,4.9,5.2,6.6,-999 PC,150032004,G,G,G,G,K,K,G,G,G,G,G,G,M SI2,150031004,4.9,1.9,2,-999,2.1,3,4,4.8,5.8,7.3,5.9,5.6,5.2 SI2,150031004,G,G,G,K,G,G,G,G,G,G,G,G,G END_DATA END_GROUP END_FILE

code1
#!/usr/local/bin/perl -w # getting the source code from the file ##################################################################### open(IN,"/home/uila3/rhuff/doh/090809.txt") or die "cannot open file +1 for reading\n"; open(OUT,">/home/uila3/rhuff/doh/test.txt") or die "cannot open file2 +for writing\n"; my $start = 0; my $count=0; while(<IN>) { chomp; $count++; next if( /^\s*$/ ); #skip empty lines if( /^END_DATA\s*$/ ) #end if this word found { $start = 0; next; } if( /VARIABLE,\s*(.*)$/ ) { my ($sec,$min,$hour,$day,$month,$yr19,@rest) = localtime(time); print OUT $_ ; next; } if( ($start==0) && ( /^BEGIN_DATA\s*$/ ) ) #starts with this wor +d only { $start = 1; next; } print OUT $1," ",$2,"\n" if( (($start==1)) && ( /^([^,]*),.*,\s*([0- +9.-]*)$/ ) ); } close(OUT); close(IN); print "No. of lines parsed $count";
In this case, I get a command line that says "No. of lines parsed 463uila%" and an output file that looks like this VARIABLE,CO VARIABLE,CO VARIABLE,NO2 VARIABLE,NO2 VARIABLE,OZONE VARIABLE,OZONE VARIABLE,PM10 VARIABLE,PM10 VARIABLE,PM2.5 VARIABLE,PM2.5 VARIABLE,SO2 VARIABLE,SO2 VARIABLE,WD VARIABLE,WD VARIABLE,WS VARIABLE,WS code 2
# getting the source code from the file my $target_data; { local $/ = "VARIABLE,PM2.5\n"; open my $INFILE, '<', '/home/uila3/rhuff/doh/2010090913.txt' or die "Couldn't open /home/uila3/rhuff/doh/2010090913.txt: $! +"; my $discard = <$INFILE>; $target_data = <$INFILE>; close $INFILE; } print $target_data; print '*' x 20; for my $line (split /\n/, $target_data) { if ($line =~ m{ \A ( \p{Uppercase}{2} \d+ ) , .* , (\d+) }xms ) { print "$1 $2"; } }
This was the response "********************uila% " Nothing else seemed to happen.

20100928 Janitored by Corion: Added readmore tag


In reply to Data Parsing help for newbie HELP ME!! by Anonymous Monk

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.