in reply to Gettin certain lines of text from a text file!

How do you eat an elephant?
One bite at a time

You need to break this problem down into sub sections that are manageable. Bfeider már seo:

This is the heart of what programming is, breaking a problem down to manageable sub-problems. Of course there are those who can do this faster, "take larger bites", see an inspired way to grease the elephant and swallow it in one go (stop with the metaphors already). However baby steps first ;)
  • Comment on Re: Gettin certain lines of text from a text file!

Replies are listed 'Best First'.
Re^2: Gettin certain lines of text from a text file!
by Foxyirish1987 (Novice) on Sep 15, 2009 at 12:59 UTC

    ok here is the first part of the text file. i need to extract the date from this

    From: g027778@msx.stp.guidant.com Sent: Thursday, September 03, 2009 08:33 To: Steidl, Charles (STP); Gubbins, John (CLN); Horan, Laurence (CLN); Duppong, Lawrence (STP); Donovan, Leighton (CLN); Deasy, Luke (CLN); Walmsley, Mark (STP); Smith, Noel (CLN); Keating, Patrick (CLN); Doyle +, PJ (CLN); Heineman, Scott (STP); Maher, Stephen (CLN); Williams, Thoma +s (STP)(Test Eng); Nguyen, Thuan (STP) Subject: ** Summary of yield for Product: Insignia__Hyb_E1 **** Yield Report **** **** Product Name : Insignia__Hyb_E1_581394 **** Software Product : 581394-201-A **** Test Level : Hyb_E1 **** Part Number : 401666-312 **** Model Number : h666 **** RMI DataBase : prod **** Analysis Date Range : 09/02/2009 00:00:00 to 09/02/2009 23:59:59 **** Tested First/Last : 09/02/2009 02:06:04 to 09/02/2009 19:20:16
    After analysis date range i need to take out the 09/02/2009. this date will change every day so searching for it will not work. i'm trying to use an array to fins analysis date range and then take the next 'x' amount of characters and output them to the excel file. this would be the first step. what i'm comming up with is that I can find the analysis date range but my array etc is not working. And yes I as dropped royally in it given this 2 weeks ago and i'm finishing work next week. only an intern here for the summer. electrical engineer undergraduate and we don't do this sort of stuff often/ever. thanks a mil for all the help though. and I have it broken down. above is the first part!

      So, when you're looking for the date range, you can find the "Analysis Date Range", but then how do you decide what is the date coming after that?

      I would say that you're looking for:

      1. First, the string "Analysis Date Range : " with an arbitrary amount of whitespace around the ':' (/Analysis Date Range\s+:\s+/)
      2. Then, at least one digit, which we want to remember (/([0-9]+)/)
      3. Then, a slash (which needs a backslash to escape it) followed by at least one digit, and record the digits but not the slash (/\/([0-9]+)/)
      4. Then, another slash followed by at least one digit, and again record the digits not the slash (/\/([0-9]+)/)
      5. And the rest doesn't matter
      Combining that all into one:

      # i modifer to make it case insensitive, just in case. if ($line =~ /Analysis Date Range\s*:\s*([0-9]+)\/([0-9]+)\/([0-9])+/i +) { print "I found an Analysis date with month/day $1, $2 in year $3\n"; }else{ print "This line did not have an analysis date that I recognize in i +t\n"; }

      Another way to go, would be to grab larger chunks and then split them later:

      1. "Analysis Date Range : "
      2. Then a pile of digits-and-slashes (start date) ([0-9\/]+)
      3. Then whitespace and digits-and-colons (start time) \s([0-9:]+)
      4. Then more whitespace and some digits-and-slashes (end date)
      5. Finally, whitespace then digits-and-colons (end time)

      PS: Is that date string "month day year" or "day month year"?

        thanks a million that is really helping. had to modify some part of it for strawberrie perl, no idea why. but its working alright. To do the next part should be something simular so i'll work away with that for the time being. thanks :-) p.s (this is for ZEN) And already said it to the manager and expectations are lowered but it'd be nice to get it done no? :-)

      So you need to extract a date once which appears after Sent and keep this value and use it in combination with the Analysis Date Range for each report?
      open (FILE, "<" ,"datafile"); my $sent_date=""; while (<>){ my $this_date_range=""; ($sent_date)=/^Sent: (.*)$/; #Could be more precise, but you're + only expecting one instance of Sent at the start of a line ? ($this_date_range)= /^\*{4}\s+Analysis Date Range\s+:\s(\d{2}\/ +\d{2}\/\d{4})/; add_to_spreadsheet($sent_date,$this_date_range); }
      So, half way there ;) Now you need to implement add_to_spreadsheet(); Answer the questions and use the module suggested initially, it's easy enough to use once you know the steps you intend to take.

        thanks a mil again. both work and i'm using both for different things at the moment. Ye are a great help thanks it was all the /~.{{ etc that wre killing me. spot on!