JoeJaz has asked for the wisdom of the Perl Monks concerning the following question:

Hi, I was hoping for some help with parsing out information from a very large string variable so that I might place html tags around different portions of the string. Here goes:

I have a variable that contains both spaces and newline characters. Within an HTML table, I would like to create a new row for each "\n" character in the string and a new colum for every space in the string.

My code looks like this so for, but I know that it is incorrect.
#!/usr/bin/perl use strict; my $input='unixhelp pts/5 10.3.4.54 Fri Jun 13 11:08 still logged in unixhelp pts/5 10.3.4.00 Fri Jun 13 11:05 - 11:05 (00:00) unixhelp pts/5 10.3.4.00 Fri Jun 13 11:04 - 11:05 (00:00) unixhelp pts/5 10.3.4.00 Fri Jun 13 11:00 - 11:01 (00:00) unixhelp pts/5 10.3.4.00 Fri Jun 13 10:49 - 10:52 (00:03) unixhelp pts/8 10.3.4.00 Fri Jun 13 10:46 - 10:46 (00:00) unixhelp pts/3 10.3.4.00 Fri Jun 13 10:12 - 10:12 (00:00) unixhelp pts/3 10.3.4.00 Fri Jun 13 10:10 - 10:10 (00:00) unixhelp pts/3 10.3.4.00 Fri Jun 13 10:09 - 10:09 (00:00) unixhelp pts/3 10.3.4.00 Fri Jun 13 10:07 - 10:08 (00:00)'; print "Content-type: text/html\n\n"; print "<HTML><BODY>\n"; print " <TABLE>\n"; print " <TR>\n"; foreach ($input) { print "<TD>"; print; print "\n</TD>"} print " </TR>\n"; print " </TABLE>\n"; print "<BODY><HTML>\n";

This code only places a table around the entire variable and does not break it up into pieces. I am still very new to perl and am not sure of all of the unique things that you can do with a for loop. If anyone knows how to seperate a variable into pieces using newline and space as a character, I would be very appreciateive if you would send me some suggestions or resources. Thank you very much for your time in reading this.

Replies are listed 'Best First'.
Re: Parsing Variables
by halley (Prior) on Jun 13, 2003 at 16:55 UTC
    What you're looking for is along these lines. In Perl, there are many alternatives to everything, but this is one approach that a newcomer such as yourself may be able to use as you learn.
    my @rows = split /\n/, $input; foreach my $row (@rows) { print "<TR>\n"; my @columns = split / /, $row; foreach my $column (@columns) { print "<TD>$column</TD>\n"; } print "</TR>\n"; }
    Now, personally, I think your table will turn out kind of odd... you're making three cells just for the date, and the "still logged in" will be lined up funny against the logout timespans. To fix this would best be served by parsing the rows with a regular expression instead of split, but this is a start.

    Update: Added 'my' keywords to match your 'use strict'. Glad to see newcomers learning discipline.

    --
    [ e d @ h a l l e y . c c ]

      with input lines of the form:
      unixhelp pts/3 10.3.4.00 Fri Jun 13 10:10 - 10:10 (00:00)

      you could include a number at the end of the split like so:

      my @columns = split / /, $row, 10;

      This will tell the split command to split into no more than ten bits, hence the message at the end will not get split and appear in one td cell.

      --tidiness is the memory loss of environmental mnemonics

Re: Parsing Variables
by Coplan (Pilgrim) on Jun 13, 2003 at 17:01 UTC
    I really hate to do this to you. I know I always got frustrated when I got this type of answer. But I personally learned much better when I had to do the specific research myself. Clipping code won't help you. Building it, fixing it and figuring out why your code doesn't work will help you.

    So the first thing you want to do is learn about regexp (regular expressions). It's kind a wierd concept, but once you learn how to look at your source file, it grows easier and easier.

    Now, your data is a variable. I'm assuming you'll eventually want to read this from a file? Correct? I have a recommendation. If you're going to read the data from a file, I highly recommend you set your script up to do that already. It may affect the behavior of your regexp in the long run. Best to get the environment into place and code around it rather than to force the environment to fit your code.

    Anyhow, regexps work by finding and or find/replace methods. You'll search for the elements that you want. Actually, in this case, you'll be searching for the things that break up the elements you want. I could be wrong, but it looks to me that your list is separated by line breaks and space breaks. So you'll want to read all the data from your variable, and search through it looking for spaces and line breaks. You'll of course want to replace each with the appropriate html code. Don't forget to start each line with a new line and cell.

    Since you're new to perl, I'd figure out how to replace these spaces and elements one at a time. You'll want to iterate through your big variable stream more than once. This is not common practice for later down the road, as it is a bit slow, but it'll help you to understand what is/isn't working if something breaks. Then figure out how to optimize it by combining all the search/replaces at once.

    --Coplan

      It looks to be the output of a "last" command, there is a perl interface to utmp and utmpx called User::Utmp It has been a while since I have used it but it used to work well.

      -Waswas
Re: Parsing Variables
by jcpunk (Friar) on Jun 13, 2003 at 16:47 UTC
    I believe split will be of some use to you, the exact syntax eludes me at this moment but a tutorial can be found here
Re: Parsing Variables
by naChoZ (Curate) on Jun 13, 2003 at 16:58 UTC
    Something like this:

    #!/usr/bin/perl -w use strict; my $input='unixhelp pts/5 10.3.4.54 Fri Jun 13 11:08 still logged in unixhelp pts/5 10.3.4.00 Fri Jun 13 11:05 - 11:05 (00:00) unixhelp pts/5 10.3.4.00 Fri Jun 13 11:04 - 11:05 (00:00) unixhelp pts/5 10.3.4.00 Fri Jun 13 11:00 - 11:01 (00:00) unixhelp pts/5 10.3.4.00 Fri Jun 13 10:49 - 10:52 (00:03) unixhelp pts/8 10.3.4.00 Fri Jun 13 10:46 - 10:46 (00:00) unixhelp pts/3 10.3.4.00 Fri Jun 13 10:12 - 10:12 (00:00) unixhelp pts/3 10.3.4.00 Fri Jun 13 10:10 - 10:10 (00:00) unixhelp pts/3 10.3.4.00 Fri Jun 13 10:09 - 10:09 (00:00) unixhelp pts/3 10.3.4.00 Fri Jun 13 10:07 - 10:08 (00:00)'; @inputlines = split('\n', $input); print "Content-type: text/html\n\n"; print "<HTML><BODY>\n"; print " <TABLE>\n"; print " <TR>\n"; foreach (@inputlines) { print "<TD> $_ \n</TD>"; } print " </TR>\n"; print " </TABLE>\n"; print "<BODY><HTML>\n";

    untested of course...

    ~~
    naChoZ

      You could also condense the separate print lines by using FORMAT commands (although it is a bit long and somewhat involved) or you could do:

      print qq/ Content-type: text/html <html><head></head> <body> <table> <tr>/; foreach (<@inputlines>) { print "<td>$_\n</td>"; } print qq/ </tr> </table> </body></html>/;
      Of course there is always more than one way to do it :-)

      "Ex libris un peut de tout"

Re: Parsing Variables
by fglock (Vicar) on Jun 13, 2003 at 16:58 UTC

    You could simply substitute "space"s by "td"s. You can start with this:

    $input =~ s/ /<\/td><td>/g;

    Also, this looks like a case for using <code> tags, instead of a table.