Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

open(MAN,"/usr/bin/man ssh |") || die "Could not fork():$!\n"; while (<MAN>) { print "found it\n" if (/DESCRIPTION/i); # tried with modifiers: m,s,i none worked. }
I posted this a while back about being able to grep a certain portion of the manpages and none of the examples worked when put into use. Something as simple as
system("man ssh >sshf"); open(MANP,"<sshf") or die "Error:$!\n"; while (<MANP>){ print "got it\n" if /DESCRIPTION/; }
doesn't work, is the man page in some kind of format that perl can't read with a regexp? I have been fiddling with this thing for a while and I simply can't figure a way to get a section of the man pages from a piped open.

the _DATA_ thing worked, but doesn't work for the situation I need.

All I want to be able to do is get the text from name to synopsis to description in their respective variable.

.... i tried using groff, but that was not a success I rather not use groff unless it is the only way. My dirs aren't set up in a way that I would benefit from it's use unless I ran a search for the arg to man in all my dirs and then used groff..... it would make my day if only this would only work......

open(MAN,"/usr/bin/man ssh |") or die"$!\n"; while(<MAN>){ print "work!\n" if /description/ig; }
-- side note --
open(MAN,"/usr/bin/man ssh |") or die"$!\n"; while(<MAN>){ print "work!\n" if /ssh/g; }
Things like this work, but the most important part like where I need to start the match
(NAME)->(SYNOPSIS)->(DESCRIPTION)
don't match.

An example of what I am trying to do so I am clear to all :)
---
%man ssh
SSH(1)                      System Reference Manual                     SSH(1)

NAME
     ssh - OpenSSH secure shell client (remote login program)

SYNOPSIS
     ssh -l login_name  user@hostname command

     ssh -afgknqstvxACNPTX1246 -c cipher_spec -e escape_char -i
         identity_file -l login_name -m mac_spec -o option -p port
         -L port:host:hostport -R port:host:hostport 
         user@hostname command

DESCRIPTION
     ssh (Secure Shell) is a program for logging into a remote machine and for
     executing commands on a remote machine.  It is intended to replace rlogin
     and rsh, and provide secure encrypted communications between two untrust­
     ed hosts over an insecure network.  X11 connections and arbitrary TCP/IP
     ports can also be forwarded over the secure channel.

     ssh connects and logs into the specified hostname. The user must prove
     his/her identity to the remote machine using one of several methods de­
     pending on the protocol version used:

   SSH protocol version 1

     First, if the machine the user logs in from is listed in /etc/hosts.equiv
     or /etc/ssh/shosts.equiv on the remote machine, and the user names are
     the same on both sides, the user is immediately permitted to log in.
     Second, if .rhosts or .shosts exists in the user's home directory on the
     remote machine and contains a line containing the name of the client ma­
     chine and the name of the user on that machine, the user is permitted to
     log in.  This form of authentication alone is normally not allowed by the
     server because it is not secure.

---- I edited the rest out ----
I want to have everything in the NAME block in $name until \n\n
$name = qw( ssh - OpenSSH secure shell client (remote login program) );
and everything in the SYNOPSIS block in $synopsis until we reach \n\n
$synopsis = qw( ssh [-l login_name] [hostname | user@hostname] [command] );
and lastly everything in the DESCRIPTION block in $description until we again reach \n\n
$description =qw(

ssh (Secure Shell) is a program for logging into a remote machine and for
executing commands on a remote machine.  It is intended to replace rlogin
and rsh, and provide secure encrypted communications between two untrust­
ed hosts over an insecure network.  X11 connections and arbitrary TCP/IP
ports can also be forwarded over the secure channel.
);
EOF

Replies are listed 'Best First'.
Re (tilly) 1: Parsing (l)unix man pages
by tilly (Archbishop) on Aug 21, 2001 at 05:06 UTC
    Your problem is the control characters. Try the following:
    my $cmd = "/usr/bin/man ssh"; open(MAN, "$cmd |") or die "Cannot run command '$cmd': $!"; while (<MAN>) { s/\010.//g; print "Worked!\n" if /ssh/; }
    The \010 is the octal escape sequence for character 8. You can double-check the presence of character 8 by:
    man ssh |perl -ne 'print "$&:", ord($&), "\n" while /./g' | less
      Thank you all :)
Re: Parsing (l)unix man pages
by Cirollo (Friar) on Aug 21, 2001 at 00:54 UTC
    perl -e 'open(FILE, "/usr/bin/man ssh | "); while(<FILE>) { print if / +SYNOPSIS/ }'
    Works great, on Solaris. Maybe the fancy formatting (bold fonts etc) of the man page on your system is what is getting you hung up? Having weird terminal control characters in the output from man to do font decorations might be making your match fail.
      Man usually just prints the text when it sees that the output is not going to a terminal...

      T I M T O W T D I
        
        perl -e 'open(FILE, "/usr/bin/man ssh | "); while(<FILE>) { print if /SYNOPSIS/ }'
        (testing)%perl -e 'open(FILE, "/usr/bin/man ssh | "); while(<FILE>) { print if /SYNOPSIS/ }'
        (testing)%
        
        
        
        well see that doesn't work for me... I'm using hyperterminal in winNT dunno if that is why or not. I just get no output from the above cmd.
Re: Parsing (l)unix man pages
by Cine (Friar) on Aug 21, 2001 at 00:52 UTC
    perl -e 'open(F,"man ssh|")||die;print while(<F>);'
    Works fine for me, prints the man page just as it should...
    What exactly is it that doesnt work?

    T I M T O W T D I
      If you send the man output to a file, you will see that the lines look something like:
      D^HDE^HES^HSC^HCR^HRI^HIP^HPT^HTI^HIO^HON^HN
      You can either use a perl script to take them out, or us lazy/smart types use:
      open(F,"man ssh| col -b |")
      People on old legacy unix boxes with crappy pagers have been using col -b for years to dump plain text to a file which can then be looked at with a text editor.