lollipop7081 has asked for the wisdom of the Perl Monks concerning the following question:
I am VERY new to perl, I'm in a undergrad Software Tools class and my professor assigned perl just to see how good we'd do (with only basic instruction on perl).
I DO NOT WANT SOMEONE TO DO MY HOMEOWRK FOR ME!!!
The assignment is that I need to format a text file where each section can be treated as a chapter and the headings should be centered.
He wants a table of contents.
Also, each section should be broken up into seperate pages and each page numbered with roman numerals that match the table of contents.
And finally, he wants an Index whose keywords are any word that appears in the question and the answer and doesn't appear more than 10 times.
Here's a part of the text he provided:
SECTION 35 - BUILDING VIM FROM SOURCE
35.1. How do I build Vim from the sources on a Unix system?
For a Unix system, follow these steps to build Vim from the sources:
- Download the source and run-time filles archive (vim-##.tar.bz2) fro
+m the
ftp://ftp.vim.org/pub/vim/unix directory.
- Extract the archive using the bzip2 and tar utilities using the com
+mand:
$ bunzip2 -c <filename> | tar -xf -
- Run the 'make' command to configure and build Vim with the default
configuration.
- Run 'make install' command to installl Vim in the default directory.
To enable/disable various Vim features, before running the 'make' comm
+and
you can run the 'configure' command with different flags to include/ex
+clude
the various Vim features. To list all the available options for the
'configure' command, use:
$ configure -help
He wants to make sure that the characters per line and lines per page can be set in the script. So to do that I've done the following:
#!/usr/bin/perl
#
# Reformat the project3Dada.txt file to make more accesible
$file = '/u/home/jhart2/perl/project3Data.txt'; # Name the fil
+e
open(INFO, $file); # Open the file for input
use strict;
use warnings;
my $NoOfCharsPerLine = 80;
my $NoOfLinesPerPage = 100;
my $RightMargin = 7;
my $header = 3;
my $footer = 5;
my @argument;
# Read command line arguments
@argument = split(/=/, $ARGV[0]);
if ($argument[1] =~ /\D/)
{
print "Nonumeric value for lines option \"$argument[1]\" --scr
+ipt aborted\n";
}
if ($argument[0] eq "--chars")
{
$NoOfCharsPerLine = $argument[1];
}
elsif ($argument[0] eq "--lines")
{
$NoOfLinesPerPage = $argument[1];
}
else
{
print "unrecognized command line option \"$argument[0]\" --scr
+ipt aborted\n";
exit;
}
close(INFO); # Close the file
print "$NoOfCharsPerLine\n";
print "$NoOfLinesPerPage\n";
So, because I'm so new I'm thinking that maybe I just don't know what to look for. I've used my text book, google, and forums like this one but i'm still at a loss. All I need is for someone to point me in the right direction to get farther than testing the chars/line and the lines/page. Thanks so much!
Edited by planetscape - added readmore tags ( keep:2 edit:11 reap:0 )
Re: Formatting Text
by ruzam (Curate) on May 12, 2006 at 18:13 UTC
|
You may want to consider looping through @ARGV so you can get more than one input argument.
Showing you how to do it would be far easier then pointing you in the right direction, but here's my thoughts...
You can't build a table of contents until you know what page the sections are on.
You can't build an index to key words until you know what page the keywords are on.
And finally you don't know what page you're on until you squeeze the words into the given line length and squeeze the lines into the given lines per page and account for new sections.
So off the top, you're going to have to read the entire file before you can write the contents, so you should be thinking about how you're going to store the file data in variables.
I would think about reformatting each line as you read it to fit the chars/line restriction.
I would think about keeping a running page counter as you read through the file.
I would think about counting words and keeping a list of page numbers for each word.
| [reply] |
Re: Formatting Text
by Zaxo (Archbishop) on May 12, 2006 at 18:45 UTC
|
ruzam++ has given you some very good advice about how to organize your algorithm.
You don't say how much of perl you are expected to know or use. CPAN has lots of modules which would help, up to nearly solving the whole thing. I'll mention a couple which would help with discrete pieces of the problem.
You start by parsing an option from the command line. The Getopt::Long module provides the standard way of doing that. It works for multiple options and preserves input filenames listed on the command line. That is probably where you should get the name of the file to crunch. Getopt::Long is a base module distributed with perl, so if you are allowed any modules at all, it should be acceptable.
The roman numeral requirement for paging is an entertaining subproblem which your instructor may wish you to solve for yourself. There is the Math::Roman module to do it for you, however. You can increment a Math::Roman numeric variable and stringification will produce the roman representation. Math::Roman is not a base module, but is available from CPAN.
Good luck, and have fun. This seems like an enjoyable exercise to learn with.
| [reply] |
|
If the instructor is giving an assignment to write code in Perl, I would expect them to be aware of all Perl code that is commonly known to be available on the internet(i.e. CPAN). Assigning a problem where a known solution is posted on CPAN seems absurd to me. A better approach in my mind would be to say: study the Math::Roman module that can be found on CPAN and write a program that shows me you know how it works.
If a person enjoys reinventing the wheel, they can do it on their own time, and more power to them. Making that a classroom assignment would be offensive to me.
| [reply] |
Re: Formatting Text
by kwaping (Priest) on May 12, 2006 at 18:40 UTC
|
| [reply] |
Re: Formatting Text
by TedPride (Priest) on May 12, 2006 at 19:47 UTC
|
Here's a simple function for Roman numerals:
use strict;
use warnings;
for (1..100) {
print roman($_), "\n";
}
BEGIN {
my %roman = ( 1 => 'I', 4 => 'IV', 5 => 'V', 9 => 'IX', 10 => 'X',
40 => 'XL', 50 => 'L', 90 => 'XC', 100 => 'C', 400 => 'CD',
500 => 'D', 900 => 'CM', 1000 => 'M' );
my @roman = sort { $b <=> $a } keys %roman;
sub roman {
my ($n, $r) = $_[0];
for (@roman) {
next if $_ > $n;
$r .= $roman{$_} x ($n / $_);
$n = $n % $_;
}
return $r;
}
}
I discovered a relatively simple explanation for the algorithm here. | [reply] [d/l] |
Re: Formatting Text
by TedPride (Priest) on May 12, 2006 at 20:10 UTC
|
use strict;
use warnings;
my $text = join '', <DATA>;
print wordwrap($text, 45);
sub wordwrap {
my ($text, $width, $result) = @_;
for (split /\n/, $text) {
while (length($_) > $width) {
s/^(.{1,$width})\s+//;
$result .= "$1\n";
}
$result .= "$_\n";
}
return $result;
}
__DATA__
When in the Course of human events, it becomes necessary for one peopl
+e to dissolve the political bands which have connected them with anot
+her, and to assume among the powers of the earth, the separate and eq
+ual station to which the Laws of Nature and of Nature's God entitle t
+hem, a decent respect to the opinions of mankind requires that they s
+hould declare the causes which impel them to the separation.
| [reply] [d/l] |
Re: Formatting Text
by moklevat (Priest) on May 12, 2006 at 21:18 UTC
|
Since you are just looking for a pointer in the right direction, for formatting text I have found the Perl6::Form module to be very effective. | [reply] |
Re: Formatting Text
by SamCG (Hermit) on May 12, 2006 at 18:46 UTC
|
I'd add that you also need to know what constitutes a section/chapter title, both for centering and building your TOC.
I'd suggest regular expressions for this -- look at what's common on all the chapters of interest. Regular expressions will also likely be helpful in other parts of your task . . .
And, you should look at printf/sprintf.
Good luck!
-----------------
s''limp';@p=split '!','n!h!p!';s,m,s,;$s=y;$c=slice @p1;so brutally;d;$n=reverse;$c=$s**$#p;print(''.$c^chop($n))while($c/=$#p)>=1;
| [reply] |
Re: Formatting Text
by TedPride (Priest) on May 12, 2006 at 20:19 UTC
|
And centering text (assuming the headings are smaller than the width you want - if not, you may want to word wrap first and center each line):
use strict;
use warnings;
my $heading = "NIMBLE CATS FALL SAFELY";
my $width = 80;
print '-' x $width, "\n";
print center($heading, $width), "\n";
print '-' x $width, "\n";
sub center {
my ($heading, $width) = @_;
return ' ' x (($width - length($heading)) / 2) . $heading;
}
| [reply] [d/l] |
Re: Formatting Text
by Trix606 (Monk) on May 12, 2006 at 19:05 UTC
|
I would find a copy of Learning Perl somewhere (library, book store, a friends bookshelf). It has clear explanations of exactly what you need to do this assignment. Regexes for picking out the different lines of text you need to handle, Formats for printing your new file the way you want. Learning Perl will get you up to speed on what you need in no time. (Rhyme unintentional but I'll take it.) | [reply] |
|
| [reply] |
|
|