For the time being, you can peruse Word and (crude) HTML versions of the 10 tutorial chapters of my book. I've reached the 2/3 milestone for the manuscript, and should have the entire work done and being edited again in a month. Hopefully by then it will appear on Manning's page of upcoming titles.

I welcome almost any comments on the information and exercises. Be warned that this site may close down at any time, or at least become password protected. Also, be aware that the content is in a state of flux, and stuff may be added (like some missing summaries and exercises) or removed at any point. Therefore, please refrain from commenting "you wrote 'chapter XXX' -- is that a typo?" or "there are no exercises for chapter N." Rather, let me know if I overlook something you would have expected to read, or whether I get too technical early on, or whether I dumb it down too much. Helpful stuff like that.

Also, real-world examples of the usefulness of look-ahead and look-behind would be appreciated (and creditted appropriately).

_____________________________________________________
Jeff[japhy]Pinyan: Perl, regex, and perl hacker.
s++=END;++y(;-P)}y js++=;shajsj<++y(p-q)}?print:??;

  • Comment on Regular Expressions in Perl: tutorial section (10 chapters) completed

Replies are listed 'Best First'.
Re: Regular Expressions in Perl: tutorial section (10 chapters) completed
by enoch (Chaplain) on Aug 30, 2001 at 23:20 UTC
           The only time I can remember using a look-ahead in work code was when I was writing a script that would peer through our CVS tree every Tuesday and print out all of the source code that was modified in the last week. (This forced us to do code reviews on Tuesday because we would walk in to the office and see the heaping pile of code.)
           At any rate, we only wanted to print out certain files in our CVS (we keep everything, there). We did not want to print .txt or .html or other files like that. So, to accomplish this, I did the following:
    opendir PROJ, "./$currentProject" or die "Could not open CVS dir $curr +entProject: $!\n"; my @files = grep /((\w|_)+(?=\.(pl|php|inc|js)$))/, readdir PROJ; closedir PROJ;
          I look forward to your book.

    Jeremy
Re: Regular Expressions in Perl: tutorial section (10 chapters) completed
by toma (Vicar) on Aug 31, 2001 at 09:10 UTC
    I would like to see something on the role of regular expressions in taint checking.

    I liked the section explaining the security implications of embedding perl code in regular expressions.

    It would be helpful to have a full section on validating program input. Perl is a great tool for validation, and it would be helpful to show techniques for doing it correctly.

    On a lighter note, in one example, an example uses the string "36, 22, 74, hike!" It would be even better to use a more realistic signal, as explained here.

    Thanks for the preview of the book! I'll be sure to buy a copy when it comes out.

    It should work perfectly the first time! - toma

Re: Regular Expressions in Perl: tutorial section (10 chapters) completed
by SuperCruncher (Pilgrim) on Aug 31, 2001 at 02:22 UTC
    This might not be a welcome comment but I feel it may be useful. I'm surprised you're using MS Word to write your manuscript. I consider Word to be a great program, but it is not a real, professional document-producing system. It is great for a small, quick documents but not for books.

    I think LaTeX would be a better choice. It really produces amazing results, and it is used for real books, e.g. Applied Operating System Concepts. LaTeX documents can be easily converted to PostScript, PDF and HTML. I have written 100+ page documents with LaTeX, and the output is truly amazing.

    As you're 2 thirds of the way through your book, I don't suppose this is of much help to you know.... but all would-be book authors take note! I'm attempting to write a book at the minute (nothing to do with Perl though) and LaTeX is what I'll be using.

      The problem is, even with many (not all) publishers who do books about free software, there are so many in-house macros and templates that work with Word, it's hard not to force authors to use it. Of course, my book's written in DocBook Lite, using vim, but I'm weird that way, and have a couple of macros.

      I knew a small poetry publisher that couldn't find any printing houses willing to do small runs in anything other than Word format. There were a lot of wasted cycles at that company, and I almost took a job there to convert things to a more useful format. Long story.

      XML's nice for this sort of thing, because XSL can translate it into just about any other format.

      Let's be honest here. Much as it pains me to say this, the world runs on Microsoft Office; call it the lingua franca of information interchange.

      Perhaps Word isn't the best tool for the job, but it's usually good enough and you can bet the farm that whatever software the other party is using can import it, largely intact.

      Sure, LaTeX could be the better choice. So was BetaMax, OS/9 (vs. Apple DOS & CP/M), and arguably, OS/2 (IBM's version, that is). But so it goes...

      Just spouting off,

      dmm

      Just call me the Anti-Gates ...
      
        The popularity of LaTeX varies by field. In math and physics it rules, and will continue to do so. In other subjects it doesn't.

        However where you want to use TeX or LaTeX is for any documentation which you want to archive. Word is simply a horrible choice for this. While it is a good bet that people today can read Word, it is a bad bet that in 15 years someone who finds a Word document will know what to do with it. However if you take a TeX document that was written 15 years ago, it isn't that hard for a person to read it in a text editor. Furthermore on the operating system of your choice it is possible to take the document and print it with the only difference from the original being caused by the limitations of your output format. (The output of TeX is specified down to the visible wavelength of light.) In fact the odds are extremely high that the program you would use 15 years from now is exactly the same as the one that you would use today. No matter what operating system you are on. (TeX is the only widely used program that I know of whose development has stopped. What people develop are macro packages to use on top of the basic program. But it has not been touched since March of 1995.)

        There is no other document format which can make equivalent claims. For instance PostScript cannot readily be read as text, and the output is not even guaranteed to remain the same from one printer to another. Microsoft has problems correctly importing documents written 5 years ago. There are many minor variations on groff out there, and the toolset is not widely used outside of Unix.

        It is to be hoped that tools like LyX will make more people get into TeX. But no matter what happens, there are niches which it dominates today that it shows no signs of losing its spot in for a very long time to come.

        The idea that LaTeX belongs in the same set of superior technology that nobody uses as suggested by your examples would come as shocking news to the many publishers that wouldn't take any thing but! As long as your information exchange is nothing more than what can be done in simple typesetting terms, I suppose you could get by with Word, but as soon as you try and do anything outside of the narrow boundaries that the redmond developers conceived of, then you are seriously out of luck. Even with the advances made since TeX was invented, Word just isn't in the same class.

        hsm