A couple of comments. First, don't use awk in Perl, Perl can do everything that awk can do, most of the times more efficiently and in a simpler manner.

If you need to traverse recursively a directory tree, then File::Find is definitely the module you might need to use. But if you have a single directory or a bunch of directories not specifically related in a hierarchical relation, then I would rather use the glob function on each such directory, simpler to use for a beginner.

Given what you intend to do, I would recommend against slurping the files, just read them one by one, each one line by line, and apply a split or a regex on each line.

Then, you have to think about your output, on which you said very little. I would imagine, from what you said about the number of files, that you probably only want to output lines (with file names) that don't qualify your rules. Printing out to the screen might be sufficient, but you may want to consider printing out to a file (or several files).

Yes, it would be useful that you provide a sample of the data format, even with bogus content.

Finally, I would recommend that you start out writing some code and show it here, I am sure you will get guidance from experienced monks and learn a lot in the process. You obviously don't really know where to start. Start with something simpler than what you need. Break it down in smaller tasks easier to master.

For example, you might start with a program reading all the files of a directory and just printing their contents to the screen (use a dummy directory with just 2 or 3 files for a start). Once this works, you can add more of the functionalities that you need. Such as several directories instead of just one, filtering the data that you need to display and writing to a file instead of displaying at the screen.

You are on the verge of undertaking a great journey. You just have to dare to start, do it, go ahead, dare, nothing wrong can happen. I sincerely hope that you will enjoy it as much as I did when I started out writing programs about 35 years ago. And that you will soon share my passion for that.


In reply to Re: Perl beginner here, needs a shove in the right direction. by Laurent_R
in thread Perl beginner here, needs a shove in the right direction. by rfromp

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.