in reply to Re^2: Sort alphabetically from file
in thread Sort alphabetically from file

If you are new to Perl, you might like diagnostics, which won't throw more errors, but messages that (hopefully) are more informative. So, your script(s) should start with
use strict; use warnings; use diagnostics;

Replies are listed 'Best First'.
Re^4: Sort alphabetically from file
by edujs7 (Novice) on Jun 15, 2019 at 10:24 UTC
    noted. Thank you very much for your support.

      Also, if your on Windows like me, I open($file, '<', shift) or die "$!"; and then immediately binmode($file);.

      Commands or parameters or filenames added to the command line when calling the script get put into an array called @ARGV and when you call shift it increments $ARGV[0] to $ARGV[1] to $ARGV[2] and so on for each shift used.

      So, if you used C:\path\to\script\perl my_script.pl file_1.txt outfile.txt then you could use shift again to open() (use three arg open) and instead of printing it to the console window, you can write the output to $outFile.

      use strict; use warnings; my %hash; open (my $inFile, '<', shift) or die "$!"; open (my $outFile, '>', shift) or die "$!"; binmode($inFile); binmode($outFile); while (<$inFile> =~ /(\d)\s+(\d)\s+(\d)\s+(\w+)/){ push @{$hash{$4}}, $1, $2, $3; } print $outFile "@{$hash{$_}}[0..2] $_\n" for sort keys %hash;
      Usage: C:\path\to\script\perl my_script.pl inFile.txt outFile.txt

      Also, please note this also removes one space from each column per row. As long as that does not corrupt your data set it should be fine. It actually may save you some hard drive space. :)

      EDITED: fixed typo in matching patterns, thanks haukex

      EDITED: changed and made obvious that the individual needs to make absolutely certain that this does not corrupt anything in their data set.

      EDITED: had to add a new paragraph so my second EDIT looked ok.

        when you call shift it increments $ARGV[0] to $ARGV[1] to $ARGV[2] and so on for each shift used.

        No, shift removes the first element of @ARGV on each call, returning the element it removed.

        /(\d)\s*(\d)\s*(\d)\s*(\w*)/

        Note that this will also match a line as simple as "123", or really anything that has three consecutive digits, since that's the only thing this regex requires. I would strongly recommend using \s+, \d+, and \w+, and anchoring the regex to the beginning and end of the string with ^ resp. $.

        As long as that does not corrupt your data set it should be fine (and i am sure it is fine)

        Sorry, but how can you be sure? Some file formats require \t as a column separator.

        Update: Expanded the last quote and highlighted the part I was reacting to.