slugger415 has asked for the wisdom of the Perl Monks concerning the following question:

Hello Monks,

I am trying to parse a text file that has many lines beginning with # (pound character). When I read the file and print the lines out, it seems to ignore all of those lines. For instance, here's the contents of my history.txt file:

#EXTM3U
#EXTINF:547,Excerpt from the Binti
Excerpt from the Binti novella series by Nnedi Okorafor.mp3

when I read this file, and print each line, I only see:

Excerpt from the Binti novella series by Nnedi Okorafor.mp3

The lines beginning with # are ignored; here's my code:

#! /usr/bin/perl use strict; my $history = shift; open my $fh, "<:encoding(utf8)", "$history" or die "$history: $!"; while (my $line=<$fh>){ print $line; } close($fh); exit;

I'm on Windows 10 using Strawberry Perl version 30, subversion 1 (v5.30.1). Am I doing something wrong? Thanks for any advice.

Replies are listed 'Best First'.
Re: Reading lines beginning with pound # ignored
by kcott (Archbishop) on Jun 18, 2023 at 05:35 UTC

    G'day slugger415,

    Check the line endings of the file.

    I suspect this is happening:

    $ perl -e ' my @lines = ("1\r", "12\r", "123\r"); print for @lines; ' 123

    While you're expecting this:

    $ perl -e ' my @lines = ("1\n", "12\n", "123\n"); print for @lines; ' 1 12 123

    — Ken

Re: Reading lines beginning with pound # ignored
by tybalt89 (Monsignor) on Jun 18, 2023 at 07:42 UTC

    Try this:

    #! /usr/bin/perl use strict; my $history = shift; open my $fh, "<:encoding(utf8)", "$history" or die "$history: $!"; while (my $line=<$fh>){ $line =~ tr/\r/\n/; # NOTE convert carriage returns to new lines print $line; } close($fh); exit;

    and see what you get.

    EDIT: apply transliteration to proper variable.

      Your transliteration operates on $_ which is uninitialised. You'd get a warning with "use warnings;". You want:

      $line =~ tr/\r/\n/;

      — Ken

      Awesome! transliteration did the trick, thank you!

Re: Reading lines beginning with pound # ignored
by Bod (Parson) on Jun 18, 2023 at 17:19 UTC
    has many lines beginning with # (pound character)

    I suppose this is a reminder/warning to be aware of geography...

    To me here in the UK, a pound character is £ and the one you mentioned is a hash character #

    I had assumed that everyone, everywhere, calls them the same thing. Perhaps because social media has hashtags that start with # and the HTML entity for what I call a 'pound' is &pound;

      I was confused too and checked Wikipedia, apparently # derives from an older symbol for some pound unit to measure mass.

      > ... pound sign. The symbol has historically been used for a wide range of purposes including the designation of an ordinal number and as a ligatured abbreviation for pounds avoirdupois – having been derived from the now-rare ℔

      For me a pound is 500g = weight of half a liter water at sea level; with a liter = 1000 cm³ (yes the metric system is sooo boring ;)

      The pound sterling OTOH is much lighter nowadays ;)

      Cheers Rolf
      (addicted to the 𐍀𐌴𐍂𐌻 Programming Language :)
      Wikisyntax for the Monastery

        Yes, I remember when I was a child, some People would abbreviate the german word for pound (weight) - "Pfund" - like this, which seems to be derived from (or "\N{L B BAR SYMBOL}") and has a very rough resemblance with "#".

      In America, the telephones have

      1 2 3 4 5 6 7 8 9 * 0 #
      and every voice menu I've ever heard has called it "press pound sign for...". Aside from that, most usage outside of phones or business calls it "number sign" like in the stylized "#1" ("number one") written on first-place ribbons or something.

      When I got started on Unix I thought "hash" was programmer slang, like "bang" is for the exclamation point. I picked it up and started using it out of a desire to join programmer culture. Now I've just learned it was a Brittish-ism :-)

        every voice menu I've ever heard has called it "press pound sign for..."

        In Germany, the phone key # is usually called "Raute". And it drives me nuts for decades, because a Raute is a Rhombus. But people have misnamed it for decades, and so it became an official name. It has several other, better names, see https://de.wikipedia.org/wiki/Doppelkreuz_(Schriftzeichen).

        Alexander

        --
        Today I will gladly share my knowledge and experience, for there are no sweeter words than "I told you so". ;-)
        In America, the telephones have...

        I had to get up the dial pad on my mobile to check 😃
        But it's the same here in the UK - at least it is on my Chinese-manufactured mobile!

        When I got started on Unix I thought "hash" was programmer slang, like "bang" is for the exclamation point. I picked it up and started using it out of a desire to join programmer culture. Now I've just learned it was a Brittish-ism :-)

        ...and one we've exported to the world through the hashtag but ironically thanks to Chris Messina, an American blogger...

        The official name for the exclamation mark or bash character is a pling. I learnt that from a school friend many, many years ago and very few people seem to ever use the term besides me and (presumably) said friend...

Re: Reading lines beginning with pound # ignored
by BillKSmith (Monsignor) on Jun 23, 2023 at 15:52 UTC
    I know that I am a week late and slightly off topic. You probably want to read your file line-by-line rather than slurping the whole thing. Use $INPUT_RECORD_SEPARAROR.
    use strict; use warnings; use autodie; use English; my $mac_file = '#EXTM3U' . "\N{CARRIAGE RETURN}" . '#EXTINF:547,Excerpt from the Binti' . "\N{CARRIAGE RETURN}" . 'Excerpt from the Binti novella series by Nnedi Okorafor.mp3'. "\N{CARRIAGE RETURN}" ; open my $MAC_FH, '<', \$mac_file; my @lines; while (my $line = do{local $INPUT_RECORD_SEPARATOR = "\r";<$MAC_FH>}) +{ #$line =~ tr/\r/\n/; push @lines, $line; } close $MAC_FH; print "Line0: ", $lines[0], "\n"; print "Line1: ", $lines[1], "\n"; print "Line2: ", $lines[2], "\n";
    Bill
Re: Reading lines beginning with pound # ignored
by Anonymous Monk on Jun 23, 2023 at 15:18 UTC

    # also could be a Sharp sign in music (1/2 step up from whatever music letter A through G precedes it), at least from this American's viewpoint

Re: Reading lines beginning with pound # ignored
by Anonymous Monk on Jun 18, 2023 at 00:56 UTC
    Make sure you are reading the correct file.