Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

Dear Monks,

I have a script which dumps a table. When I pipe it to the same script, I use IO::Select and can_read() to check for input on STDOUT. My script works fine when I run it through the debugger, but sadly can_read() fails to detect anything when I don't. I tried writing a test script to see if I can reproduce the issue, but unfortunately the script works fine either way.

Interestingly, when I put a print statement (marked HERE) in my script, the can_read() seems to work.

Any ideas ? Thank you very much for your time !!!

#!usr/bin/perl5.26.1 use strict; use IO::Select; use Text::ASCIITable; $|=1; my $t = Text::ASCIITable->new({ headingText => 'Basket' }); $t->setCols('Id','Name','Price'); $t->addRow(1,'Dummy product 1',24.4); $t->addRow(2,'Dummy product 2',21.2); $t->addRow(3,'Dummy product 3',12.3); $t->addRowLine(); $t->addRow('','Total',57.9); print STDOUT $t ."\n"; ### print "checking pipe\n"; #<==== HERE my $s = IO::Select->new(); $s->add(\*STDIN); if ($s->can_read(.5)) { print STDOUT "reading from pipe data\n"; }
# # TEST RUN # # perl test.pl .------------------------------. | Basket | +----+-----------------+-------+ | Id | Name | Price | +----+-----------------+-------+ | 1 | Dummy product 1 | 24.4 | | 2 | Dummy product 2 | 21.2 | | 3 | Dummy product 3 | 12.3 | +----+-----------------+-------+ | | Total | 57.9 | '----+-----------------+-------' # perl test.pl | perl test.pl .------------------------------. | Basket | +----+-----------------+-------+ | Id | Name | Price | +----+-----------------+-------+ | 1 | Dummy product 1 | 24.4 | | 2 | Dummy product 2 | 21.2 | | 3 | Dummy product 3 | 12.3 | +----+-----------------+-------+ | | Total | 57.9 | '----+-----------------+-------' reading from pipe data

Replies are listed 'Best First'.
Re: IO::Select woes
by hippo (Archbishop) on May 04, 2023 at 17:38 UTC
    I tried writing a test script to see if I can reproduce the issue, but unfortunately the script works fine either way.

    This is the value of an SSCCE - it shows that there is something else amiss in your production script which is causing the problems. Your next task is to identify what else it might be in the production script and to incorporate that into your test script. Keep doing that until you can reproduce the problem.

    Interestingly, when I put a print statement (marked HERE) in my script, the can_read() seems to work.

    Given your use of IO::Select it could well be that you've done something unexpected with STDOUT and the currently selected handle in the code you have not shown us. Perhaps that is worthy of your attention?

    Note that in your original post you have used the phrase "the script" and "my script" to refer to both your production script and your test script at various places. This just adds to the confusion. Try to be as specific and unambiguous as possible.


    🦛

      Thank you for taking the time to decipher, hippo! after posting it and reading haj's response, I realized my post was a bit confusing, so I apologize. I guess the point I want to emphasize is that my production script works fine through the debugger. This has happened to me before and it has always stemmed from library issues getting loaded during a debug session, but not production. I'll keep looking for differences between my prod & test script...
Re: IO::Select woes
by bliako (Abbot) on May 04, 2023 at 17:40 UTC

    I think the can_read()'s timeout, which is the same for both executions in the pipeline, is the culprit. Perhaps you could add a switch for the user to tell the script that it is either in read or write mode?

    Edit: what I mean is that both scripts in the pipeline upon execution wait for at most 0.5s for piped input. That stops the 1st script from outputting anything to be passed to the 2nd script. They are run one after the other with a tiny time difference. Will this difference be enough for the 1st script to output and the 2nd to still be blocked in the can_read()? To see that, clone your test.pl and run the 1st with smaller timeout than the 2nd. Re: debugger, I can not explain what you experience. But I am sure that in this particular case, loading different libraries is not relevant.

    I can't resist offering this halfway measure (don't take it seriously hehe): use a random timeout (can_read(rand)). In this way your pipeline will work correctly half the time, on average (given only 2 items in the pipeline). Correction edit: can_read(0.5+rand)

    If you are a die-hard "do-what-i-mean" perl hacker and insist on not using CLI switches, then here is a tip: in the pipeline perl test.pl | perl test.pl, pid of 1st script (left, from where I see it) < pid of 2nd script (right). (print "my pid is $$\n:";)

    1min Edit: Oh, I just remembered that in linux there is the -t test to check if your STDIN is associated with a terminal (your 1st script in the pipeline is, the second is associated with the output of the 1st script). And yes, just checked that in Perl you can do: if( -t STDIN ){ print "$$: not piped\n" } else { print "$$: piped\n" }.

    And by the way, IO:Select's doc states also how to use the handle(s) which can_read() returns as ready for reading, and that's the handles you should use (instead of using raw "STDIN").

    bw, bliako

      I can't resist offering this halfway measure (don't take it seriously hehe): use a random timeout (can_read(rand)).

      Random (or rather "slightly randomized") timeouts and/or sleeps can be quite useful when you have a lot of cyclic process forked from the same parent process to spread the workload more evenly over time (as long as you remember to call srand after forking).

      It's also extremely useful in finding bugs, because it helps simulate the messiness of production environment (compared to the relatively clean one-process-at-a-time developer systems).

      PerlMonks XP is useless? Not anymore: XPD - Do more with your PerlMonks XP

      Thank you thank you !! It took me a day to absorb your post, but it is absolutely genius! Your comments are very insightful... and I believe you are absolutely correct about the randomization. It seems to work now, but as you say, only about half the time !! I don't understand what's going on here at all!

      ## first run with added randomization pid 46234 NOT piped pid 46234 checking pipe pid 46235 piped pid 46235 checking pipe pid 46235 cannot read ## 2nd run with added randomization pid 46327 NOT piped pid 46327 checking pipe pid 46328 piped pid 46328 checking pipe pid 46327 cannot read reading from pipe data <= IT CAN READ pid 46328 can read
Re: IO::Select woes
by haukex (Archbishop) on May 06, 2023 at 10:15 UTC

    I assume this is a continuation of checking for piped input data? I am wondering why you are using IO::Select at all - it seems to me like IPC may be overcomplicating things a lot. Reading the anonymous posts in that and this thread (it would be easier to follow if you were to create an account), I have to say: please take a moment to take a step back and explain the big picture to us - what are you trying to accomplish? Are you:

    1. Simply trying to get one script to consume the output of a second script? Then all you need is print and <>:
      $ cat producer.pl #!/usr/bin/env perl use warnings; use strict; print "Pretend this is your table.\n"; $ cat consumer.pl #!/usr/bin/env perl use warnings; use strict; while ( my $line = <> ) { chomp($line); print "I received: <<$line>>\n"; } $ ./producer.pl | ./consumer.pl I received: <<Pretend this is your table.>> $ ./producer.pl >table.txt $ ./consumer.pl table.txt I received: <<Pretend this is your table.>>
      In addition, you really shouldn't be using ASCII-formatted tables as your data exchange format; use something like JSON to pass data between processes and then only format the table when actually displaying it to the user.
    2. Do you have a long-running script that is once in a while calling another script to produce data (like your table)?
      1. If both are Perl scripts that you have control over, then turn the data-producing script into a Perl module that the main script can use.
      2. Otherwise, if you must call an exernal program, see my node Calling External Commands More Safely, if you scroll down it has a bunch of example code.
    3. Do you have two long-running scripts running side by side that need to talk to each other? Then I recommend sockets, and then IO::Select may be appropriate (though I personally find it too low-level).
Re: IO::Select woes
by haj (Vicar) on May 04, 2023 at 15:26 UTC
    You should explain in more detail what you expect. I am not surprised by the outcome: Your "piped" version reports reading from pipe data which indicates that can_read returned something with a true value. But then, you don't actually read from STDIN.
      Hi Haj, thank you for taking the time to review my issue and my sincere apologies for not being clear. The test script is the expected outcome and works as shown when reading from the pipe. My actual script ONLY enters the can_read() if clause when I am debugging. When my test script dumps the table to STDOUT, can_read() checks stdin (SHOWN), and after detected, I then plan to read the data within the if clause (NOT SHOWN).
Re: IO::Select woes
by ikegami (Patriarch) on May 06, 2023 at 06:12 UTC

    Are you using read, readline (aka <>) or eof? Those aren't compatible with select. You need to use sysread.

      > Are you using read, readline (aka <>) or eof? Those aren't compatible with select. You need to use sysread.

      Good point. Reminded me of this old node, an example of a complete event-driven Perl server using IO::Select. From its source code:

      # ----------------------------------------------------------------- # Perl network programming notes # ------------------------------ # Reading can be done with: # 1) <$sock> - blocks till new line # 2) read($sock, $buf, $len) - blocks till $len bytes received # (like W Richard Stevens readn function + in C) # 3) sysread($sock, $buf, $len) - may return less than asked for # (like read()/recv() in C) # ------------------------------------------------------- ... # This one blocks until newline received. # $data = <$client>; # This one blocks till $len bytes received (like Stevens readn fun +ction in C) # my $hdr; # my $hdrlen = read( $client, $hdr, $SYSLOG_HDR_LEN ); # This one may return less than asked for (like read()/recv() in C) my $msglen = sysread( $client, $data, $SYSLOG_MAX_MSG_LEN );

      Given the commenting out above, I presumably tried all three at the time before settling on sysread. :)

Re: IO::Select woes [Prime with NULL]
by kcott (Archbishop) on May 06, 2023 at 01:40 UTC

    If you send a NULL ("\0") to STDOUT, nothing happens with a standalone script, but it will be recognised as STDIN by a script to which it is piped.

    I believe the following script (no_woes.pl) resolves all of the various issues that I've seen in this thread.

    #!/usr/bin/env perl use strict; use warnings; use constant READ_TIMEOUT => 0.5; use Text::ASCIITable; use IO::Select; if (-t \*STDIN) { warn "[$$] STDIN: TTY\n"; } else { warn "[$$] STDIN: PIPE\n"; } if (-t \*STDOUT) { warn "[$$] STDOUT: TTY\n"; } else { warn "[$$] STDOUT: PIPE\n"; } $| = 1; print "\0" if -t \*STDIN; my $io_select = IO::Select->new(\*STDIN); if ($io_select->can_read(READ_TIMEOUT)) { print for <STDIN>; } else { print get_table(); } sub get_table { my $t = Text::ASCIITable::->new({ headingText => 'Basket' }); $t->setCols('Id', 'Name', 'Price'); $t->addRow(1, 'Dummy product 1', 24.4); $t->addRow(2, 'Dummy product 2', 21.2); $t->addRow(3, 'Dummy product 3', 12.3); $t->addRowLine(); $t->addRow('', 'Total', 57.9); return $t; }

    Standalone output:

    $ ./no_woes.pl [1564] STDIN: TTY [1564] STDOUT: TTY .------------------------------. | Basket | +----+-----------------+-------+ | Id | Name | Price | +----+-----------------+-------+ | 1 | Dummy product 1 | 24.4 | | 2 | Dummy product 2 | 21.2 | | 3 | Dummy product 3 | 12.3 | +----+-----------------+-------+ | | Total | 57.9 | '----+-----------------+-------'

    Piped output:

    $ ./no_woes.pl | ./no_woes.pl [1566] STDIN: PIPE [1566] STDOUT: TTY [1565] STDIN: TTY [1565] STDOUT: PIPE .------------------------------. | Basket | +----+-----------------+-------+ | Id | Name | Price | +----+-----------------+-------+ | 1 | Dummy product 1 | 24.4 | | 2 | Dummy product 2 | 21.2 | | 3 | Dummy product 3 | 12.3 | +----+-----------------+-------+ | | Total | 57.9 | '----+-----------------+-------'

    Multiple pipes:

    $ ./no_woes.pl | ./no_woes.pl | ./no_woes.pl [1568] STDIN: PIPE [1568] STDOUT: PIPE [1567] STDIN: TTY [1567] STDOUT: PIPE [1569] STDIN: PIPE [1569] STDOUT: TTY .------------------------------. | Basket | +----+-----------------+-------+ | Id | Name | Price | +----+-----------------+-------+ | 1 | Dummy product 1 | 24.4 | | 2 | Dummy product 2 | 21.2 | | 3 | Dummy product 3 | 12.3 | +----+-----------------+-------+ | | Total | 57.9 | '----+-----------------+-------' $ ./no_woes.pl | ./no_woes.pl | ./no_woes.pl | ./no_woes.pl [1571] STDIN: PIPE [1571] STDOUT: PIPE [1572] STDIN: PIPE [1572] STDOUT: PIPE [1573] STDIN: PIPE [1573] STDOUT: TTY [1570] STDIN: TTY [1570] STDOUT: PIPE .------------------------------. | Basket | +----+-----------------+-------+ | Id | Name | Price | +----+-----------------+-------+ | 1 | Dummy product 1 | 24.4 | | 2 | Dummy product 2 | 21.2 | | 3 | Dummy product 3 | 12.3 | +----+-----------------+-------+ | | Total | 57.9 | '----+-----------------+-------'

    That was run on Cygwin: "perl 5, version 36, subversion 0 (v5.36.0) built for cygwin-thread-multi".

    AM comment:

    "Note: If you are on windows this will never work as select is only implemented for sockets, and not STDIN."

    That same code works on Win10 (Strawberry Perl): "perl 5, version 26, subversion 3 (v5.26.3) built for MSWin32-x64-multi-thread". The output is very similar, so I've put it in a spoiler to avoid cluttering the thread.

    — Ken

Re: IO::Select woes
by Anonymous Monk on May 05, 2023 at 16:19 UTC

    Hi again Monks,

    I have duplicated the issue. I just simply wrapped the STDIN check in a function. When I run this very simple script through the pipe, can_read() doesn't detect anything on STDIN. When I run it through the debugger, it works fine. When I uncomment the print statement in checkPipe, it triggers something and works fine without the debugger! What am I possibly doing wrong here? I am losing my sanity! Could it possibly be the timeout as Bliako stated? I tried changing it, but it doesn't seem to change anything.

    # # TEST RUN # # # perl test2.pl | perl test2.pl ## PRINTS NOTHING !!!
    # perl test2.pl | perl -d test2.pl reading from pipe data .------------------------------. | Basket | +----+-----------------+-------+ | Id | Name | Price | +----+-----------------+-------+ | 1 | Dummy product 1 | 24.4 | | 2 | Dummy product 2 | 21.2 | | 3 | Dummy product 3 | 12.3 | +----+-----------------+-------+ | | Total | 57.9 | '----+-----------------+-------'
    ## test2.pl # #!/usr/bin/perl5.26.1 use strict; use Text::ASCIITable; use IO::Select; use strict; use warnings; $|=1; checkPipe(); sub checkPipe { #print "checking pipe\n"; #<==== UNCOMMENT IT WORKS!! my $s = IO::Select->new(); $s->add(\*STDIN); if ($s->can_read(.5)) { print STDOUT "reading from pipe data\n"; dumpTable(); } } sub dumpTable { my $t = Text::ASCIITable->new({ headingText => 'Basket' }); $t->setCols('Id','Name','Price'); $t->addRow(1,'Dummy product 1',24.4); $t->addRow(2,'Dummy product 2',21.2); $t->addRow(3,'Dummy product 3',12.3); $t->addRowLine(); $t->addRow('','Total',57.9); print STDOUT $t ."\n"; }

      I think there is a confusion here between what the first process is doing and what the second process is doing.

      I think your intention was that the first process prints the table, and the second process reads it; however both are running the same code, so in the normal course of events the first process sees nothing to read, so does not dump its table; since it outputs nothing, the second process also sees nothing to read, and also does not dump a table.

      When you add print "checking pipe\n";, the first process outputs that, sees nothing to read, and exits. The second process now does see something to read - the diagnostic emitted by the first process - so it now calls dumpTable().

      The first thing I would suggest to reduce confusion is to use "warn" rather than "print" for your diagnostics, so that they show up on STDERR and not on the pipeline; it may also be useful to show the process id ($$) in those diagnostics, so that you can distinguish the two processes.

      I'm not familiar with the debugger, but since it will try to use STDIN to get console input it is certainly likely to confuse the issue.

      I tested this with the code below:

      % cat t0 #!/usr/bin/perl use strict; use warnings; use IO::Select; $|=1; warn "pid $$ checking pipe\n"; my $s = IO::Select->new(); $s->add(\*STDIN); if ($s->can_read(.5)) { warn "pid $$ can read\n"; print "here is my table\n"; } else { warn "pid $$ cannot read\n"; } % ./t0 | ./t0 pid 24811 checking pipe pid 24810 checking pipe pid 24811 cannot read pid 24810 cannot read % echo "go" | ./t0 | ./t0 pid 26452 checking pipe pid 26452 can read pid 26453 checking pipe pid 26453 can read here is my table %

      Update: remove unneeded '(a)'

        Thank you hv ! This is so super helpful. I put the following code at the top of my test script, included the diagnostic process warnings, and now the test script works fine. Also, It makes perfect sense that the debugger interacts with STDIN. So, unfortunately, I am still stuck with my original problem so I will continue to make the test script look like my production script. Thank you !

        if (-t STDIN) { dumpTable(); }

      This is a race condition between your processes.

      The process left of the pipe has no STDIN defined, so it runs into the else path no matter what. And then... it terminates, closing its STDOUT during the process.

      The process right of the pipe tries to read. If the "left" process has not yet terminated, it will find something readable (at EOF, though), and that's why you can make it work in the debugger or with extra print statements (a print to a pipe blocks until it is read). In most cases, however, the left process will already be gone before Perl is done parsing your script, which then doesn't see anything left on STDIN.

Re: IO::Select woes
by Anonymous Monk on May 05, 2023 at 07:26 UTC
    Note: If you are on windows this will never work as select is only implemented for sockets, and not STDIN.
Re: IO::Select woes
by Anonymous Monk on May 15, 2023 at 05:59 UTC

    Not sure the other answers completely explained what is wrong, but: When you pipe them - both are executed and start at the 'same time' with ones output piped to the others input. This the second test.pl can start before the first has even output anything.

    The fix is to execute test.pl while capturing its output. Then when that has finished, execute again with the captured output.

    perl test.pl > temp_file perl test.pl < temp_file
    Or some terminals you may be able to do this in one line:
    perl test.pl > temp_file && perl test.pl < temp_file

      perl test.pl > temp_file perl test.pl < temp_file will work only with a semicolon (;) separating the two commands: perl test.pl > temp_file ; perl test.pl < temp_file

      Edit: oh I have just noticed that haukex has added this as a comment in a moderation of the above node. So, I will add another: Or some terminals -> Or some shells.