Beefy Boxes and Bandwidth Generously Provided by pair Networks
go ahead... be a heretic
 
PerlMonks  

Is a here-doc a good way to retrieve data file contents for testing?

by Lotus1 (Vicar)
on Jan 17, 2021 at 01:35 UTC ( [id://11127014]=perlquestion: print w/replies, xml ) Need Help??

Lotus1 has asked for the wisdom of the Perl Monks concerning the following question:

I found from reading in perlop a suggestion to use chomp() to get rid of the extra newline at the end of my here-doc. For testing purposes I would like my test function to return exactly what the file contains. This is the first time I have noticed or cared about this so I thought I would share what I learned.

I'm working on my first module to upload to CPAN and I plan to put this function in a module in 't/lib'. Is there a nicer way to handle retrieving this? I could just make the function slurp the json file and return it. I've been looking at other modules but haven't found a good example yet. Thanks.

use warnings; use strict; use Test::More tests => 1; print "-------\n"; print json_q(); print "-------\n"; print json_here(); print "-------\n"; is(json_q(), json_here(), "should be the same"); sub json_q { q( "Type": 0, "Width": 504, "X": 18, "Y": 18 } ] }); } sub json_here { chomp(my $json = <<'END_JSON'); "Type": 0, "Width": 504, "X": 18, "Y": 18 } ] } END_JSON $json }

The output looks like:

1..1 ------- "Type": 0, "Width": 504, "X": 18, "Y": 18 } ] }------- "Type": 0, "Width": 504, "X": 18, "Y": 18 } ] }------- ok 1 - should be the same

Replies are listed 'Best First'.
Re: Is a here-doc a good way to retrieve data file contents for testing?
by haukex (Archbishop) on Jan 17, 2021 at 10:03 UTC
    For testing purposes I would like my test function to return exactly what the file contains.

    For tests, I think either of what you showed is ok, plus the variations shown by the others. Here are my comments on various ways to get multiline data, in particular data for tests:

    • Here docs are probably one of the most common ways to get multiline strings. One minor potential caveat is that line endings in here docs are always \n's. If you want to explicitly test different line endings, then I recommend qq strings instead and explicit backslash escapes (\n, \r, etc.) instead of literal line breaks. Also, if indented here docs are desired, properly unindenting them in Perl before 5.26 requires a bit of extra code.

    • Multiline strings work, although PBP recommends against them (ProhibitImplicitNewlines).

    • __DATA__ is another alternative, one potential caveat to remember here is that DATA is the filehandle that is used to read the Perl source file, so for example if you've got use utf8;, that remains enabled when reading from __DATA__ (unless changed with binmode). Plus, if you want to get multiple multiline strings from __DATA__, you'll have to split them yourself or use a module for that (e.g. Inline::Files, though that's a fairly powerful module).

    • Of course, you can always use actual files. Personally I often generate them using functions like this, and as mentioned above, using qq{} strings and backslash escapes so that I have full control over what the files will look like.

    • Though I wouldn't recommend this for embedded test data, sometimes there may be instances where you want to test something that's part of the POD documentation. For example, as part of my author tests I often test code snippets from my POD like this (I've even written a fairly complex parsing module, though I now think that's only worth it if there's a lot of code in the POD).

    Also note that if you need filehandles, you can use in-memory opens, which are relatively performant at splitting strings into lines.

    open my $fh, '<', \<<'EOF' or die $!; Hello World EOF
      Also note that if you need filehandles[...]

      I have wanted to do this before but didn't know about this feature. Thank you!

Re: Is a here-doc a good way to retrieve data file contents for testing?
by tybalt89 (Monsignor) on Jan 17, 2021 at 01:58 UTC

    Maybe cleaner, or maybe not ...

    #!/usr/bin/perl use strict; use warnings; use Test::More tests => 1; print "-------\n"; print json_q(); print "-------\n"; print json_here(); print "-------\n"; is(json_q(), json_here(), "should be the same"); sub json_q { q( "Type": 0, "Width": 504, "X": 18, "Y": 18 } ] }); } sub json_here { <<'END_JSON' =~ s/\n\z//r } "Type": 0, "Width": 504, "X": 18, "Y": 18 } ] } END_JSON
      sub json_here { <<'END_JSON' =~ s/\n\z//r }

      Although I'd probably have done something similar, I just wanted point out that s///r requires Perl 5.14+, which may be a limitation if this is a module that is intended to be compatible with a broad range of Perl versions. sub json_here { chomp( my $json = <<'END_JSON'); $json } works fine too.

      tybalt89, you are one of the most unorthodox Perl hackers I've seen so far and I mean it as a compliment. :)

      Cheers Rolf
      (addicted to the Perl Programming Language :)
      Wikisyntax for the Monastery

      UPDATE

      WOW, this even allows interpolating variables from the function scope in the "outside" HERE-DOC ...

      use strict; use warnings; print "-------\n"; print json_here(42,666); print "-------\n"; sub chompit { my $str = shift; chomp $str; return $str; } sub json_here { my $x=shift; chompit << "__JSON__" } "Type": 0, "Width": 504, "X": $x, "Y": $_[0] } ] } __JSON__

      C:/Perl_524/bin\perl.exe -w d:/tmp/pm/chomp_here.pl ------- "Type": 0, "Width": 504, "X": 42, "Y": 666 } ] }-------

        Because it's a lexer thing. To the parser it's just a string *inside* the body of the sub.

Re: Is a here-doc a good way to retrieve data file contents for testing? (updated)
by LanX (Saint) on Jan 17, 2021 at 01:51 UTC
    That's how I'd do it.

    Purely out of esthetical reasons

    use strict; use warnings; print "-------\n"; print json_here(); print "-------\n"; sub chompit { my $str = shift; chomp $str; return $str; } sub json_here { chompit << '__JSON__' ; "Type": 0, "Width": 504, "X": 18, "Y": 18 } ] } __JSON__ }

    update

    changed END_JSON to __JSON__

    • I like naming the HERE-ID after the format, that's a plus and can help the IDE with formatting
    • But I don't wanna read END_ at the start
    • Surrounding underscores are a (my?) visual anchor to recognize a here doc

    Cheers Rolf
    (addicted to the Perl Programming Language :)
    Wikisyntax for the Monastery

      I like your suggestion to not use 'END' at the start of the here-doc. Showing the format is a great idea. This made me wonder what would happen if I used '__DATA__'. It actually worked!

Re: Is a here-doc a good way to retrieve data file contents for testing?
by BillKSmith (Monsignor) on Jan 17, 2021 at 20:01 UTC
    The 'file' which you are simulating almost certainly has a newline at the end of every (including the last) line. The 'chomp' to remove the newline from the last line of your test file serves as a reminder that the same function is needed in the real code.
    Bill
      The 'file' which you are simulating almost certainly has a newline at the end of every (including the last) line.

      In fact it doesn't. At least you only said 'almost' certainly. LOL. The json file is part of an archive that is exported from my phone. I wrote a text extractor to get all my memos from these archive files. The json file it extracts has no newline on the last line hence this discussion.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://11127014]
Approved by LanX
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others goofing around in the Monastery: (5)
As of 2024-04-18 06:23 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found