Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl-Sensitive Sunglasses
 
PerlMonks  

Please review documentation of my AI::Embedding module

by Bod (Parson)
on Jun 02, 2023 at 21:26 UTC ( [id://11152618]=perlmeditation: print w/replies, xml ) Need Help??

Could you please take a look at the documentation for my new module and let me know if it makes sense? I always find that I am too close to the module and know what everything is supposed to do. In short, I have the Curse of Knowledge!

Here is the documentation

Why is it you only find typos after publishing?
The second raw_embedding method should read test_embedding in both the heading and the sample code. I've corrected this error now.

Thank you greatly for helping me get this right...

Edit:

Changed title from "RFC - Documentation Review" to "Please review documentation of my AI::Embedding module" as considered by erzuuli

  • Comment on Please review documentation of my AI::Embedding module

Replies are listed 'Best First'.
Re: RFC - Documentation Review
by hv (Prior) on Jun 02, 2023 at 22:21 UTC

    Typos: compatator; tyhe; chargable (should be chargeable [1]); "will random" => "will be random"; ACKNOWLEDGEMENTS section misses a trailing full stop.

    Interface: returning the HTTP::Tiny response object on failure of various methods means you always need to call success (or error) to know how to look at the response. If I were using this, I'd rather it returned something more easily testable such as undef - the HTTP::Tiny response could be included in the error() return value, or be provided by a separate http_error method.

    Interface: it seems strange to have the comparator be built in to the object. To me it would make more sense to have the comparator method return a subref (and remove it as a new option).

    Text: "It requires the 'key' parameter." is not needed, the following sentence already says this (and it is reiterated with the individual parameter listing). Use of the word "homogeneous" is odd: I'm not sure in what way the string form is more "homogeneous" than the array form. The text generally seems to assume that the only use for this interface is to rank search matches, I'm not sure how appropriate that is. More generally I'm unsure about calling the class "Embedding": if I correctly understand what I've read, an instance of this class is not an embedding but an embedder - it is something capable of providing embeddings.

    Hope this helps. :)

    [1] I use the phrase "city centre; cat cut cot" to remind me that in almost all English words, a 'c' or 'g' is hard when followed by 'i' or 'e', but soft when followed by 'a', 'u' or 'o'. Thus "chargable" can't be right - it would have to be pronounced with a hard 'g', so it gets an extra 'e' inserted to make it soft. Similarly "manageable" and "enforceable"; but "reproducible" is fine.

      Thank you hv for your valued input.

      Typos: compatator; tyhe; chargable (should be chargeable 1); "will random" => "will be random"; ACKNOWLEDGEMENTS section misses a trailing full stop.

      All corrected. I use Grammarly for all the writing I do. But it doesn't work in my text editor so doesn't correct errors in POD. Perhaps I need to copy POD into something Grammarly does check before uploading it.

      returning the HTTP::Tiny response object on failure of various methods means...

      Good point!
      It was probably laziness on my part which could do with revisiting. It's on the ToDo List.

      it seems strange to have the comparator be built in to the object.

      I feel the term 'comparator' is unclear. But I cannot think of a better one!
      When the compare method is called with two parameters, there is some processing of both to convert them into hashrefs. If one is feeding the same parameter to compare repeatedly many times, this processing can add up. So the comparator method does the conversion just once and stores the hashref to be compared to the single parameter fed to compare.

      If you can suggest a better method name, that would be great.

      Use of the word "homogeneous" is odd

      I mean that one would not be interested in the discreet values of the array, only the array as a whole. Because the whole array needs to be stored as a whole and not as parts, it makes sense (to me at least) to have it as a "homogeneous" string of values. This is easy to store in a database.

      Hope this helps. :)

      Tremendously thank you :)

        I feel the term 'comparator' is unclear. But I cannot think of a better one!

        I have no problem with the name, only with the interface - it doesn't make sense (as far as I can see) to embed the comparator within the object. Apart from anything else that makes it harder to have multiple comparators, for no obvious benefit.

        I'm imagining a simple curry like:

        sub comparator { my($self, $embed) = @_; return sub { $self->compare($embed, @_); }; }

        .. and documentation like:

        comparator

        my $comparator = $embedding->comparator($csv_embedding1); ... my $comparison = $comparator->($csv_embedding2);

        Returns a subroutine reference that can be used for repeated compare calls for the given vector against different secondary vectors, that returns the same type of result as compare.

        Update: I forgot to include the extra work for comparator, it should probably look more like:

        sub comparator { my($self, $embed) = @_; my $vector1 = $self->_make_vector($embed); return sub { my($embed2) = @_; my $vector2 = $self->_make_vector($embed2); return $self->_compare_vector($vector1, $vector2); }; }

        .. where _compare_vector would be factored out of the last 9 lines of compare.

      Use of the word "homogeneous" is odd

      I've been doing some weekend reading on PyTorch - not because I want to use it but because I want a high-level understanding of what it does and what it is used for...

      In the Wikipedia article, they too use the word "homogeneous" to describe the multi-dimensional arrays.

      "PyTorch defines a class called Tensor (torch.Tensor) to store and operate on homogeneous multidimensional rectangular arrays of numbers."

        I'm not familiar with PyTorch so this is purely a guess.

        In the examples, all of the array elements are floating point numbers: that's probably what homogeneous means in this context.

        By contrast, arrays with a mix of floats, integers, strings, objects, and so on, would be heterogeneous.

        — Ken

Re: RFC - Documentation Review
by kcott (Archbishop) on Jun 03, 2023 at 00:51 UTC

    G'day Bod,

    I always start my POD with:

    =encoding utf8

    It's part of my template for module code so I don't have to do anything manually to get this. See "perlpod: =encoding encodingname" if you're unfamiliar with this POD command.

    "... under the same terms as the Perl 5 programming language system itself."

    This looks strained (for want of a better word). I'd remove the word "system"; and consider whether I even want "itself".

    "Why is it you only find typos after publishing?"

    Checking your own work is not the best idea because your brain already knows what you meant and that's what it tends to see. However, if you're like me, you don't have the benefit of a proof-reader on hand. I generally try to read POD in different contexts:

    • viewing the POD in the source code using my editor
    • `perldoc lib/Some/Module.pm` while coding
    • `perldoc Some::Module` after make install
    • `man Some.Module` after make install

    Each of those will likely provide a slightly different view; typos will tend to be more prominent. Changing the width of your screen may also give a different view; I can never be bothered to go to that much effort.

    I see you have 't/pod.t' with Test::Pod which should pick up typos in POD commands (e.g. =begun); and 't/pod-coverage.t' with Test::Pod::Coverage which should pick up typos in subroutine names. However, these are set up as "Author tests" and, from "Test::More fails...", you don't seem to be running these:

    t/manifest.t ...... skipped: Author tests not required for installatio +n t/pod-coverage.t .. skipped: Author tests not required for installatio +n t/pod.t ........... skipped: Author tests not required for installatio +n

    Those are messages for users installing your module; you should do `export RELEASE_TESTING=1` before your `make test` (or similar; e.g. I see gmake test in that post).

    In addition, there are other Test::Pod::* modules which you may find helpful. I don't use any of these, so I'll make no recommendations, but I see some for checking spelling, links, and so on. Search for these from MetaCPAN.

    — Ken

      >> "... under the same terms as the Perl 5 programming language system itself."
       
      This looks strained (for want of a better word). I'd remove the word "system"; and consider whether I even want "itself".

      Good point...
      I wasn't sure if changing the wording of licence agreements was a good idea as I have no idea how much "legal rigour" they have been subjected to. I didn't write that but rather chose it from the options available within Module::Starter

      I see you have 't/pod.t' with Test::Pod which should pick up typos

      I noticed those as well but haven't touched them as I don't understand them and note that publishing the module works without altering them. Again, they were added by Module::Starter.

      First, I want to get the module published, available and usable...then I will (hopefully) get a better understanding of how I can improve the details and the system for creating and publishing modules so that future ones will be better.

      Seems like the POD tests need to be looked at first :)

        "... haven't touched them as I don't understand them and note that publishing the module works without altering them."

        You should run all of your tests and ensure they pass before even thinking about publishing. If you don't understand them, read the documentation: Test::Pod & Test::Pod::Coverage. If you don't understand the documentation, ask us — you would be fully aware that we're more than happy to help.

        I downloaded the tarball from the "Here is the documentation" link, https://metacpan.org/release/BOD/AI-Embedding-0.1_2/view/lib/AI/Embedding.pm, that you provided in your OP: "https://cpan.metacpan.org/authors/id/B/BO/BOD/AI-Embedding-0.1_2.tar.gz". I set up a directory for testing, then unpacked the tarball, giving:

        ken@titan ~/tmp/pm_11152618_rfc_doco/AI-Embedding-0.1_2 $ ls -al total 23 drwxr-xr-x 1 ken None 0 Jun 2 22:14 . drwxr-xr-x 1 ken None 0 Jun 3 09:14 .. -rw-r--r-- 1 ken None 425 Jun 2 21:54 Changes drwxr-xr-x 1 ken None 0 Jun 2 22:15 lib -rw-r--r-- 1 ken None 1819 Jun 2 21:50 Makefile.PL -rw-r--r-- 1 ken None 304 Jun 2 22:14 MANIFEST -rw-r--r-- 1 ken None 1169 Jun 2 22:14 META.json -rw-r--r-- 1 ken None 675 Jun 2 22:14 META.yml -rw-r--r-- 1 ken None 1353 May 30 20:54 README drwxr-xr-x 1 ken None 0 Jun 2 22:14 t

        Those are the files you should have in front of you (albeit in a different parent directory). Here's what you should do:

        ken@titan ~/tmp/pm_11152618_rfc_doco/AI-Embedding-0.1_2 $ perl Makefile.PL Checking if your kit is complete... Looks good Generating a Unix-style Makefile Writing Makefile for AI::Embedding Writing MYMETA.yml and MYMETA.json ken@titan ~/tmp/pm_11152618_rfc_doco/AI-Embedding-0.1_2 $ make cp lib/AI/Embedding.pm blib/lib/AI/Embedding.pm Manifying 1 pod document ken@titan ~/tmp/pm_11152618_rfc_doco/AI-Embedding-0.1_2 $ export RELEASE_TESTING=1 ken@titan ~/tmp/pm_11152618_rfc_doco/AI-Embedding-0.1_2 $ make test PERL_DL_NONLAZY=1 "/home/ken/perl5/perlbrew/perls/perl-5.36.0/bin/perl +.exe" "-MExtUtils::Command::MM" "-MTest::Harness" "-e" "undef *Test:: +Harness::Switches; test_harness(0, 'blib/lib', 'blib/arch')" t/*.t t/00-load.t ....... Use of uninitialized value in concatenation (.) or + string at /home/ken/tmp/pm_11152618_rfc_doco/AI-Embedding-0.1_2/blib +/lib/AI/Embedding.pm line 59. t/00-load.t ....... 1/1 # Testing AI::Embedding 0.12, Perl 5.036000, / +home/ken/perl5/perlbrew/perls/perl-5.36.0/bin/perl t/00-load.t ....... ok t/manifest.t ...... skipped: Test::CheckManifest 0.9 required t/pod-coverage.t .. Use of uninitialized value in concatenation (.) or + string at /home/ken/tmp/pm_11152618_rfc_doco/AI-Embedding-0.1_2/blib +/lib/AI/Embedding.pm line 59. t/pod-coverage.t .. 1/1 # Failed test 'Pod coverage on AI::Embedding' # at /home/ken/perl5/perlbrew/perls/perl-5.36.0/lib/site_perl/5.36.0 +/Test/Pod/Coverage.pm line 133. # Coverage for AI::Embedding is 77.8%, with 2 naked subroutines: # test # test_embedding # Looks like you failed 1 test of 1. t/pod-coverage.t .. Dubious, test returned 1 (wstat 256, 0x100) Failed 1/1 subtests t/pod.t ........... ok Test Summary Report ------------------- t/pod-coverage.t (Wstat: 256 (exited 1) Tests: 1 Failed: 1) Failed test: 1 Non-zero exit status: 1 Files=4, Tests=3, 1 wallclock secs ( 0.05 usr 0.01 sys + 0.36 cusr + 0.39 csys = 0.81 CPU) Result: FAIL Failed 1/4 test programs. 1/3 subtests failed. make: *** [Makefile:857: test_dynamic] Error 255

        I'm running Perl v5.36.0; I pretty sure that has nothing to do with any of your test failures. Here's points regarding those problems:

        • t/00-load.t: Fix the code such that you don't get that warning. `make test` might not care about that; but you should.
        • t/manifest.t: I didn't have Test::CheckManifest installed so the test is skipped. I'll get back to that; see below.
        • t/pod-coverage.t:
          • When you fix t/00-load.t, you'll have fixed the same warning here.
          • You have two subroutines in your module, test & test_embedding, which are not documented in your POD: add appropriate documentation.

        Test::CheckManifest — I've had no end of problems with this module. Just to demonstrate, I installed it and reran `make install`:

        ken@titan ~/tmp/pm_11152618_rfc_doco/AI-Embedding-0.1_2 $ make test PERL_DL_NONLAZY=1 "/home/ken/perl5/perlbrew/perls/perl-5.36.0/bin/perl +.exe" "-MExtUtils::Command::MM" "-MTest::Harness" "-e" "undef *Test:: +Harness::Switches; test_harness(0, 'blib/lib', 'blib/arch')" t/*.t t/00-load.t ....... Use of uninitialized value in concatenation (.) or + string at /home/ken/tmp/pm_11152618_rfc_doco/AI-Embedding-0.1_2/blib +/lib/AI/Embedding.pm line 59. t/00-load.t ....... 1/1 # Testing AI::Embedding 0.12, Perl 5.036000, / +home/ken/perl5/perlbrew/perls/perl-5.36.0/bin/perl t/00-load.t ....... ok t/manifest.t ...... Bailout called. Further testing stopped: Cannot +find a MANIFEST. Please check! t/manifest.t ...... Dubious, test returned 255 (wstat 65280, 0xff00) Failed 1/1 subtests FAILED--Further testing stopped: Cannot find a MANIFEST. Please check! make: *** [Makefile:857: test_dynamic] Error 255

        I gave up on this module some years ago and instead use what it suggests in "Test::CheckManifest - REPLACE THIS MODULE". I use more descriptive test names:

        is_deeply([manicheck()], [], 'Check files in "MANIFEST" exist.'); is_deeply([filecheck()], [], 'Check for files not in "MANIFEST" or "MA +NIFEST.SKIP".');
        "Again, they were added by Module::Starter."

        You are under no obligation to use that directly. I use Module::Starter::PBP as a plugin. Here's my config:

        $ cat ~/.module-starter/config author: Ken Cotterill email: kcott@cpan.org builder: ExtUtils::MakeMaker plugins: Module::Starter::PBP template_dir: /home/ken/.module-starter/P536

        I have multiple directories under /home/ken/.module-starter/ for different Perl versions and different types of modules; P536 is for Perl v5.36.0 templates.

        Depending on what you decide after receiving advice on "How Critical is Critic?", this plugin may be useful in that respect. I don't use it for that purpose. What I like is the ability to customise the templates; I'm no longer locked in to what Module::Starter provides. Consider the hoops you had to jump through when dealing with "CPAN Ratings..."; you could have just made a text change to your own template; I acknowledge that was a good learning exercise with respect to GitHub.

        — Ken

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlmeditation [id://11152618]
Approved by kcott
Front-paged by kcott
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others sharing their wisdom with the Monastery: (2)
As of 2024-04-20 15:59 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found