Please review documentation of my AI::Embedding module

Replies are listed 'Best First'.
Re: RFC - Documentation Review by hv (Prior) on Jun 02, 2023 at 22:21 UTC
Typos: compatator; tyhe; chargable (should be chargeable [1]); "will random" => "will be random"; ACKNOWLEDGEMENTS section misses a trailing full stop. Interface: returning the HTTP::Tiny response object on failure of various methods means you always need to call `success` (or `error`) to know how to look at the response. If I were using this, I'd rather it returned something more easily testable such as undef - the HTTP::Tiny response could be included in the `error()` return value, or be provided by a separate `http_error` method. Interface: it seems strange to have the comparator be built in to the object. To me it would make more sense to have the `comparator` method return a subref (and remove it as a `new` option). Text: "It requires the 'key' parameter." is not needed, the following sentence already says this (and it is reiterated with the individual parameter listing). Use of the word "homogeneous" is odd: I'm not sure in what way the string form is more "homogeneous" than the array form. The text generally seems to assume that the only use for this interface is to rank search matches, I'm not sure how appropriate that is. More generally I'm unsure about calling the class "Embedding": if I correctly understand what I've read, an instance of this class is not an embedding but an embedder - it is something capable of providing embeddings. Hope this helps. :) [1] I use the phrase "city centre; cat cut cot" to remind me that in almost all English words, a 'c' or 'g' is hard when followed by 'i' or 'e', but soft when followed by 'a', 'u' or 'o'. Thus "chargable" can't be right - it would have to be pronounced with a hard 'g', so it gets an extra 'e' inserted to make it soft. Similarly "manageable" and "enforceable"; but "reproducible" is fine.	[reply] [d/l] [select]
Re^2: RFC - Documentation Review by Bod (Parson) on Jun 02, 2023 at 23:05 UTC
Thank you hv for your valued input. Typos: compatator; tyhe; chargable (should be chargeable 1); "will random" => "will be random"; ACKNOWLEDGEMENTS section misses a trailing full stop. All corrected. I use Grammarly for all the writing I do. But it doesn't work in my text editor so doesn't correct errors in POD. Perhaps I need to copy POD into something Grammarly does check before uploading it. returning the HTTP::Tiny response object on failure of various methods means... Good point! It was probably laziness on my part which could do with revisiting. It's on the ToDo List. it seems strange to have the comparator be built in to the object. I feel the term 'comparator' is unclear. But I cannot think of a better one! When the `compare` method is called with two parameters, there is some processing of both to convert them into hashrefs. If one is feeding the same parameter to `compare` repeatedly many times, this processing can add up. So the `comparator` method does the conversion just once and stores the hashref to be compared to the single parameter fed to `compare`. If you can suggest a better method name, that would be great. Use of the word "homogeneous" is odd I mean that one would not be interested in the discreet values of the array, only the array as a whole. Because the whole array needs to be stored as a whole and not as parts, it makes sense (to me at least) to have it as a "homogeneous" string of values. This is easy to store in a database. Hope this helps. :) Tremendously thank you :)	[reply] [d/l] [select]
Re^3: RFC - Documentation Review by hv (Prior) on Jun 03, 2023 at 01:38 UTC
I feel the term 'comparator' is unclear. But I cannot think of a better one! I have no problem with the name, only with the interface - it doesn't make sense (as far as I can see) to embed the comparator within the object. Apart from anything else that makes it harder to have multiple comparators, for no obvious benefit. I'm imagining a simple curry like: `sub comparator { my($self, $embed) = @_; return sub { $self->compare($embed, @_); }; }` [download] .. and documentation like: comparator `my $comparator = $embedding->comparator($csv_embedding1); ... my $comparison = $comparator->($csv_embedding2);` [download] Returns a subroutine reference that can be used for repeated `compare` calls for the given vector against different secondary vectors, that returns the same type of result as `compare`. Update: I forgot to include the extra work for comparator, it should probably look more like: `sub comparator { my($self, $embed) = @_; my $vector1 = $self->_make_vector($embed); return sub { my($embed2) = @_; my $vector2 = $self->_make_vector($embed2); return $self->_compare_vector($vector1, $vector2); }; }` [download] .. where `_compare_vector` would be factored out of the last 9 lines of `compare`.	[reply] [d/l] [select]
Re^4: RFC - Documentation Review by Bod (Parson) on Jun 03, 2023 at 11:31 UTC
Re^5: RFC - Documentation Review by hv (Prior) on Jun 03, 2023 at 14:12 UTC
Re^5: RFC - Documentation Review by afoken (Chancellor) on Jun 03, 2023 at 12:23 UTC
Homogeneous (was: Re^2: RFC - Documentation Review) by Bod (Parson) on Jun 11, 2023 at 13:12 UTC
Use of the word "homogeneous" is odd I've been doing some weekend reading on PyTorch - not because I want to use it but because I want a high-level understanding of what it does and what it is used for... In the Wikipedia article, they too use the word "homogeneous" to describe the multi-dimensional arrays. *"PyTorch defines a class called Tensor (torch.Tensor) to store and operate on homogeneous multidimensional rectangular arrays of numbers."*	[reply]
Re: Homogeneous (was: Re^2: RFC - Documentation Review) by kcott (Archbishop) on Jun 11, 2023 at 14:37 UTC
I'm not familiar with PyTorch so this is purely a guess. In the examples, all of the array elements are floating point numbers: that's probably what homogeneous means in this context. By contrast, arrays with a mix of floats, integers, strings, objects, and so on, would be heterogeneous. — Ken	[reply]
Re: RFC - Documentation Review by kcott (Archbishop) on Jun 03, 2023 at 00:51 UTC
G'day Bod, I always start my POD with: `=encoding utf8` [download] It's part of my template for module code so I don't have to do anything manually to get this. See "perlpod: =encoding encodingname" if you're unfamiliar with this POD command. "... under the same terms as the Perl 5 programming language system itself." This looks strained (for want of a better word). I'd remove the word "system"; and consider whether I even want "itself". "Why is it you only find typos after publishing?" Checking your own work is not the best idea because your brain already knows what you meant and that's what it tends to see. However, if you're like me, you don't have the benefit of a proof-reader on hand. I generally try to read POD in different contexts: viewing the POD in the source code using my editor ``perldoc lib/Some/Module.pm`` while coding ``perldoc Some::Module`` after `make install` ``man Some.Module`` after `make install` Each of those will likely provide a slightly different view; typos will tend to be more prominent. Changing the width of your screen may also give a different view; I can never be bothered to go to that much effort. I see you have '`t/pod.t`' with `Test::Pod` which should pick up typos in POD commands (e.g. `=begun`); and '`t/pod-coverage.t`' with `Test::Pod::Coverage` which should pick up typos in subroutine names. However, these are set up as "Author tests" and, from "Test::More fails...", you don't seem to be running these: `t/manifest.t ...... skipped: Author tests not required for installatio +n t/pod-coverage.t .. skipped: Author tests not required for installatio +n t/pod.t ........... skipped: Author tests not required for installatio +n` [download] Those are messages for users installing your module; you should do ``export RELEASE_TESTING=1`` before your ``make test`` (or similar; e.g. I see `gmake test` in that post). In addition, there are other `Test::Pod::*` modules which you may find helpful. I don't use any of these, so I'll make no recommendations, but I see some for checking spelling, links, and so on. Search for these from MetaCPAN. — Ken	[reply] [d/l] [select]
Re^2: RFC - Documentation Review by Bod (Parson) on Jun 03, 2023 at 11:23 UTC
>> "... under the same terms as the Perl 5 programming language system itself." This looks strained (for want of a better word). I'd remove the word "system"; and consider whether I even want "itself". Good point... I wasn't sure if changing the wording of licence agreements was a good idea as I have no idea how much "legal rigour" they have been subjected to. I didn't write that but rather chose it from the options available within Module::Starter I see you have '`t/pod.t`' with `Test::Pod` which should pick up typos I noticed those as well but haven't touched them as I don't understand them and note that publishing the module works without altering them. Again, they were added by Module::Starter. First, I want to get the module published, available and usable...then I will (hopefully) get a better understanding of how I can improve the details and the system for creating and publishing modules so that future ones will be better. Seems like the POD tests need to be looked at first :)	[reply] [d/l] [select]
Re^3: RFC - Documentation Review by kcott (Archbishop) on Jun 03, 2023 at 19:08 UTC
"... haven't touched them as I don't understand them and note that publishing the module works without altering them." You should run all of your tests and ensure they pass before even thinking about publishing. If you don't understand them, read the documentation: Test::Pod & Test::Pod::Coverage. If you don't understand the documentation, ask us — you would be fully aware that we're more than happy to help. I downloaded the tarball from the "Here is the documentation" link, https://metacpan.org/release/BOD/AI-Embedding-0.1_2/view/lib/AI/Embedding.pm, that you provided in your OP: "https://cpan.metacpan.org/authors/id/B/BO/BOD/AI-Embedding-0.1_2.tar.gz". I set up a directory for testing, then unpacked the tarball, giving: ken@titan ~/tmp/pm_11152618_rfc_doco/AI-Embedding-0.1_2 $ ls -al total 23 drwxr-xr-x 1 ken None 0 Jun 2 22:14 . drwxr-xr-x 1 ken None 0 Jun 3 09:14 .. -rw-r--r-- 1 ken None 425 Jun 2 21:54 Changes drwxr-xr-x 1 ken None 0 Jun 2 22:15 lib -rw-r--r-- 1 ken None 1819 Jun 2 21:50 Makefile.PL -rw-r--r-- 1 ken None 304 Jun 2 22:14 MANIFEST -rw-r--r-- 1 ken None 1169 Jun 2 22:14 META.json -rw-r--r-- 1 ken None 675 Jun 2 22:14 META.yml -rw-r--r-- 1 ken None 1353 May 30 20:54 README drwxr-xr-x 1 ken None 0 Jun 2 22:14 t [download] Those are the files you should have in front of you (albeit in a different parent directory). Here's what you should do: ken@titan ~/tmp/pm_11152618_rfc_doco/AI-Embedding-0.1_2 $ perl Makefile.PL Checking if your kit is complete... Looks good Generating a Unix-style Makefile Writing Makefile for AI::Embedding Writing MYMETA.yml and MYMETA.json ken@titan ~/tmp/pm_11152618_rfc_doco/AI-Embedding-0.1_2 $ make cp lib/AI/Embedding.pm blib/lib/AI/Embedding.pm Manifying 1 pod document ken@titan ~/tmp/pm_11152618_rfc_doco/AI-Embedding-0.1_2 $ export RELEASE_TESTING=1 ken@titan ~/tmp/pm_11152618_rfc_doco/AI-Embedding-0.1_2 $ make test PERL_DL_NONLAZY=1 "/home/ken/perl5/perlbrew/perls/perl-5.36.0/bin/perl +.exe" "-MExtUtils::Command::MM" "-MTest::Harness" "-e" "undef Test:: +Harness::Switches; test_harness(0, 'blib/lib', 'blib/arch')" t/.t t/00-load.t ....... Use of uninitialized value in concatenation (.) or + string at /home/ken/tmp/pm_11152618_rfc_doco/AI-Embedding-0.1_2/blib +/lib/AI/Embedding.pm line 59. t/00-load.t ....... 1/1 # Testing AI::Embedding 0.12, Perl 5.036000, / +home/ken/perl5/perlbrew/perls/perl-5.36.0/bin/perl t/00-load.t ....... ok t/manifest.t ...... skipped: Test::CheckManifest 0.9 required t/pod-coverage.t .. Use of uninitialized value in concatenation (.) or + string at /home/ken/tmp/pm_11152618_rfc_doco/AI-Embedding-0.1_2/blib +/lib/AI/Embedding.pm line 59. t/pod-coverage.t .. 1/1 # Failed test 'Pod coverage on AI::Embedding' # at /home/ken/perl5/perlbrew/perls/perl-5.36.0/lib/site_perl/5.36.0 +/Test/Pod/Coverage.pm line 133. # Coverage for AI::Embedding is 77.8%, with 2 naked subroutines: # test # test_embedding # Looks like you failed 1 test of 1. t/pod-coverage.t .. Dubious, test returned 1 (wstat 256, 0x100) Failed 1/1 subtests t/pod.t ........... ok Test Summary Report ------------------- t/pod-coverage.t (Wstat: 256 (exited 1) Tests: 1 Failed: 1) Failed test: 1 Non-zero exit status: 1 Files=4, Tests=3, 1 wallclock secs ( 0.05 usr 0.01 sys + 0.36 cusr + 0.39 csys = 0.81 CPU) Result: FAIL Failed 1/4 test programs. 1/3 subtests failed. make: * [Makefile:857: test_dynamic] Error 255 [download] I'm running Perl `v5.36.0`; I pretty sure that has nothing to do with any of your test failures. Here's points regarding those problems: `t/00-load.t`: Fix the code such that you don't get that warning. ``make test`` might not care about that; but you should. `t/manifest.t`: I didn't have `Test::CheckManifest` installed so the test is skipped. I'll get back to that; see below. `t/pod-coverage.t`: When you fix `t/00-load.t`, you'll have fixed the same warning here. You have two subroutines in your module, `test` & `test_embedding`, which are not documented in your POD: add appropriate documentation. `Test::CheckManifest`** — I've had no end of problems with this module. Just to demonstrate, I installed it and reran ``make install``: ken@titan ~/tmp/pm_11152618_rfc_doco/AI-Embedding-0.1_2 $ make test PERL_DL_NONLAZY=1 "/home/ken/perl5/perlbrew/perls/perl-5.36.0/bin/perl +.exe" "-MExtUtils::Command::MM" "-MTest::Harness" "-e" "undef Test:: +Harness::Switches; test_harness(0, 'blib/lib', 'blib/arch')" t/.t t/00-load.t ....... Use of uninitialized value in concatenation (.) or + string at /home/ken/tmp/pm_11152618_rfc_doco/AI-Embedding-0.1_2/blib +/lib/AI/Embedding.pm line 59. t/00-load.t ....... 1/1 # Testing AI::Embedding 0.12, Perl 5.036000, / +home/ken/perl5/perlbrew/perls/perl-5.36.0/bin/perl t/00-load.t ....... ok t/manifest.t ...... Bailout called. Further testing stopped: Cannot +find a MANIFEST. Please check! t/manifest.t ...... Dubious, test returned 255 (wstat 65280, 0xff00) Failed 1/1 subtests FAILED--Further testing stopped: Cannot find a MANIFEST. Please check! make: *** [Makefile:857: test_dynamic] Error 255 [download] I gave up on this module some years ago and instead use what it suggests in "Test::CheckManifest - REPLACE THIS MODULE". I use more descriptive test names: `is_deeply([manicheck()], [], 'Check files in "MANIFEST" exist.'); is_deeply([filecheck()], [], 'Check for files not in "MANIFEST" or "MA +NIFEST.SKIP".');` [download] "Again, they were added by Module::Starter." You are under no obligation to use that directly. I use Module::Starter::PBP as a plugin. Here's my `config`: `$ cat ~/.module-starter/config author: Ken Cotterill email: kcott@cpan.org builder: ExtUtils::MakeMaker plugins: Module::Starter::PBP template_dir: /home/ken/.module-starter/P536` [download] I have multiple directories under `/home/ken/.module-starter/` for different Perl versions and different types of modules; `P536` is for Perl `v5.36.0` templates. Depending on what you decide after receiving advice on "How Critical is Critic?", this plugin may be useful in that respect. I don't use it for that purpose. What I like is the ability to customise the templates; I'm no longer locked in to what `Module::Starter` provides. Consider the hoops you had to jump through when dealing with "CPAN Ratings..."; you could have just made a text change to your own template; I acknowledge that was a good learning exercise with respect to GitHub. — Ken	[reply] [d/l] [select]
Re^4: RFC - Documentation Review by Bod (Parson) on Jun 04, 2023 at 22:10 UTC
Re^5: RFC - Documentation Review by kcott (Archbishop) on Jun 05, 2023 at 00:37 UTC
Some notes below your chosen depth have not been shown here
Re^4: RFC - Documentation Review by Bod (Parson) on Jun 04, 2023 at 22:20 UTC
Re^5: RFC - Documentation Review by kcott (Archbishop) on Jun 05, 2023 at 00:45 UTC
Some notes below your chosen depth have not been shown here