Aldebaran has asked for the wisdom of the Perl Monks concerning the following question:

Hi Monks,

I'm doing what most Americans are: not watching tv and blasting 80's music so that I can't hear. I've had to develop some software tools related to downloading social media posts to follow a guy who has become a nemesis of sorts, a cat named Tim Ballard. This guy is problematic and disgusting in many different ways. He puts out this mega-wave of bogus stuff about his own supposed heroism when he's a demonstrable affinity fraud who claims to be this Great Liberator of child sex slaves (zero proof). So the topic is supergross on top of that, and not something I want on my machine. So a big part of this software is conditioning a USB target drive to download and watermark it with a time and source stamp for eventual use by a small village of female complainants whom he defames and threatens in these social media posts.

The software is a disheveled pile of scripts at this point, and I was hoping to bring it to perl to be able to dress it up with the trappings of a distro, where I could then do more testing as I go for other types of these social media posts. I will say that I did give Leon the 11 bucks for a month of twitter prime to be able to access the API and grok, and was never able to do the former, while the latter was available for one week, but then I could only get grok2 and if I subscribed to prime, which is a maddening bad-logic game. As far as I can tell, Leon has just transformed that company into his version of Truth Social. (I actually bought stock in DJT so that I could be part of shareholders' lawsuit if the bottom should drop out while an officer thereof makes false representations.)

Let's get to the software. I'll have a handful of bash scripts that are kind of working, and since there's failures, my victories come on the heels of meticulous logging. I like to wrap bash scripts so get that failure point. Here, it's trying to get this to github. Here's the wrapper:

#!/usr/bin/perl # /media/fritz/175F-DC61/2022.bak/2.22.bak use v5.030; use Time::Piece; use Log::Log4perl; use IPC::System::Simple qw/systemx capturex/; use utf8; my ($argv1) = @ARGV; if (not defined $argv1) { die "Need argv1\n"; } my $t = localtime; my $jd = $t->julian_day; my $log_conf4 = '/home/fritz/Documents/perlmonks/conf_files/4.conf'; Log::Log4perl::init($log_conf4); #info my $logger = Log::Log4perl->get_logger(); $logger->info("Time is $t"); $logger->info("Julian day is $jd"); $logger->info("$0"); my $ini_path = qw( /home/fritz/Documents/perlmonks/conf_files/1.&#1090 +;айный.txt ); my $ref_config = get_тайный($ini_p +ath); #$DB::single = 1; my %h = %$ref_config; my $capture = capturex "bash","6.git.sh", "@ARGV"; $logger->info("capx: $capture"); sub get_тайный { use Config::Tiny; use Data::Dump; my %h; #creating here and exporting reference to caller my $ini_path = shift; #caller provides inipath my $sub_hash1 = "gitlab"; my $Config = Config::Tiny->new; $Config = Config::Tiny->read( $ini_path, 'utf8' ); # -> is optional between brackets $h{email} = $Config->{$sub_hash1}{'email'}; $h{name} = $Config->{$sub_hash1}{'name'}; $h{pass} = $Config->{$sub_hash1}{'pass'}; my $sub_hash2 = "bash"; $h{bash_root} = $Config->{$sub_hash2}{'bash_root'}; my $ref_config = \%h; #dd $ref_config; $DB::single = 1; return ($ref_config); } __END__

This has sat on mothballs since the last time that it worked. I managed to lock myself out of my git account by cutting the cord with my iphone, having initiated 2FA haphazardly (no recovery codes). It was all I could do to tell my siblings that I was leaving my iphone, so I didn't think of gitlab. Anyways, I've decided I don't need 2FA for this, and have a new account that I think is configured correctly, but my terminal doesn't want to agree with me. It's wrapping this bash script that's me trying to tap my way through git push/pull/punch...

#!/bin/bash # Create README.md with the current directory information pwd > README.md # Initialize a new Git repository (or reinitialize if it already exist +s) git init # Hardcode Git user information git config --global user.email "redacted@redacted" git config --global user.name "@redacted" # Add a list of files to README.md ls >> README.md # Add files to the staging area git add README.md git add *.pl *.sh *.txt *.py # Set up the remote repository URL (using SSH) if ! git remote | grep -q origin; then git remote add origin git@gitlab.com:aardvark/hja.git else echo "Remote 'origin' already exists." fi # Create a verbose, descriptive commit message based on staged changes who=$(printenv USER) now=$(date) file_list=$(git diff --cached --name-only | tr '\n' ', ' | sed 's/, $/ +/') # Commit all staged changes git commit -m "On $now, $who added changes in files: $file_list" || { +echo "Failed to commit changes"; exit 1; } # Push changes to the main branch git branch -M main git push -u origin main echo "Script completed successfully."

I get good partial results:

<c>2024/11/05 19:36:08 INFO Time is Tue Nov 5 19:36:07 2024 2024/11/05 19:36:08 INFO Julian day is 2460620.60841435 2024/11/05 19:36:08 INFO 4.wrap.pl 2024/11/05 19:36:13 INFO capx: Reinitialized existing Git repository i +n /home/fritz/Desktop/dl2/repack/.git/ Remote 'origin' already exists. [main 8f1999b] On Tue 05 Nov 2024 07:36:08 PM MST, fritz added changes + in files: 4.wrap.pl, 1 file changed, 1 insertion(+), 1 deletion(-) Script completed successfully.

Q1) How do I get this onto gitlab?

Why re-invent the download wheel? I needed a lot more forensics on the video than I was gonna get from somebody's else's dl app. This is typical output; winnowing and filtering the metadata is also critical to not getting buried on STDOUT. I'm particularly pleased that this gives me a machine-state. I endeavor to leave every step like this as a way for the envelope to persist.

This is typical .json output that accompanies the dl.

{ "download_path": "/media/fritz/9766-DD0B/2024-11-03/", "cookie_path": null, "url": "https://www.youtube.com/watch?v=YDP2tR6MXNA", "font": "Arial Bold", "font_size": 64, "username_color": "yellow", "date_color": "cyan", "timestamp_color": "red", "username_position": [ "left", "top" ], "date_position": [ "left", "bottom" ], "timestamp_position": [ "right", "bottom" ], "overall_start": 2, "caption_top": "15%", "caption_bottom": "75%", "line_width": "8%", "hor_offset": "4%", "cap_length": 5, "max_number": 60, "max_char_width": 65, "next_line": 1.7, "pause_between_para": 2, "source_path": "/app/data/4.source.txt", "shadow": { "color": "black", "offset": 5, "opacity": 0.6 }, "video_title": "Tim_Ballard_20241028_4_watermarked_captioned", "video_date": "20241029", "uploader": "Perl Gonzalez", "duration": 151, "width": 1920, "height": 1080, "id": "YDP2tR6MXNA", "ext": "mkv", "resolution": "1920x1080", "fps": 24, "tbr": 2183.114, "protocol": "https+https", "vcodec": "avc1.640028", "vbr": 2065.741, "acodec": "opus", "abr": 117.373, "asr": 48000, "original_filename": "/media/fritz/9766-DD0B/2024-11-03/Perl_Gonza +lez_20241029.mkv", "to_process": "/media/fritz/9766-DD0B/2024-11-03/Perl_Gonzalez_202 +41029.mkv" }

One thing I have to remark on is that I'm not using a cookie to make this happen, which is surprising, because it still happens. (OT: The python library used is yt_dlp for anyone who wants to enlighten me.)

Central to this is a Dockerfile:

# Start with the base image (Ubuntu in this case) FROM ubuntu:22.04 # Update apt repository and install necessary dependencies including f +fmpeg RUN apt-get update && \ apt-get install -y --no-install-recommends \ ffmpeg \ python3 \ python3-pip && \ apt-get clean && \ rm -rf /var/lib/apt/lists/* # Set the working directory WORKDIR /app # Copy all files into the Docker image COPY . /app # Install Python dependencies RUN pip3 install --upgrade pip && pip3 install -r requirements.txt --r +oot-user-action=ignore RUN ls -la * /app

This has become simpler as the bash wrapper gets fancier. It's evolved away the CMD at the bottom; this is how docker gets invoked from the bash wrapper:

# Check if the Docker image exists if [[ "$(docker images -q my_dl:latest 2> /dev/null)" == "" ]]; then echo "Docker image 'my_dl:latest' does not exist. Building the ima +ge..." docker build -t my_dl . else echo "Docker image 'my_dl:latest' found." fi # Proceed with Docker run command using the validated paths echo "Running Docker with the specified volume and config path..." docker run --rm \ -v "$(pwd)":/app \ -v "$Config_Path":/app/data/8.config.json \ -v "$usb_mount_point":"$usb_mount_point" \ my_dl python3 4.caller.py "$1" | tee -a "$LOGFILE"

The -v is for volumes, and it's a clever machine who figured this part out rather than me.

What do I want to do with this? I want perl to pull the wagon from dl to watermark to postprocess. Watermark example of Mr. Disgusting. I've got the units written but not combined, but I want it to be with the structure of the distro, where there's a place to put a bunch of oddball scripts and keep data segregated. Glad for any suggestions.

With the help of a superintelligent chatbot, I can wade through the python, but what I really miss in assembling programming units is the perl debugger. What I'm finding is that I get a lot of functionality out of the docker debugger, but I haven't found a python equivalent. It could well exist. In some sense, I want to bring this back to my own turf. And it occurs to me that if I'm going to embellish what's going on with the Dockerfile, then I want to do it with perl because it's native. Again, accepting anybody's experience in doing so.

Alright, well I'm glad I did something other than watch the boobtube and fret tonight. Hope this find you all well.

Gruss aus Amiland,

"nauseous optimism"

Replies are listed 'Best First'.
Re: Turning a handful of scripts into a distro
by Aldebaran (Curate) on Dec 06, 2024 at 09:04 UTC
    Q1) How do I get this onto gitlab?

    I used perl instead and got the same story.

    #!/usr/bin/perl use strict; use warnings; use Git::Repository; use File::Spec; # Hard-coded values for GitLab access my $repo_path = '/home/fritz/Desktop/dl2/repack'; # Path to your lo +cal git repository my $commit_msg = 'Automated commit from Perl script'; my $branch = 'main'; my $git_user = '@redacted'; my $git_email = 'redacted@gmail.com'; my $remote_name = 'origin'; my $remote_url = 'git@gitlab.com:aardvark/hja.git'; # Update with yo +ur GitLab repo URL # Set up Git repository object my $repo = Git::Repository->new( work_tree => $repo_path ); # Set user configuration $repo->run( config => 'user.name', $git_user ); $repo->run( config => 'user.email', $git_email ); # Add all changes to the staging area $repo->run( add => '.' ); # Commit changes $repo->run( commit => '-m', $commit_msg ); # Check if the remote exists, add if not my $remotes = $repo->run( 'remote' ); if ($remotes !~ /$remote_name/) { $repo->run( remote => 'add', $remote_name, $remote_url ); } # Push changes to the remote repository $repo->run( push => $remote_name, $branch ); print "Changes pushed to GitLab successfully.\n";

    The message is always that the remote url doesn't exist, or I don't have access to it. I'm all but sure that I'm missing a setting at the gitlab site somewhere. (Frustrated.)

    I want perl to pull the wagon from dl to watermark to postprocess.

    I made a lot of progress with making this conform to what we might call a distro. I fired up Module Starter and created Acme::Frobnitz, where this is the manifest:

    # Project Root Dockerfile ignore.txt Makefile.PL MANIFEST README requirements.txt # Scripts and Configuration bin/2.start_docker.sh bin/4.caller.py bin/4.wrap.pl bin/7.wrap.sh conf/app_config.json # Libraries lib/Acme/Frobnitz.pm lib/python_utils/downloader5.py lib/python_utils/utilities1.py # Test Files t/00-load.t t/manifest.t t/pod-coverage.t t/pod.t t/download.t # Extended Tests xt/boilerplate.t

    Got a cool illustration:

    Acme-Frobnitz/
    ├── bin/
    │   ├── 7.wrap.sh
    │   └── ... (other scripts)
    ├── lib/
    │   ├── Acme/
    │   │   └── Frobnitz.pm
    ├── t/
    │   ├── download.t
    ├── Makefile.PL
    ├── ...
    
    

    Simple driver: 1.driver.pl:

    #!/usr/bin/perl use strict; use warnings; use FindBin; use lib "$FindBin::Bin/../lib"; use Acme::Frobnitz; # Parse command-line arguments my $hyperlink = shift @ARGV or die "Usage: $0 <hyperlink>\n"; # Call the download method print "Starting download process...\n"; my $file = Acme::Frobnitz->download($hyperlink); print "File downloaded and processed: $file\n";

    Typical output, we dip into python quickly, avert your gaze if necessary:

    (new-env) fritz@laptop:~/Desktop/Acme-Frobnitz/bin$ perl 1.driver.pl h +ttps://www.youtube.com/shorts/CFqehDVY_zQ Starting download process... 2024-12-06 05:19:13,823 - __main__ - INFO - Current Python Path: ['/ap +p/bin', '/app/lib/python_utils', '/app/lib', '/usr/local/lib/python31 +0.zip', '/usr/local/lib/python3.10', '/usr/local/lib/python3.10/lib-d +ynload', '/usr/local/lib/python3.10/site-packages', '/app/lib/python_ +utils'] 2024-12-06 05:19:17,284 - __main__ - INFO - Attempting to create direc +tory: /media/fritz/BALLARDT1/2024-12-06/ 2024-12-06 05:19:17,285 - __main__ - INFO - Directory created or alrea +dy exists: /media/fritz/BALLARDT1/2024-12-06/ 2024-12-06 05:19:17,285 - __main__ - INFO - Entering function: mask_me +tadata 2024-12-06 05:19:17,286 - downloader5 - INFO - Masking metadata 2024-12-06 05:19:17,286 - downloader5 - INFO - Received parameters for + metadata [youtube] CFqehDVY_zQ: Downloading m3u8 information [youtube] Extracting URL: https://www.youtube.com/shorts/CFqehDVY_zQ [youtube] CFqehDVY_zQ: Downloading webpage [youtube] CFqehDVY_zQ: Downloading ios player API JSON [youtube] CFqehDVY_zQ: Downloading mweb player API JSON [youtube] CFqehDVY_zQ: Downloading m3u8 information [info] CFqehDVY_zQ: Downloading 1 format(s): 399+251 [download] Destination: /media/fritz/BALLARDT1/2024-12-06/Mormon_Stori +es_Podcast_20241204.f399.mp4 [download] 100% of 9.39MiB in 00:00:03 at 2.72MiB/s [download] Destination: /media/fritz/BALLARDT1/2024-12-06/Mormon_Stori +es_Podcast_20241204.f251.webm [download] 100% of 788.73KiB in 00:00:00 at 1.64MiB/s (new-env) fritz@laptop:~/Desktop/Acme-Frobnitz/bin$

    But, the whole reason to bring it to the world of perl was to do testing, and this is where it starts getting fun, IMO.

    It truly was fun to write the test. Sure I failed the whole thing, but it's pretty to me:

    #!/usr/bin/perl use strict; use warnings; use Test::More tests => 3; use File::Path qw(remove_tree); use File::Spec; use FindBin; use lib "$FindBin::Bin/../lib"; use Acme::Frobnitz; # Setup my $output_dir = "downloads"; remove_tree($output_dir) if -d $output_dir; # Clean up any previous d +ownloads # Test 1: Ensure the method exists can_ok('Acme::Frobnitz', 'download'); # Test 2: Attempt to download a YouTube video my $youtube_link = 'https://www.youtube.com/shorts/CFqehDVY_zQ'; eval { Acme::Frobnitz->download($youtube_link); }; ok(!$@, "YouTube video download did not throw an error"); # Test 3: Verify output directory and files ok(-d $output_dir, "Output directory exists"); my @files = glob(File::Spec->catfile($output_dir, '*')); ok(@files > 0, "Downloaded files exist in the output directory"); # Teardown remove_tree($output_dir); # Clean up after test done_testing();
    (new-env) fritz@laptop:~/Desktop/Acme-Frobnitz$ prove -lv t/download.t t/download.t .. 1..3 ok 1 - Acme::Frobnitz->can('download') bash: 7.wrap.sh: No such file or directory not ok 2 - YouTube video download did not throw an error # Failed test 'YouTube video download did not throw an error' # at t/download.t line 24. not ok 3 - Output directory exists # Failed test 'Output directory exists' # at t/download.t line 27. not ok 4 - Downloaded files exist in the output directory # Failed test 'Downloaded files exist in the output directory' # at t/download.t line 29. not ok 5 - planned to run 3 but done_testing() expects 4 # Failed test 'planned to run 3 but done_testing() expects 4' # at t/download.t line 34. # Looks like you planned 3 tests but ran 5. # Looks like you failed 4 tests of 5 run. Dubious, test returned 4 (wstat 1024, 0x400) Failed 2/3 subtests

    Let's see where I am now:

    (new-env) fritz@laptop:~/Desktop/Acme-Frobnitz$ prove -lv t/download.t t/download.t .. not ok 1 - 7.wrap.sh exists and is executable # Failed test '7.wrap.sh exists and is executable' # at t/download.t line 13. ok 2 - Acme::Frobnitz->can('download') bash: 7.wrap.sh: No such file or directory not ok 3 - YouTube video download did not throw an error # Failed test 'YouTube video download did not throw an error' # at t/download.t line 23. ok 4 - Output directory exists ok 5 - Downloaded files exist in the output directory 1..5 # Looks like you failed 2 tests of 5. Dubious, test returned 2 (wstat 512, 0x200) Failed 2/5 subtests ... (new-env) fritz@laptop:~/Desktop/Acme-Frobnitz$ ls -l bin/7.wrap.sh -rw-r--r-- 1 fritz fritz 1933 Dec 5 21:55 bin/7.wrap.sh (new-env) fritz@laptop:~/Desktop/Acme-Frobnitz$ chmod +x bin/7.wrap.sh (new-env) fritz@laptop:~/Desktop/Acme-Frobnitz$ ls -l bin/7.wrap.sh -rwxr-xr-x 1 fritz fritz 1933 Dec 5 21:55 bin/7.wrap.sh (new-env) fritz@laptop:~/Desktop/Acme-Frobnitz$

    I think that might help me...

    (new-env) fritz@laptop:~/Desktop/Acme-Frobnitz$ prove -lv t/download.t t/download.t .. ok 1 - 7.wrap.sh exists and is executable ok 2 - Acme::Frobnitz->can('download') ...

    Alright, one more test passed. At the durch Khyulo/Musk Goetterdaemmerung of a brazenly-hacked American election, this is my idea of fun. Using perl to call python makes sense to me, because I am that guy who wants a return value. And the progress I make is proportional to my astute logging. You've got to mask out the metadata from these things or get buried by them.

    Question: where should I put this code if this is meant to be distro-ey? I find myself wanting to add it to Frobnitz.pm, but that can't be right. (Nobody else does that, right?)

    use Log::Log4perl; # Initialize Log4perl Log::Log4perl->init(\<<'END'); log4perl.logger = INFO, FileAppender log4perl.appender.FileAppender = Log::Log4perl::Appender::File log4perl.appender.FileAppender.filename = ~/Desktop/Acme-Frobnitz/logs +/acme-frobnitz.log log4perl.appender.FileAppender.layout = Log::Log4perl::Layout::Patte +rnLayout log4perl.appender.FileAppender.layout.ConversionPattern = %d [%p] %m%n END # Create logger my $logger = Log::Log4perl->get_logger(); # Log messages $logger->info("This is an info message."); $logger->error("This is an error message.");

    Also, what tests would you add here? I've asked before and gotten several responses, but now I can actually implement them. I also wonder whether this gets read in Berlin still. It well might not. Almost everything that Americans use to warn of cybercrimes whilst determining the fate of the free world has been flared up to dead silence. Look for sources on substack.

    Keep yourself healthy; if helps me to write perl. Gruss aus yehacktem Amiland....