comment on

It's easy to find examples where AI fails, or fails to meet expectations. It's only going to get better, with as much money is being invested in it. The degradation of training data is only a problem for the training as they're doing it now - eventually the models will be learning similar to humans and building their own training data from experience. But, most of these opinions are speculation.

I've been making a list of AI successes, though I don't usually post them here. Here are the two most recent ones, both of which I gave to Claude.AI, and were both successful:

I'd like a javascript component which plays a list of URLs (audio files) from a web server. This could be almost as simple as opening the URL and activating the browser-based player, but I also want to chain to the next song as each song completes. This is for internal use, so there don't need to be any permissions or fancy byte-range loading, unless those features are trivial to add. In other words, I'd be happy if I simple load up a bunch of browser-native players for each song, then trigger them to play one after another. But a fancy system of loading byte ranges into the player until a song file is exhaused and then loading ranges from the next file is also a workable solution. I currently have files in .flac and .mp3, but I can transcode to whatever is most convenient and compatible for the player. This player will go onto a standalone page, probably just jQuery with a simple mobile-friendly play/pause button. If you're pretty sure about the best approach, then just go ahead and write it, but otherwise please explain my options and pros/cons.

which generated this code, which was a 100% success. I edited it to list my songs, named it index.html, and put it into a docker container alongside my song files for a complete music player off my home server in a mere 1-2 hours of effort.

Another similar project that was 99% successful was:

I'd like a simple Perl Mojolicious::Lite-based webapp that serves a page displaying the uptime and running status from "docker inspect ark", and an action that can run "docker stop ark" or "docker start ark" and corresponding buttons on the page. The page should poll the status every 30 seconds, but at 1 second intervals after requesting a change (start/stop) until the change completes.
(followup)
to pair with this, can you write a small program in C called ark-control-docker which takes exactly one parameter "inspect", "stop" or "start", and execs "docker","inspect","ark", and so on? This way I can give it set-gid with group docker and not give generic docker permissions to the webapp. Or, if you have a better idea for privilege isolation, let me know.
(followup)
One more change to the web script - as it waits for the start event, it should read the log file looking for a line like [2025.05.20-20.20.13:889][ 0]Server: "(name)" has successfully started! The log file is currently located at [REDACTED_PATH]
(followup)
one more catch, you need to make sure the timestamp on the log file is newer than the timestamp of when the server started, or else it reads the old log file before the process has begun writing the new one

(if you're curious, this was so a friend could start/stop a server for the video game ARK while VPN'd into my network, but without needing to give them a login to the server or permission to run docker, which is equivalent to root)

The one mistake it made in the code was to declare a Mojo route using ":action" as the parameter name, which is a reserved parameter name in Mojo. I renamed it to ":verb" and all the rest of the code worked. You can also see that it didn't think to check the date on the log file at first... but then neither did I! If it had predicted that it would probably be a more capable programmer than me.

Those two were smashing successes - I wrote almost no code at all and got something useful out of them that I wouldn't have had time to have written otherwise.

This one was maybe 85% successful:

I'd like a script in Perl that performs a binary search of ZFS snapshots. It should accept a dataset name, and list out available snapshots for that dataset, then prompt for the min and max timestamp, unmount the original dataset, then repeatedly create writable datasets from a snapshot (according to binary search) at the original mount point, then prompt me for whether that is a good or bad snapshot. When it determins the final good snapshot, it should give me the option to roll back the dataset to that snapshot, and remount the datset normally. Along the way, it should discard the temporary writable datasets it created. My dataset snapshots look like:
$ zfs list -t snapshot [REDACTED_VOLUME_NAME]
[download]

It messed up on this one by not knowing that ZFS clones auto-mount at a path the same as their name. It's no worse than I would have done, though, because I'm only moderately familiar with ZFS, and part of why I asked it this question was to see what sort of commands it would generate for this workflow. After running into that one problem, I was able to fix the commands by hand to specify the mount point at clone time, and got it working. The user interface was rather nice and I didn't have to write a single line of it.

I'll conclude with a very recent advancement: ChatGPT introduced Codex, which is a system that links up with your GitHub account and will spend actual minutes reading and understanding your code, and running the unit tests in something like a docker container that you set up for it. I was able to get all my perl dependencies installed, and ChatGPT wrote a significant portion of my unit tests for Crypt::SecretBuffer. This effort wasn't nearly as big of a win as I was hoping for; I started Crypt::SecretBuffer trying to see if I could get AI to write the whole thing complete with Win32 compatibility (which I didn't even know how that would work, exactly). That ended up being my downfall - I tried asking it to do things that weren't possible on Win32 and then it wrote a bunch of code that could never work. But again, it's not hard to find examples of what fails. The interesting part is that things got notably better once I introduced my AGENTS.md. Once AI starts working more like a developer who understands the codebase and knows what the goals are, and spends actual time thinking about the problem and running the tests, it's going to be a million times better than blurting out a pile of code as an off-the-cuff response to a prompt, which is most of what people have experienced with AI so far.

This is a coming revolution, like when the Internet first appeared. It's going to change everything. It's better to follow closely on the leading edge so you don't get buried under the wave.

Update

If there's one conclusion I'd draw from this on the current state of AI, it's that AI does way better when you choose an intelligent implementation and ask it to write the bothersome details, than when you ask AI to design the implementation for you. But, AI can also explain all the details of the technology you're about to use. So you can sit around asking it to compare technologies and tell you about design limitations, then make a decision about what to write, then ask it to write it for you. So a smart technical-minded human is still *currently* required to get good results. But there's no telling whether AI will eventually be capable of that central decision role!

In reply to Re: AI in the workplace by NERDVANA
in thread AI in the workplace by talexb

Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!

Titles consisting of a single word are discouraged, and in most cases are disallowed outright.

Read Where should I post X? if you're not absolutely sure you're posting in the right place.

Please read these before you post! —

Posts may use any of the Perl Monks Approved HTML tags:

a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr

You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)

	For:		Use:
	&		`&`
	<		`<`
	>		`>`
	[		`[`
	]		`]`

Link using PerlMonks shortcuts! What shortcuts can I use for linking?

See Writeup Formatting Tips and other pages linked from there for more info.