Re: Is there a simple way to archive/download all of PerlMonks?
by jdporter (Paladin) on Apr 30, 2024 at 22:28 UTC
|
| [reply] |
Re: Is there a simple way to archive/download all of PerlMonks?
by LanX (Saint) on Apr 28, 2024 at 07:27 UTC
|
| [reply] |
Re: Is there a simple way to archive/download all of PerlMonks?
by Anonymous Monk on Apr 28, 2024 at 22:07 UTC
|
Probably better to (cr)use the Wayback Machine for this purpose. They already have an archived copy, after all.
| [reply] |
|
"They already have an archived copy, after all."
They don't. I've checked several thread URLs, most didn't exist, those that did had snapshots that were years out of date. Besides, it doesn't match the criteria of the question, an 'offline' copy.
| [reply] |
|
> thread URLs, most didn't exist, those that did had snapshots that were years out of date.
Did you check all 6+ domains? :)
I'm wondering how likely an old thread can be out of date, do you expect many monks updating what they wrote at 9/11?
Edit (answering myself)
Hmmm wait, besides editing we have indeed necroposts resurrecting old threads.
True, a backup service would need to check RAT or newest nodes regularly. (Or refrain to mirror only single posts)
And editing isn't recorded anywhere... :/
| [reply] |
|
|
Re: Is there a simple way to archive/download all of PerlMonks?
by afoken (Chancellor) on Apr 30, 2024 at 14:07 UTC
|
Wikipedia once had a way to download a big tar(?) archive of their articles, IIRC in wiki syntax. I don't know if this function is still active.
Having a cron job export the source of all nodes readable to Anonymous Monk every day, week or month to an archive might be an idea. That joub could perhaps run on a dedicated server, perhaps on a snapshot of the live database, so it would not put extra load on the normal servers.
Alexander
--
Today I will gladly share my knowledge and experience, for there are no sweeter words than "I told you so". ;-)
| [reply] |
|
| [reply] |
Re: Is there a simple way to archive/download all of PerlMonks?
by Anonymous Monk on Apr 28, 2024 at 00:28 UTC
|
| [reply] |
|
| [reply] |
|
| [reply] |
Re: Is there a simple way to archive/download all of PerlMonks?
by nikosv (Deacon) on Apr 28, 2024 at 09:10 UTC
|
Actually that could be useful in training/fine tuning a local LLM on the collective Perlmonks threads/data so you can ask free style ChatGPT alike questions on it. | [reply] |
|
| [reply] |
|
I think otherwise: compared to SO, it would be a merciful treatment;-)
| [reply] |
|
|
| [reply] |
|
This is only if the perlmonks text was swept up in the training data. It's probably in our best interest to make perlmonks more downloadable so that this body of information is available to LLM tools. People might actually decide to use or not use perl for a task based on how well ChatGPT can answer questions about it. Lately I've been asking it a bunch of questions about Vue3 and amazes how useful the answers are (as a search engine, it still doesn't write accurate code).
| [reply] |
|
Re: Is there a simple way to archive/download all of PerlMonks?
by harangzsolt33 (Deacon) on Apr 27, 2024 at 16:46 UTC
|
Now, this is a great question!!! I would love to download it or get it somehow if it was possible. Even pure text format would be fine. It doesn't have to be in HTML, although that would be nicer.
I think, it's just a matter of time, and someone is going to say, "Why don't you just scrape it?" Lol :D | [reply] |