I wrote Speech::Recognition::Vosk, but it certainly needs some love.
I also am looking at whisper.cpp, which should be easier to build, but I haven't put it into use. I plan a start with Inline::C,
simply embedding/including the header file, and then trying to port over the stream.cpp example file into an API.
Text-to-speech seems to be a very convoluted setup in every implementation I've looked at. There is mimic-3, but I haven't found a library that doesn't have lots and lots of build prerequisites.