Thoughts on Solus linux, systemd timers vs cron, and the state of TTS on linux.

2019-12-07-1

I've been trying Solus linux, a rolling distro that tries to stay very modern, out as a potential replacement for Ubuntu. It has first class MATE support and a modern 5.x kernel. It seems very nice visually. But, and I know it's not it's fault, as my first hard intro to systemd all the things I dislike about it are making me dislike Solus. Instead of just one line and done with cron with systemd timers you create two ~15 line config files and then it's not even auto-detected you have to run another long system command to reload config. This was an unpleasant discovery and hopefully not a harbinger of things to come. But just in case I've set up a Devuan VM too to set up the same software flows as I implement them on systemd. I'll have to learn all this systemd stuff sometime anyway.

One of the first and most important things I need set up on daily driver OS is proper text to speech. In the old days this meant Festival 1.96 with CMU Arctic dataset HTS voices. This produces a monotone but clearly understood voice rapidly. Unfortunately Festival 2.0 does not support the Nitech CMU Arctic HTS voices so distros that only had it (building festival from source is a nightmare) were always ruled out. But today while playing around in solus I learned about mimic. It's apparently a festival lite fork but it supports all the CMU HTS voices I've come to know and love over the last decade while improving many other features. Now with mimic you can set speech speed and pitch on the fly now instead of having to config it to export every utterance to aplay binary call for manual speeding and pitch shift. Bye-bye,

(Parameter.set 'Audio_Command "aplay -q -c 1 -t raw -f s16 -r $(($SR*105/100)) $FILE")

Hello,

--setf duration_stretch=0.8 --setf int_f0_target_mean=145
...or something like that.

While talking about mimic on IRC another guy recommended I check out RHVoice which apparently has a US English voice included but no online voice samples. I'll try compiling that later. In the far future there's also all those machine learning based "tacotron" text to speech methods which apparently will be the core of mimic2 and Mozilla's text to speech core.

Edit: Nope.

I was wrong. It turns out mimic doesn't have flitevox compatible voices for the nitech_us_slt_arctic_hts. It has a HTS voice from the same CMU arctic dataset but by people at CMU. This voice is not equal in quality to the Nitech one. Mimic 1 is not a replacement for festival 1.96.

But!

It turns out that debian did fix the syntax of the nitech voices for festival 2.0 compatibility and put some of them in the default repos! So debian 10 does have some of the nitech voices but not cmu *slt* dataset. But on the debian bug management list there is a guide on how to do it yourself. Thanks Debian! It's worth noting that Devuan, which I was also investigating, doesn't even have the CMU HTS voices at all in it's repos for festival. So both Solus and Devuan are out of the running. Debian 10 it is.