It doesn't need to be completely machine generated, for a start. You've heard of sampling, right?
It is trivial to sample a voice and change the scale from treble to bass, for example.
What I'm advocating is not as simple, for sure, but it certainly isn't as difficult as you are making out.
I maintain that the main reason the VO is such a cumbersome process, is the nature of the production and the innate procedure followed by the developers: they are just adding a post-coding VO phase to the overall production.
Re-thinking the development methodology is the main way to drag a quondam cottage industry process into the automation age. (I know, I did a thesis on it.)