This post is about a mobile premium game we are working along with Crescent Moon Games, called Ravensword Legacy, which is currently on development.
This week I had to work revamping the awesome characters that were already made by another team member in order to allow them to talk. After some research, I found a plugin for Unity called LipSync Pro that allows you to add some keyframes for Audio Clips, so that the character that is talking moves their mouth accordingly (it allows for some other blendshapes, like blinking or yawning, and even some presetted expressions like angry and happy among others, so you can assign each expression to each line of the character).
The core of spoken languages
For this kind of work, game developers usually group the phonemes together, for example the letter "k" in "key" sounds the same as the "c" in "car", hence needing only one phoneme for that sound. Same with "m", "b", and "p" and so on.
Adapting to the new needs
I proceeded to modify the models and open their mouths, add the inside of the mouth (commonly called mouthbag), tongue and teeth. I also had to modify the textures so the teeth, tongue and mouthbag were textured.
After this, I duplicated and modified three times the resting pose for the A, E/I and O phonemes. As the game is low poly and has pixel post processing and a limited colour palette (sometimes even as low as 8 bits!), too much fidelity and/or fluidity would make it look uncanny.
Each of these heads were exported as a single head with 4 blendshapes, using the modified mouth's ones as targets for the said blendshapes.
Setup of the system
I created a LipSync info .asset file from that Audio Clip via LipSync's Clip Editor (shortcut Ctrl + Alt + A) and started adding the phonemes that matched with what the line was saying. Having only 3 phonemes really sped up this process, otherwise it'd have been too tedious. After that was done, I saved the LipSync info .asset file in the same folder as my Audio Clip.
Each of these black markers means that the mouth will change to the specified phoneme at the specified time. Once this was done, I went back to the prefab of the character head, added the LipSync script and assigned the head mesh as the main mesh, and the teeth as the secondary mesh. This means that the head blendshapes will drive the teeth ones too. I also assigned the Audio Output of this character to be the origin of the sound of the line, and dropped it into the slot.
I then specified which blendshapes were to be assigned to which phonemes so that LipSync knew what blendshape it had to change everytime the timeslider passed through a phoneme marker.
And so this is the end result! It was a very fun experiment and I'll probably end up using this method again in the future for personal projects.
Please be aware the audio clip was a test one to make sure the plugin worked and it's not intended to be used in the final product, since it's a dubbed line from another game.
If this was helpful to you in any way please consider sharing it with your gamedev friends, we really appreacite your support!
Alejandro Bielsa is a junior 3D artist working at Polygonal Mind's in-house team.
Passionate about videogames, vivid tutorial drinker and cat lover.