If that is the case then you need to have Live2D versions of "speaking" animations. You can't just change the mouth. Live2D implies moving images, so the mouth isn't in the same place all the time like a static sprite. There's no way to just change the mouth on a moving image like that (at least, not that I know of).
If you have an "idle" animation, then you also need an "idle talking" animation. If you have a different animation for a different type of pose, like, an animated "surprised" pose, then you need "surprised" and you also need "surprised talking" versions. For each animated Live2D pose you have, you will need a talking animation created in Live2D.
After you have your "talking" versions created in Live2D, you can then have a script use character callbacks to detect audio playing and use the "talking" versions of the Live2D animations when audio is playing, and then swap to the normal stance versions of the Live2D animations when audio has stopped playing.
An additional consideration would be that the 'swap' between "talking" and "non-talking" animations might be a frame snap (from frame X in "talking" version to frame 1 in "non-talking" version). In such a case, the animations should be created in a way that looks elegant after the transition occurs. For example, a character that gestures with their hands as they talk, and then puts their hand down when they finish. This would make the transition look more natural/organic and seem less like a sudden robotic frame snap.
At least, this is one method.
The better method is to have many different talking animations, and have them continue without ending. Have them tailored to the duration of their respective dialogue lines so the mouth stops when the audio stops (measured by the animator, not by a script). For an example of that kind of method, just look at a playthrough of
Potionomics. With this method, the frame snap only occurs when a new line of dialogue happens and the animation changes, instead of when audio ends.