Audio must be in English since our training datasets are only in this language. Ensure the vocals of audio are clear; background music is acceptable. The development of portrait image animation ...