| ✅ DO ✅ | ❌ DON’T ❌ |
|---|---|
| The camera is stable, still, and at actor’s level. | Don’t zoom, move the camera, or use tilted angles. |
| Record only one actor per video. | Avoid videos where multiple humans are in the foreground. |
| Use fitted clothing so the model can clearly track the actor’s limbs. | Don’t use loose clothing that hides or obscures the actor’s limbs. |
| Make sure the actor is standing on the ground at the first frame of the video. | Don’t start the video with the actor jumping, mid-air, or in an unclear stance. |
Follow the rules above for the best motion capture. Our model tracks full body motion (including head and fingers), but not facial expressions or objects.
Example of an ideal video:
Example of an ideal acting & image references pair:


Our recommendation: To ensure the character is in the right position, you can use Nano Banana and use the first frame of the acting video as a reference.
To get the best results, use your prompt to clarify anything the model can’t infer directly from the image or acting video.