Sitemap

Day 92 of 100 Days Agentic Engineer Challenge: Step-by-Step Guide to Consistent AI Actor Generation

3 min readApr 5, 2025
Press enter or click to view image in full size

There are two types of ai videos: shorts that contains between 6 to 10 scenes with 5 seconds length each or movies with ai actors and lipsync where the scene can be a few minutes long. The first option is easy to achieve but the second one is a challenge and require many steps and tools. Below I will explain how to do it and what tp use.

Step-by-Step Guide to Consistent AI Actor Generation

Here a step by step process inspired by Imagine Art Films course on Skool.

1. Create a Master Image
🔗 Flux Realism

  • Generate a highly realistic and original image of your character using an AI tool.
  • Make sure the image is high-quality and accurately captures the character’s essence.

2. Upscale the Original Image
🔗 Real-ESRGAN

  • Use an upscaling tool like Real-ESRGAN on Replicate to enhance resolution.
  • Avoid face enhancement features to preserve the original look.

3. Create Character Poses
🔗 Consistent Character

  • Generate multiple images (preferably 1042x1024) of the character in varied poses, angles, and styles.
  • Make sure the outputs are realistic and diverse to build a richer dataset.

Optional Tools:

4. Prepare the Image Dataset

  • Name each image sequentially (e.g., 1.png, 2.png, etc.).
  • Organize the images in a dedicated folder for easy access.

5. Generate Captions (Optional)

  • 🔗 Single Image Captioning
  • 🔗 Batch Captioning (requires OpenAI API key)
  • Create AI-generated captions for each image to add descriptive context.
  • Save each caption in a .txt file named to match its image (e.g., 1.txt for 1.png).

6. Compile the Final Dataset

  • Zip all image and caption files into a single archive.
  • Name the zip file appropriately for easy identification.

7. Train the LoRa Model
🔗 Flux LoRa Trainer

  • Upload your zip archive to a training platform like Replicate.
  • Set parameters such as training steps (recommended: 2000–3000 for better results).
  • Make sure the model visibility is set to private if needed.

8. Test and Refine the Model

  • Generate sample images using your new LoRa model.
  • Evaluate the results and adjust prompts or parameters as needed to fine-tune output quality.

Few Improvements:

  • For ultra-realistic images, you can explore other models available on Civitai or Hugging Face. You can also try Flux 1.1 Pro Ultra in Raw mode.
  • Additional upscaling options include Tensorpix and Topaz Labs.
  • For training, you can also use Black Forest Labs’ Flux Pro Finetuning service.

This process is focused solely on generating consistent images with digital actors — there’s no video, no audio, and no lipsync involved. It requires multiple steps, and for each new actor, the process must be repeated.

Before starting, you’ll need a script to define the scenario for generating the images.

I’m planning to automate this entire workflow and integrate it into my AI video platform: Pixonaut.art.

--

--

Damian Dąbrowski
Damian Dąbrowski

Written by Damian Dąbrowski

Hi, I’m Damian, an Electrical Power Engineer, who loves building AI powered apps.

No responses yet