Reference Images Done Rightfor Stable Results

Reference images should define inheritance boundaries, not overconstrain the model. Clarity beats quantity.

Portrait scene with structured environment and depth

Give each reference a single job

Unstable runs often happen when one image is expected to lock face identity, outfit, composition, and mood at the same time.

Decide if each reference is for identity, spatial layout, or material atmosphere. One clear goal per image improves model decisions.

Reduce conflicting signals

More references do not always improve stability. If lighting and camera distance conflict, the model receives mixed direction.

Use one to two images from similar shooting logic whenever possible. Signal consistency is more important than count.

State inheritance rules explicitly

References are input context, not instructions. Prompt text should explicitly say what to keep and what to change.

When visual and textual constraints agree, first-pass quality improves and later extension to video becomes easier.