models/DPO (Direct Preference Optimization) LoRA for XL and 1.5 - OpenRail++ - SDXL - V1.0

DPO (Direct Preference Optimization) LoRA for XL and 1.5 - OpenRail++ - SDXL - V1.0

|
7/26/2025
|
3:07:16 AM
| Discussion|
0
A female warrior clad in silver armor stands in a forest holding a glowing sword and a blue shield with red emblem.
Macro shot of an extraterrestrial creature with iridescent blue and green feathers, large expressive eyes, and glowing bioluminescent highlights, perched on a red alien plant.
Young woman with blonde pixie haircut sitting on a red armchair wearing a school uniform with a red tie in a living room setting with plants and red curtains.
A detailed Neo-Byzantine style circular mosaic featuring ruby, sapphire, amethyst, and gold elements in an intricate floral and fractal pattern with silver leaves.
A sharp mountain peak silhouetted against a fiery orange sunset sky, reflected in a clear lake with visible rocks beneath the surface.
Colorful cute robot character with multiple arms, generated using Stable Diffusion AI.
A mountain temple surrounded by misty peaks and calm waters, AI generated using Stable Diffusion.

Recommended Prompts

RAW photo, a close-up picture of a cat, a close-up picture of a dog, orange eyes, blue eyes, reflection in it's eyes

Recommended Parameters

samplers

DPM2

steps

25

cfg

5

Creator Sponsors

What is DPO?

DPO is Direct Preference Optimization, the name given to the process whereby a diffusion model is finetuned based on human-chosen images. Meihua Dang et. al. have trained Stable Diffusion 1.5 and Stable Diffusion XL using this method and the Pick-a-Pic v2 Dataset, which can be found at https://huggingface.co/datasets/yuvalkirstain/pickapic_v2, and wrote a paper about it at https://huggingface.co/papers/2311.12908.

What does it Do?

The trained DPO models have been observed to produce higher quality images than their untuned counterparts, with a significant emphasis on the adherence of the model to your prompt. These LoRA can bring that prompt adherence to other fine-tuned Stable Diffusion models.

Who Trained This?

These LoRA are based on the works of Meihua Dang (https://huggingface.co/mhdang) at

https://huggingface.co/mhdang/dpo-sdxl-text2image-v1 and https://huggingface.co/mhdang/dpo-sd1.5-text2image-v1, licensed under OpenRail++.

How were these LoRA Made?

They were created using Kohya SS by extracting them from other OpenRail++ licensed checkpoints on CivitAI and HuggingFace.

1.5: https://civitai.com/models/240850/sd15-direct-preference-optimization-dpo extracted from https://huggingface.co/fp16-guy/Stable-Diffusion-v1-5_fp16_cleaned/blob/main/sd_1.5.safetensors.

XL: https://civitai.com/models/238319/sd-xl-dpo-finetune-direct-preference-optimization extracted from https://huggingface.co/stabilityai/stable-diffusion-xl-base-1.0/blob/main/sd_xl_base_1.0_0.9vae.safetensors

These are also hosted on HuggingFace at https://huggingface.co/benjamin-paine/sd-dpo-offsets/

Contributor

Previous
NightVisionXL - NightVisionXL_v9.0.0
Next
CHOo1NE | Shiiro's Styles - v1.0

Model Details

Model type

LORA

Base model

SDXL 1.0

Model version

SDXL - V1.0

Model hash

c100ec5708

Creator

Discussion

Please log in to leave a comment.

Model Collection - DPO (Direct Preference Optimization) LoRA for XL and 1.5 - OpenRail++

Images by DPO (Direct Preference Optimization) LoRA for XL and 1.5 - OpenRail++ - SDXL - V1.0

dpo Images

lora Images

stable diffusion Images

tool Images