models/DPO (Direct Preference Optimization) LoRA for XL and 1.5 - OpenRail++ - SDXL - V1.0

DPO (Direct Preference Optimization) LoRA for XL and 1.5 - OpenRail++ - SDXL - V1.0

7/26/2025

3:07:16 AM

Related Keywords & Tags

direct preference optimization,dpo,dpo (direct preference optimization) lora for xl and 1.5 - openrail++,enfugue,lora,pick-a-pic v2 dataset,sdxl 1.0,sdxl - v1.0,stable diffusion,tool

A female warrior clad in silver armor stands in a forest holding a glowing sword and a blue shield with red emblem.

Macro shot of an extraterrestrial creature with iridescent blue and green feathers, large expressive eyes, and glowing bioluminescent highlights, perched on a red alien plant.

Young woman with blonde pixie haircut sitting on a red armchair wearing a school uniform with a red tie in a living room setting with plants and red curtains.

$A detailed Neo-Byzantine style circular mosaic featuring ruby, sapphire, amethyst, and gold elements in an intricate floral and fractal pattern with silver leaves.$

A sharp mountain peak silhouetted against a fiery orange sunset sky, reflected in a clear lake with visible rocks beneath the surface.

Colorful cute robot character with multiple arms, generated using Stable Diffusion AI.

A mountain temple surrounded by misty peaks and calm waters, AI generated using Stable Diffusion.

Recommended Prompts

RAW photo, a close-up picture of a cat, a close-up picture of a dog, orange eyes, blue eyes, reflection in it's eyes

Recommended Parameters

samplers

DPM2

steps

cfg

Creator Sponsors

What is DPO?

DPO is Direct Preference Optimization, the name given to the process whereby a diffusion model is finetuned based on human-chosen images. Meihua Dang et. al. have trained Stable Diffusion 1.5 and Stable Diffusion XL using this method and the Pick-a-Pic v2 Dataset, which can be found at https://huggingface.co/datasets/yuvalkirstain/pickapic_v2, and wrote a paper about it at https://huggingface.co/papers/2311.12908.

What does it Do?

The trained DPO models have been observed to produce higher quality images than their untuned counterparts, with a significant emphasis on the adherence of the model to your prompt. These LoRA can bring that prompt adherence to other fine-tuned Stable Diffusion models.

Who Trained This?

These LoRA are based on the works of Meihua Dang (https://huggingface.co/mhdang) at

https://huggingface.co/mhdang/dpo-sdxl-text2image-v1 and https://huggingface.co/mhdang/dpo-sd1.5-text2image-v1, licensed under OpenRail++.

How were these LoRA Made?

They were created using Kohya SS by extracting them from other OpenRail++ licensed checkpoints on CivitAI and HuggingFace.

1.5: https://civitai.com/models/240850/sd15-direct-preference-optimization-dpo extracted from https://huggingface.co/fp16-guy/Stable-Diffusion-v1-5_fp16_cleaned/blob/main/sd_1.5.safetensors.

XL: https://civitai.com/models/238319/sd-xl-dpo-finetune-direct-preference-optimization extracted from https://huggingface.co/stabilityai/stable-diffusion-xl-base-1.0/blob/main/sd_xl_base_1.0_0.9vae.safetensors

These are also hosted on HuggingFace at https://huggingface.co/benjamin-paine/sd-dpo-offsets/

Contributor

enfugue

NightVisionXL - NightVisionXL_v9.0.0

CHOo1NE | Shiiro's Styles - v1.0

Use this model

Model Details

Model type

LORA

Base model

SDXL 1.0

Model version

SDXL - V1.0

Model hash

c100ec5708

Creator

enfugue

Discussion

Please log in to leave a comment.

Model Collection - DPO (Direct Preference Optimization) LoRA for XL and 1.5 - OpenRail++

LORAMODELS