Stable Diffusion์„ ์‚ฌ์šฉํ•ด ์ƒ์„ฑ๋œ AI ์ด๋ฏธ์ง€๋กœ, ๊ธˆ๋ฐœ ๋จธ๋ฆฌ์— ํŒŒ๋ž€ ์•ˆ๊ฒฝ์„ ์“ฐ๊ณ  ์—ฐํ•œ ํŒŒ๋ž€์ƒ‰ ์˜ท์„ ์ž…์€ ๊ท€์—ฌ์šด ์• ๋‹ˆ๋ฉ” ์†Œ๋…€๊ฐ€ ๋จธ๋ฆฌ์— ๊ฒ€์€ ๋ฆฌ๋ณธ์„ ๋‹ฌ๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค.
Stable Diffusion์œผ๋กœ ์ œ์ž‘ํ•œ ํฐ์ƒ‰ ์›จ๋”ฉ๋“œ๋ ˆ์Šค์™€ ๋ฒ ์ผ, ํ‹ฐ์•„๋ผ๋ฅผ ์ฐฉ์šฉํ•œ ํŒŒ๋ž€ ๋จธ๋ฆฌ์˜ ๊ท€์—ฌ์šด ์†Œ๋…€ ์• ๋‹ˆ๋ฉ”์ด์…˜ ์Šคํƒ€์ผ ์ด๋ฏธ์ง€
๊ธด ์ง™์€ ํŒŒ๋ž€ ๋จธ๋ฆฌ์นด๋ฝ์„ ๊ฐ€์ง„ ์• ๋‹ˆ๋ฉ” ์†Œ๋…€๊ฐ€ ๋นจ๊ฐ„ ๋ฆฌ๋ณธ์ด ๋‹ฌ๋ฆฐ ๊ต๋ณต์„ ์ž…๊ณ  ๋ฐ์€ ํŒŒ๋ž€์ƒ‰ ๋ฐฐ๊ฒฝ๊ณผ ํฐ๊ฝƒ์„ ๋ฐฐ๊ฒฝ์œผ๋กœ ์„œ ์žˆ๋Š” ๋ชจ์Šต. Stable Diffusion์„ ์‚ฌ์šฉํ•œ AI ์ƒ์„ฑ ์ด๋ฏธ์ง€.
๊ธธ๊ณ  ๊ฒ€์€ ๋จธ๋ฆฌ์— ๋ณ„ ์žฅ์‹์ด ์žˆ๊ณ  ํฐ์ƒ‰ ๊ต๋ณต์„ ์ž…์€ ์• ๋‹ˆ๋ฉ” ์†Œ๋…€๊ฐ€ ์›ƒ๊ณ  ์žˆ๋Š” ๋ชจ์Šต. Stable Diffusion์„ ์‚ฌ์šฉํ•ด AI๊ฐ€ ์ƒ์„ฑํ•œ ์ด๋ฏธ์ง€.
Stable Diffusion์„ ์‚ฌ์šฉํ•˜์—ฌ ์ƒ์„ฑ๋œ ๊ธด ํŒŒ๋ž€ ๋จธ๋ฆฌ, ํŒŒ๋ž€ ๋ˆˆ, ์ƒ์„ธํ•œ ์˜์ƒ์„ ๊ฐ€์ง„ ๋งˆ๋ฒ• ๊ฐ™์€ ์• ๋‹ˆ๋ฉ”์ด์…˜ ์†Œ๋…€์˜ AI ์ƒ์„ฑ ์ด๋ฏธ์ง€.
๊ธด ์–ด๋‘์šด ๋จธ๋ฆฌ์™€ ํฐ ๋ฏธ์†Œ๋กœ ํ‰ํ™” ๊ธฐํ˜ธ๋ฅผ ๋“ค๊ณ  ์žˆ๋Š” ๊ท€์—ฌ์šด ์• ๋‹ˆ๋ฉ” ์†Œ๋…€. Stable Diffusion์„ ์‚ฌ์šฉํ•ด AI๊ฐ€ ์ƒ์„ฑํ•œ ์ด๋ฏธ์ง€.
ํฐ์ƒ‰ ๋ฏผ์†Œ๋งค ๋“œ๋ ˆ์Šค๋ฅผ ์ž…์€ ๊ธด ์–ด๋‘์šด ๋จธ๋ฆฌ์˜ ์• ๋‹ˆ๋ฉ” ์Šคํƒ€์ผ ์ Š์€ ์†Œ๋…€๊ฐ€ ๋ฏธ์†Œ๋ฅผ ์ง€์œผ๋ฉฐ ์†์œผ๋กœ ํ‰ํ™”์˜ V ํ‘œ์‹œ๋ฅผ ํ•˜๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค. ๋ฐฐ๊ฒฝ์—๋Š” ๋น›๋‚˜๋Š” ํ‘ธ๋ฅธ ๊ฝƒ๊ณผ ๋ฐคํ•˜๋Š˜์ด ์žˆ์–ด Stable Diffusion์„ ์‚ฌ์šฉํ•œ AI ์ƒ์„ฑ ์ด๋ฏธ์ง€์ž„์„ ๊ฐ•์กฐํ•ฉ๋‹ˆ๋‹ค.
๊ฝƒ์ด ์žˆ๋Š” ๋‹ค์ฑ„๋กœ์šด ๋ฐฐ๊ฒฝ์„ ๋ฐฐ๊ฒฝ์œผ๋กœ ๊ธฐ์˜๊ฒŒ ์›ƒ๊ณ  ์žˆ๋Š” ์ง„ํ•œ ํŒŒ๋ž€ ๋จธ๋ฆฌ์™€ ๋นจ๊ฐ„ ๋ฆฌ๋ณธ์„ ํ•œ ํ–‰๋ณตํ•œ ์• ๋‹ˆ๋ฉ” ์†Œ๋…€, Stable Diffusion์œผ๋กœ ์ƒ์„ฑ๋จ.
Stable Diffusion์„ ์‚ฌ์šฉํ•ด ๋งŒ๋“ค์–ด์ง„, ํ˜๋Ÿฌ๋‚ด๋ฆฌ๋Š” ์–ด๋‘์šด ๋จธ๋ฆฌ์นด๋ฝ๊ณผ ํฐ ๋“œ๋ ˆ์Šค๋ฅผ ์ž…๊ณ  ๋น›๋‚˜๋Š” ํŒŒ๋ž€ ๊ฝƒ๋“ค์— ๋‘˜๋Ÿฌ์‹ธ์ธ ์• ๋‹ˆ๋ฉ” ์†Œ๋…€
๊ธด ๊ฒ€์€ ๋จธ๋ฆฌ์™€ ๋ฆฌ๋ณธ์„ ๊ฐ€์ง„ ๊ท€์—ฌ์šด ์• ๋‹ˆ๋ฉ” ์†Œ๋…€๊ฐ€ ์œ™ํฌํ•˜๋ฉฐ ๊ฒ€์ง€ ์†๊ฐ€๋ฝ์„ ์ž…์ˆ ์— ๋Œ€๊ณ  ์žˆ๋Š” ๋ชจ์Šต, Stable Diffusion์„ ์‚ฌ์šฉํ•ด AI ์ƒ์„ฑ.
Stable Diffusion์„ ์‚ฌ์šฉํ•˜์—ฌ ๋นจ๊ฐ„ ์žฌํ‚ท๊ณผ ๊ฒ€์€ ๋ฒ ๋ ˆ๋ชจ๋ฅผ ์“ด ๋ผ๋ฒค๋” ๋จธ๋ฆฌ์˜ ์• ๋‹ˆ๋ฉ”์ด์…˜ ์†Œ๋…€๊ฐ€ ํœ ์ฒด์–ด์— ์•‰์•„ ์žˆ๋Š” AI ์ƒ์„ฑ ์ด๋ฏธ์ง€.
Stable Diffusion์„ ์‚ฌ์šฉํ•˜์—ฌ ํŽ€์นญ๋ฐฑ๊ณผ ํ•จ๊ป˜ ๊ถŒํˆฌํ•˜๋Š” ์•…๋งˆ ์†Œ๋…€์˜ AI ์ƒ์„ฑ ์• ๋‹ˆ๋ฉ”์ด์…˜ ์Šคํƒ€์ผ ์ด๋ฏธ์ง€.

์ถ”์ฒœ ํ”„๋กฌํ”„ํŠธ

score_9,score_8_up,score_7_up

score_9,score_8_up

์ถ”์ฒœ ๋„ค๊ฑฐํ‹ฐ๋ธŒ ํ”„๋กฌํ”„ํŠธ

score_4,score_3,score_2,worst quality, bad hands, bad feet

score_3,score_2,ugly,bad feet

์ถ”์ฒœ ๋งค๊ฐœ๋ณ€์ˆ˜

samplers

Euler a

steps

22 - 30

cfg

7

clip skip

2

resolution

848x1072, 840x1112, 952x1192, 936x1192, 872x1184, 848x1216, 824x1160

other models

T-ponynai3(5.5-4)(v3-0.5) (89e7c7518c)

์ถ”์ฒœ ๊ณ ํ•ด์ƒ๋„ ๋งค๊ฐœ๋ณ€์ˆ˜

upscaler

R-ESRGAN 4x+ Anime6B

upscale

1.6 - 1.7

steps

10

denoising strength

0.3

ํŒ

์ค‘๊ฐ„ ํ•ด์ƒ๋„์—์„œ ๊ณ ํ•ด์ƒ๋„ ์ˆ˜์ • ๊ธฐ๋Šฅ์„ ์‚ฌ์šฉํ•˜์—ฌ ์ตœ์ƒ์˜ ๊ฒฐ๊ณผ๋ฅผ ์–ป์œผ์„ธ์š”.

๋ˆˆ ๋””ํ…Œ์ผ ๊ฐœ์„ ์„ ์œ„ํ•ด style_3 ๋˜๋Š” 4๋ฅผ ์‹œ๋„ํ•ด๋ณด์„ธ์š”.

๋ฒ„์ „ ํ•˜์ด๋ผ์ดํŠธ

์ด ๋ฒ„์ „์€ 5.1์˜ ์ตœ์ ํ™” ๋ฒ„์ „์œผ๋กœ, ๋ˆˆ ๋””ํ…Œ์ผ, ๋ฐœ ํ•ฉ๋ฆฌ์„ฑ, ํ”„๋กฌํ”„ํŠธ ๋ฏผ๊ฐ๋„, ์‚ฌ์ง€ ๊ฒน์นจ ํ•ฉ๋ฆฌ์„ฑ์„ ๊ฐœ์„ ํ–ˆ์Šต๋‹ˆ๋‹ค. ๋‹ค๋งŒ ํ™”๋ฉด ๋ช…์•” ์ฒ˜๋ฆฌ์—์„œ๋Š” ์—ฌ์ „ํžˆ ์ œ ๊ธฐ๋Œ€์— ์™„์ „ํžˆ ๋ฏธ์น˜์ง€ ๋ชปํ–ˆ์Šต๋‹ˆ๋‹ค. ์ œ ํ…Œ์ŠคํŠธ ๊ฒฐ๊ณผ style_4 ์‚ฌ์šฉ ์‹œ ํ™”๋ฉด์ด ๋” ์–ด๋‘์›Œ์ ธ ๋‹จ๊ธฐ์  ํ•ด๊ฒฐ์ฑ…์ผ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ์ง์ ‘ ํ›ˆ๋ จํ•œ ๊ฒฐ๊ณผ๊ฐ€ ์ข‹์ง€ ์•Š์•„ ์ถ”๊ฐ€ ํ›ˆ๋ จ์„ ์ง„ํ–‰ํ–ˆ๊ณ , ์ด๋กœ ์ธํ•ด ์‹œ๊ฐ„์ด ์กฐ๊ธˆ ๋‚ญ๋น„๋˜์—ˆ์Šต๋‹ˆ๋‹ค. ์ถ”๊ฐ€ ๋ฌธ์ œ๊ฐ€ ์žˆ์œผ๋ฉด ๋ฐ˜๋“œ์‹œ ๋Œ“๊ธ€์— ์•Œ๋ ค์ฃผ์„ธ์š”!

This version is an optimization to 5.1, optimizing the details of the eyes, the rationality of the feet, the sensitivity to cues, and the rationality of limb overlap. In my tests, the use of style_4 made the picture darker, perhaps a short-term solution, and I used some additional training, because the direct training did not work so well, so I wasted some time. If you have more questions, please be sure to let me know in the comments section!

ํฌ๋ฆฌ์—์ดํ„ฐ ์Šคํฐ์„œ

[๋ฏธ์ธ์ฆ] Tonade๋Š” T-ponynai3 ๋ชจ๋ธ ์ œ์ž‘์ž์ด๋ฉฐ, c์‚ฌ์ดํŠธ ID๋Š” Tonade์ž…๋‹ˆ๋‹ค. | ์‚ฌ๋ž‘ ์ „๋ ฅ (afdian.net)

์—ฌ๊ธฐ๋Š” ์‚ฌ๋ž‘ ์ „๋ ฅ(afdian) ํ›„์› ์ฑ„๋„์ž…๋‹ˆ๋‹ค. ๋ชจ๋ธ์ด ๋งˆ์Œ์— ๋“ค๊ณ  ์—ฌ์œ ๊ฐ€ ์žˆ์œผ์‹œ๋ฉด ์ง€์› ๋ถ€ํƒ๋“œ๋ฆฝ๋‹ˆ๋‹ค! ๋ฌด๋ฆฌํ•˜์ง€ ๋งˆ์‹œ๊ณ  ์—ฌ๋Ÿฌ๋ถ„์˜ ์ž‘์€ ์ง€์›๋„ ๊ฐ์‚ฌ๋“œ๋ฆฝ๋‹ˆ๋‹ค. ๊ณ„์†ํ•ด์„œ ๋ชจ๋ธ์„ ๊ฐœ์„ ํ•˜๋„๋ก ๋…ธ๋ ฅํ•˜๊ฒ ์Šต๋‹ˆ๋‹ค!

929721518 ๋ณธ์ธ์˜ qq ์†Œ๊ทธ๋ฃน ๋ฒˆํ˜ธ์ž…๋‹ˆ๋‹ค. tpony ๊ด€๋ จ ๊ถ๊ธˆํ•œ ์ ์ด ์žˆ์œผ๋ฉด ๋“ค์–ด์™€์„œ ์งˆ๋ฌธํ•˜์„ธ์š”. c์‚ฌ์ดํŠธ์ž„์„ ๊ผญ ๋ฐํ˜€์ฃผ์„ธ์š”.

์ด ๋ชจ๋ธ์€ ์ด๋ฏธ ๋‚ด์žฅ๋œ vae๊ฐ€ ์žˆ์œผ๋‹ˆ ๋ณ„๋„์˜ vae๋ฅผ ์ถ”๊ฐ€ํ•  ํ•„์š”๊ฐ€ ์—†์Šต๋‹ˆ๋‹ค.

The model already has included vae, there is no need to add additional vae

์ตœ์ ์˜ ์ด๋ฏธ์ง€ ์ƒ์„ฑ ์ „๋žต์€ ์ค‘๊ฐ„ ํ•ด์ƒ๋„์—์„œ ๊ณ ํ•ด์ƒ๋„ ์ˆ˜์ • ๊ธฐ๋Šฅ์„ ์‚ฌ์šฉํ•˜๋Š” ๊ฒƒ์ด๋ฉฐ, ์ง์ ‘ ๊ณ ํ•ด์ƒ๋„๋กœ ์ถœ๋ ฅํ•˜๋Š” ๊ฒƒ์€ ์ง€์–‘ํ•ด์•ผ ํ•ฉ๋‹ˆ๋‹ค.

The best generate strategy is to use high-fix at a moderate resolution, rather than directly using high-resolution direct output

[๋ฏธ์ธ์ฆ] Tonade๋Š” T-ponynai3 ๋ชจ๋ธ์˜ ์ œ์ž‘์ž์ด๋ฉฐ, c์‚ฌ์ดํŠธID๋Š” Tonade์ž…๋‹ˆ๋‹ค. | ์‚ฌ๋ž‘ ์ „๋ ฅ (afdian.net)

์—ฌ๊ธฐ๋Š” ์‚ฌ๋ž‘ ์ „๋ ฅ(afdian) ํ›„์› ์ฑ„๋„์ž…๋‹ˆ๋‹ค. ๋ชจ๋ธ์ด ๋„์›€์ด ๋˜๊ณ  ์—ฌ์œ ๊ฐ€ ์žˆ์œผ์‹œ๋‹ค๋ฉด ์ง€์› ๋ถ€ํƒ๋“œ๋ฆฝ๋‹ˆ๋‹ค! ๋ฌด๋ฆฌํ•˜์ง€ ์•Š์œผ์…จ์œผ๋ฉด ํ•ฉ๋‹ˆ๋‹ค. ์—ฌ๋Ÿฌ๋ถ„์˜ ๋ชจ๋“  ์ง€์›์— ๊ฐ์‚ฌ๋“œ๋ฆฌ๋ฉฐ, ๋ชจ๋ธ์„ ๊ณ„์†ํ•ด์„œ ๊ฐœ์„ ํ•˜๋Š” ๋ฐ ํž˜์“ฐ๊ฒ ์Šต๋‹ˆ๋‹ค!

(์ด ๋ชจ๋ธ์€ Tusi์™€ Tensor ์–‘์ชฝ์— ๋™์‹œ์— ์กด์žฌํ•  ์ˆ˜ ์žˆ์–ด์„œ Tusi์—์„œ ์‚ฌ์šฉํ•˜๋Š” ๊ฒƒ์ด ์ข‹์Šต๋‹ˆ๋‹ค. ์‚ฌ์šฉ์— ๋ฌธ์ œ๊ฐ€ ์žˆ์œผ๋ฉด ์•Œ๋ ค์ฃผ์„ธ์š”)

v5 ๋ฒ„์ „์—๋Š” 4๊ฐœ์˜ ์ƒˆ๋กœ์šด ์Šคํƒ€์ผ์ด ์ถ”๊ฐ€๋˜์—ˆ์œผ๋ฉฐ, style_1๋ถ€ํ„ฐ style_4๊นŒ์ง€๋ฅผ ํ†ตํ•ด ์ด๋ฏธ์ง€ ์„ธ๋ถ€๋ฅผ ๋ฏธ์„ธ ์กฐ์ •ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค (์ด๋ก ์ƒ ๊ทธ๋ ‡์ง€๋งŒ ์‹ค์ œ ํšจ๊ณผ๋Š” ๋‹ค์†Œ ์‹ ๋น„๋กญ๊ฑฐ๋‚˜ ๋‚ฎ์„ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค)

V5 version has added 4 new styles, which can be used to fine tune the details of the image through style_1 to style_4 (theoretically, this is the case, but the actual effect is more mystical or lower)

์ด ๋ชจ๋ธ์€ ponyv6๋ฅผ ๊ธฐ๋ฐ˜์œผ๋กœ ํ•™์Šต๋œ lora ๋ชจ๋ธ์„ ์™„๋ฒฝํ•˜๊ฒŒ ์ง€์›ํ•˜๋ฉฐ, ani3์™€ sdxl1.0์˜ lora๋„ ์ผ์ • ๋ถ€๋ถ„ ์ ์šฉ ๊ฐ€๋Šฅํ•ฉ๋‹ˆ๋‹ค.

This model perfectly supports lora trained with ponyv6 as the base model, and the Lora of ani3 and sdxl1.0 can also be adapted to some extent.

v4.1 ๊ธฐ๋ฐ˜ ์ด๋ฏธ์ง€ ์ธํŽ˜์ธํŠธ ํ…Œ์ŠคํŠธ (์ด์ „ ๋ฒ„์ „์—์„œ ๊ฐ„๊ณผ๋œ ๋ถ€๋ถ„์ž„)

Image inpaint testing based on v4.1 (this is a previously overlooked part)

pony๋Š” ์‹ ์ด๋ฉฐ, ํ˜ธํ™˜์„ฑ์ด ๋งค์šฐ ๋›ฐ์–ด๋‚ฉ๋‹ˆ๋‹ค. ์ด ๋ชจ๋ธ์€ ani์™€ pony์˜ lora๋ฅผ ์ง€์›ํ•ฉ๋‹ˆ๋‹ค.

ํ•„์ˆ˜ ์ „์ œ ํšจ๊ณผ ๋‹จ์–ด๋Š” ponydiffusion๊ณผ ๊ฐ™์Šต๋‹ˆ๋‹ค.

positive:(score_9,score_8_up,score_7_up,score_6_up,score_5_up,score_4_up)

๋˜๋Š” (score_9,score_8_up,score_7_up)

๋ถ€์ •๋‹จ์–ด๋„ ์ถ”๊ฐ€ ๊ฐ€๋Šฅํ•ฉ๋‹ˆ๋‹ค๏ผš

negative: (score_4,score_3,score_2,score_1),

์ผ๋ฐ˜์ ์ธ nai ๊ณ„์—ด ๋ถ€์ • ๋‹จ์–ด๋„ ์ถ”๊ฐ€ ๊ฐ€๋Šฅํ•ฉ๋‹ˆ๋‹ค. ์˜ˆ:

negative: worst quality, bad hands, bad feet

๋งˆ์Œ์— ๋“œ์…จ์œผ๋ฉด ์ข‹๊ฒ ์Šต๋‹ˆ๋‹ค แ••(โ— ฺผโ— )แ•— nai3์™€ ponyv6 ๊ธฐ๋ฐ˜

ํ›ˆ๋ จ ์•ˆ๋‚ด: v1๋Š” 94์žฅ์˜ ์ด๋ฏธ์ง€๋ฅผ ์‚ฌ์šฉํ–ˆ๊ณ , v2๋Š” 119์žฅ, v3๋Š” 348์žฅ, v3.5๋Š” 474์žฅ์˜ nai3 ์ƒ์„ฑ ์ด๋ฏธ์ง€๋ฅผ ์‚ฌ์šฉํ•ด lora๋ฅผ ๊ธฐ์ € ๋ชจ๋ธ์— ๋ณ‘ํ•ฉํ•˜์—ฌ ๋ฏธ์„ธ ์กฐ์ •ํ–ˆ์Šต๋‹ˆ๋‹ค. ponyv6๊ฐ€ ์ง€์›ํ•˜๋Š” ํ™”๊ฐ€ ํƒœ๊ทธ๋Š” ๋ชจ๋‘ ์ง€์›ํ•˜๋ฉฐ, ๋‘ ๊ฐœ ์ด์ƒ์˜ ํ™”๊ฐ€ ํƒœ๊ทธ ์‚ฌ์šฉ ์‹œ ๋ฐฐ๊ฒฝๅดฉๅฃŠ ๋ฌธ์ œ๊ฐ€ ๋ฐœ์ƒํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ํ˜„์žฌ ์›์‹  ์บ๋ฆญํ„ฐ๋Š” ์ƒ์„ฑ ๊ฐ€๋Šฅํ•˜๋‚˜ ๋‹ค๋ฅธ ์บ๋ฆญํ„ฐ๋Š” ๋ฏธ์ง€์ˆ˜์ด๋ฉฐ, ์ด ๋ชจ๋ธ์— ๋Œ€ํ•ด ๋งŽ์€ ํ…Œ์ŠคํŠธ๋Š” ํ•˜์ง€ ์•Š์•˜์Šต๋‹ˆ๋‹ค. T-anime-xl, ponyv6, ani3๋ฅผ ์œตํ•ฉํ•œ ๋ชจ๋ธ์ด๋ฉฐ ์•„์ง ์ •์‹ ์ถœ์‹œ๋˜์ง€ ์•Š์•˜์Šต๋‹ˆ๋‹ค.

์‚ฌ์šฉ๋œ ๊ทธ๋ž˜ํ”ฝ ์นด๋“œ๋Š” ์ œ ๊ฐœ์ธ 3090์ด๋ฉฐ, v1๋ถ€ํ„ฐ v3๊นŒ์ง€ ๊ฐ๊ฐ 7์‹œ๊ฐ„, 12์‹œ๊ฐ„, 35์‹œ๊ฐ„, 47์‹œ๊ฐ„ ํ›ˆ๋ จํ–ˆ์Šต๋‹ˆ๋‹ค.

Training Instructions: v1 94์žฅ, v2 119์žฅ, v3 348์žฅ, v3.5 474์žฅ์˜ nai3 ์ƒ์„ฑ ์ด๋ฏธ์ง€๋ฅผ ์‚ฌ์šฉํ•ด ๊ธฐ์ € ๋ชจ๋ธ์„ ๋ฏธ์„ธ ์กฐ์ •ํ–ˆ์Šต๋‹ˆ๋‹ค. ponyv6๊ฐ€ ์ง€์›ํ•˜๋Š” ๋ชจ๋“  ํ™”๊ฐ€ ํƒœ๊ทธ๋ฅผ ์ง€์›ํ•˜์ง€๋งŒ nai3์—์„œ ์ถ”๊ฐ€ ํ™”๊ฐ€ ํƒœ๊ทธ๋Š” ์—†์Šต๋‹ˆ๋‹ค. ๋‘ ๊ฐœ ์ด์ƒ์˜ ํ™”๊ฐ€ ํƒœ๊ทธ ์‚ฌ์šฉ ์‹œ ๋ฐฐ๊ฒฝๅดฉๅฃŠ๊ฐ€ ๋ฐœ์ƒํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ํ˜„์žฌ ์›์‹  ์บ๋ฆญํ„ฐ ์ƒ์„ฑ์ด ๊ฐ€๋Šฅํ•˜๋ฉฐ, ๋‹ค๋ฅธ ๊ฒƒ์€ ๋ฏธํ™•์ธ ์ƒํƒœ์ž…๋‹ˆ๋‹ค. ์ด ๋ชจ๋ธ์— ๋Œ€ํ•œ ํ…Œ์ŠคํŠธ๋Š” ๋งŽ์ง€ ์•Š์œผ๋ฉฐ, nai3 ๊ทธ๋ฆผ์ฒด ์žฌํ˜„์ด ๋›ฐ์–ด๋‚ฉ๋‹ˆ๋‹ค. ๊ธฐ์ € ๋ชจ๋ธ์€ T-anime-xl, ponyv6, ani3 ์œตํ•ฉ ๋ชจ๋ธ๋กœ ์•„์ง ๋ฏธ๊ณต๊ฐœ์ž…๋‹ˆ๋‹ค.

์ œ๊ฐ€ ์‚ฌ์šฉํ•œ ๊ทธ๋ž˜ํ”ฝ ์นด๋“œ๋Š” ์ œ ๊ฐœ์ธ 3090์ด๋ฉฐ, v1๋ถ€ํ„ฐ v3.5๊นŒ์ง€ ๊ฐ๊ฐ 7์‹œ๊ฐ„, 12์‹œ๊ฐ„, 35์‹œ๊ฐ„, 47์‹œ๊ฐ„ ํ›ˆ๋ จํ–ˆ์Šต๋‹ˆ๋‹ค.

v1

์žฌ๋ฏธ์žˆ๋Š” ์‹œ๋„์˜€์Šต๋‹ˆ๋‹ค.

An interesting attempt

v2

v1 ๊ธฐ๋ฐ˜์—์„œ ํ›ˆ๋ จ ์„ธํŠธ๋ฅผ ์•ฝ๊ฐ„ ๋Š˜๋ฆฌ๊ณ  30์‹œ๊ฐ„ ์ •๋„ ํŒŒ๋ผ๋ฏธํ„ฐ๋ฅผ ์‹œํ—˜ํ–ˆ์ง€๋งŒ, ํ›ˆ๋ จ๋œ ๊ทธ๋ฆผ์ฒด๋Š” ์—ฌ์ „ํžˆ ๊ณผ์ ํ•ฉ ํ˜„์ƒ์ด ์žˆ์–ด ์˜ˆ๋ฅผ ๋“ค๋ฉด ๋ฐฐ๊ผฝ์ด ๋‘ ๊ฐœ์ธ ํ˜„์ƒ์ด๋‚˜ ์—‰๋ง์ธ ๋จธ๋ฆฌ์นด๋ฝ์ด ๋‚˜ํƒ€๋‚ฌ์Šต๋‹ˆ๋‹ค.

On the basis of v1, the training set was slightly increased and went through about 30 hours of trial and error, but the trained art style still had some overfitting, such as double navel eyes and messy hair

v3

v3์˜ ์‚ฌ์ง€ ํ‘œํ˜„์€ v2๋ณด๋‹ค ์ข‹์•„์กŒ์œผ๋ฉฐ, footfocus ์ดํ•ด๋„ ํ–ฅ์ƒ์œผ๋กœ ์‹œ๊ฐ์  ์ž„ํŒฉํŠธ๊ฐ€ ํฐ ๋ฐœ๊ณผ ๋‚œ์ด๋„ ๋†’์€ ์›๊ทผ ์‹œ์ ์„ ์ƒ์„ฑํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. v3์˜ ๋จธ๋ฆฌ์นด๋ฝ AI ๋А๋‚Œ์€ v2๋ณด๋‹ค ์•ฝํ•˜๋ฉฐ, v2๋Š” ํ›ˆ๋ จ์…‹์ด ์ ์–ด ๊ณผ์ ํ•ฉ์ด ์žˆ์—ˆ๊ณ  ๊ฐ„ํ—์ ์œผ๋กœ ๋ณด์ด๋˜ ๋‘ ๋ฐฐ๊ผฝ ํ˜„์ƒ๋„ ์‚ฌ๋ผ์กŒ์Šต๋‹ˆ๋‹ค. ์ „์ฒด์ ์œผ๋กœ v2 ๋Œ€๋น„ 3๋ฐฐ ์ฆ๊ฐ€ํ•œ ํ›ˆ๋ จ์…‹ ํฌ๊ธฐ์™€ ํฐ dim ํŒŒ๋ผ๋ฏธํ„ฐ ๋•๋ถ„์— ๊ทธ๋ฆผ์ฒด๊ฐ€ ์ž์—ฐ์Šค๋Ÿฝ๊ฒŒ ์ ํ•ฉ๋˜๊ณ , ๊ธด ํ”„๋กฌํ”„ํŠธ์—์„œ์˜ ํ‘œํ˜„๋ ฅ์ด v2๋ณด๋‹ค ํ›จ์”ฌ ๋›ฐ์–ด๋‚ฉ๋‹ˆ๋‹ค.

The limbs of v3 are better than those of v2. In terms of understanding footfocus, v3 can generate feet with greater visual impact and higher difficulty perspective. The AI feeling of v3's hair is also weaker than that of v2, because v2 has too little training set, so the hair part may be slightly overfitting, and the occasional double navel eyes that appear in v2 are also gone. Overall, three times the size of the v2 training set and a larger dim parameter make the art style fit more natural, and the performance is much stronger than v2 under long prompts.

v3.5

์ด๋ฒˆ ๋ฒ„์ „์—์„œ๋Š” ํ’ˆ์งˆ ๋‹จ์–ด ์š”๊ตฌ๊ฐ€ ๋œ ์—„๊ฒฉํ•ด์ ธ์„œ, pony์˜ ๋ฏธํ•™ ์ ์ˆ˜ ํ’ˆ์งˆ ๋‹จ์–ด๋ฅผ ์‚ฌ์šฉํ•˜์ง€ ์•Š๊ณ ๋„ ์ถœ๋ ฅ์ด ๊ฐ€๋Šฅํ•ฉ๋‹ˆ๋‹ค. ํ…Œ์ŠคํŠธ ์ค‘ ๊ฐ„ํ—์ ์œผ๋กœ ์˜๋ฏธ ์—†๋Š” ์ƒ‰ ๋ธ”๋ก์ด ์ƒ์„ฑ๋  ์ˆ˜ ์žˆ์–ด, ๋ฏธํ•™ ํ’ˆ์งˆ ๋‹จ์–ด ๋Œ€์‹  score_1, score_2๋Š” worst quality๋กœ ๋Œ€์ฒดํ•˜๋Š” ๊ฑธ ๊ถŒ์žฅํ•ฉ๋‹ˆ๋‹ค. ์ €๋Š” ์•ฝ 150์žฅ์˜ ์ถ”๊ฐ€ ํ›ˆ๋ จ์…‹์„ ๋„ฃ์–ด ๊ทธ๋ฆผ์ฒด ๊ท ํ˜•๊ณผ ํ’๋ถ€ํ•จ์„ ๋”ํ–ˆ๊ณ , ํ•™์Šต ๊ณก์„  ์ดˆ๋ฐ˜ ๊ธฐ์šธ๊ธฐ๋ฅผ ๋‚ฎ์ถฐ ๊ณผ์ ํ•ฉ์„ ์ค„์˜€์Šต๋‹ˆ๋‹ค. ๋•๋ถ„์— ๋” ๋งŽ์€ lora์™€ ์ฐฝ์˜์ ์ธ ํ”„๋กฌํ”„ํŠธ๋ฅผ ํ™œ์šฉํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ์ „์ฒด์ ์œผ๋กœ ์ด ๋ฒ„์ „์€ v3๋ณด๋‹ค ์ž์œ ๋กœ์šด ๋ฒ„์ „์ด๋ฉฐ, ๋‚จ์„ฑ ํ‘œํ˜„๋ ฅ์ด ํ›จ์”ฌ ์šฐ์ˆ˜ํ•˜๊ณ , ์ผ๋ถ€ ํ”„๋กฌํ”„ํŠธ์—์„œ ์ƒ‰๊ฐ๊ณผ ๊ทธ๋ฆผ์ฒด๊ฐ€ ๋œ ๊ณผ์žฅ๋˜๊ณ  ๊ธฐ๋ฆ„์ง€์ง€ ์•Š์Šต๋‹ˆ๋‹ค.

In this version, the requirements for quality words are not so strict, you can completely not to use the quality words of pony's aesthetic score to plot the picture, and occasionally there will be a situation where the picture generates meaningless color blocks in the test, you only need to replace the quality words of the aesthetic score with 1.5 commonly used quality words, such as score_1, score_2 replace it with worst quality. In this version, I added about 150 more training sets to balance and enrich the art style, and reduced the initial slope of the learning curve, which makes this model less overfitted and can be adapted to more lora and whimsical prompts. Overall, this version is a freer version than the v3 version, and this version is much stronger than the v3 version, and the colors and style of painting under some hints are not so bright and greasy.

v4

์ด ๋ฒ„์ „์€ 798์žฅ์˜ ์ด๋ฏธ์ง€๋ฅผ ํ›ˆ๋ จ ์†Œ์žฌ๋กœ ์‚ฌ์šฉํ–ˆ๊ณ , 3090 ๊ทธ๋ž˜ํ”ฝ์นด๋“œ๋กœ 90์‹œ๊ฐ„ ํ›ˆ๋ จํ–ˆ์Šต๋‹ˆ๋‹ค. v3.5 ๋Œ€๋น„ ํŠน์ • ํ”„๋กฌํ”„ํŠธ์—์„œ ๊ตฌ๋„ ๋ฐ ์ผ๋ถ€ ๋ถ€์œ„ ๋ฌ˜์‚ฌ๊ฐ€ ๋” ์ •ํ™•ํ•ด์กŒ์œผ๋ฉฐ, ์˜ˆ๋ฅผ ๋“ค์–ด ์†๊ฐ€๋ฝ ์ž”์ƒ ๋ฐ ์ผ๋ถ€ ์‹ ์ฒด ๋ถ€์œ„ ๊ฒน์นจ์„ ๊ฐœ์„ ํ–ˆ์Šต๋‹ˆ๋‹ค. ์ €๋Š” ์ค‘๊ฐ„ ๊ธธ์ด์™€ ๋‹ค์†Œ ์งง์€ ๊ธธ์ด์˜ ํ”„๋กฌํ”„ํŠธ๋ฅผ ์ฃผ์š” ํ›ˆ๋ จ ๋ชฉํ‘œ๋กœ ์‚ผ์•˜์Šต๋‹ˆ๋‹ค. ๊ธด ํ”„๋กฌํ”„ํŠธ๋ฅผ ์ž‘์„ฑํ•ด์•ผ๋งŒ ๊ณ ํ’ˆ์งˆ ์ด๋ฏธ์ง€๋ฅผ ์ƒ์„ฑํ•˜๋Š” ๊ฑธ ์ข‹์•„ํ•˜๋Š” ์‚ฌ๋žŒ์€ ์—†์œผ๋‹ˆ๊นŒ์š”. pony์˜ ๋ฏธํ•™ ์ ์ˆ˜ ํ’ˆ์งˆ ํ”„๋กฌํ”„ํŠธ๋ฅผ ์ œ๊ฑฐํ•œ ํ›„, ์ด๋ฏธ์ง€ ํ’ˆ์งˆ์ด v3.5 ๋Œ€๋น„ ํฌ๊ฒŒ ํ–ฅ์ƒ๋˜์—ˆ์œผ๋ฉฐ, ๊ฒฐ๊ณผ๋ฌผ์€ ์ž…์ฒด์ ์ด๊ธฐ๋ณด๋‹ค ๋”์šฑ ํ‰๋ฉด์ ์ธ ํด๋ž˜์‹ ์• ๋‹ˆ๋ฉ”์ด์…˜ ์Šคํƒ€์ผ์— ๊ฐ€๊น์Šต๋‹ˆ๋‹ค. ponyv6 ๋ฏธ์„ธ ์กฐ์ • ํšจ๊ณผ์— ๋Œ€ํ•œ ์ด๋ฏธ์ง€ ์ˆ˜ ๊ฒ€์ฆ์ด ๊ฑฐ์˜ ์™„๋ฃŒ๋˜์—ˆ์œผ๋ฉฐ, ๋‹ค์Œ ๋‹จ๊ณ„๋กœ๋Š” ํ”„๋กฌํ”„ํŠธ ํ›ˆ๋ จ ํƒœ๊ทธ๋ถ€ํ„ฐ ์‹œ์ž‘ํ•˜์—ฌ ํ•œ์ •๋œ pony ํ•™์Šต ์†Œ์Šค ๋‚ด์—์„œ ๋” ๋งŽ์€ ์กฐ์ • ๊ฐ€๋Šฅํ•œ ํ”„๋กฌํ”„ํŠธ๋ฅผ ์ถ”๊ฐ€ํ•˜๋ ค ํ•˜๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค(์˜ˆ: ๋ฏธํ•™ ์ ์ˆ˜ ์ถ”๊ฐ€, ํ˜„์žฌ ํ›ˆ๋ จ ๋…ผ๋ฆฌ๋Š” ์ฃผ๋ฅ˜ ํ’ˆ์งˆ ๋‹จ์–ด๋กœ pony ๋ฏธํ•™ ํ’ˆ์งˆ ๋‹จ์–ด๋ฅผ ๋ฎ๊ณ  ์žˆ์Œ). ๋˜ํ•œ ์ ์ ˆํ•œ ์‹ ๊ทœ ํ›ˆ๋ จ ์†Œ์žฌ(์žฅ๋ฉด ๋ฐ ๋” ๋งŽ์€ ๋ฐœ ๋ถ€๋ถ„ ์†Œ์žฌ)๋ฅผ ๊ณ„์† ์ถ”๊ฐ€ํ•  ์˜ˆ์ •์ž…๋‹ˆ๋‹ค(v4 ๋ฐœ ์†Œ์žฌ๋Š” ๋‹ค์†Œ ๋ถ€์กฑํ•œ ๊ฒƒ์œผ๋กœ ๋ณด์ž„).

This version used 798 images as training materials and trained for 90 hours using a 3090 graphics card. This version has a more accurate composition and depiction of certain parts in certain prompts compared to v3.5, such as ghosting of fingers and overlapping of some body parts. In terms of prompts, my main training goal is to use medium and slightly shorter prompts, as nobody likes to write a long string of prompts to generate high-quality images, right? After removing the quality prompt of Pony's aesthetic score, the image quality has been significantly improved compared to v3.5, and the resulting quality tends to be more flat rather than three-dimensional, closer to the classic anime style. The testing of the fine-tuning effect of Ponyv6 on the number of images is nearing completion. The next step is to start with the training labels of prompts and try to add more adjustable prompts to Pony's limited number of single training materials (such as adding aesthetic scores, the current training logic still uses mainstream quality words to cover Pony's aesthetic score quality words), and continue to add suitable new training materials, such as scene training materials and more foot training materials (v4's foot training materials seem to be a bit scarce).

v4.1

์‚ฌ์šฉ์ž ์—ฌ๋Ÿฌ๋ถ„๊ป˜ ์งง์€ ๊ธฐ๊ฐ„์— ์ƒˆ ๋ฒ„์ „์„ ์ถœํ•˜ํ•˜๊ฒŒ ๋œ ์  ์‚ฌ๊ณผ๋“œ๋ฆฝ๋‹ˆ๋‹ค. ์ปดํ“จํ„ฐ ๋ฉ”๋ชจ๋ฆฌ์™€ ๋„คํŠธ์›Œํฌ ์†๋„์— ํฐ ๋ถ€๋‹ด์ด ๋  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. O_O

Firstly, I would like to apologize to all users for the release of a new version in such a short period of time, which greatly tests the computer's memory and network speed. O_O

์ด ์ƒˆ ๋ฒ„์ „์€ v4์˜ ์‚ฌ์ง€ ๋””๋ฒ„๊น… ๋ฒ„์ „์ž…๋‹ˆ๋‹ค. v4๋Š” ์‚ฌ์ง€ ํšจ๊ณผ ์กฐ์ ˆ์ด ๋งค์šฐ ์–ด๋ ค์›Œ ์†์˜ ์™„์„ฑ๋„๊ฐ€ ์ œ ํ…Œ์ŠคํŠธ ๊ธฐ๋Œ€์— ๋ชป ๋ฏธ์ณค์Šต๋‹ˆ๋‹ค. ๊ทธ๋ž˜์„œ ์นœ๊ตฌ์ธ ๆœจ็Œซ็Œซ็Œซ์™€ ํ•จ๊ป˜ v4๋ฅผ ์ผ๋ถ€ ์กฐ์ • ๋ฐ ๊ฐœ์„ ํ•˜์—ฌ v4.1์˜ ์‚ฌ์ง€๊ฐ€ ์ œ ๊ธฐ๋Œ€์— ๋ถ€ํ•ฉํ•˜๋„๋ก ํ–ˆ๊ณ , ๋™์ผํ•œ ํŒŒ๋ผ๋ฏธํ„ฐ์—์„œ v4 ๋Œ€๋น„ v4.1์˜ ๊ฐœ์„  ์ •๋„๋ฅผ ๋ถ„๋ช…ํžˆ ๋ณด์—ฌ์ฃผ๋Š” ์—ฌ๋Ÿฌ xy ์ด๋ฏธ์ง€๋ฅผ ๊ณต๊ฐœํ•  ์˜ˆ์ •์ž…๋‹ˆ๋‹ค.

This new version is based on the limb debugging version of v4. Due to the difficulty in controlling the limb effects of v4, the perfection rate of the hands did not meet my testing expectations in the past few days. So my friend ๆœจ็Œซ็Œซ็Œซ and I made some adjustments and improvements to v4, which ultimately made the limbs of v4.1 meet my expectations. I will release several xy graphs to clearly show the improvement of v4.1 compared to v4 under the same parameters.

v5

์ด ๋ฒ„์ „์˜ ํ›ˆ๋ จ ์†Œ์žฌ๋Š” ์ค„์—ˆ์œผ๋ฉฐ, v4 ์‹คํŒจ ํ›„ ์ €๋Š” ๋ฉ”๋ชจ๋ฆฌ ์ ์œ ๊ฐ€ ์ ์€ ๊ด€์ ์—์„œ ์•„์ด๋””์–ด๋ฅผ ํ…Œ์ŠคํŠธํ•˜๊ธฐ ์œ„ํ•ด T-ponynai3์šฉ ๋„ค ๊ฐ€์ง€ ๋‹ค๋ฅธ ์Šคํƒ€์ผ Lora๋ฅผ ํ›ˆ๋ จํ•˜๋Š” ํ”„๋กœ์ ํŠธ๋ฅผ ์‹œ์ž‘ํ–ˆ์Šต๋‹ˆ๋‹ค. ๋ฌผ๋ก  ์›๋ณธ ๋ชจ๋ธ๋„ civitai์— ์—…๋กœ๋“œํ–ˆ์Šต๋‹ˆ๋‹ค. ์ ํ•ฉ์„ฑ์„ ํ…Œ์ŠคํŠธํ•œ ํ›„ ์ด ๋„ค ๊ฐ€์ง€ ์Šคํƒ€์ผ์„ ์ฒจ๊ฐ€์ œ๋กœ T-ponynai3-v5์— ํ›ˆ๋ จํ–ˆ์Šต๋‹ˆ๋‹ค. ๋†€๋ž๊ฒŒ๋„ v5์˜ ์„  ๋ฌ˜์‚ฌ๋Š” ํฌ๊ฒŒ ํ–ฅ์ƒ๋˜์—ˆ๋Š”๋ฐ, ์ด๋Š” ์„ฌ์„ธํ•œ ์†Œ์žฌ๋ฅผ ํ›ˆ๋ จํ•œ ๋•๋ถ„์ž…๋‹ˆ๋‹ค. ๋„ค ๊ฐ€์ง€ ์Šคํƒ€์ผ์„ style_1๋ถ€ํ„ฐ style_4๊นŒ์ง€ ํ”„๋กฌํ”„ํŠธ ๋‹จ์–ด๋กœ ๋งˆํ‚นํ–ˆ์œผ๋‚˜, ์•„์‰ฝ๊ฒŒ๋„ ์ด ๋„ค ๊ฐ€์ง€ ์Šคํƒ€์ผ์ด ๊ฐœ๋ณ„ ๋ถ„๋ฆฌ๋˜์ง€ ์•Š๊ณ  ์›๋ณธ ์Šคํƒ€์ผ์— ์ž˜ ์œตํ•ฉ๋˜์—ˆ์Šต๋‹ˆ๋‹ค. ์—ฌ๋Ÿฌ ์Šคํƒ€์ผ์„ ์ง€์›ํ•˜์ง„ ๋ชปํ–ˆ์ง€๋งŒ ์›๋ž˜ nai3 ์Šคํƒ€์ผ ์งˆ๊ฐ์€ ํ•œ ๋‹จ๊ณ„ ์˜ฌ๋ผ๊ฐ”์œผ๋ฉฐ, ๋‹ค์Œ ๋ฒ„์ „์—์„œ ๋” ์ง„์ „์‹œํ‚ฌ ์ˆ˜ ์žˆ์„ ๊ฒƒ ๊ฐ™์Šต๋‹ˆ๋‹ค. (์ €๋Š” ๊ฒŒ์ž„์„ ๋งค์šฐ ์ข‹์•„ํ•ด์„œ, ํ›ˆ๋ จํ•  ๋•Œ ์ปดํ“จํ„ฐ ๊ฒŒ์ž„์„ ๋ชป ํ•˜๋Š” ๊ฒŒ ๋„ˆ๋ฌด ํž˜๋“ญ๋‹ˆ๋‹ค.)

The training materials for this version have been reduced. Due to the failure of v4, I launched another project to test my idea from a small perspective of memory usage, which is to train four different art styles of Lora adapted to T-ponynai3. Of course, the original model was also uploaded to Civitai. After testing the adaptability, I started training these four different art styles as additives into T-ponynai3-v5. Surprisingly, The line texture of v5 has improved to a high level, probably because I trained a very delicate material. For the marking of these four art styles, I used the prompt words from style_1 to style_4. Unfortunately, for some reason, these four art styles were not separated or the effect was weak, but rather integrated well into the original art style. Although it did not achieve the goal of supporting multiple art styles, it effectively elevated the texture of the original Nai3 art style to a higher level. Perhaps the next version can try to take it even further. (I really enjoy playing games, and it's too difficult for me to play computer games every time I train.)

v5 ๋ฒ„์ „ ๊ด€๋ จ ๋ช‡ ๊ฐ€์ง€ ๋ฌธ์ œ ์š”์•ฝ.

1. Lora ํ˜ธํ™˜์„ฑ ๋ฐ ์‚ฌ์ง€์™€ ํ๋ฆฟํ•œ ๋ˆˆ ๋ฌธ์ œ. Lora ํ˜ธํ™˜์„ฑ ๋ฌธ์ œ๋Š” ์ด๋ฒˆ ํ›ˆ๋ จ์— ๋„ˆ๋ฌด ๋†’์€ ์ตœ์ข… ๊ฐ€์ค‘์น˜๋ฅผ ์‚ฌ์šฉํ•œ ๊ฒƒ์—์„œ ๊ธฐ์ธํ–ˆ์œผ๋ฉฐ, ์ผ๋ถ€ ์ƒํ™ฉ์—์„œ ๊ณผ์ ํ•ฉ์ด ๋ฐœ์ƒํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ์ด ์ตœ์ ํ™” ๋ฒ„์ „์€ ํ•ด๋‹น ๊ฐ€์ค‘์น˜๋ฅผ ๋‚ฎ์ถ˜ ๋ฒ„์ „์œผ๋กœ, ์‚ฌ์ง€ ๋ถ•๊ดด์œจ๊ณผ ์ผ๋ถ€ Lora ํ˜ธํ™˜์„ฑ์ด ๊ฐœ์„ ๋˜์—ˆ์Šต๋‹ˆ๋‹ค. v4.1 ๊ธฐ๋ฐ˜ ์Šคํƒ€์ผ Lora๋ฅผ ์‚ฌ์šฉํ•œ ๋น„๊ต ๊ทธ๋ฆผ์„ ์ฐธ๊ณ ์šฉ์œผ๋กœ ์—ฌ๋Ÿฌ ์žฅ ์ค€๋น„ํ–ˆ์Šต๋‹ˆ๋‹ค. ๋ˆˆ ํ๋ฆผ ๋ฌธ์ œ๋Š” style_1 ํ›ˆ๋ จ์— ์‚ฌ์šฉํ•˜๋Š” ์›๋ณธ ์†Œ์žฌ ๋ˆˆ์ด ํ๋ ธ๊ธฐ ๋•Œ๋ฌธ์ด๋ฉฐ, style_3 ๋˜๋Š” 4๋ฅผ ์‚ฌ์šฉํ•˜๋ฉด ๊ฐœ์„ ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

2. ๋ณผ๋ฅจ ๋ผ์ดํŠธ์˜ ๋…ธ์ถœ ๋ฌธ์ œ. ํ…Œ์ŠคํŠธ ์ค‘ ์ด ๋ฌธ์ œ๋ฅผ ๊ฒช์ง€ ๋ชปํ–ˆ๋Š”๋ฐ, ์›์ธ์€ noise offset ํ›ˆ๋ จ ํŒŒ๋ผ๋ฏธํ„ฐ ์‚ฌ์šฉ์œผ๋กœ ๋ชจ๋ธ์˜ ๋น› ๊ด€๋ จ ํ”„๋กฌํ”„ํŠธ ๋ฏผ๊ฐ๋„๊ฐ€ ์˜ฌ๋ผ๊ฐ€ ๊ฐ™์€ ๊ฐ€์ค‘์น˜ ๋น› ํ”„๋กฌํ”„ํŠธ์˜ ๊ฒฐ๊ณผ๊ฐ€ ๋” ๋ฐ์•„์ง„ ๊ฒƒ์œผ๋กœ ๋ณด์ž…๋‹ˆ๋‹ค. ๊ด„ํ˜ธ๋‚˜ ์ˆซ์ž๋ฅผ ์‚ฌ์šฉํ•ด ๊ฐ€์ค‘์น˜๋ฅผ ๋†’์ด์ง€ ์•Š๋Š” ๊ฒƒ์„ ์‹œ๋„ํ•ด๋ณด๊ธธ ๊ถŒ์žฅํ•ฉ๋‹ˆ๋‹ค. sdxl์€ ํ”„๋กฌํ”„ํŠธ ๋‹จ์–ด์— ๋ฏผ๊ฐํ•˜๋ฏ€๋กœ ๋˜‘๊ฐ™์€ ๋‹จ์–ด๋ฅผ ์—ฌ๋Ÿฌ ๋ฒˆ ๋ฐ˜๋ณตํ•ด ๊ทน๋‹จ์  ๊ฒฐ๊ณผ๋ฅผ ์ค„์ด๋Š” ๊ฒƒ๋„ ๋ฐฉ๋ฒ•์ž…๋‹ˆ๋‹ค. ์ด ํŒŒ๋ผ๋ฏธํ„ฐ๋Š” ์ ์€ ํ”„๋กฌํ”„ํŠธ ๋‹จ์–ด ํ•˜์—์„œ ์ƒ์„ฑ ๊ฒฐ๊ณผ๊ฐ€ ๋…ธ๋ž—๊ฒŒ ๋˜๋Š” ๋ฌธ์ œ๋ฅผ ๊ณ ์น˜๊ธฐ ์œ„ํ•œ ๊ฒƒ์œผ๋กœ, ๋น„๊ต ๊ทธ๋ฆผ๋„ ์ฒจ๋ถ€ํ–ˆ์Šต๋‹ˆ๋‹ค.

3. ๋ชจ๋ธ ๋ณต์žก๋„ ์ €ํ•˜ ๋ฌธ์ œ. ์ด๋ก ๊ณผ ์‹คํ—˜ ๋ชจ๋‘ v5๊ฐ€ ์ด์ „ ๋ฒ„์ „๋ณด๋‹ค ๋” ๊น”๋”ํ•˜๊ณ  ๋‹ค์–‘ํ•˜๋ฉฐ, ์ผ๋ถ€ ํ”„๋กฌํ”„ํŠธ ํ™œ์šฉ ์‹œ ํ‘œํ˜„๋ ฅ์ด ๋” ์ •ํ™•ํ•ด์กŒ์Šต๋‹ˆ๋‹ค. ๋น„๊ต ๊ทธ๋ฆผ๋„ ์ค€๋น„ํ–ˆ์Šต๋‹ˆ๋‹ค. ์ด๋ฒˆ ํ›ˆ๋ จ์…‹์—๋Š” ์ง€๋‚˜์น˜๊ฒŒ ๋ณต์žกํ•œ ์†Œ์žฌ๋ฅผ ์‚ฌ์šฉํ•˜์ง€ ์•Š์•˜๋Š”๋ฐ, ๋ณต์žกํ•œ ์ด๋ฏธ์ง€๋Š” ๊ณผ์ ํ•ฉ ๊ฒฝํ–ฅ์„ ์œ ๋ฐœํ•ด ์„ธ๋ถ€ ๋ฌ˜์‚ฌ ์†์‹ค์„ ์ดˆ๋ž˜ํ•  ์ˆ˜ ์žˆ๋‹ค๊ณ  ํŒ๋‹จํ–ˆ๊ธฐ ๋•Œ๋ฌธ์ž…๋‹ˆ๋‹ค.

๋ชฉ์ : ์ด์ „ ๋ฒ„์ „๊ณผ ์ถฉ๋ถ„ํžˆ ์ฐจ๋ณ„ํ™”๋˜๋Š” ๋ชจ๋ธ์„ ์–ป๊ณ ์ž ํ•˜๋ฉฐ, ๊ฑฐ์˜ ๋˜‘๊ฐ™์€ ๋ชจ๋ธ์„ ์ถœ์‹œํ•˜๊ณ  ์‹ถ์ง€ ์•Š์Šต๋‹ˆ๋‹ค. ์—ฌ๋Ÿฌ๋ถ„์˜ ํ”ผ๋“œ๋ฐฑ์€ ์ข‹์€ ์‹œํ–‰์ฐฉ์˜ค ๊ธฐํšŒ์ด๋ฉฐ, ์ € ํ˜ผ์ž์„œ๋Š” ์‹œ๋„ ๋น„์šฉ์ด ์ ์Šต๋‹ˆ๋‹ค. ๋‹ค์Œ ๋ฒ„์ „์—์„œ๋Š” ์„œ๋กœ ๋‹ค๋ฅธ ์Šคํƒ€์ผ ์†Œ์žฌ๋Ÿ‰์„ ๋Š˜๋ ค ๋ชจ๋“  ์†Œ์žฌ ์Šคํƒ€์ผ์„ ์ž˜ ์œตํ•ฉํ•˜๊ณ  ๋ถ„๋ฆฌํ•ด, ํŠน์ • ํ”„๋กฌํ”„ํŠธ๋กœ ์Šคํƒ€์ผ์„ ์ „ํ™˜ํ•  ์ˆ˜ ์žˆ๊ฒŒ ํ•  ๊ณ„ํš์ž…๋‹ˆ๋‹ค. ์•„๋งˆ ์ƒˆ๋กœ์šด ํ›ˆ๋ จ ๊ธฐ์ˆ ์ด ํ•„์š”ํ•  ๊ฒƒ์ž…๋‹ˆ๋‹ค. ํ”ผ๋“œ๋ฐฑ ๊ฐ์‚ฌํ•ฉ๋‹ˆ๋‹ค!

Summarize some issues regarding the v5 version.

1, Lora compatibility and issues with limbs and blurred eyes. Lora compatibility is that I used too much final weight for this training, and in some cases, overfitting may occur. This optimized version is the one that reduces the corresponding weight, and the limb collapse rate and compatibility with some Loras should be better. I have run several comparison charts of Loras trained with v4.1 for reference. The problem of blurred eyes should be the reason why I trained style_1. The eyes in the original material used are blurry, and can be improved by using style_3 or 4.

2. Exposure issues with volume light. I did not encounter this issue during testing, and the reason for it should be that I used the noise offset training parameter to increase the sensitivity of the model to light related prompt words, resulting in brighter results when the same weight of light prompt words were used. I suggest trying not to use parentheses and numbers to increase the weight. Due to the sensitivity of sdxl to prompt words, you can try repeating the same prompt words multiple times to avoid extreme results. At the same time, using this parameter is to fix the problem of generating yellow results under a small number of prompt words. I have run several comparison graphs for reference.

3. The problem of reduced model complexity. In theory and in practice. V5 should be a cleaner and more diverse model than the previous version, and with the help of some prompts, it should be able to achieve more accurate performance. Similarly, I ran several comparison charts for comparison. This training set did not use overly complex materials because I believe that overly complex images tend to overfit the results, which inevitably leads to a certain degree of detail loss.

Purpose: I hope to obtain a model that is significantly different from the previous version, rather than releasing a model that is almost identical to the previous version. This feedback from everyone is a great opportunity for trial and error, and I really don't have any trial and error costs on my own. In the next version, I will try to increase the amount of materials for different art styles, so that the art styles of different materials can be well integrated and separated. Using specific prompts to switch art styles may require some new training techniques. Thank you for your feedback!

์ด์ „
Pony Realism - v21 Lightning 4S VAE
๋‹ค์Œ
Tponynai3 - v51weight ์ตœ์ ํ™”

๋ชจ๋ธ ์„ธ๋ถ€์‚ฌํ•ญ

๋ชจ๋ธ ์œ ํ˜•

Checkpoint

๊ธฐ๋ณธ ๋ชจ๋ธ

Pony

๋ชจ๋ธ ๋ฒ„์ „

v5.5

๋ชจ๋ธ ํ•ด์‹œ

89e7c7518c

์ œ์ž‘์ž

ํ† ๋ก 

๋Œ“๊ธ€์„ ๋‚จ๊ธฐ๋ ค๋ฉด log inํ•˜์„ธ์š”.

Tponynai3 - v55 ์ œ์ž‘ ์ด๋ฏธ์ง€

์• ๋‹ˆ๋ฉ” ์ด๋ฏธ์ง€

๊ธฐ๋ณธ ๋ชจ๋ธ ์ด๋ฏธ์ง€