모델/Tponynai3 - v55

Tponynai3 - v55

김지훈 (Kim Ji-hoon)

5/23/2025

1:38:39 AM

| Discussion|

팁

중간 해상도에서 고해상도 수정 기능을 사용하여 최상의 결과를 얻으세요.

눈 디테일 개선을 위해 style_3 또는 4를 시도해보세요.

버전 하이라이트

이 버전은 5.1의 최적화 버전으로, 눈 디테일, 발 합리성, 프롬프트 민감도, 사지 겹침 합리성을 개선했습니다. 다만 화면 명암 처리에서는 여전히 제 기대에 완전히 미치지 못했습니다. 제 테스트 결과 style_4 사용 시 화면이 더 어두워져 단기적 해결책일 수 있습니다. 직접 훈련한 결과가 좋지 않아 추가 훈련을 진행했고, 이로 인해 시간이 조금 낭비되었습니다. 추가 문제가 있으면 반드시 댓글에 알려주세요!

This version is an optimization to 5.1, optimizing the details of the eyes, the rationality of the feet, the sensitivity to cues, and the rationality of limb overlap. In my tests, the use of style_4 made the picture darker, perhaps a short-term solution, and I used some additional training, because the direct training did not work so well, so I wasted some time. If you have more questions, please be sure to let me know in the comments section!

크리에이터 스폰서

[미인증] Tonade는 T-ponynai3 모델 제작자이며, c사이트 ID는 Tonade입니다. | 사랑 전력 (afdian.net )

여기는 사랑 전력(afdian) 후원 채널입니다. 모델이 마음에 들고 여유가 있으시면 지원 부탁드립니다! 무리하지 마시고 여러분의 작은 지원도 감사드립니다. 계속해서 모델을 개선하도록 노력하겠습니다!

929721518 본인의 qq 소그룹 번호입니다. tpony 관련 궁금한 점이 있으면 들어와서 질문하세요. c사이트임을 꼭 밝혀주세요.

이 모델은 이미 내장된 vae가 있으니 별도의 vae를 추가할 필요가 없습니다.

The model already has included vae, there is no need to add additional vae

최적의 이미지 생성 전략은 중간 해상도에서 고해상도 수정 기능을 사용하는 것이며, 직접 고해상도로 출력하는 것은 지양해야 합니다.

The best generate strategy is to use high-fix at a moderate resolution, rather than directly using high-resolution direct output

[미인증] Tonade는 T-ponynai3 모델의 제작자이며, c사이트ID는 Tonade입니다. | 사랑 전력 (afdian.net )

여기는 사랑 전력(afdian) 후원 채널입니다. 모델이 도움이 되고 여유가 있으시다면 지원 부탁드립니다! 무리하지 않으셨으면 합니다. 여러분의 모든 지원에 감사드리며, 모델을 계속해서 개선하는 데 힘쓰겠습니다!

(33) T-ponynai3-v5 - (가중치 수정 버전) | Stable Diffusion 체크포인트 | 토스트 tusi.cn (tusiart.com) tusiart(중국 버전 tensor) 온라인 생성 링크

(이 모델은 Tusi와 Tensor 양쪽에 동시에 존재할 수 있어서 Tusi에서 사용하는 것이 좋습니다. 사용에 문제가 있으면 알려주세요)

v5 버전에는 4개의 새로운 스타일이 추가되었으며, style_1부터 style_4까지를 통해 이미지 세부를 미세 조정할 수 있습니다 (이론상 그렇지만 실제 효과는 다소 신비롭거나 낮을 수 있습니다)

V5 version has added 4 new styles, which can be used to fine tune the details of the image through style_1 to style_4 (theoretically, this is the case, but the actual effect is more mystical or lower)

이 모델은 ponyv6를 기반으로 학습된 lora 모델을 완벽하게 지원하며, ani3와 sdxl1.0의 lora도 일정 부분 적용 가능합니다.

This model perfectly supports lora trained with ponyv6 as the base model, and the Lora of ani3 and sdxl1.0 can also be adapted to some extent.

v4.1 기반 이미지 인페인트 테스트 (이전 버전에서 간과된 부분임)

Image inpaint testing based on v4.1 (this is a previously overlooked part)

pony는 신이며, 호환성이 매우 뛰어납니다. 이 모델은 ani와 pony의 lora를 지원합니다.

필수 전제 효과 단어는 ponydiffusion과 같습니다.

positive:(score_9,score_8_up,score_7_up,score_6_up,score_5_up,score_4_up)

또는 (score_9,score_8_up,score_7_up)

부정단어도 추가 가능합니다：

negative: (score_4,score_3,score_2,score_1),

일반적인 nai 계열 부정 단어도 추가 가능합니다. 예:

negative: worst quality, bad hands, bad feet

마음에 드셨으면 좋겠습니다 ᕕ(◠ڼ◠)ᕗ nai3와 ponyv6 기반

훈련 안내: v1는 94장의 이미지를 사용했고, v2는 119장, v3는 348장, v3.5는 474장의 nai3 생성 이미지를 사용해 lora를 기저 모델에 병합하여 미세 조정했습니다. ponyv6가 지원하는 화가 태그는 모두 지원하며, 두 개 이상의 화가 태그 사용 시 배경崩壊 문제가 발생할 수 있습니다. 현재 원신 캐릭터는 생성 가능하나 다른 캐릭터는 미지수이며, 이 모델에 대해 많은 테스트는 하지 않았습니다. T-anime-xl, ponyv6, ani3를 융합한 모델이며 아직 정식 출시되지 않았습니다.

사용된 그래픽 카드는 제 개인 3090이며, v1부터 v3까지 각각 7시간, 12시간, 35시간, 47시간 훈련했습니다.

Training Instructions: v1 94장, v2 119장, v3 348장, v3.5 474장의 nai3 생성 이미지를 사용해 기저 모델을 미세 조정했습니다. ponyv6가 지원하는 모든 화가 태그를 지원하지만 nai3에서 추가 화가 태그는 없습니다. 두 개 이상의 화가 태그 사용 시 배경崩壊가 발생할 수 있습니다. 현재 원신 캐릭터 생성이 가능하며, 다른 것은 미확인 상태입니다. 이 모델에 대한 테스트는 많지 않으며, nai3 그림체 재현이 뛰어납니다. 기저 모델은 T-anime-xl, ponyv6, ani3 융합 모델로 아직 미공개입니다.

제가 사용한 그래픽 카드는 제 개인 3090이며, v1부터 v3.5까지 각각 7시간, 12시간, 35시간, 47시간 훈련했습니다.

v1

재미있는 시도였습니다.

An interesting attempt

v2

v1 기반에서 훈련 세트를 약간 늘리고 30시간 정도 파라미터를 시험했지만, 훈련된 그림체는 여전히 과적합 현상이 있어 예를 들면 배꼽이 두 개인 현상이나 엉망인 머리카락이 나타났습니다.

On the basis of v1, the training set was slightly increased and went through about 30 hours of trial and error, but the trained art style still had some overfitting, such as double navel eyes and messy hair

v3

v3의 사지 표현은 v2보다 좋아졌으며, footfocus 이해도 향상으로 시각적 임팩트가 큰 발과 난이도 높은 원근 시점을 생성할 수 있습니다. v3의 머리카락 AI 느낌은 v2보다 약하며, v2는 훈련셋이 적어 과적합이 있었고 간헐적으로 보이던 두 배꼽 현상도 사라졌습니다. 전체적으로 v2 대비 3배 증가한 훈련셋 크기와 큰 dim 파라미터 덕분에 그림체가 자연스럽게 적합되고, 긴 프롬프트에서의 표현력이 v2보다 훨씬 뛰어납니다.

The limbs of v3 are better than those of v2. In terms of understanding footfocus, v3 can generate feet with greater visual impact and higher difficulty perspective. The AI feeling of v3's hair is also weaker than that of v2, because v2 has too little training set, so the hair part may be slightly overfitting, and the occasional double navel eyes that appear in v2 are also gone. Overall, three times the size of the v2 training set and a larger dim parameter make the art style fit more natural, and the performance is much stronger than v2 under long prompts.

v3.5

이번 버전에서는 품질 단어 요구가 덜 엄격해져서, pony의 미학 점수 품질 단어를 사용하지 않고도 출력이 가능합니다. 테스트 중 간헐적으로 의미 없는 색 블록이 생성될 수 있어, 미학 품질 단어 대신 score_1, score_2는 worst quality로 대체하는 걸 권장합니다. 저는 약 150장의 추가 훈련셋을 넣어 그림체 균형과 풍부함을 더했고, 학습 곡선 초반 기울기를 낮춰 과적합을 줄였습니다. 덕분에 더 많은 lora와 창의적인 프롬프트를 활용할 수 있습니다. 전체적으로 이 버전은 v3보다 자유로운 버전이며, 남성 표현력이 훨씬 우수하고, 일부 프롬프트에서 색감과 그림체가 덜 과장되고 기름지지 않습니다.

In this version, the requirements for quality words are not so strict, you can completely not to use the quality words of pony's aesthetic score to plot the picture, and occasionally there will be a situation where the picture generates meaningless color blocks in the test, you only need to replace the quality words of the aesthetic score with 1.5 commonly used quality words, such as score_1, score_2 replace it with worst quality. In this version, I added about 150 more training sets to balance and enrich the art style, and reduced the initial slope of the learning curve, which makes this model less overfitted and can be adapted to more lora and whimsical prompts. Overall, this version is a freer version than the v3 version, and this version is much stronger than the v3 version, and the colors and style of painting under some hints are not so bright and greasy.

v4

이 버전은 798장의 이미지를 훈련 소재로 사용했고, 3090 그래픽카드로 90시간 훈련했습니다. v3.5 대비 특정 프롬프트에서 구도 및 일부 부위 묘사가 더 정확해졌으며, 예를 들어 손가락 잔상 및 일부 신체 부위 겹침을 개선했습니다. 저는 중간 길이와 다소 짧은 길이의 프롬프트를 주요 훈련 목표로 삼았습니다. 긴 프롬프트를 작성해야만 고품질 이미지를 생성하는 걸 좋아하는 사람은 없으니까요. pony의 미학 점수 품질 프롬프트를 제거한 후, 이미지 품질이 v3.5 대비 크게 향상되었으며, 결과물은 입체적이기보다 더욱 평면적인 클래식 애니메이션 스타일에 가깝습니다. ponyv6 미세 조정 효과에 대한 이미지 수 검증이 거의 완료되었으며, 다음 단계로는 프롬프트 훈련 태그부터 시작하여 한정된 pony 학습 소스 내에서 더 많은 조정 가능한 프롬프트를 추가하려 하고 있습니다(예: 미학 점수 추가, 현재 훈련 논리는 주류 품질 단어로 pony 미학 품질 단어를 덮고 있음). 또한 적절한 신규 훈련 소재(장면 및 더 많은 발 부분 소재)를 계속 추가할 예정입니다(v4 발 소재는 다소 부족한 것으로 보임).

This version used 798 images as training materials and trained for 90 hours using a 3090 graphics card. This version has a more accurate composition and depiction of certain parts in certain prompts compared to v3.5, such as ghosting of fingers and overlapping of some body parts. In terms of prompts, my main training goal is to use medium and slightly shorter prompts, as nobody likes to write a long string of prompts to generate high-quality images, right? After removing the quality prompt of Pony's aesthetic score, the image quality has been significantly improved compared to v3.5, and the resulting quality tends to be more flat rather than three-dimensional, closer to the classic anime style. The testing of the fine-tuning effect of Ponyv6 on the number of images is nearing completion. The next step is to start with the training labels of prompts and try to add more adjustable prompts to Pony's limited number of single training materials (such as adding aesthetic scores, the current training logic still uses mainstream quality words to cover Pony's aesthetic score quality words), and continue to add suitable new training materials, such as scene training materials and more foot training materials (v4's foot training materials seem to be a bit scarce).

v4.1

사용자 여러분께 짧은 기간에 새 버전을 출하하게 된 점 사과드립니다. 컴퓨터 메모리와 네트워크 속도에 큰 부담이 될 수 있습니다. O_O

Firstly, I would like to apologize to all users for the release of a new version in such a short period of time, which greatly tests the computer's memory and network speed. O_O

이 새 버전은 v4의 사지 디버깅 버전입니다. v4는 사지 효과 조절이 매우 어려워 손의 완성도가 제 테스트 기대에 못 미쳤습니다. 그래서 친구인 木猫猫猫와 함께 v4를 일부 조정 및 개선하여 v4.1의 사지가 제 기대에 부합하도록 했고, 동일한 파라미터에서 v4 대비 v4.1의 개선 정도를 분명히 보여주는 여러 xy 이미지를 공개할 예정입니다.

This new version is based on the limb debugging version of v4. Due to the difficulty in controlling the limb effects of v4, the perfection rate of the hands did not meet my testing expectations in the past few days. So my friend 木猫猫猫 and I made some adjustments and improvements to v4, which ultimately made the limbs of v4.1 meet my expectations. I will release several xy graphs to clearly show the improvement of v4.1 compared to v4 under the same parameters.

v5

이 버전의 훈련 소재는 줄었으며, v4 실패 후 저는 메모리 점유가 적은 관점에서 아이디어를 테스트하기 위해 T-ponynai3용 네 가지 다른 스타일 Lora를 훈련하는 프로젝트를 시작했습니다. 물론 원본 모델도 civitai에 업로드했습니다. 적합성을 테스트한 후 이 네 가지 스타일을 첨가제로 T-ponynai3-v5에 훈련했습니다. 놀랍게도 v5의 선 묘사는 크게 향상되었는데, 이는 섬세한 소재를 훈련한 덕분입니다. 네 가지 스타일을 style_1부터 style_4까지 프롬프트 단어로 마킹했으나, 아쉽게도 이 네 가지 스타일이 개별 분리되지 않고 원본 스타일에 잘 융합되었습니다. 여러 스타일을 지원하진 못했지만 원래 nai3 스타일 질감은 한 단계 올라갔으며, 다음 버전에서 더 진전시킬 수 있을 것 같습니다. (저는 게임을 매우 좋아해서, 훈련할 때 컴퓨터 게임을 못 하는 게 너무 힘듭니다.)

The training materials for this version have been reduced. Due to the failure of v4, I launched another project to test my idea from a small perspective of memory usage, which is to train four different art styles of Lora adapted to T-ponynai3. Of course, the original model was also uploaded to Civitai. After testing the adaptability, I started training these four different art styles as additives into T-ponynai3-v5. Surprisingly, The line texture of v5 has improved to a high level, probably because I trained a very delicate material. For the marking of these four art styles, I used the prompt words from style_1 to style_4. Unfortunately, for some reason, these four art styles were not separated or the effect was weak, but rather integrated well into the original art style. Although it did not achieve the goal of supporting multiple art styles, it effectively elevated the texture of the original Nai3 art style to a higher level. Perhaps the next version can try to take it even further. (I really enjoy playing games, and it's too difficult for me to play computer games every time I train.)

v5 버전 관련 몇 가지 문제 요약.

1. Lora 호환성 및 사지와 흐릿한 눈 문제. Lora 호환성 문제는 이번 훈련에 너무 높은 최종 가중치를 사용한 것에서 기인했으며, 일부 상황에서 과적합이 발생할 수 있습니다. 이 최적화 버전은 해당 가중치를 낮춘 버전으로, 사지 붕괴율과 일부 Lora 호환성이 개선되었습니다. v4.1 기반 스타일 Lora를 사용한 비교 그림을 참고용으로 여러 장 준비했습니다. 눈 흐림 문제는 style_1 훈련에 사용하는 원본 소재 눈이 흐렸기 때문이며, style_3 또는 4를 사용하면 개선할 수 있습니다.

2. 볼륨 라이트의 노출 문제. 테스트 중 이 문제를 겪지 못했는데, 원인은 noise offset 훈련 파라미터 사용으로 모델의 빛 관련 프롬프트 민감도가 올라가 같은 가중치 빛 프롬프트의 결과가 더 밝아진 것으로 보입니다. 괄호나 숫자를 사용해 가중치를 높이지 않는 것을 시도해보길 권장합니다. sdxl은 프롬프트 단어에 민감하므로 똑같은 단어를 여러 번 반복해 극단적 결과를 줄이는 것도 방법입니다. 이 파라미터는 적은 프롬프트 단어 하에서 생성 결과가 노랗게 되는 문제를 고치기 위한 것으로, 비교 그림도 첨부했습니다.

3. 모델 복잡도 저하 문제. 이론과 실험 모두 v5가 이전 버전보다 더 깔끔하고 다양하며, 일부 프롬프트 활용 시 표현력이 더 정확해졌습니다. 비교 그림도 준비했습니다. 이번 훈련셋에는 지나치게 복잡한 소재를 사용하지 않았는데, 복잡한 이미지는 과적합 경향을 유발해 세부 묘사 손실을 초래할 수 있다고 판단했기 때문입니다.

목적: 이전 버전과 충분히 차별화되는 모델을 얻고자 하며, 거의 똑같은 모델을 출시하고 싶지 않습니다. 여러분의 피드백은 좋은 시행착오 기회이며, 저 혼자서는 시도 비용이 적습니다. 다음 버전에서는 서로 다른 스타일 소재량을 늘려 모든 소재 스타일을 잘 융합하고 분리해, 특정 프롬프트로 스타일을 전환할 수 있게 할 계획입니다. 아마 새로운 훈련 기술이 필요할 것입니다. 피드백 감사합니다!

Summarize some issues regarding the v5 version.

1, Lora compatibility and issues with limbs and blurred eyes. Lora compatibility is that I used too much final weight for this training, and in some cases, overfitting may occur. This optimized version is the one that reduces the corresponding weight, and the limb collapse rate and compatibility with some Loras should be better. I have run several comparison charts of Loras trained with v4.1 for reference. The problem of blurred eyes should be the reason why I trained style_1. The eyes in the original material used are blurry, and can be improved by using style_3 or 4.

2. Exposure issues with volume light. I did not encounter this issue during testing, and the reason for it should be that I used the noise offset training parameter to increase the sensitivity of the model to light related prompt words, resulting in brighter results when the same weight of light prompt words were used. I suggest trying not to use parentheses and numbers to increase the weight. Due to the sensitivity of sdxl to prompt words, you can try repeating the same prompt words multiple times to avoid extreme results. At the same time, using this parameter is to fix the problem of generating yellow results under a small number of prompt words. I have run several comparison graphs for reference.

3. The problem of reduced model complexity. In theory and in practice. V5 should be a cleaner and more diverse model than the previous version, and with the help of some prompts, it should be able to achieve more accurate performance. Similarly, I ran several comparison charts for comparison. This training set did not use overly complex materials because I believe that overly complex images tend to overfit the results, which inevitably leads to a certain degree of detail loss.

Purpose: I hope to obtain a model that is significantly different from the previous version, rather than releasing a model that is almost identical to the previous version. This feedback from everyone is a great opportunity for trial and error, and I really don't have any trial and error costs on my own. In the next version, I will try to increase the amount of materials for different art styles, so that the art styles of different materials can be well integrated and separated. Using specific prompts to switch art styles may require some new training techniques. Thank you for your feedback!