Stable Diffusion, What are the real uses? (stability.ai logo)

يمكنك أيضًا مشاهدة الترجمة العربية للمقال في النهاية

Believe it or not, one of the major growing branches of technology is AI. So we are back to introduce another AI software to you.

Arnold Schwarzenegger, Movie: The Expendables 2. I'm Back!

Stable Diffusion

In 2022, Stability.ai created a deep learning model called Stable Diffusion that could convert words into images. In addition to the text-to-image conversion, this AI can also use the image-to-image procedure. Stable Diffusion has the potential to revolutionize the world of AI, enabling people to generate realistic-looking images from text without any prior knowledge of programming or image editing.

As with Google's Imagen, Stable Diffusion uses a static CLIP ViT-L/14 text encoder to condition the model on text prompts.

Stable Diffusion splits the picture generation process at runtime into a "diffusion" process. This method takes a picture from its initial, noisy state and refines it until it closely matches the specified textual description.

Stable Diffusion image-to-image generation: make him younger with long hair

System Requirements

Windows 10/11 OS
Nvidia GPU RTX with at least 12 GB of VRAM
25 GB of local disk space

Note: A GPU with more memory will be able to generate larger images without requiring upscaling. The model can still run even on 8 GB of VRAM, but you will be limited to 256x256 resolution.

You can use Stable Diffusion on your PC as well as the Stability.ai website. On a PC, however, there are numerous choices for customizing the output image.

You may watch a tutorial for the Stable Diffusion installation at the end of this post.

What are the differences between Stable Diffusion and Dall-E2?

There are many other AIs that generate images out of the text such as Dall-E2 or Midjourney.

On January 5, 2021, OpenAI unveiled Dall-E2, an AI system that can generate images based on written descriptions. To decipher the natural language inputs and produce related visuals, it employs a 12-billion-parameter training version of the GPT-3 transformer model.

Each of these two AIs has its own advantages and disadvantages. So, we highlight a few of them here.

Free & Open Source

Stable Diffusion is free, whereas Dall-E2 is not, which is a significant distinction. Open source is another benefit of stable diffusion.

Open source also means that the code behind Stable Diffusion is available for public scrutiny and review, which can help to ensure the accuracy and reliability of the platform.

Generating Power

Both programs are incredibly powerful, but Stable Diffusion tends to create imagery that is more artistic and beautiful, whilst DALL-E2 sometimes seems more straightforward.

The outcomes vary between landscapes, people, works of art, animals, and other text prompts like robots or futuristic vehicles, so a lot relies on the type of graphic you are creating. Paying attention to the instruction words, such as "highly-detailed," "smooth," or an indicator of the texture you'd like your image to have is one of the finest ways to improve your graphic.

Right Dall e2, Left Stable Diffusion: landscape, sunset

Resolution

Although Stable Diffusion triumphs when it comes to higher-resolution pictures, we believe that each of these AI technologies presents an intriguing opportunity to experiment with image design and development. When compared to DALL-E2 512 x 512 resolution, the application can produce images with a resolution of up to 1024 × 1024, making it a clear choice if you need sharp graphics for use in marketing, gaming, or other fields.

Facial Images

Because Dall-E2 has a wider range and enables you to generate visuals of real individuals, it may be better equipped to produce images of real people (such as celebrities or historical figures).

Left Stable diffusion, Right Dall-E 2 generated image: Roger Federer Playing Tennis

What Are The Real Uses?

Apart from the joy of creating some images with the careful selection of `words, here is the main question of this blog “What are the real uses of Stable Diffusion AI”?

Video Games

creating portraits for an Age of Empires 3 Definitive edition game mod
tileable textures
Tile finalization

Product & Architecture Design

One of the intriguing features of this AI is sketch-to-image and image-to-image generating models. The architects and product designers would benefit greatly from this.

Marketing

In the Stable Diffusion, because you have the full right to the generated image, you may confidently put it to use in your advertising campaigns. With the right system, AI, and prompts in place, this might result in significant project time savings.

Security

For use on social media and other image-based platforms, photos with recognizable individuals or locations can be removed from them and made anonymous. This process is known as anonymization, and its purpose is to protect the privacy of those individuals or places while still allowing people to view or interact with the image.

Sciences

The Diffusion Model can be used for the fake MRI dataset. It is a process in which latent diffusion modeling is used for the generation of brain imaging.

In the below articles, you can read more about the use of diffusion models in the field of science and neurology.

Brain Imaging Generation with Latent Diffusion Models

Deep neural networks have brought remarkable breakthroughs in medical imageanalysis. However, due to their data-hungry nature, the modest dataset sizes inmedical imaging projects might be hindering their full potential. Generatingsynthetic data provides a promising alternative, allowing to comple…

arXiv.orgWalter H. L. Pinaya

LDM 100k Dataset

AI-generated high-resolution Brain MRI imaging data comprising of 100k subjects, with associated information such as age, sex, and brain size normalised by head size (surrogate of atrophy). The data was generated using a 3D Latent Diffusion Model. The model was trained on the Cambridge-1 Super Compu…

Academic Torrents

How to install Stable Diffusion

Here is the video for installing and using Stable Diffusion:

الترجمه

(Stable Diffusion)الانتشار المستقر: ما هي الاستخدامات الحقيقية؟

ما هو الانتشار المستقر؟ دعونا نرى ما هي الاستخدامات الحقيقية؟

صدق أو لا تصدق ، يعد الذكاء الاصطناعي أحد الفروع الرئيسية المتنامية للتكنولوجيا. لذا عدنا لتقديم برنامج ذكاء اصطناعي آخر لك.

(Stable Diffusion)الانتشار المستقر

في عام 2022 ، أنشأ Stability.ai نموذجًا للتعلم العميق يسمى Stable Diffusion يمكنه تحويل الكلمات إلى صور. بالإضافة إلى تحويل النص إلى صورة ، يمكن لهذا الذكاء الاصطناعي أيضًا تحويل الصورة إلى صورة. يمتلك الانتشار المستقر القدرة على إحداث ثورة في عالم الذكاء الاصطناعي ، مما يمكّن الناس من إنشاء صور واقعية المظهر من النص دون أي معرفة مسبقة بالبرمجة أو تحرير الصور.

كما هو الحال مع Imagen من Google ، يستخدم Stable Diffusion أداة تشفير نصية ثابتة CLIP ViT-L / 14 لتكييف النموذج في مطالبات النص.

يقسم Stable Diffusion عملية إنشاء الصورة في وقت التشغيل إلى عملية "انتشار". تأخذ هذه الطريقة صورة من حالتها الأولية الصاخبة وتنقحها حتى تتطابق بشكل وثيق مع الوصف النصي المحدد.

متطلبات النظام

نظام التشغيل Windows 10/11
Nvidia GPU RTX بسعة 12 جيجابايت على الأقل من VRAM
25 جيجابايت من مساحة القرص المحلي

ملاحظة: ستتمكن وحدة GPU التي تحتوي على ذاكرة أكبر من إنشاء صور أكبر دون الحاجة إلى ترقية. لا يزال بإمكان الطراز العمل حتى على 8 جيجابايت من VRAM ، لكنك ستقتصر على دقة 256 × 256.

يمكنك استخدام Stable Diffusion على جهاز الكمبيوتر الخاص بك وكذلك موقع Stability.ai. ومع ذلك ، هناك العديد من الخيارات على جهاز الكمبيوتر لتخصيص صورة الإخراج.

يمكنك مشاهدة برنامج تعليمي لتثبيت Stable Diffusion في نهاية هذا المنشور.

ما هي الاختلافات بين Stable Diffusion و Dall-E2؟

هناك العديد من أنظمة الذكاء الاصطناعي الأخرى التي تنشئ صورًا من النص مثل Dall-E2 أو Midjourney.

في 5 يناير 2021 ، كشفت OpenAI النقاب عن Dall-E2 ، وهو نظام ذكاء اصطناعي يمكنه إنشاء صور بناءً على الأوصاف المكتوبة. لفك تشفير مدخلات اللغة البشرية وإنشاء المرئيات ذات الصلة ، يستخدم Dall-E2 نسخة تدريب 12 مليار متغير من نموذج محول GPT-3.

كل من هذين الذكاء الاصطناعي له مميزاته وعيوبه. لذلك ، نشیر الی بعضها.

مجاني ومفتوح المصدر(open source)

الانتشار المستقر مجاني ، بينما Dall-E2 ليس كذلك ، وهو تمييز مهم. المصدر المفتوح هو فائدة أخرى للانتشار المستقر.

يعني المصدر المفتوح أيضًا أن الكود وراء Stable Diffusion متاح للتدقيق والمراجعة العامة ، مما يساعد على ضمان دقة وموثوقية النظام الأساسي.

القدرة على إنشاء الصور

كلا البرنامجين قويان بشكل لا يصدق ، لكن Stable Diffusion يميل إلى إنشاء صور أكثر فنية وجميلة ، بينما يبدو DALL-E2 أحيانًا أكثر وضوحًا.

تختلف النتائج بين المناظر الطبيعية والأشخاص والأعمال الفنية والحيوانات وغيرها من الرسائل النصية مثل الروبوتات أو المركبات المستقبلية ، لذلك يعتمد الكثير على نوع الرسم الذي تقوم بإنشائه. يُعد الانتباه إلى كلمات التعليمات ، مثل «highly-detailed»، أو «smooth»، أو زيادة نسبة أبعاد الصور، من أفضل الطرق لتحسين الرسم.

دقة الصورة

على الرغم من انتصار Stable Diffusion عندما يتعلق الأمر بالصور عالية الدقة ، إلا أننا نعتقد أن كل تقنية من تقنيات الذكاء الاصطناعي هذه تقدم فرصة رائعة لتجربة تصميم الصور وتطويرها. مقارنة بـ DALL-E2 ، الذي تبلغ حده الأقصى للدقة 1024 × 1024 ، يمكن لـ Stable Diffusion إنشاء صور بنسب أبعاد مختلفة.

صور الوجه

نظرًا لأن Dall-E2 لديه نطاق أوسع ويمكّنك من إنشاء صور لأفراد حقيقيين ، فقد يكون مجهزًا بشكل أفضل لإنتاج صور لأشخاص حقيقيين (مثل المشاهير أو الشخصيات التاريخية).

ما هي الاستخدامات الحقيقية؟

بصرف النظر عن متعة إنشاء بعض الصور مع الاختيار الدقيق للكلمات ، فإليك السؤال الرئيسي لهذه المدونة "ما هي الاستخدامات الحقيقية لـ Stable Diffusion AI"؟

ألعاب الفيديو

إنشاء صور للعبة Age of Empires 3 Definitive edition game mod
تصميم الصور بأسلوب التبليط أو التایل

تصميم المنتج و المعماري

إحدى الميزات في هذا الذكاء الاصطناعي هي نماذج الرسم الاولیة إلى صورة ومن صورة إلى صورة. سيستفيد المهندسون المعماريون ومصممو المنتجات بشكل كبير من هذا.

تسويق

في Stable Diffusion ، نظرًا لأن لديك الحق الكامل في الصورة التي تم إنشاؤها ، يمكنك بثقة استخدامها في حملاتك الإعلانية. مع وجود النظام الصحيح والذكاء الاصطناعي والمطالبات في مكانها الصحيح ، قد يؤدي ذلك إلى توفير كبير في وقت المشروع.

حماية

لاستخدامها على وسائل التواصل الاجتماعي وغيرها من الأنظمة الأساسية القائمة على الصور ، يمكن إزالة الصور التي تحتوي على أفراد أو مواقع يمكن التعرف عليهم منها وجعلها مجهولة المصدر. تُعرف هذه العملية بإخفاء الهوية ، والغرض منها هو حماية خصوصية هؤلاء الأفراد أو الأماكن مع السماح للأشخاص بمشاهدة الصورة أو التفاعل معها.

علوم

يمكن استخدام نموذج الانتشار لمجموعة بيانات التصوير بالرنين المغناطيسي المزيفة(Fake MRI Dataset). إنها عملية يتم فيها استخدام نمذجة الانتشار الكامن لتوليد تصوير الدماغ.

fake MRI dataset

في المقالات أدناه ، يمكنك قراءة المزيد حول استخدام نماذج الانتشار في مجال العلوم وعلم الأعصاب.

مقالة الاولی

مقالة الثانیة