感谢 Apple 工程师,您现在可以使用 Core ML 在 Apple Silicon 上运行 Stable Diffusion!

这个 Apple 存储库提供了基于🧨 Diffusers 的转换脚本和推理代码,我们喜欢它!为了让您尽可能轻松,我们自己转换了权重并将模型的 Core ML 版本放在Hugging Face Hub中。

这篇文章将指导您如何使用转换后的权重。

可用检查点

已经转换并准备使用的检查点是这些模型的检查点:

Core ML 支持您设备中可用的所有计算单元:CPU、GPU 和 Apple 的神经引擎 (NE)。Core ML 还可以在不同设备上运行模型的不同部分,以最大限度地提高性能。

每种型号都有多种变体,它们可能会产生不同的性能,具体取决于您使用的硬件。我们建议您试用它们并坚持使用最适合您系统的那个。继续阅读以了解详细信息。

性能说明

每个模型有几个变体:

  • “原始”注意力与“split_einsum”。这是关键注意块的两个替代实现。之前由 Apple 推出split_einsum,兼容所有计算单元(CPU、GPU 和 Apple 的神经引擎)。,另一方面,仅与 CPU 和 GPU 兼容。尽管如此,它可能比某些设备更快,所以一定要检查一下!original``original``split_einsum
  • “ML 包”与“编译”模型。前者适用于 Python 推理,而compiled版本则需要 Swift 代码。compiledHub 中的模型将大型 UNet 模型权重拆分为多个文件,以便与 iOS 和 iPadOS 设备兼容。这对应于--chunk-unet转换选项

At the time of this writing, we got best results on my MacBook Pro (M1 Max, 32 GPU cores, 64 GB) using the following combination:

  • original attention.
  • all compute units (see next section for details).
  • macOS Ventura 13.1 Beta 4 (22C5059b).

With these, it took 18s to generate one image with the Core ML version of Stable Diffusion v1.4 🤯.

⚠️ Note

Several improvements to Core ML have been introduced in the beta version of macOS Ventura 13.1, and they are required by Apple’s implementation. You may get black images –and much slower times– if you use the current release version of macOS Ventura (13.0.1). If you can’t or won’t install the beta, please wait until macOS Ventura 13.1 is officially released.

Each model repo is organized in a tree structure that provides these different variants:

  1. coreml-stable-diffusion-v1-4
  2. ├── README.md
  3. ├── original
  4. ├── compiled
  5. └── packages
  6. └── split_einsum
  7. ├── compiled
  8. └── packages

You can download and use the variant you need as shown below.

Core ML Inference in Python

Prerequisites

  1. pip install huggingface_hub
  2. pip install git+https://github.com/apple/ml-stable-diffusion

Download the Model Checkpoints

To run inference in Python, you have to use one of the versions stored in the packages folders, because the compiled ones are only compatible with Swift. You may choose whether you want to use the original or split_einsum attention styles.

This is how you’d download the original attention variant from the Hub:

  1. from huggingface_hub import snapshot_download
  2. from huggingface_hub.file_download import repo_folder_name
  3. from pathlib import Path
  4. import shutil
  5. repo_id = "apple/coreml-stable-diffusion-v1-4"
  6. variant = "original/packages"
  7. def download_model(repo_id, variant, output_dir):
  8. destination = Path(output_dir) / (repo_id.split("/")[-1] + "_" + variant.replace("/", "_"))
  9. if destination.exists():
  10. raise Exception(f"Model already exists at {destination}")
  11. downloaded = snapshot_download(repo_id, allow_patterns=f"{variant}/*", cache_dir=output_dir)
  12. downloaded_bundle = Path(downloaded) / variant
  13. shutil.copytree(downloaded_bundle, destination)
  14. cache_folder = Path(output_dir) / repo_folder_name(repo_id=repo_id, repo_type="model")
  15. shutil.rmtree(cache_folder)
  16. return destination
  17. model_path = download_model(repo_id, variant, output_dir="./models")
  18. print(f"Model downloaded at {model_path}")

The code above will place the downloaded model snapshot inside the directory you specify (models, in this case).

Inference

Once you have downloaded a snapshot of the model, the easiest way to run inference would be to use Apple’s Python script.

  1. python -m python_coreml_stable_diffusion.pipeline --prompt "a photo of an astronaut riding a horse on mars" -i models/coreml-stable-diffusion-v1-4_original_packages -o </path/to/output/image> --compute-unit ALL --seed 93

<output-mlpackages-directory> should point to the checkpoint you downloaded in the step above, and --compute-unit indicates the hardware you want to allow for inference. It must be one of the following options: ALL, CPU_AND_GPU, CPU_ONLY, CPU_AND_NE. You may also provide an optional output path, and a seed for reproducibility.

The inference script assumes the original version of the Stable Diffusion model, stored in the Hub as CompVis/stable-diffusion-v1-4. If you use another model, you have to specify its Hub id in the inference command-line, using the --model-version option. This works both for models already supported, and for custom models you trained or fine-tuned yourself.

For Stable Diffusion 1.5 (Hub id: runwayml/stable-diffusion-v1-5):

  1. python -m python_coreml_stable_diffusion.pipeline --prompt "a photo of an astronaut riding a horse on mars" --compute-unit ALL -o output --seed 93 -i models/coreml-stable-diffusion-v1-5_original_packages --model-version runwayml/stable-diffusion-v1-5

For Stable Diffusion 2 base (Hub id: stabilityai/stable-diffusion-2-base):

  1. python -m python_coreml_stable_diffusion.pipeline --prompt "a photo of an astronaut riding a horse on mars" --compute-unit ALL -o output --seed 93 -i models/coreml-stable-diffusion-2-base_original_packages --model-version stabilityai/stable-diffusion-2-base

Core ML inference in Swift

Running inference in Swift is slightly faster than in Python, because the models are already compiled in the mlmodelc format. This will be noticeable on app startup when the model is loaded, but shouldn’t be noticeable if you run several generations afterwards.

Download

To run inference in Swift on your Mac, you need one of the compiled checkpoint versions. We recommend you download them locally using Python code similar to the one we showed above, but using one of the compiled variants:

  1. from huggingface_hub import snapshot_download
  2. from huggingface_hub.file_download import repo_folder_name
  3. from pathlib import Path
  4. import shutil
  5. repo_id = "apple/coreml-stable-diffusion-v1-4"
  6. variant = "original/compiled"
  7. def download_model(repo_id, variant, output_dir):
  8. destination = Path(output_dir) / (repo_id.split("/")[-1] + "_" + variant.replace("/", "_"))
  9. if destination.exists():
  10. raise Exception(f"Model already exists at {destination}")
  11. downloaded = snapshot_download(repo_id, allow_patterns=f"{variant}/*", cache_dir=output_dir)
  12. downloaded_bundle = Path(downloaded) / variant
  13. shutil.copytree(downloaded_bundle, destination)
  14. cache_folder = Path(output_dir) / repo_folder_name(repo_id=repo_id, repo_type="model")
  15. shutil.rmtree(cache_folder)
  16. return destination
  17. model_path = download_model(repo_id, variant, output_dir="./models")
  18. print(f"Model downloaded at {model_path}")

Inference

To run inference, please clone Apple’s repo:

  1. git clone https://github.com/apple/ml-stable-diffusion
  2. cd ml-stable-diffusion

And then use Apple’s command-line tool using Swift Package Manager’s facilities:

  1. swift run StableDiffusionSample --resource-path models/coreml-stable-diffusion-v1-4_original_compiled --compute-units all "a photo of an astronaut riding a horse on mars"

You have to specify in --resource-path one of the checkpoints downloaded in the previous step, so please make sure it contains compiled Core ML bundles with the extension .mlmodelc. The --compute-units has to be one of these values: all, cpuOnly, cpuAndGPU, cpuAndNeuralEngine.

For more details, please refer to the instructions in Apple’s repo.

Bring Your own Model

If you have created your own models compatible with Stable Diffusion (for example, if you used Dreambooth, Textual Inversion or fine-tuning), then you have to convert the models yourself. Fortunately, Apple provides a conversion script that allows you to do so.

For this task, we recommend you follow these instructions.

Next Steps

我们对这带来的机会感到非常兴奋,迫不及待地想看看社区可以从这里创造什么。一些潜在的想法是:

  • 适用于 Mac、iPhone 和 iPad 的原生高质量应用程序。
  • 为 Swift 引入额外的调度器,以实现更快的推理。
  • 额外的管道和任务。
  • 探索量化技术和进一步优化。

期待看到你创造的东西!