Image generation
Text-to-image and image-to-image generation using Stable Diffusion.
Overview
Image generation uses qvac-ext-stable-diffusion.cpp as the inference engine. Load a supported model using modelType: "diffusion". Then, provide a text prompt describing the image to generate.
For image-to-image, also pass init_image (a Uint8Array of PNG bytes) — the model transforms the input guided by the prompt instead of starting from noise.
diffusion() returns one or more PNG images as Uint8Array buffers. Use progressStream to track generation progress step-by-step.
Functions
Use the following sequence of function calls:
For how to use each function, see SDK — API reference.
Models
Supported model families and their file layouts:
- SD1.x, SD2.x: single all-in-one
*.gguffile. No companion files needed. - SDXL, SD3: may require separate CLIP/T5 text encoder files (
clipLModelSrc,clipGModelSrc,t5XxlModelSrc) inmodelConfigdepending on the model variant. - FLUX.2-klein: split layout — diffusion model
*.gguf+ LLM text encoder*.gguf(viallmModelSrc) + VAE*.safetensors(viavaeModelSrc).
For models available as constants, see SDK — Models.
Examples
Stable Diffusion
The following script shows a minimal text-to-image generation example using a single all-in-one SD 2.1 model:
import { loadModel, unloadModel, diffusion, SD_V2_1_1B_Q8_0 } from "@qvac/sdk";
import fs from "fs";
// Minimal diffusion example — single GGUF model, no companion files needed.
// Works with SD 1.x / 2.x all-in-one models.
const modelSrc = process.argv[2] || SD_V2_1_1B_Q8_0;
const prompt = process.argv[3] || "a photo of a cat sitting on a windowsill";
const modelId = await loadModel({
modelSrc,
modelType: "diffusion",
modelConfig: { prediction: "v" },
});
const { outputs } = diffusion({ modelId, prompt });
const buffers = await outputs;
fs.writeFileSync("output.png", buffers[0]);
console.log("Saved: output.png");
await unloadModel({ modelId, clearStorage: false });
process.exit(0);FLUX.2-klein
The following script shows text-to-image generation using FLUX.2-klein with its split-layout model (separate diffusion model, LLM text encoder, and VAE):
import { loadModel, unloadModel, diffusion, FLUX_2_KLEIN_4B_Q4_0, FLUX_2_KLEIN_4B_VAE, QWEN3_4B_Q4_K_M } from "@qvac/sdk";
import fs from "fs";
import path from "path";
// FLUX.2 [klein] uses a split-layout: separate diffusion model + LLM text encoder + VAE
const diffusionModelSrc = process.argv[2] || FLUX_2_KLEIN_4B_Q4_0;
const llmModelSrc = process.argv[3] || QWEN3_4B_Q4_K_M;
const vaeModelSrc = process.argv[4] || FLUX_2_KLEIN_4B_VAE;
const prompt = process.argv[5] || "a futuristic city at sunset, photorealistic";
const outputDir = process.argv[6] || ".";
console.log("Loading FLUX.2 [klein] split-layout model...");
const modelId = await loadModel({
modelSrc: diffusionModelSrc,
modelType: "diffusion",
modelConfig: {
device: "gpu",
threads: 4,
llmModelSrc,
vaeModelSrc,
},
onProgress: (p) => console.log(`Loading: ${p.percentage.toFixed(1)}%`),
});
console.log(`Model loaded: ${modelId}`);
console.log(`\nGenerating: "${prompt}"`);
const { progressStream, outputs, stats } = diffusion({
modelId,
prompt,
width: 512,
height: 512,
steps: 20,
guidance: 3.5,
cfg_scale: 1,
seed: -1,
});
for await (const { step, totalSteps } of progressStream) {
process.stdout.write(`\rStep ${step}/${totalSteps}`);
}
console.log();
const buffers = await outputs;
for (let i = 0; i < buffers.length; i++) {
const outputPath = path.join(outputDir, `flux2_${i}.png`);
fs.writeFileSync(outputPath, buffers[i]);
console.log(`Saved: ${outputPath}`);
}
console.log("\nStats:", await stats);
await unloadModel({ modelId, clearStorage: false });
console.log("Done.");
process.exit(0);Image-to-image
Pass init_image to transform an existing image guided by a text prompt. Behavior depends on the model family:
- SD / SDXL / SD3: SDEdit-style. Use
strengthto control how much the source is preserved (0= keep source,1= ignore source). - FLUX.2: in-context conditioning. Requires
prediction: "flux2_flow"inmodelConfigatloadModel()time;strengthis ignored on this path.
The following script loads an SD 2.1 model and transforms an input image using strength: 0.5:
import { loadModel, unloadModel, diffusion, SD_V2_1_1B_Q8_0 } from "@qvac/sdk";
import fs from "fs";
import path from "path";
// img2img example — transforms an input image guided by a text prompt.
const inputPath = process.argv[2];
const prompt = process.argv[3] || "oil painting style, vibrant colors";
const outputDir = process.argv[4] || ".";
const modelSrc = process.argv[5] || SD_V2_1_1B_Q8_0;
if (!inputPath) {
console.error("❌ Error: input image path is required");
console.error("Usage: bun run bare:example dist/examples/diffusion-img2img.js <inputImage> [prompt] [outputDir] [modelSrc]");
process.exit(1);
}
try {
console.log("Loading diffusion model...");
const modelId = await loadModel({
modelSrc,
modelType: "diffusion",
});
console.log(`Model loaded: ${modelId}`);
const init_image = new Uint8Array(fs.readFileSync(inputPath));
console.log(`\nTransforming "${inputPath}" with prompt: "${prompt}"`);
const { progressStream, outputs, stats } = diffusion({
modelId,
prompt,
init_image,
strength: 0.5,
steps: 30,
seed: -1,
});
for await (const { step, totalSteps } of progressStream) {
process.stdout.write(`\rStep ${step}/${totalSteps}`);
}
console.log();
const buffers = await outputs;
for (let i = 0; i < buffers.length; i++) {
const outputPath = path.join(outputDir, `img2img_${i}.png`);
fs.writeFileSync(outputPath, buffers[i]);
console.log(`Saved: ${outputPath}`);
}
console.log("\nStats:", await stats);
await unloadModel({ modelId, clearStorage: false });
console.log("Done.");
process.exit(0);
}
catch (error) {
console.error("❌ Error:", error);
process.exit(1);
}Tip: all examples throughout this documentation are self-contained and runnable. For instructions on how to run them, see SDK quickstart.