5.3.1. Creating Visual Outputs with Generative Models
💡 First Principle: To generate an image, you deploy an image-generation model and send a descriptive text prompt; the model returns a new image. The skill is prompt specificity — naming the subject, style, mood, and composition — because the model creates from your description, not from a stored library of pictures.
In Foundry, this follows the familiar shape: deploy an image-generation model, send a prompt describing the desired image, receive the generated result (typically as image data or a URL). Use cases include marketing visuals, concept art, and placeholder imagery. The same responsible-AI considerations apply — generated images can reflect bias in training data, and transparency about AI-generated content matters.
⚠️ Exam Trap: Image generation creates new images from text; it does not edit or analyze an image you provide (that's multimodal understanding or a separate editing capability). A scenario asking to "create a logo concept from this description" is generation; "describe this existing logo" is understanding.
Reflection Question: Why does a vague prompt like "make a nice picture" produce unpredictable results, while "a minimalist line drawing of a coffee cup on a white background" works better? Connect this to how generative models create content.