
By contrast, the team behind Stable Diffusion have been very transparent about how their model is trained. Sometimes, the data isn’t available at all: OpenAI has said it’s trained DALL-E 2 on hundreds of millions of captioned images, but hasn’t released the proprietary data. We know they were trained on images pulled from the web, but which ones? As an artist or photographer, an obvious question is whether your work was used to train the AI model, but this is surprisingly hard to answer.

One of the biggest frustrations of text-to-image generation AI models is that they feel like a black box.
