Mapping the black box. Visual investigation of a diffusion model’s latent space
Loading...
Authors
Author (Corporation)
Publication date
2024
Typ of student thesis
Master
Course of study
Master of Arts FHNW in Digital Communication Environments
Type
11 - Student thesis
Editors
Editor (Corporation)
Parent work
Special issue
DOI of the original publication
Series
Series number
Volume
Issue / Number
Pages / Duration
Patent number
Publisher / Publishing institution
Hochschule für Gestaltung und Kunst Basel FHNW
Place of publication / Event location
Basel
Edition
Version
Programming language
Assignee
Practice partner / Client
Abstract
Although the translation from text to images has been a long-standing aspect of human visual expression, generative AI models add a new way to perform these translations based on textual prompts. This new possibility makes the generative models’ internal logic and decision-making processes become central.
The research explores the Midjourney v6.0-mediated translation from text to images through three types of experiments, with a particular focus on the correlation between generated images and specific prompt variations. The proposed methods prove to be a successful strategy to investigate the model’s latent space and decision-making processes, and the analysis of the generated image series reveals intriguing insights about the AI’s ‘black box’ structure and its internal latent representations.
Keywords
Künstliche Intelligenz, Halluzinationen, Bildgenerierung, Prompting, Modell
Subject (DDC)
Event
Exhibition start date
Exhibition end date
Conference start date
Conference end date
Date of the last check
ISBN
ISSN
Language
English
Created during FHNW affiliation
Yes
Strategic action fields FHNW
Publication status
Review
Open access category
License
Citation
Oliva, O. (2024). Mapping the black box. Visual investigation of a diffusion model’s latent space [Hochschule für Gestaltung und Kunst Basel FHNW]. https://irf.fhnw.ch/handle/11654/50381