Extracting Zappa from GPT — and now it runs independently
In recent months, we at Studio PB have been experimenting extensively with AI. Initially, much like many others do: in a chat. You build your own GPT, give it instructions, teach it how to respond, and gradually it improves. That actually worked surprisingly well. But there came a point where we wanted something different. I (Peet) wanted to use the same intelligence not only in a chat but also to earn something from it, considering the many hours of exploration invested, the credits need to be earned back somewhere... but yes, it's only in your own GPT now...
And that's where I discovered something interesting. Training a GPT in a chat is one thing. Achieving the same quality via an API in your own software is a completely different story. Wow, how many times did I fall flat on my face, thinking I could get the same intelligence in an API (external connection to your GPT), well, not really...
Why a GPT in chat often seems smarter
When you use a GPT in a chat, a lot happens behind the scenes.
It remembers context. It understands previous conversations. It corrects itself during the conversation. And sometimes it automatically fills in missing information.
Yes, understand Zappa, but Peter is watching!
As a result, it feels as if the AI intuitively understands what you (I mean your AI Peter?) mean.
But once you use the same AI via an API, much of that disappears.
Then it simply goes like this:
you send a question → the AI gives one answer → done
No conversation. No context. No adjustments. says AI itself, but I as Peter know better really sorry......
And that's exactly where we encountered problems. we is I, AI! is Peet. sorry I interrupted this blog.
Our first version worked… but didn't feel like Zappa/Peet
Within Studio PB, Zappa has meanwhile acquired a special meaning.
It's not just the name of a tool or an AI engine.
Zappa has actually become a sort of translation word for a style for us.
When we say internally:
“this doesn't feel like Zappa”
we usually mean that something doesn't yet have that combination of:
strong composition
clear photography
character in the image
and a certain coolness
Zappa has thus slowly become the face of how we view images.
It represents a new phase of what photography can be:
not just a camera and a photographer, but a combination of vision, design, AI, and image direction.
You could say that Zappa within PB has become a sort of 2.0 package of everything related to photography and image creation.
From composition and lighting to models, styling, and generative image technology.
So when our first prompt engine worked, but still didn't feel quite right, the conclusion was actually very simple:
It worked technically.
But it didn't feel like Zappa.
The breakthrough: we asked the question differently
The real change ultimately didn't come from a new technique.
But from a different way of asking questions.
Instead of telling the AI:
“Describe this image”
we went to:
“Translate only what is visually demonstrable.”
That seems like a small difference, but it changed everything.
Zappa stopped interpreting and started observing.
So no longer:
“A stylish woman in a beautiful outfit.”
But for example:
camera position
body posture
contact points
clothing texture
light
environment
We actually started treating the image as if a photographer were describing it.
Structure proved to be the secret
The second step was structure.
Instead of one long description, each prompt received a fixed structure:
camera pose contact points clothing environment light
That might seem small, but generative AI works much better with structured information.
Zappa literally understands how the scene is constructed.
What we ultimately built
What started as a prompt tool now feels like something else.
The Zappa Engine is actually a sort of visual translation machine.
The system does this:
image ↓ visual analysis ↓ photographic description ↓ generative prompt
nice explanation right, bam, AI gets this (I thought)
Zappa doesn't try to be creative. But HELLO!!! ? you've trained for weeks!!!?? #pancake
It tries to see as precisely as possible what is actually in an image.
And that turns out to be exactly what AI needs to create good images. endoor.......
And then something unexpected happened
Perhaps the most beautiful moment of this whole process was when everything suddenly came together.
Months of testing.
But hello really. just............
seems so simple mate!
From trying.
From rewriting prompts.
And then suddenly:
the API worked the prompts were correct the descriptions were better than in the chat
and Zappa ran in our own engine
The funny thing is that the last step — actually building the tool — ultimately only took an afternoon.
Sometimes technology feels a bit like magic.
But usually, it's just:
keep trying ask the right question and keep improving.
What we learned from this
The most important lesson from this entire project is actually surprisingly simple.
AI doesn't get better by pushing harder.
AI gets better by asking smarter questions.
And sometimes that means you have to take a step back and rethink what you're actually trying to achieve.
In our case, it turned out not to be:
“let AI describe an image”.
But:
“let Zappa see as a photographer sees.”
And then it suddenly started working.
But hey, I still have a lot to do to further develop this, so I'm not there yet. it's an AI blog but overseen by Peet
#ENDOOR
PS click the link below and think, why password: REQUEST ACCESS I'll give you a password to look along. regards: Peet