
My UI Started Building Itself: The Hard Lessons of Moving Beyond the AI Chatbox
Why streaming text is just the beginning and how I transitioned to a Generative UI architecture using Vercel’s tool-calling capabilities.
My UI Started Building Itself: The Hard Lessons of Moving Beyond the AI Chatbox
The chat window is the peak of AI interaction. At least, that’s what every SaaS landing page from the last year wants you to believe. But honestly? Most of the time, the chat interface is just a lazy placeholder for a better UI that we haven't built yet.
If I ask an AI to help me manage my budget, I don't want a paragraph describing my spending habits in poetic detail. I want a bar chart. I want a "Pay Bill" button that actually works. I want the interface to adapt to the context of my request, not just spit out strings of tokens that I have to parse with my human eyeballs.
I spent the last few months moving away from the "Text-In, Text-Out" paradigm and into the world of Generative UI using the Vercel AI SDK. It’s been a ride of high highs and "why is this component rendering three times?" lows.
The "Wall of Text" Problem
We’ve all seen it. You prompt an LLM, and it starts streaming. You watch the little cursor dance across the screen, generating Markdown tables that look okay on desktop but break your mobile layout completely.
The breakthrough for me was realizing that LLMs shouldn't just talk to users; they should orchestrate components.
Instead of the model saying, "Here is your weather report for London," and then listing temperatures, the model should trigger a WeatherCard component. This sounds simple, but the plumbing behind it—mapping a model's "intent" to a React component in real-time—is where things get spicy.
Shifting to Tool Calling
The secret sauce is "Tool Calling" (or function calling). Instead of just generating a string, the model decides it needs to call a specific function. With Vercel's ai SDK, we can intercept that decision and return a React component instead of just raw data.
Here is what a simplified version of my transition looked like. We moved from useChat (text-only) to a setup using streamUI.
// actions.tsx (Server Action)
import { streamUI } from 'ai';
import { openai } from '@ai-sdk/openai';
import { z } from 'zod';
import { StockChart } from '@/components/StockChart';
import { LoadingSpinner } from '@/components/Loading';
export async function submitUserQuery(input: string) {
const result = await streamUI({
model: openai('gpt-4o'),
prompt: input,
text: ({ content }) => <div>{content}</div>,
tools: {
get_stock_price: {
description: 'Get the current stock price and trend',
parameters: z.object({
symbol: z.string().describe('The stock ticker symbol'),
}),
// This is the magic part
generate: async function* ({ symbol }) {
yield <LoadingSpinner />; // Show this while fetching data
const data = await fetchStockData(symbol);
return <StockChart ticker={symbol} price={data.price} />;
},
},
},
});
return result.value;
}In this flow, the LLM isn't just guessing what a chart looks like. It identifies the user's intent to see a stock price, extracts the ticker symbol, and hands it off to my actual, typed, battle-tested React components.
The Hard Lessons (The stuff they don't tell you)
It wasn't all sunshine and rainbows. Once your UI starts building itself, you run into problems that a standard "Chatbot" never has to deal with.
1. The "Zod" Rigidity
The model is only as good as your schema. If your Zod schema for a tool is too vague, the LLM will hallucinate props. I once had a component crash because the LLM decided a price field should be a string like "$150" instead of the number my component expected.
Lesson: Be aggressively specific in your .describe() calls within your Zod schema. Tell the model exactly what format you need.
2. State Management is a Nightmare
When the AI generates a component, where does the state live? If that StockChart has an internal toggle for "1D vs 5D" views, and the user switches it, that's fine. But if the user then sends *another* message, the entire UI stream usually refreshes or adds a new message. Keeping the "old" generated components synced with the "new" ones requires a very careful approach to your message history array.
3. The "Flash of Boring Content"
Streaming text is fast. Rendering a complex component that needs to fetch its own data is slow. I had to get used to writing "Skeleton" states for my generative components. If you don't yield a loading state in your generate function, the UI just sits there frozen while the LLM waits for your API call to finish. It feels broken.
Why this is worth the headache
Even with the hurdles, the result feels like magic.
When a user says "I spent too much on coffee this week," and the AI responds by instantly rendering an interactive spend-tracker with a "Set Limit" button, the friction of the web disappears. We’re moving away from users having to learn where our buttons are, and toward the interface meeting the user where their brain is.
Generative UI turns the LLM from a "smart-aleck writer" into a "dynamic UI architect." It’s messy, it’s still early, and the debugging tools are barely there—but I’m never going back to just a text box.

