My UI Started Building Itself: The Hard Lessons of Moving Beyond the AI Chatbox

The chat window is the peak of AI interaction. At least, that’s what every SaaS landing page from the last year wants you to believe. But honestly? Most of the time, the chat interface is just a lazy placeholder for a better UI that we haven't built yet.

If I ask an AI to help me manage my budget, I don't want a paragraph describing my spending habits in poetic detail. I want a bar chart. I want a "Pay Bill" button that actually works. I want the interface to adapt to the context of my request, not just spit out strings of tokens that I have to parse with my human eyeballs.

I spent the last few months moving away from the "Text-In, Text-Out" paradigm and into the world of Generative UI using the Vercel AI SDK. It’s been a ride of high highs and "why is this component rendering three times?" lows.

The "Wall of Text" Problem

We’ve all seen it. You prompt an LLM, and it starts streaming. You watch the little cursor dance across the screen, generating Markdown tables that look okay on desktop but break your mobile layout completely.

The breakthrough for me was realizing that LLMs shouldn't just talk to users; they should orchestrate components.

Instead of the model saying, "Here is your weather report for London," and then listing temperatures, the model should trigger a WeatherCard component. This sounds simple, but the plumbing behind it—mapping a model's "intent" to a React component in real-time—is where things get spicy.

Shifting to Tool Calling

The secret sauce is "Tool Calling" (or function calling). Instead of just generating a string, the model decides it needs to call a specific function. With Vercel's ai SDK, we can intercept that decision and return a React component instead of just raw data.

Here is what a simplified version of my transition looked like. We moved from useChat (text-only) to a setup using streamUI.

// actions.tsx (Server Action)
import { streamUI } from 'ai';
import { openai } from '@ai-sdk/openai';
import { z } from 'zod';
import { StockChart } from '@/components/StockChart';
import { LoadingSpinner } from '@/components/Loading';

export async function submitUserQuery(input: string) {
  const result = await streamUI({
    model: openai('gpt-4o'),
    prompt: input,
    text: ({ content }) => <div>{content}</div>,
    tools: {
      get_stock_price: {
        description: 'Get the current stock price and trend',
        parameters: z.object({
          symbol: z.string().describe('The stock ticker symbol'),
        }),
        // This is the magic part
        generate: async function* ({ symbol }) {
          yield <LoadingSpinner />; // Show this while fetching data
          
          const data = await fetchStockData(symbol); 
          
          return <StockChart ticker={symbol} price={data.price} />;
        },
      },
    },
  });

  return result.value;
}

In this flow, the LLM isn't just guessing what a chart looks like. It identifies the user's intent to see a stock price, extracts the ticker symbol, and hands it off to my actual, typed, battle-tested React components.

The Hard Lessons (The stuff they don't tell you)

It wasn't all sunshine and rainbows. Once your UI starts building itself, you run into problems that a standard "Chatbot" never has to deal with.

1. The "Zod" Rigidity

The model is only as good as your schema. If your Zod schema for a tool is too vague, the LLM will hallucinate props. I once had a component crash because the LLM decided a price field should be a string like "$150" instead of the number my component expected.
Lesson: Be aggressively specific in your .describe() calls within your Zod schema. Tell the model exactly what format you need.

2. State Management is a Nightmare

When the AI generates a component, where does the state live? If that StockChart has an internal toggle for "1D vs 5D" views, and the user switches it, that's fine. But if the user then sends *another* message, the entire UI stream usually refreshes or adds a new message. Keeping the "old" generated components synced with the "new" ones requires a very careful approach to your message history array.

3. The "Flash of Boring Content"

Streaming text is fast. Rendering a complex component that needs to fetch its own data is slow. I had to get used to writing "Skeleton" states for my generative components. If you don't yield a loading state in your generate function, the UI just sits there frozen while the LLM waits for your API call to finish. It feels broken.

Why this is worth the headache

Even with the hurdles, the result feels like magic.

When a user says "I spent too much on coffee this week," and the AI responds by instantly rendering an interactive spend-tracker with a "Set Limit" button, the friction of the web disappears. We’re moving away from users having to learn where our buttons are, and toward the interface meeting the user where their brain is.

Generative UI turns the LLM from a "smart-aleck writer" into a "dynamic UI architect." It’s messy, it’s still early, and the debugging tools are barely there—but I’m never going back to just a text box.

My UI Started Building Itself: The Hard Lessons of Moving Beyond the AI Chatbox

My UI Started Building Itself: The Hard Lessons of Moving Beyond the AI Chatbox

The "Wall of Text" Problem

Shifting to Tool Calling

The Hard Lessons (The stuff they don't tell you)

1. The "Zod" Rigidity

2. State Management is a Nightmare

3. The "Flash of Boring Content"

Why this is worth the headache

Related Articles

Evals Are Pre-Commit Hooks

How to Benchmark Your LLM Outputs Without the Manual Vibe-Check

How I'm Using AI to Code 3x Faster (And You Can Too)

Related Articles

Evals Are Pre-Commit Hooks

How to Benchmark Your LLM Outputs Without the Manual Vibe-Check

How I'm Using AI to Code 3x Faster (And You Can Too)