How I Finally Tamed the Chaos of Streaming AI Responses in React

Most React developers are treating AI streams like simple fetch requests, and that’s why their UI feels like it's vibrating apart. If you’ve ever tried to pipe a GPT-4 response into a standard useState hook using a basic fetch call, you’ve likely seen the "jank": the screen flickers, the scroll position jumps like a caffeinated squirrel, and your browser's main thread starts screaming for mercy.

I spent three weeks trying to build a "simple" chat interface before I realized I was fighting the wrong battle. I wasn't building a data-fetching app; I was building a high-frequency state synchronization engine. Here is how I moved past the spaghetti code and finally got those buttery-smooth generative responses.

The "Naive" Way (Or: Why My First Draft Failed)

Initially, I thought: *It's just a stream. I'll just append every new chunk to a string.*

It looked something like this mess:

// Don't do this. Seriously.
const [message, setMessage] = useState("");

const handleChat = async () => {
  const response = await fetch('/api/chat', { method: 'POST', body: JSON.stringify({ prompt }) });
  const reader = response.body.getReader();
  const decoder = new TextDecoder();

  while (true) {
    const { done, value } = await reader.read();
    if (done) break;
    const chunk = decoder.decode(value, { stream: true });
    // This triggers a re-render for EVERY single character/word.
    setMessage((prev) => prev + chunk); 
  }
};

This works for about five seconds. Then you realize that updating a massive string in state 60 times a second while React tries to re-render a complex Markdown component is a recipe for a 12-frames-per-second nightmare.

Moving to a Stream-First Architecture

The breakthrough happened when I stopped trying to reinvent the wheel and looked at how the Vercel AI SDK handles it. They decoupled the *network stream* from the *UI state*.

Instead of manually managing readers and decoders, you use hooks that are optimized for high-frequency updates. If you aren't using useChat yet, you're working ten times harder than you need to.

The Refined Pattern

Here’s the setup that finally cleared the air. It handles the loading states, the message history, and the streaming chunks without me having to write a single while(true) loop.

import { useChat } from 'ai/react';

export default function ChatComponent() {
  const { messages, input, handleInputChange, handleSubmit, isLoading } = useChat();

  return (
    <div className="flex flex-col w-full max-w-md py-24 mx-auto stretch">
      {messages.map(m => (
        <div key={m.id} className={`whitespace-pre-wrap ${m.role === 'user' ? 'text-blue-600' : 'text-gray-800'}`}>
          <strong>{m.role === 'user' ? 'User: ' : 'AI: '}</strong>
          {m.content}
        </div>
      ))}

      <form onSubmit={handleSubmit}>
        <input
          className="fixed bottom-0 w-full max-w-md p-2 mb-8 border border-gray-300 rounded shadow-xl"
          value={input}
          placeholder="Say something..."
          onChange={handleInputChange}
        />
      </form>
    </div>
  );
}

Why this works

1. Throttled Rendering: The hook doesn't necessarily trigger a full React fiber reconciliation on every single byte. It's optimized to batch updates.
2. Request Cancellation: If the user gets bored and clicks "Stop" or navigates away, the AbortController is handled internally. No more ghost streams running in the background.
3. Chat History: It manages the messages array automatically, appending the user's message and then "growing" the assistant's message in real-time.

The Final Boss: Auto-Scrolling

Even with a smooth stream, the UX is trash if the user has to manually scroll down as the AI writes a novel. But "Scroll to bottom" is deceptively tricky in React. If you scroll on every render, the user can’t scroll *up* to read previous messages without the UI snapping them back down like a magnet.

I solved this by using a "User-Intent" check.

const scrollRef = useRef(null);
const [isAtBottom, setIsAtBottom] = useState(true);

const handleScroll = () => {
  const element = scrollRef.current;
  // Check if user is within 50px of the bottom
  const bottom = element.scrollHeight - element.scrollTop <= element.clientHeight + 50;
  setIsAtBottom(bottom);
};

useEffect(() => {
  if (isAtBottom && scrollRef.current) {
    scrollRef.current.scrollTo({
      top: scrollRef.current.scrollHeight,
      behavior: 'smooth',
    });
  }
}, [messages, isAtBottom]);

By tracking whether the user is *already* at the bottom, we only auto-scroll if they haven't manually scrolled up to check a previous fact. It’s a small detail, but it’s the difference between a tool that feels "pro" and one that feels like a hackathon project.

Markdown and Performance

A quick warning: Parsing Markdown (via something like react-markdown) on every stream chunk is expensive. If your AI responses are long, you’ll notice the fans on your laptop spinning up.

The Fix: Wrap your Markdown component in React.memo or use a lightweight parser. Don't let your syntax highlighter re-calculate the entire code block theme every time a new semicolon is streamed in.

Wrapping Up

Streaming isn't just about showing text fast; it's about managing asynchronous state without blocking the user. By moving away from manual fetch loops and embracing hooks like useChat, you stop fighting the browser and start working with it.

Stop wrestling with the chunks. Let the libraries handle the stream, and you focus on making the UI actually look good. Your users (and your CPU) will thank you.

How I Finally Tamed the Chaos of Streaming AI Responses in React

How I Finally Tamed the Chaos of Streaming AI Responses in React

The "Naive" Way (Or: Why My First Draft Failed)

Moving to a Stream-First Architecture

The Refined Pattern

Why this works

The Final Boss: Auto-Scrolling

Markdown and Performance

Wrapping Up

Related Articles

The Hydration Gap

Why Does Your React App Stutter During LLM Streaming?

React Server Components: The Game-Changer I Almost Missed

Related Articles

The Hydration Gap

Why Does Your React App Stutter During LLM Streaming?

React Server Components: The Game-Changer I Almost Missed