The Textbox Should Not Be the Future of AI
ChatGPT Changed More Than AI
When ChatGPT came out in 2022, it was a magical moment. Write anything into the textbox, and you’d get back half a decent answer. The world was taken aback, and rightly so. Over the next few years, we got to see many more apps spring up around LLMs with a similar form factor: a prominent textbox as input, and a chat-like interface. What was once the signature interface of ChatGPT slowly became the industry standard for interacting with language models.
As things stand today, LLMs seem to have found utility in almost every major software product, and they are being integrated at a lightening pace. In most cases, they absolutely should be. The technology is indeed that versatile. But along the way, something else happened too: the integration of LLMs became synonymous with the introduction of a chat-like interface.
The Textbox Became the Default
Almost every product that I see is coming up with the same prominent textbox as an input, which OpenAI introduced with ChatGPT. This right here is what I can’t wrap my mind around. In the rush to stay on track with the AI hype cycle, every product is being slapped with that same chat input. What began as one brilliant interface for one kind of product is now being treated as the universal interface for all products.
We spent years iterating on UI and UX to get to sliders, buttons, panels, toggles, menus, shortcuts, and so many other input components that made software faster and easier to use. Now suddenly, the textbox is being treated as the answer to everything. And that feels like a mistake.
Why That’s a Problem
The textbox as an input seems like a downgrade honestly. In a lot of cases, what used to take 5 clicks is now being accomplished with 200 keystrokes. That is alarming.
Input is essentially a way to determine user intent. It answers the question: what is the user trying to do? Once that intent is clear, the software can process it and return an output. The problem is that typing is not always the best way to express intent. Sometimes it is. Sometimes it’s not.
If I already know I want a date range, a comparison view, a filtered dashboard, or a sorted list, why should I type all of that from scratch every time? Why should language be the only doorway into the system?
When Typing Is Worse Than Clicking
“Give me a report of my last month’s and then compare it with the month before that” is clear intent, but it is also very time consuming. I would rather press a few buttons to get this report, and see the comparison in a dropdown. That would be a couple of clicks and convey the same information to the system.
This is where traditional UI still wins. Buttons are fast. Dropdowns are clear. Sliders are intuitive. Panels make options visible. Good interfaces reduce effort. Typing everything manually often does the opposite.
Not every task needs a conversation. Some tasks just need a clean control panel.
Calculator, Excel, and ChatGPT
I understand that language models fundamentally work with language, so it is logical to take words as an input. Textboxes are hence the default choice; voice comes in next. But I’d like to believe that this cannot be the final form of how we interact with them.
Language models, as they’re being used today, are not very different from any other software at their core. They take an input, do something with it, and give you a result. Your calculator does that. So does Microsoft Excel. And ChatGPT is no different.
Input → processing → output.
What is different in each of the three softwares is the complexity that can be handled at each stage. A calculator can take in just numbers and operations displayed on the UI, in a specific sequence. Get the sequence wrong, and it won’t work. Excel can take in far more kinds of input than a calculator. It not only takes everything on your keyboard, it also takes in stuff not on it. Even here, some level of sequencing is required, but it’s much more flexible.
ChatGPT, at the very end of that spectrum, can probably take your grandma in as input, if you tried hard enough. There’s almost no wrong sequence here that can break it. The same is true for their processing and output capabilities as well. But greater flexibility does not automatically mean one textbox is the best interface for every use case.
The Future of AI Interfaces
Now, I understand that language models can do much more than what static software does. And that is why not everything can be made into a button or dropdown. But I do believe that there has to be a middle ground between the two extremes of that user experience.
Maybe it is generative UI. Maybe it is voice. Maybe it is interfaces that adapt in real time based on what you’re trying to do. Maybe it is software that starts with buttons and panels, then opens into conversation only when needed.
I don’t know. I’m trying my best to imagine that future, as someone who loves product and building. I am sure other much more capable people are trying as well.
We’re Still Early
These are some exciting times in our history. What we are seeing right now may simply be version one. The textbox may be the beginning, not the destination.
We haven’t reinvented software yet. We’ve only added a textbox to it.
And I’m just waiting to see how things turn out.