Why Language Models Struggle to Design What They Have Never Really Encountered

There is a basic problem with asking a language model to design a user interface, and it is deeper than whether the model can write decent HTML or organize a page well. It usually can. It can produce clean layouts, sensible hierarchies, and functional code. It can even imitate taste when given enough direction. The real limitation sits underneath all of that. And that is the model’s understanding of design is built from the accumulated record of things that have already been made, and most of what has already been made follows a small number of familiar patterns.

That is simply how the system learns. A language model is trained on recurrence. It absorbs structures by seeing them over and over until they become statistically dominant. If it has encountered endless landing pages with a hero at the top, supporting sections in the middle, and a footer at the bottom, then that architecture begins to feel less like one solution among many and more like the natural shape of a website itself. So when someone asks it for something “new,” what it often delivers is not a genuinely new structure, but a polished variation on the same old one.

Music offers a useful comparison here. Popular music has relied on a stable architecture for a very long time: intro, verse, chorus, verse, chorus, bridge, chorus, outro. It works because it is legible and it gives listeners a rhythm of return. They know where the emphasis lands, where the hook lives, where the emotional payoff is supposed to come. A model trained mostly on that material would learn, very quickly, that this is what a song is.

But that logic does not hold across all genres. In deathcore, death metal, and parts of metalcore, songs often do not move in such tidy units. A passage may swell without announcing itself as a chorus. A breakdown may not function like a bridge at all, but like a tear in the song’s internal logic. Sections blur together. Energy is redistributed in ways that would sound wrong in a pop structure but feel completely natural in a heavier one. The song is not failing to meet the template because the template was never the point.

A model trained on dominant patterns has a tendency to drag those patterns everywhere. That may seem very obvious as it keeps trying to restore the expected form. It wants to place a chorus where one does not belong because, in its learned understanding, that is where the song is supposed to open up.

The same thing happens in web design. The standard hero-content-footer layout did not emerge by accident. It solved a real problem. Businesses wanted to explain who they are, what they offer, and what the visitor should do next, usually in a matter of seconds. A prominent hero establishes identity/clear branding and purpose. Supporting sections elaborate further for those interested. A footer gathers the rest. For marketing sites, product pages, portfolios, and SaaS companies, that structure is efficient and easy to navigate. It is not hard to see why it became the dominant grammar of the web.

The trouble is that efficiency is not the same thing as originality, and clarity is not the same thing as experience.

When a language model designs an interface, it tends to optimize for the things most easily recognized as successful. The layout is coherent. The navigation is visible. The hierarchy is readable. The call to action is hard to miss. None of that is bad. In fact, most of it is necessary. But memorable design usually begins where that checklist stops. The sites people remember are not always the ones that communicated most directly. Often they are the ones that created a mood, or established a rhythm, or held attention through surprise. They did something riskier than merely functioning well.

Some interfaces feel cinematic. Others feel playful, eerie, calm, dense, or strangely intimate. Some delay comprehension on purpose, not because they are poorly made, but because disorientation is part of the experience they are trying to create. A model can imitate surface features associated with those effects, but imitation is not the same thing as intention. It can reproduce visual signals of “interesting design” without grasping why those signals mattered in the first place.

That gap becomes especially obvious when you try to guide the model with better rules. I have been working on a skill meant to push models toward more thoughtful interface design. It does help a little. The outputs become more refined. Typography is handled with more care. But the deeper structure usually remains intact. And that is hard to get rid of because by nature it is embedded deep into the model's training data. But I do not believe the data is non-existent. Quite the opposite, but it does need to be elicited.

So the problem is not that the model cannot make attractive things. It can. The problem is that it keeps returning to the same architectural assumptions, even when the styling becomes more sophisticated. What changes is usually the finish, not the underlying idea.

That is where idiosyncratic design differs. What people call artisanal design is not just handmade-looking work, or work with rougher edges, or work that seems personal in some vague aesthetic sense. More often, it is work shaped by a strong point of view. The designer is trying to make a specific experience happen, and is willing to discard conventions that get in the way. Sometimes that means breaking usability norms. Sometimes it means making a page less efficient in order to make it more affecting. Sometimes it means accepting that part of the audience will dislike it.

That sort of design requires judgment in a deeper sense than models portray. It requires preference that is not merely inferred from precedent. It requires choosing a path not because it resembles what has already been validated, but because it feels truer to the thing one is trying to make. It also requires tolerance for failure. A human designer can decide that a risky idea is worth pursuing even if it might confuse people, alienate clients, or simply fall flat. A language model has no equivalent stake in being wrong for the right reasons. It is built to approximate what is most likely to fit.

There is another complication, and it may become more serious over time. As more interfaces are generated with the help of models, those interfaces become part of the broader visual environment future models will learn from. The result is a feedback loop. Template-driven design is no longer just the inherited background of the web; increasingly, it is being reproduced and reinforced by systems already predisposed to favor it. With each cycle, the roughness gets sanded down. The variance narrows. The same structures return with slightly better execution and slightly less friction, but rarely with any real departure from the form.

That does not mean nothing new can never emerge from these systems. It just means that novelty is more often than not, going to be shallow unless a human intervenes at the level of concept rather than decoration. If you ask a model to be experimental, it will often interpret that as a request for more motion, stranger colors, sharper type contrasts, or a more dramatic layout treatment layered over the same familiar skeleton. It can embellish the template. It has a much harder time abandoning it.

And that may be the real boundary here. Design innovation is not just the production of unusual outputs. It comes from knowing what kind of experience ought to exist, even before there is a recognized structure for delivering it. It starts with an intuition about feeling, pacing, atmosphere, or tension, and only then works toward form. A model can remix known solutions. It can even do so elegantly. But remixing is different from originating a form around a vision.

None of this makes language models useless for design. Far from it. Most design problems are not avant-garde problems. Most businesses do, in fact, need clarity, hierarchy, and a page structure people already understand. For those tasks, the template exists for a reason, and the model is often quite good at using it. The mistake is expecting it to produce the rare kind of work that alters your sense of what a website can be.

That kind of work still tends to come from a person with an obsession, or a sensibility, or a stubborn and perhaps irrational commitment to making something that does not quite fit the existing mold. It comes from someone willing to begin with the experience they want to create rather than the structure they know will probably work. A model can assist that person. It can accelerate drafts, generate variants, and clean up execution. What it cannot supply is the underlying conviction.