The root cause is that web browsers were designed for document delivery but are used for building application UIs. The browsers don’t offer a standard set of common application UI components, so every team builds their own leading to inconsistency and half baked implementations.
In contrast, when you build a native app, developers can draw on a standard set of OS provided UI widgets.
I think the root cause goes deeper than that, and has to do with economic incentives. Up through the 90s, the predominant business model was you sell a product, people use it to get work done, if they're happy they tell their friends and you sell more product. Starting in the 00s, the business model became you give a service away for free, get people hooked, make them so dependent upon it that they can't look away, and then either jack up prices to extort as much money from them as possible, sell advertisements so that other people can do the same, or sell their personal data so that other people can target them with sales pitches. Actually getting any work done became secondary to making the transaction happen. This applies just as much to enterprise software as consumer software, because the purchaser of enterprise software is usually some IT department, purchasing department, or executive who doesn't have to actually use the software, and they will probably move on to the next company before the consequences of their purchasing decision being useless become visible.
We are reaping the consequences of that now, where lots of transactions are happening that don't actually make anyone happy or productive.
But you can see how that would filter down into UI design. When your incentive is to make people happy and productive, you spend time studying how people actually use the product, and then optimize that so they can use the product more efficiently. When your incentive is to turn people into mindless consumers that keep coming back for more ads, you spend time studying what sort of content holds the user's attention, and then optimize that so you can work as many ads into the stream as possible without them turning away. When your incentive is to sell enterprise software, you spend time studying what sales pitches will get the budget-holder to open their company's wallets, and then optimize the sales funnel to the extent of actual product usability. Even if your users hate you, they don't get to decide whether they keep using you.
Developers always had the flexibility to create custom UI elements/colors etc even in native apps (albeit not as easily as using CSS). Even in SPAs, most UI elements follow the same style or pattern more or less (bootstrap/tailwind etc). It's the entire UI design itself that's not user friendly for enterprise/business apps (excessive padding, comically large UI elements etc).
Wouldn't say it's the root cause, but it is a major cause. I have some experience developing desktop applications using Visual C++ / MFC in the early 2000's. I still prefer that development experience to modern React/Redux SPA development.
OS widget libraries aren't always big enough to solve all problems. On the web, there are many frameworks that provide widgets for typical use cases.
But even if you have a library with hundreds of widgets, you can still make a terrible UX if you don't understand good design, and many programmers don't.
In my experience most designers don't know what UX is. They think their job is to make it look pretty. If it needs 3x more clicks to do the same thing as before so be it.
There's more consistency in UIs on the web than desktop IMO
People have less power on the web so it has more limitations, even if it lacks a number of consistent UI components baked in.Desktop apps are notorious for getting fancy. Even simple control apps from random headphones/keyboards/music gear/etc all want to reinvent the settings page and make it 'sleek' instead of usable.
In contrast, when you build a native app, developers can draw on a standard set of OS provided UI widgets.