Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Devpod: Remote development environment at Uber (uber.com)
291 points by phgn on Dec 18, 2022 | hide | past | favorite | 230 comments


Back in the early 2000s, Sun ("the network is the computer") had a similar solution that worked seamlessly for most of their software org-- the Sun Ray. https://en.wikipedia.org/wiki/Sun_Ray

It was a network terminal. Your files and entire session were on the server. Your “local” terminal consisted only of a network interface and enough compute power to display your session. The way they had it set up was that you could insert your Sun employee ID – the same card used to get into the building – into a slot in the terminal. That authenticated you to the server and displayed your session instantly. Want to show a colleague something you’re working on? Just put your ID into their Sun Ray and show them exactly what you were doing. That was cool! It was a frictionless way to demo and collaborate.


I worked on a Sun Ray. I and many colleagues absolutely hated it.

These tiny machines were just way too slow to handle even the tiny amount of work they had to do. Also, everyone knows that X over network is just not made for modern applications ("modern" in the year 2000!). I worked with Matlab, and had a lot of fun trying to rotate 3D plots with a few thousand points. It was just unbearable.

Then of course the "single point of failure" thing. Network problems? No one can work. Main server has a drive failure? No one can work. Main server needs an upgrade? No one can work.

The Sun Rays had super-poor USB support. My ergonomic keyboard had no auto-repeat when connected to these things, absolutely impossible to fix or even debug. Then of course there was Sun software: although they invented Java, their JVM was leaking like a sieve and everything Java had to be restarted regularly. The Sun coreutils were just very limited compared to the GNU counterparts. We complained endlessly, and in the end, IT budged and we all got our dedicated Linux machines.


How long did the networked machines experiment last at Sun before you switched to Linux PCs?

I remember that famous Larry Ellison speech in the mid 90s about how thin client/networked applications will be the future. It’s apparently what helped make him famous because no one cared about enterprise DBs in the tech media:

https://tedium.co/2018/04/12/larry-ellison-network-computer-...


No, I did not work at Sun, I did my work on a Sun Ray computer at university. Sorry if that wasn't clear.


> everyone knows that X over network is just not made for modern applications ("modern" in the year 2000!). I worked with Matlab

No. I don't think everyone knows that at all. At my university we used X terminals connected to a central Sun SPARCcenter 2000 over ethernet and it was fine. We used MATLAB, Maple V, SAS, etc etc.


What's old is new again. How soon until we realize the X window system actually had some good ideas again and start running desktop apps on cloud servers for remote work?


It had some good ideas, but that was the extent of it. Every iteration of the solution since then has agreed that streaming all application UI is always going to be janky and needless. It's a lot better to host the data on the server side and let the client handle visual rendering by itself.


These days lots of people stream games and CAD sessions over the internet using software like parsec. It works quite well and is often better than the UI running locally on an underpowered GPU.


Sure but most regular consumer or business applications don't really need that level of graphical power. Rendering a menu or button or blurb of text locally from some layout language is always going to be more performant than streaming raw pixels from a server.


It would be, if everyone agreed on a toolkit. As it as, at least X11, AFAIK Wayland, and Windows/RDP ended up just throwing pixels over the wire because every program renders text/menus/whatever differently.


I wonder what the world would look like if we decoupled what something is supposed to be and what data it's supposed to have, from how it actually looks and acts. Say, some "lowest common denominator" that works reasonably across all platforms.

For example, I'd say that I need:

  A dropdown for a single option, with options: A, B, C
And let the device itself decide what needs to be displayed in the native GUI toolkit. Then just send that specification over the wire, instead of needlessly wasting the bandwidth on lots of pixels.

Actually, I think I'm just describing an analogue to HTML with the equivalent of CSS provided by the platform, but for native desktop toolkits (hopefully without the complexity of a browser engine).


RDP does not mostly do that. It sends higher level commands like text and geometry placement. That’s why it performs 10x better than the competition.


Actually it does do that, since about 2010: https://www.anandtech.com/show/3972/nvidia-gtc-2010-wrapup/3

RDP doesn't perform even 0x better than the competition, like parsec.


I heard Stadia was actually pretty good, especially for a game that wasn't super sensitive to fast reactions like RDR2.


The tech was by most accounts groundbreaking. The problem was with everything else – business model, pricing, games, customer trust.


It wasn't really groundbreaking tech. It's been possible for years, see online. It's just the business model sucks. We did some prototypes in a large company I worked for and it was easy to tell if you could put servers in local data centers with fat pipes it would work. We had demos using ec2 back in 2013. Them we ran the numbers and were like why, esp with free to play becoming more and more of a thing so we canned it.

Nvidia had a successful product as well before stadia. Hell I remember reading a use AWS GPU instances as a gaming machine blog posts before stadia.


I don’t think Stadia itself was groundbreaking. Xbox cloud worked as good as Stadia.

But a few yrs in the big players in the cloud gaming industry figured out some critical latency issues around the controllers and optimizing delivery of the video feed. To the point where lag was almost non-existent on a good fiber connection. The datacenter stuff was obviously the core innovation though.

I still prefer to download my Xbox games to my series s/x but I spent a year playing Xbox cloud games exclusively and I could 100% see that being the default for a big part of the casual audience.


Stadia guy here. There were hiccups, but for the price+convenience, it was awesome.


I have streamed my worksession with nomachine and I have used VS code remote. nomachine is really good, but I would still prefer to have a beefy enough machine.


I'm using Steam Remote Play over local LAN for gaming, works pretty well.


I used to run Firefox on my Debian server with X tunneling in high school sometimes. It wasn't that bad then and surely it's even better with today's internet.


Firefox is a particularly bad experience over X forwarding these days.

Turns out lots of software depends on at least minimal hardware acceleration.


>start running desktop apps on cloud servers for remote work

Hopefully never.


I bet on LAN X11 forwarding was pretty sweet, but when I tried it over Wi-Fi/tethering is was pretty janky (and before someone suggests xpra, that wasn't buttery smooth either). But I do like the idea of what X11 forwarding is, better than VNC where whole desktop is shared. I read that on Windows RDP can "stream" only the window, whereas on Linux the implementation is full desktop only? (please correct me if I'm wrong cause I would love to tunnel apps from servers).


Honestly, I think xpra is probably the best open solution you will find. It does have (too) many knobs to configure so it takes time to try out different codecs and settings to find a sweet spot using their session info toolset. And the mailing list is very helpful. For example, if bandwidth is plenty, you can use the straight RGB encoding and it performs well. Turning on audio can have effect on your latency as well.


IIRC, part of the problem is X11 forwarding works best when the software is using raw X primitives (someone can correct my terminology if I'm wrong) because then X11 can just send draw commands. Both GTK and Qt don't use the X11 primitives so instead the server has to send entire pixel buffers, which is naturally a lot slower.


> "stream" only the window

I mean, that is a thing xpra can do. Or, x11vnc can do it with some hoop-jumping.


X maintainers are promoting a new platform that doesn’t provide for remote hardware rendering. Ironically, shipping Javascript to a web browser (local to the display) seems to be the way ahead for performance.


We had SunRay at a municipality here in Sweden where I worked at the IT department. I tought it was really cool and I had built our own "session routing" script that could connect terminals to different servers based on the smartcard ID. Terminals in the schools connected to a school server where students could login without smartcards, but if a teacher inserted their card it would connect them to the "admin" server or to a windows VM.

I had my own server in the DC that I could then connect to from my desk using a multi-monitor SunRay terminal. At home I had a SunRay connecting in to the office with VPN. I could move between terminals by just inserting my smartcard in whichever terminal I was at. There was even a company creating a SunRay laptop called Gobi that I tried using , but a regular laptop with the software client was a much better experience.


I helped deploy the previous version of that, the Java station, to everyone in MPK. They did not take away the existing computers and mostly went unused.


The thin terminal possibilities that Java was working on solving was exciting back in the day.


I've tried to setup this up at a small tech company pre-kubernetes and there were a lot of challenges...

* low utilization, so you need to auto-shutdown or auto-scale down systems but sometimes jobs did need to run overnight or on the weekend so you needed an interface/user training to avoid upsetting users/killing their jobs.

* local hardware is cheap and powerful - employees are already issued really powerful laptops and some teams just went out and bought their own really, really powerful workstations. It is hard for the 'cloud' to compete with this.

* bin-packing workloads was hard, Kubernetes probably solves this better.

* fast-customization was hard, we had docker but it was hard to train users to update/fork the dockerfiles to keep the environment reproducible. A lot of users were more scientists than engineers and so weren't great at using version control.

* persistence / shared-storage IOPs are expensive - shared storage is a nice to have and a number of teams made a lot of use of it, it also made it easy to migrate users around but it's expensive. Local disks were also very painful/slow/buggy to deattach/reattach (maybe this is better now).

* latency - we needed instances close to our team and at the time the metro clusters near us were like second class with low capacity and limited features.

* specialty hardware like GPUs

* multiple paradigms, we also needed long running dev/staging environments, spark clusters, or other software, some managed or licensed to run in specific way, and it was hard to get these all managed the same way, clustered on the same nodes without introducing other issues - but w/out this costs would spiral. Again now that kubernetes is like the defacto cluster manager this might be easier now.


Yeah I can't really get beyond this thing about how powerful our machines on our desk are compared to what hosting providers charge if you have 12 months utilization (I get people in SV change jobs every 6 months but).

I guess the thing that makes this make sense for Uber is the ginormous repo? This probably is a good "big ball of mud" solution to a bazillion tiny projects, each with varying degrees of documentation making it impossible to run.

One day we as an industry will figure out how to make it so that our dev setups work and run well in a multitude of environment (hopefully without that solution involving "we are pinning to a specific Ubuntu docker image"...)


> Yeah I can't really get beyond this thing about how powerful our machines on our desk are compared to what hosting providers charge if you have 12 months utilization

I'm not so sure. Here's a quick comparison:

Desktop:

Dell XPS 12ᵗʰ Gen Intel® Core™ i5-12400 Windows 11 Home Intel® UHD Graphics 730 8 GB, DDR5 256 GB SSD

$669.99

https://www.dell.com/en-us/shop/desktop-computers/xps-deskto...

Cloud:

Linode 8 GB 8 GB 4 CPUs 160 GB

$0.06 / Hr

Suppose I work every working day of the year for 10 hours (an overestimate). I would pay 250 * 10 * $0.06 = $150 for the Linode. So the Dell desktop takes 4.5 years to pay for itself.

Granted this is a fairly rough comparison but I don't think it's obvious that physical machines are more cost efficient that cloud machines.


First off these specs are... poor. My 2012 spare laptop that I keep for nostalgia has more memory and storage.

You can't physically connect a local screen and a keyboard to a remote Linode instance, so the $0.06/h is on top of the $669.99 (which in your example doesn't include screen, input, etc).

So in that exact setup you end up with a sub-par local computer that connects to a sub-par remote computer (and creates all kinds of headaches associated with remotely accessing compute resources); where the remote computer can't do a single thing that a ($669.99+$150) $820 local computer wouldn't do much better.

All to save you how much cash? How many hours' worth of $ in engineer salary? $2000-4000 will get you a really powerful M1/M2 Mac, and if you need more than 128GB of ram then you can indeed spin up an instance - and instances with >128GB of memory are nowhere near close to $0.06/h.

Optimize for developer productivity, that is your bottleneck. Uber's problem is less of "our local machines are bad and/or expensive", more of "our setup is so convoluted, and we have so many devs, that we need to apply economies of scale to tackle the problem".


> $2000-4000 will get you a really powerful M1/M2 Mac

If we take this as the basis for comparison, then the same dollars will buy 1-2 (working hours only) years of Linode 96 GB with 20 cores, so it seems like cloud services indeed lose competitiveness at the high end.


Not a great comparison. The i5-12400 has 6 physical P-cores. The Linode instance you mention has 4 vCPUs, which typically correspond to logical cores and are subject to noisy neighbors. If your workload needs those CPUs, a better comparison would be against the “Dedicated 32 GB” plan, which has 16 dedicated vCPUs. Those instances are 3x your quoted price.


> If your workload needs those CPUs, a better comparison would be against the “Dedicated 32 GB” plan

I'm not sure what you mean. The "Dedicated 32 GB" plan is 6x my quoted price. Why wouldn't the comparison be to the midpoint of the "Dedicated 8 GB" and "Dedicated 16 GB" plans (which have 4 and 8 CPUs respectively)? That would be around $0.14 per hour, about 2.5x my quoted price.

Anyway, it seems highly unlikely that local development, which uses CPU very burstily, needs a dedicated plan.


When your app is really 50 microservices, or one really big monolith, there's no great way to run it locally. And if you have one single job that takes up more RAM than you have, or requires tons of parallel CPU, the laptop is a bottleneck. And heaven forbid your corporate overlords saddle you with horrible virus-scanning crap or mandatory HTTP proxies with custom certs...

A lot of this will be solved if the people making our OSes ever decide to build a real distributed OS (either SSI or something more like P9). The OS is supposed to make programs easier. Why we keep trying to reinvent the wheel inside every application, and tie them together with duct tape, I don't know.


Let's face it: monorepos are just huge monoliths. If you want modular software, you WILL have to pay the price of fragmentation.

Someone needs to think about the boundaries between different parts of a system. If those boundaries are defined by functions, classes, packages or services, it doesn't matter that much. Yes, it is always a pain.

Shipping the entire forest of modules in a single repo seems like a good compromise. It stops being a good compromise when you start creating a complicated network of soft dependencies between those modules. And when you have everything in a single repository, it's hard NOT to do that.

By the way, there's nothing wrong with monoliths. They're just better if you design them as such. There's nothing wrong with microservices as well. It shouldn't be a surprise though that neither of them are actually silver bullets.

I understand why these solutions were created. I see it as an ephemeral thing. Once someone figures out the tooling and practices to automate all of that painful management locally, developers will always prefer that. Then when it's fast and easy, we'll break it once again (like we did with classes, packages, containers, etc).


Monorepos and monoliths are orthogonal

Monorepos are a tooling nightmare. Monorepos are a bandwidth black hole.

Monorepos make some things possible that are just not possible or very very hard when you have multiple independent repos that are built and tested independently.

Namely,

a) they allow you to test the effects of a change on all the components that depend on you, before you merge your change. This reduces the noise caused by regressions (API or behaviour) introduced in one component that is depended on by many consumers.

b) they are a practical way to ensure that all components have up to date internal dependencies: by placing the burden of API and behaviour breakage to the author of the change, you don't end up having hundreds of teams each struggling to keep up with dependencies that keep breaking their builds when you update them and consequently hating the teams that release those changes.

None of these things is a big deal unless you're a huge company with hundreds of teams.

I think in theory there could be some tooling and workflow that could provide all or moat of the benefits of monorepos, without the downsides.

Until then, monorepos are likely going to be a bad choice for small companies.


I agree with the other commenter - in my view, a monorepo is the _best_ choice for a small company. I guess this depends on what tooling is available for your language / ecosystem of choice though. In my experience of TypeScript and Java with monorepos, you definitely need to know how to configure the tooling properly (which is certainly "overhead"), but it massively reduces the maintenance cost and increases the consistency of your tooling config. Spreading out over loads of repos means you need to share artefacts, which means package managers and package manager hosts, and a whole suite of release CI/CD which gets out of sync almost immediately.

It's also getting a lot better, gradle works amazingly well for a monorepo even with dozens of developers committing to it every day with shared caching, nx/turborepo/others are making the story for front-end/TS much better too.


Can you explain how a monorepo for microservices is a tooling nightmare and bandwidth sink?

In my experience, a monorepo significantly makes things easier and saves time everyday for all developers involved when compared with multi repo.


The biggest issue is that every CI provider speaks with "repo=project" language, so your way forward with monorepos is using stuff like Bazel.

Bazel is extremely cool! But you end up with, like, a handful of people on the team who can write Bazel and eveyone else cargo culting their way through it.

There are a lot of JS "monorepo management" tools but they all seem concerned about the release phase for a lot of libraries that really should just be one library instead of 300 npm packages or whatever.


> Bazel is extremely cool! But you end up with, like, a handful of people on the team who can write Bazel and eveyone else cargo culting their way through it.

It's not that difficult and docs are outstanding. It can and it will be worse with your own Bash-isms that likely won't have any docs at all.


I think Bazel is doing a lot of good stuff... but I think it suffers a bit from the same thing as Angular, where it makes a lot of its own terminology.

There is another aspect where if you are using a language like Python or JS then you have to kind of swim upstream to get an existing project onto Bazel. Far from impossible but if you just look at the default Bazel stuff without pulling in third-party libs it's pretty tedious to get a project with a good amount of dependencies working.


Why would you not pull third-party libs? I presume you use loads of them in the actual codebase how is this different?


Thank you.

> repo=project

This has gotten better lately. At least, good enough for small-medium scale projects.


>Can you explain how a monorepo for microservices is a tooling nightmare and bandwidth sink?

bandwidth:

I meant "bandwidth" as literal bandwidth. When your codebase becomes huge, your VCS repo size becomes enormous and it becomes harder and harder to keep a full checkout on all the development machines, especially if they are over the WAN (e.g. at home on your laptop).

This has fuelled solutions like sparse checkouts (like MS vfs for git, now scalar) remote development (like TFA; but also Google's cider and srcfs etc).

tooling:

naïve monorepo tooling (which I've seen in various companies I worked for) simply perform a full build of the whole monorepo for each CI execution. At first this is just fine since you can parallelize builds and call it a day; but after a while, the builds just don't scale anymore ; flaky tests become an increasing frustration etc.

The tooling that can help scale large monorepos does exist, but requires buy in and comes with its own learning curve and tradeoffs. One well known such tool is bazel (https://bazel.build) and bazel remote builds and remote caches. These tools are hard to set up, although folks at https://www.buildbuddy.io/ can help smaller startups by offering a managed service.

(Again, I'm talking about really large monorepos. A monorepo which includes a dozen or so modules, for which you can easily perform a full re-build on a single CI worker and on your laptop is not the kind of repo that creates tooling nightmares.)


Thank you. Monorepos of _that_ size is definitely far more complex and comes with certain issues that you mention. Although, I feel like you'll only hit that wall once you are _very_ large as a company and at that point, you'll also have resources to climb that wall.

For small-medium scale, monorepo has been a blessing after dealing with multi-repo systems for years.


I'm a fan of monorepos personally.

That said, I cannot pretend I don't see the problems with suboptimal tooling working in a medium size codebase. "big" and "medium" and "small" are quite subjective things.

At $work we have a monorepo whose git repo grew to 1GB in size and where the CI turn-around is so high, and the glitches so often, that it often takes hours or even days to land some code to prod. Developers instinctively react by making bigger and bigger changes because the very thought of going through PR/review/CI/merge cycle once again terrifies them. It's all compounded by a security policy that forces a code review approval every time the source code changed, including when you have to apply fixes to build failures induced by a component you don't own.

All of these things can and should be be fixed. But this is work, and is not urgent work so it's not often done at the same pace other stuff is done. This induces fatigue in the team and, as usually things go, people tend to blame the most easiest thing to blame: the monorepo.

That's why I try to phrase the problem to be a tooling problem and not a monorepo problem. Clearly if you don't have a problem with your monorepo, you either already have good tooling, or you don't need good tooling.


You don't need a monorepo to have integration tests. You can very well have a CI that clones a bunch of stuff and builds and tests them all together.

The issue that monorepo is solving is regarding write operations: FOO depends on BAR, which depends on BAZ. If you need to change BAZ to develop what you want on FOO and they're all on multiple repos, you'd have to pull request your way from the bottom to the top of the dependency graph. This is what causes the friction that monorepos avoid, and this is the hard part of such workflow to automate with multiple repos.


That's exactly how I understand the benefits of monorepo and it seems like a terrible idea.

You might spend one week building next version of your component and 3 months updating all of the dependents. Then do it again. And again. You are working at a fraction of your productivity. I wonder if that's why Google needs thousands of engineers.

Whereas I just updated stripe from 2.x.x to 5.x.x in one of the projects I'm working on, because new version has features that I needed. I never wasted time updating to other versions until I had a need to.

It also limits your ability to break backwards compatibility in new versions, because we all of course are coming up with great designs right from the beginning.

I get the security and performance benefits of keeping all dependencies up to date, but man, the time sink and limitations seem so not worth it.


> you start creating a complicated network of soft dependencies between those modules. And when you have everything in a single repository, it's hard NOT to do that.

That is what the visibility [1] in Bazel solves. You can't import other people's code unless they say you can by making their code visible to yours.

[1] https://bazel.build/concepts/visibility


We are moving to such cloud environment, and it makes me sick.

Maybe you need to dockerise Mongo, MySql and 5 other dependencies - I can get this, but I don't get it why the rest the code should still be running in the cloud. Python, Rails, Node? Why? Developers should be able to run 1 shell commands to install node.

Dev experience excuses are just excuses for a bad setup. So, fix your setup, please.

Not being able to run the project natively is big red flag for me and when I move jobs will be my 1st question.


> Why?

Three reasons from my perspective:

1) There's no setup steps. You just open your editor of choice and everything is set up for you. All the build tools, linters, specific versions of software $XYZ, etc.

2) Large VM (16 core, 96GB of RAM in my case) speeds builds and tests up dramatically.

3) Zero productivity lost if your laptop breaks. Just grab a new one from IT and you're up and running exactly where you left off with zero effort.

> Not being able to run the project natively

What do you mean by this? It's just running on a remote server rather than your laptop.


I don't see #1 happening. There is no way anyone has my personally customized development setup in some VM ready to use. More likely it is going to be one common setup that they want everyone to use.


Why are you speaking as if this is some weird future state that doesn't exist and has lots of unknown downsides? This is the current state in some companies. I work in this flow daily and really appreciate the value it delivers.


I haven't experienced it yet, only read about it in various places. So it only exists as a future possibility for me at this point and as speculation about how the various places I've worked or know well for other reasons would use it.


> my personally customized development setup

Why would your personally customized development setup even need to be on the remote host? You pop up your IDE, connect to the remote host via SSH and you are set.


Also most “personally customized “ environments can be copied around with 1 config file or a few dot files at this point


My IDE is designed to run on the remote system and be accessed via SSH and so it needs it's configs synced over or packaged up in some way. It is a bit more work to setup, but pays dividends in many ways.


For things like github codespaces, you can bring your own dotfiles to company codespaces. https://docs.github.com/en/codespaces/customizing-your-codes...


Ever seen Google Cloud Shell? Setup your own local install of software and configuration, it persists on disk.


It's not for you, it's for every new hire. You can always build on top of the base image, but it sets a nice lower bound and gets people started fast.


Problem is with people who put together these sorts of systems tend to think theirs is best and everyone should use it. For example I once interviewed at a place where every developer got an identical Mac with the dev environment already setup and you couldn't use something else as they thought it better if everyone had the same setup to enable easier pairing. I could really see something similar happening with these systems.. seems like the kind of setup that would have appealed to that company I interviewed at.


Setting up a working dev environment can take days to a week+ in my experience. On boarding can take a while, so getting a VM image or whatever that works out of the box that you can build on is a really good thing.

It's great too because if your own environment breaks, you can compare to a working one to fix it.

Just because a company might abuse it doesn't mean we should avoid it.


I have no problem with the idea of having a pre-existing image to kick off your dev setup. Just more that, human nature being what it is, it is unlikely to stop there. It will vary from company to company of course and I think my current company would adopt it in a good way but I also can think of one or two previous employers who would politicize and abuse the idea.

IE. it's no slam dunk. Like any other tool of this sort is will be used for good and ill.


Yeah as ops gets more complex i don't want to tell developers they need to install xyz with a hundred specific versions. Just grab the latest app image


For work-related stuff: I disagree. We're not even at Uber's scale at my company and I hate having to manage the ever-changing set of dependencies that I have no control over.


The thing is, it doesn’t have to be ever-changing, and creating a reproducible working setup locally can be achieved by appropriate setup scripts.


One of the reasons we developed devpods was because our setup scripts were unmanageable.

Instructions to new employees would say things like "run this thing, scroll up past dozens of pages of stdout noise and manually deal w/ the errors buried therein by looking up relevant FAQs in some doc somewhere"

The scripts would touch every technology imaginable, from brew to npm to arc (phabricator's cli) to proprietary tools and no single person understands how setup scripts work in their entirety.

One exercise we'd get new employees to run through was get them to brainstorm about how some system ought to work. The lesson was that just about any idea they could come up with would have already been tried (and failed).

I'm told that devpods aren't even the first time we tried cloud dev envs. Presumably lots of lessons were learned from previous attempts at improving dev envs.


> run this thing, scroll up past dozens of pages of stdout noise and manually deal w/ the errors buried therein by looking up relevant FAQs in some doc somewhere

Okay, but if you can define/script your environment enough to run in a pod, couldn't you just run that locally? You already have to solve the manual steps either way...


Re: if a computer can run scripts in a pod, can’t you just run them locally?

It’s a construct of the order of operations.

One core issue with local is the variety of OS’ and local build tools that would fundamentally mess with the centralized scripts. Getting company-wide setup scripts to work on top of existing laptop config was a continuous challenge. Hence, having a consistent baseline (OS flavor, system-level packages) on top of which the company-wide “setup script” is added followed by “developer-customizations” seems to work great.

Central teams can manage the first couple of steps and individual user-specific configuration can be managed much better in a decentralized manner.


It's the same convo as docker: can't one just use setup scripts for reproducible envs? In theory yes, but in practice, theory and practice aren't the same thing.

It's easier to apply best practices to a greenfield project written in a modern language (hence devpods) than trying to comb through a decade+ of tech debt written in bash.


Long before devpods (circa 2014) we had boxer, which was a command line tool that abstracted away the nitty gritty details of setting up a dev environment by orchestrating calls to AWS, vagrant, puppet, rsync, etc for you. However this was also a time when there were only a handful of services you would need to run, and because remote editing tools at the time weren’t as nice as they are now, it only worked really well if your preferred editor was vim/emacs so you didn’t have to deal with the (occasionally buggy) remote file syncing.

The devpod flow is a lot smoother. I had my laptop replaced recently and was up it running again in an amount of time that felt like cheating.


Setup scripts that mutate an operating system are fragile. Break something? Unless you understand how the scripts work (which will become stupidly complex at a scale like Uber’s) you will have to reinstall your machine. Or you will have to staff a ton of support to help users when they do break their configuration.

Spinning up a VM with an image containing all the development tools is a much smoother experience most of the time. The only reason why I don’t use it where I work is because I use vim and network adds too much latency for me.


That's a false dichotomy. Just because you don't have a VM doesn't mean that the alternative is a build environment and setup that mutates an operating system.


Using a VM is fine. One can use it locally.


In practice setup scripts are brittle. New person joins the team and their script fails because it turns out everyone else’s dev environment was only working due to something left behind by some older script. Hotfix requires checking out an old branch but now I need to run an old script but my setup is from the future - the old script doesn’t know what needs to be un-done to get the system to a consistent state. And what about data? My old data has gone, the data I have is from the future. Never mind simple stuff like the script author assuming node is in /usr/bin but that’s not true for me because I use nvm.


We’ve had good luck using nix for this. Same dependencies for the local dev environment as for the built containers. Same config. All deterministic. And not just major deps like programming languages, but sed, bash, grep, and all the other shell tools, so no more worrying about people running scripts on mac’s ancient version of bash.


I'm sorry, but this is the task of proper configuration management. Yes, don't depend on local stuff that isn't configuration-managed. Don't have a workflow where you check out an old branch in a newer environment. Of course you need a way to establish the older environment in that case. I'm assuming that Devpod does a similar thing on the server side. My point is, the ability to reproduce a working setup doesn't imply a requirement of having to work remotely.


> I'm assuming that Devpod does a similar thing on the server side

That's why they use devpods. You moved entire configuration to the cloud. There is absolutely no reason why it should be run in your local environment.


But now you're saying "we shouldn't change dependencies". Pypy turns out to work nicer than CPython? now you have a rollout project to your team. Swapping out some libraries? Another rollout project.

Now, it's great if you can avoid complex setups in general cuz complex is harder no matter what! But if you're starting from a complex setup, having easy ways to roll out changes is an important step in actually doing the simplification work to get to where you want to be!


Easy in theory, but always breaks down for a non-trivial project with >5 developers on it.


> it doesn’t have to be ever-changing

It IS ever-changing. One library among the zillions of local dependencies that you need to build something changes, and you have to go through the dependency hell.

If entire software world valued backwards compatibility and vigilantly guarded it, that wouldnt be a problem. But in the package hell that we are living in today, every other day a package update brings some incompatibility or breaking change for this or that other thing.


I'm 100% the opposite side of this argument. Running your entire stack locally is a silly trend that cost us a decade of productivity. In the early to late 2000's the remote-dev approach was very common. It wasn't "push a button and you have a dev instance!" easy but it yielded similar results.


I'm slowly getting to the point of "I want my computers to be a thin client around some config files". Treat your computer like cattle, rather than a pet, etc etc.


Same. I'm considering using a flatpak for this as it seems to have an interesting feature set to build a pre-setup development environment on top of. Just want my shell setup, tooling configs and a few other things bundled up in an easy to use fashion.

Currently I just have a git repo with my setup mostly in it (sans executables) with a way to get it going on a new machine. It works but is rather hackish and requires a bit of work to keep in sync.


[Obligatory Nix stanning here]


oblig. advice to wipe / on every boot: https://grahamc.com/blog/erase-your-darlings


Great idea! Also, restore from backup on every boot. The trouble is, I never reboot.


Unless your dev environment literally took one decade to set-up, then it didn't take "a decade of productivity" ;)

For people who knew how to run vms or even chroots, this was not a big issue


It's one thing to "develop" in vim over a slow ssh session or VNC. The latency makes that very annoying.

But we're at the point where we can still have a lot of analysis on the machine, and offload the slow analysis and building onto a more powerful computer, so that remote development is faster even with latency. And also getting to the point where internet connectivity is really fast so latency is low.

Like with most systems, the important part of remote development is that it's done well. And it seems like most employees are comfortable with Uber's setup.

Although, I do have to say that most companies really shouldn't use remote development, only if they have some excuse like they're Uber sized so that the benefits outweigh the costs. I've done remote work at uni, and they're servers and integrations aren't nearly as good so it's a chore; it's faster to use Mutagen and just develop locally then build/deploy remotely.


For me it’s about the hardware, I do molecular simulations and I have a machine with 96 cores and 512GB of RAM and 4 GPUs in it - I wouldn’t want this noisy machine in my house so using VSCode with remote server allows for everything I need to do - interactive debugging just like it’s local, automatic ssh means the terminal feels like it’s local, only a slight (few hundred millisecond) delay on saving,

It really feels like it’s local, I enjoy it


Also, It just passes the problem down the line. If your dependencies are too messy to manage locally as a dev then they're gonna be an ops nightmare too. Just commit to maintaining a flake.nix and use whatever computer you want.


Its a lot easier to manage dev envs if everyone is literally using cloned ec2 instances that can be remotely fixed/managed by devexp teams. These type of projects are a big step forward IMO


Thats definitely one way to do it. Im a founder at a remote dev infra startup (usenimbus.com) and this was the model we started with because it was just so simple. But we quickly learned theres no one size fit all solution so expanded into supporting containers, terraform, etc.


> why the rest the code should still be running in the cloud. Python, Rails, Node? Why? Developers should be able to run 1 shell commands to install node.

Because, making local environments exactly identical to the actual production environment is nigh on impossible. There still will be minor differences. And, to maintain the local development environment, crap ton of work will go to however that environment is maintained.

If engineers are maintaining it themselves, each of them will literally waste time on maintaining the local dependencies needed for the local environment - frequently encountering blockers due to package management hell that we are living in these days breaking one thing or the other. If you have 100 engineers as an example, your organization will lose 100 man-hours each month to such local development environment issues.

If you go the route of having infra or dev experience teams etc maintain them, then that team will be spending that effort to keep the scripts and whatever being used to keep the remote local environments in the engineers' computers up to date and working.

Instead, that infra team can just prop up dev versions of their infra/cluster/whatever, give the engineers access to that environment through a vpn etc, and voila - you instantly removed a lot of that lost man-hours.

Moreover, you will not never encounter any totally unexpected bug or performance problems that could end up coming to being from there being unforeseen incompatibilities in between local environments of the engineers and the actual prod environment.

> So, fix your setup

Life is not long enough for hundreds of engineers being spending their time on fixing totally unnecessary package management conflicts that are created by the utterly insufferable package and dependency hell that we are living in today. If you like suffering through that dependency hell, good for you. Most of us prefer to ship code and make things happen.


I think you’re overlooking the beginning of this article where they specifically call out the use-case is moving towards their monorepo(s). Do you really think all of Uber’s source could be built and rebuilt continuously on one developer laptop while keeping pace with organization-wide releases? It’s one thing to build a sub module or a few modules locally, but they’re specifically talking about the efficiencies gained through caching ALL build artifacts, source code, and leveraging data locality to place an otherwise inordinate amount of compute power in the hands of developers to do as they please.


How do you run a job that requires 64GB of RAM and dies when you run out of memory? Or is very CPU intense and takes 3 days on your laptop? If you test it in CI, how long is the dev cycle waiting for your CI to run? What if every dev ends up with slightly different local env that creates different build deps, that turn into test differences? How do you connect the cloud stuff to the non-cloud stuff without running into yet more complexity trying to connect them together?


Same here. We have a similar setup as described in this article. But they also issued me a 16-core workstation with 64 GB of RAM! Like just let me use my hardware goddammit.


In a large org those local resources are allocated for running Tanium.


This sounds great. Certainly the IP zealots love it because the code never leaves the walled garden, and it’s a super-secure blah blah environment. Consistent tooling, easy onboarding… you get the idea.

All good until you have an outage. Then your development team’s productivity drops to exactly zero while you fix it and your entire production environment is now potentially vulnerable to defects you can’t fix until the development environment is fixed (better hope it stays running). This is when companies realize that the development environment is actually a service and it needs higher SLA targets than production, but it will never get the attention that it needs to achieve those targets (because it’s just dev, right?).


Every service has the risk of introducing a new point of failure, of course.

But compare dev productivity lost due to service downtime with that of each new SWE in your org burning time to A) setup their own unique snowflake of an environment and B) futzing and debugging it when it breaks or there's a software update.


> burning time to setup their own unique snowflake of an environment

Anyone good is doing that even if you tell him not to. One size does not fit all.


You already have the same concern with source control, code review, continuous integration, continuous deployment, etc.

Dev servers are arguably much simpler to provide without issues.


But I don’t really have those concerns—they’re better mitigated. If source control goes down, I have a complete clone on my machine and so do my coworkers (yay git). I have done code reviews by pasting `git am` formatted patches between coworkers during particularly long GitHub outages. CI/CD is more problematic, but I only need those things to integrate work. I can still do work and verify correctness with tests that I can run locally.

None of that is possible in a centrally-managed dev VM setup. When the VMs go down, you send your dev team home until it’s fixed. You’re still paying their salaries while you pay another team to make them productive again.


Depends on the scale you're working with. If your complete code can fit locally, you're working at a different scale. Huge monorepos won't fit on a laptop-sized local disk these days, and a full local dev env would take a half dozen VMs to even replicate half-successfully.


We’re talking generally here about all software development, so of course we can find exceptional cases to refute anything anyone says. Let’s scope the discussion to what is likely a common developer scenario where `git clone` works and it’s possible to run a reasonable facsimile of the application locally (maybe with some simulators for cloud services).


The article we're discussing is from Uber, which has a huge monorepo. It's exactly the use case they designed Devpod for.


> Huge monorepos won't fit on a laptop-sized local disk these days

Reason 743 why monorepos create more problems than they solve


Nothing is stopping developers from running the docker image locally.


Not sure what's done at Uber but I imagine HA is a requirement.

There are mitigations and a couple of hours of dev downtime is imo not the end of world. Sure, prod incident could overlap with a devpod incident but sounds like they still have the choice to do local dev.


> All good until you have an outage

Better to have hundreds, or * gasp * thousands of engineers having to fix local package dependency problems on their computers every other day. Much better than those engineers having to keep that gigantic context loaded in their mind so that they fix those local problems.

That effort spent for maintaining local environments and navigating the package management hell is effort not being spent on creating actual code.


That time is spread randomly over thousands of devs and not lost all at once during a sev1


That time is spread randomly over thousands of devs and each of them encounter similar package dependency issues every other day. Its not 1 in 1000 devs every day. Everyone will need to solve it locally themselves.


*> higher SLA targets than production

Lost revenue is usually more impactful than the occasional lost developer time.


Can someone explain the rationale behind switching to a monorepo? I just don't get it.

Does it mean also having a single unified production environment build?

I have been managing a stack composed of 500+ repositories, communicating through webservices, files and ABIs for many years now, and never quite hit much of the issues cited as reasons to switch to monorepos.

Having multiple small independent environments for each deployed service is a feature to me. It reduces the surface of bugs and regressions introduced by new dependencies.

Not having to update dependencies globally has been a feature as well. It allows to prioritize which environments to migrate first. I found that big bang dependency updates burden is the #1 reason of _not_ updating a dependency, while allowing a per-service dependency migration ensures we can be fast to update the most important and supported services.

Switching between repositories has never really been an issue to me. I found that if the structuring of projects in repositories make sense, rarely do you have to work across more than a few of them at the same time.

Each repository is its own package, with its own dependencies. Features that cross repository boundaries are much less frequent than isolated ones, and updating dependencies to other projects is part of each project's PR anyway.

I never tried monorepo because I never quite felt the need to. To me it seemed to be a step backward to end up with a megafat repository, where individual service history would be lost in the overall monorepo history. The deployment seems like a nightmare too, having to update the whole stack at once because you then have no idea which individual service changed between releases.

Not to mention I really don't want people to spend time migrating project X - that does its work perfectly without issue since 5 years - to the latest version of LibFooBar just because project Y wants it.

What am I missing?


I maintain the web monorepo at Uber, so I think I can give some context.

Monorepos allows us to centralize important dependency upgrades. E.g. fixing log4j vulns is a lot easier when you can patch everything simultaneously. Same for tzdata (2022g gave very little heads up) Auditing for npm supply chain attacks was a lot simpler in monorepo than microrepos. Etc.

Monolithic version control doesn't have to mean monolithic everything.

Our web projects can be deployed independently of each other, and we leverage tool like yarn workspace focus and bazel for granular installs and builds/tests/etc.

It doesn't have to mean monoversions either. We support multiple version of libraries, though we prefer coalescing them as much as possible to facilitate effort centralization. Finding out that your library change will break downstreams before you land the change is a feature.

We had microrepos before and the main problem is that to this day I still get some random team coming to me for help w/ some rediscovered 7 year old repo that doesn't even build anymore cus lockfiles weren't a thing back then.

At a large enough org, you'll inevitably see the full spectrum of team quality, from the really good teams to the one intern/contractor getting thrown into the deep end of some unloved ancient thing. You want a common denominator that lets you do things like patch vulns in unstaffed projects.

I've done fairly large migrations both to and from monorepos. Each has pros and cons. For us and companies like Google, monorepos work well with our organization model. For others it may not.


Thanks for the insights.

> E.g. fixing log4j vulns is a lot easier when you can patch everything simultaneously.

This is true, though the refactoring action done can also be distributed and carried out via automated Pull Request creation to multiple repos. You still will deploy the changes over a period of time with a degree of parallelism.

> We had microrepos before and the main problem is that to this day I still get some random team coming to me for help w/ some rediscovered 7 year old repo that doesn't even build anymore cus lockfiles weren't a thing back then.

You can solve this by forcing a CI build for every repo to be run periodically.

> For us and companies like Google, monorepos work well with our organization model.

What are the concrete aspects of the organization model that make both orgs favor monorepos?


Upgrades come in a spectrum. Some are able to automerge via dependabot or similar tools, some need minor codemodding and can be managed by tools like soucegraph's new offering. Some take months of dedicated effort and digging through the ripple effects across a web of ecosystem libraries (React comes to mind).

The problem monorepos attempt to solve isn't a technology problem, it's more of a people problem. For example, say you cron CI job fails. Then what? Someone needs to look at it.

It's easier for someone to fix things they currently have context for (e.g. if I upgrade Python or update some security-related config and something breaks, I can reason it was my change that broke it), vs an unsuspecting contractor getting around to some backlog task 6 months after the fact with no context.

Organizationally, we can shard tasks to match areas of expertise. We only need one Node.js expert, one tzdata expert, one JRE expert, etc, to upgrade each of these, instead of everyone needing to obtain above average familiarity w/ obscure FFI bullshit or whatever in each technology.


Thanks for your answer.

How does CI work in practice? how do you avoid rebuilding and retesting the whole repo at every change? That would be an insane waste of resource and time.


You use a build system. There are a number of open source ones, we use Bazel. In a nutshell, it keeps track of dependency graphs, so you can test only affected projects.

It can also cache execution of unchanged transitive steps, so you can skip builds/tests that were already run previously (e.g. you could skip most of a large 2nd CI run if all you did in a code review was edit one file.

You can also parallelize execution across cloud nodes.


Monorepos makes the dev experience simpler, at the expense of more complex CI

I was resistant at first, but have found it to be a worthwhile trade-off


By only testing and building packages that has a dependency on the changed package.


And how do you determine that dependency structure easily? What tooling helps with this?


in 7 years someone will discovery a random subdirectory that doesn't build because wormholes weren't a thing back then ;)

monorepos solve none of the problems you listed. you are just basking in the short lived light after a big refactoring.

only thing monorepo does is make it easier to update some shared code and have all code which uses it run tests beforr pushing the new lib version. with many repos (micro repos is false speach to justify monorepos) you first publish the shared lib and then find out downstream failures.

anyone using monorepo in a way you can't build a small pieceocally is doing it very wrong.


A monorepo is indeed not necessarily the one true best way for everyone. But it is _a_ way that works well for certain large corps, such as Google, that can dedicate teams to solve challenges that indeed come with it.

Some advantages off the top of my head:

- Let's say I am getting an error in production, and it was built from version X of the repo. In 3 seconds I am navigating the source tree of all of my tens of thousands of dependencies at that exact version.

- While developing, doing experiments, or reproing a bug, I can trivially make temporary changes to any dependency. It's a zero effort thing, so I often jump into any dependency without hesitating. For example for adding some extra logging.

- Step debugging into code from any library dependency is trivial.

- Making changes to a library let me use the build and test system to find if it breaks any user. Because the build and test system is completely consistent across the monorepo, I can easily dig into any user's code, fix it, and run their tests with my changed library.

The "Software Engineering At Google" book page discusses more pros and cons: https://abseil.io/resources/swe-book/html/ch16.html#version_...


At large companies with developer experience teams, it can be nice. You setup a singular repo (or even several), use a cross-language and cross-platform build system, and you can get a lot of gains. You can make sweeping code mods too and update your customer’s libraries for them.

It makes working in all the languages mostly consistent, and provides a nice platform for optimizations like only testing code which changed.

I think the mistake is smaller companies adopting it without understanding the large amount of investment it requires. If you have a mono-repo mostly in one language, and you hire someone who is going to work in another language then you could be in for a world of pain. Unless you resource the language support.


>Does it mean also having a single unified production environment build?

No, the way you store the source code for projects is independent to how you ship them.

>Not having to update dependencies globally has been a feature as well.

For a large upgrade where a bot can't fix what will break it is typical to introduce it and have both versions in the monorepo at the same time. You still migrate each project one at a time and eventually remove the old version. Of course you could never finish migrating and have to support both versions at the same time.

>I found that if the structuring of projects in repositories make sense

I don't see how that makes any less sense than putting projects into folders.

>The deployment seems like a nightmare too, having to update the whole stack at once because you then have no idea which individual service changed between releases.

If your landing page is edited there is no reason that you should be deploying a new version of your mobile app. Deploying everything overran every commit is a tooling issue.

>Not to mention I really don't want people to spend time migrating project X - that does its work perfectly without issue since 5 years - to the latest version of LibFooBar just because project Y wants it.

Then pay the cost of supporting two versions of LibFooBar in your monorepo.

You can completely emulate what you do with multirepo with a monorepo. A monorepo gives you extra things like a single revision that can let you see the state of everything from when a build was made or making it easy to depend on the latest version of libraries without having to constantly bump it either manually or via bots.


It's much more of a political tool than a technical one.

Every benefit (minus atomic commits*) can be had immediately on micro-repos with a for-loop to do the monorepo-thing in each one. If bazel is what you want, great, use bazel! In microrepos! Want consistent dependencies? Enforce them! HEAD must build? Wonderful, nothing's stopping you from doing that! It's all solvable, and quite easily.

What you actually want is reliability and consistency. A monorepo gives you one political entity, with a clear adoption path, to argue with when enforcing those kinds of requirements. So they're much more likely to actually be achieved. In a big company, that may be worthwhile... but oh boy are the downsides large, and the only way to deal with them is massive eng effort and money.

* Atomic commits are a completely false promise. Your code on multiple machines doesn't change execution atomically, and being able to split a breaking change is a good thing, and massively harder to support in monorepos - it lets you adopt changes gradually, rather than forcing it on everything at once. You know, the same thing that every safe-change-practice handbook says you should do. Except in monorepos apparently.


Monorepo lets me have a _single_ Github PR with the changes across 4 projects in front of me. To replicate what you're talking about in a multirepo setup, you have to introduce changesets in 4 different places, then merge them in the right order, make a bunch of tiny releases, update commit hashes to point to the right one, and do a lot of other bookkeeping because in this house we keep our projects separated.


Indeed, the small additional bookkeeping is the price to pay. I'm quite used to it, so I don't even see it anymore.

Is that all? These 5 less minutes of bookkeeping are the killer feature of monorepos?

Because, as far as I see it, there's an insane amount of engineering to make a monorepo work even at small scale. Are these 5 minutes per multi-project PR worth it?

I wouldn't be surprised these 5 minutes are largely offsets solely by the additional hours of CI testing time introduced by having to run tests on the whole monorepo at each commit instead of just the project that changed.


> hours of CI testing time introduced by having to run tests on the whole monorepo at each commit instead of just the project that changed.

This is rarely the case. Most monorepos shard tests such that wall time is minimised and have dependency analysis that only runs tests affected by the changed code.

More of a problem is that IDEs and LSPs often don't deal well with having to index and navigate very large codebases.

I've maintained both monorepo and polyrepo environments and there are pros and cons to both, and they vary _wildly_ based on the language being used.


Yeah. They reduce the time spent on rare (but relatively valuable!) events, and massively increase time spent on all the little things you do dozens or hundreds of times per day. It adds up quite badly for everyone except the monorepo managing teams.


5 minutes? Bit of an underestimation.

Also I used bazel in my previous monorepo setup… so not really paying a huge CI cost.


How many people are in your organization?


People seem to conflate a monorepo with having everything else the same as well, just because that's how Google and other BigCos do it. You don't have to.

At some level a monorepo is just a way to stick all your code in one giant directory and manage it all under one VCS repository.

You could still do separate build tools per project, separate vendoring if you really wanted and so on.

However you may find that being able to simply import other first party code by path instead of doing some cross repo dependency process is a massive win.

Edit: apologies, this was meant to be a reply to the top level comment.


> However you may find that being able to simply import other first party code by path instead of doing some cross repo dependency process is a massive win.

I don't quite get the benefits of "depending by path".

The engineering seems huge, you now have to create a magic meta build and testing system so that only individual components that changed are rebuilt/tested. That seems like a scaling nightmare at best.

Also, dependency management is hardly an issue on most modern stacks anyway. Javascript, rust, python, etc, all have private package hosting tools that are trivial to deploy.


Import by path for monorepo is an anti pattern. It is exactly what another gp complained about: turning monorepo into soft dependant monolith.

https://news.ycombinator.com/item?id=34048245


The rationale is that Google, Meta, Microsoft and other successfull companies are running monorepos. Ergo, if you want your company to be successfull — you have to employ monorepo too. It's that simple!


I'm not sure it is.

Some of us have evaluated several approaches based on their merits and tried to makes decisions that are best for our specific organization.


I like this workflow, use something like mutagen to sync from your local environment to the remote machine. For me, I prefer jetbrains IDEs so I tried using jetbrains gateway, it’s really buggy and slow. I prefer to use jetbrains ides on my local machine and use VSCode remote for interacting with the remote machine.

I wonder though, is it worth it to setup a remote dev machine if you have an M1 Mac?

I think you can get good compile times for Rust etc. on an M1. For me I have an intel Mac at work and at home so the remote dev env is better for builds.


Yeah - we've (usenimbus.com - a company in this space) have heard the same about Jetbrains IDEs. You should check out our extension (or others') because there's been a lot of work put into making performance better.

M1s are a mixed bag for this way of working. Pre-M1, devs were running into local computing power issues. Post-M1, more compatibility and stability issues.


I think there are still some limits applied even on cutting edge hardware. GitHub previously said they hit network limitations cloning their multi-gb repo for instance.


If I am not mistaken, git clone is an atomic operation (it either succeeds or fails).

Since I see it mentioned in many threads here: for a huge repo one can always use `git clone --depth 1` to get that repo and later do a proper pull to retrieve all history.


I've definitely used the vscode remote ssh functionality before when I've had to use a particular architecture (x86) while I was developing on an M1 mac. In cases like this I already have the dev environment configuration setup for local dev so it's super easy to spin up an EC2 instance and I'm off to the races.

I definitely see the use case, but in saying that I find local development really valuable and default to it when I can. I do however run dev work almost exclusively inside a container so I'm flexible either way. I can see how some might not be.


Most developers in Uber use local builds anyway because of input lag. Dev pods were mostly good to prevent laptop from working at 100% fan volume at all times though.


input lag like when typing? I use remote env on vscode and afaik it’s locally editing and syncing after the fact for lint, compilation etc


It was probably just a RDP connection, which either signals how old this insight is, or how far behind Uber is on these new ways to remotely code.


The post clearly shows they're not using RDP.

That said Projector round tripped all the key strokes to get new draw commands (unlike VSCode) which also results in lag. The new IntelliJ remote architecture is much better, and it seems Uber is moving that way too.


They're using this now, but I imagine the user ValleZ's comment is about a past time working at Uber or hearing stories from Uber developer friends, in which I can see why spinning up a low-latency (but still over 30+ millisecond RTT) virtual machine would be an early solution to developers trying to cut down on build times and whatnot.


What's the new way? JetBrains Gateway? Have they caught up to VSCode's devcontainers?


Why is there input lag?

I’ve used something identical and mosh makes this just work. Most devs at that company swear by remote builds and hate laptop builds


Mosh doesn't work with SSH bastions (kind of obviously, but admittedly a bummer indeed), which Uber's blog post shows they use (as do many other similarly-sized companies).



The contrast between the Uber and Google setups would be twofold. Google engineers weren't doing local builds anyway (unless they intentionally inflicted those upon themselves) because of Forge. The Forge builders already had 10000 CPU cores in 2010[1]. You can probably imagine how many cores Forge or its equivalent has today. Secondly Google doesn't suffer from the performance problems of a weird freeware VCS, because they don't use git or the git workflow model at all. They built their own VCS based on the workflow model from Perforce.

One funny aspect of Google moving development into cloud machines was that a decade before the process ran the other way. The desktop that was issued to most engineers was a production websearch machine turned sideways and stuck under your desk.

1: https://youtu.be/b52aXZ2yi08?t=1118


Google contains multitudes. Android, Chrome, and ChromeOS are examples of large Google projects that do not use Perforce and are routinely built locally.


Chrome, at least, is typically built in the cloud (I don't know about Android, though it would be weird for it to be the only major project that doesn't do this). It's not just that you can have beefier machines in the cloud (though that's a factor); it's also that you can have a shared cache of build artifacts for all your users, which makes most workflows a lot faster. See documentation: https://chromium.googlesource.com/infra/goma/client/


Local-ish. (at least) Chromium is built by Googlers w/ the aide of goma, basically a distcc[1].

[1] https://chromium.googlesource.com/chromium/src/+/778a7e84f65...


I always dream about this setup. But an unreliable network prevents it. mosh and VSCode Remote partially solved the problem of latency. There's is still a problem of unavailable network - when you are in flight.

nix solved my problem by and large for well modularized projects. Because, nix can provide nearly identical dev environment in practice.


I'm always thinking about the use case of writing code at an airport or on a plane or somewhere remote with limited (or no) reliable internet connection. Or maybe I should just take a vacation :)


There's at least one good commercial solution that basically does the same thing the blog describes [1]. Working well for us so far.

[1] https://www.gitpod.io


For some reason this link redirects to https://www.uber.com/es-CO/blog/devpod-improving-developer-p... which results in a 404.

This URL works without redirects: https://www.uber.com/en-US/blog/devpod-improving-developer-p...


At my current startup we are doing something similar. Perhaps as inspiration for others without the huge budgets.

From very high level it works as follow:

- developer logs in to AWS cli

- executes: dev/01-start-env.sh

- prepare infra services: dev/02-base-platform.sh

- code can be changed and run locally. But if they need to test in the bigger system: dev/03-deploy-code.sh

There is dev/99-delete-env.sh

An environment is a personal ec2 spot instance that auto shutsdown if there’s no developer activity for over an hour.

The idea is that all developers work (program/code) locally as much as possible. But deploys changes to their own private environment in a much heavier VM’s that runs all services in containers.

Also there is a dev/tests.sh that executes the exact same test cases as in continuous integration. In fact, we try to bring all checks enforced during CI to be available to developers in their semi-remote-dev area environments for quick feedbacks.


We already have something like this at Meta for improving developer experience. All our dev environments are remote


Any externally visible docs on it? Google's is documented at https://cloud.google.com/blog/topics/developers-practitioner...


I don't think it's been discussed externally yet. It's called on-demand (some references to it here - https://developers.facebook.com/blog/post/2022/11/15/meta-de...).


Fascinating, thanks!


You will develop in the pod. You will not own a capable laptop.


You will fix the bugs?


This isn’t entirely honest, as Uber doesn’t have one monorepo, it has multiple. They never understood what Mono means lol.


The joke is that we love them so much that we have many of them.

In seriousness, we historically used microrepos and just getting to language-specific monorepos was already a monumental effort.


I was on the first ever team to use or setup monorepos at Uber. We even had a banner lol. In hindsight it was the right idea but a totally wrong implementation. I wish you the best but, IMO, dividing the monorepos by language is and was a mistake.


Even for my home lab I use a remote setup. My local workstation/server runs ubuntu with all the bells and whistles (nginx, docker, cuda etc). My dev laptop just has vs code and connects to my local server via wifi 6 &/or gigabit ethernet.

I do not notice that i develop remotely. VS code also has some great quality of life features, for example if you run a command that exposes a port in your remote machine, it automatically forwards it to your local. Same for jupyter notebooks.

VSCode finally enabled the features that command line folks enjoyed for decades.


I also have a remote dev machine and love it. When I do any heavy cpu work my laptop stays nice and cool since it only has a terminal and a browser open.


Poor Australian devs needing to develop on Oregon with 200ms of latency


Having to develop with circa 360ms pings was a contributing factor to me deciding to leave Gitpod. https://ghuntley.com/tea


Wow that's insane glad I decided not to use gitpod.


fwiw, we’re solving this at DevZero with having multiple AZs - given that our platform centrally manages remote compute (hibernation etc), this is working great at making sure latencies are <20ms for users


It's bad enough with an ssh session to a bastion on the other side of the pond...


Off-topic question: What are 70 Mio. code lines for? I mean, yeah, there are some services like Uber, Uber eats, but gosh, 70 Mio.? I have worked on very sophisticated products with a bit more than 1 Mio. LOC... Do I miss something in Uber apps and services?


Or "4000+ services", and "500+ Web Apps", and the many thousands of devs they employ.

I've wondered for quite a while where all this code and complexity is at. My guess is a lot of it is in their self-driving cars projects and stuff like this, but I don't really know.


Glad to see industry adopting cloud server + ssh + code-server / code-remote as a standard. As a PhD student I have been working on HPCs like this with an environment I built myself for a while. It is really good and helps you to move fast.


It depends on latency. If you have a slow network connection (or are simply far away from the cloud data center) you are in for pain. That being said, running a local environment on a constantly throttling 2019 MacBook Pro that sounds like a jet engine is probably worse.


> The Linux file system, which performs better compared to laptop file systems

... say what?


Probably means "our laptops only run Windows, which infamously has performance issues with git because of different filesystem semantics (something about caching metadata?), but Linux filesystem drivers don't have that problem".


Or they were using Docker on Mac


This is the problem i hope WebAssembly fixes. 2GB of RAM per microservice is nuts. If we can get this back down to 10-100MB, then maybe we can run a company's dev env from a laptop again.


Anyone from Uber here that works on this? You ever expect a DevPod for iOS developers? I've been toying with idea of doing this in work for our devs on Intel MacBooks, using -> https://github.com/sickcodes/Docker-OSX, which enables local running on iOS devices.

I think it's hampered by licensing though, maybe that's why it's not used or mentioned in this post.


We’ve thought about this quite deeply at DevZero. Yes, the iOS developer experience can be quite hampered on the local env and requires users to frequently get beefy local laptops/machines. I saw this first hand at Uber from when crowdstrike was rolled out broadly.

We have support for AWS-based Mac VMs on DevZero but we don’t find our customers having their biggest issues related to iOS dev yet (we also target enterprise cos that have a vast diversity of tools, mostly backend and front end)


I worked on this at Uber in the past and you explain very well why there is no iOS flavor for devpods.


The limiting factor here is the IDE experience. You can build all this amazing architecture to provision and serve the source bits of the monorepo but if the IDE experience is janky? folks will not adopt it. The slow adoption rate speaks to that. If it truly was a productivity boom we would see a sharp spike to near 90% or more. Instead we see a slow change which is more akin to a mandate and all docs heavily suggesting for folks to use it.


Both VS Code and IntelliJ have first class support for remote development as referenced in the article. I have been using Github codespaces for a while now and you can't nearly tell you aren't developing locally. Add to that faster builds without your laptop sounding like a jet engine.


I’m not sure what repo sizes your developing in, but at big enough numbers, the set of protobuf/thrift definitions, package dependencies, and sheer number of files being looked at, brings all the remote products to their knees. IntelliSense and other syntax, highlighting, chokes.

The biggest services/apps at Uber are not developed in devpods. Speculative, but these IDE’s were developed first as local first environments, there’s lots assumptions they make, adding up to terrible latency.


I was getting a 404 error. Apparently, because the site redirects me to a non-existent localized version of the page. You can bypass this by forcing English in the URL https://www.uber.com/en-US/blog/devpod-improving-developer-p...


Wonder how it handles things like dot files?

Also does it automatically install dependencies? Feel like running npm install each time (or the equivalent) would be slow.


This just seems like a lot of cloud costs to take on, considering how much power developers laptops carry and would be completely untapped. Instead of having everything remote, to me it appears more sensible to have a distributed development environment. I guess this would look something like the dagger.io folks are shooting for.


I think the steelman answer is that if this works well, you don't give developers powerful laptops; you give them cheap laptops that are little more than dumb terminals. (Whether that works is left as an exercise for the reader.)


we’re not seeing cloud costs be too terrible at DevZero - with proper hibernation/suspension, cloud costs are ~$50-60/mo in the worst case admittedly, we target only enterprise companies where the cost of loss of dev efficiencies is much higher

Say net cost to company for an engineer is $100k-$200k+. Even a net 10% savings over a year means $10k-$20k+ vs a $600-1k/yr investment (in worst case). Security posture is also significantly improved, which admittedly is harder to assign a $ value to


using remote resources as a part of your local dev flow can be very useful if your local environment is constrained on cpu/ram/gpu/ssd/bandwidth.

this can be as simple as an ephemeral ec2 spot machine that reacts every time files on it’s filesystem change. it then does stuff, like building and shipping.

your local setup needs to rsync files from local to remote every time you save a file.

i’m on an upload constrained setup right now, and this[1] significantly speeds up my iterations uploading lambda zips.

fancier setups probably are similarly advantageous, but add tradeoffs proportional to their complexity.

1. https://github.com/nathants/aws-gocljs/blob/258ea5bb72d06a50...


And this, ladies and gentlemen, is where monorepos will inevitably take you.

There's actually an elementary litmus test on migrating to monorepos. It is a questionnaire with only one question: is your company Google? If the answer is "no" — you don't need a monorepo.

You are welcome.


These things usually suck when latency is high... even if you are supporting most major regions it's not usual to find yourself in a situation when you have a few hundred ms latency.

Anything over ~40ms is very noticeable when typing and editing.

Then you're screwed.


> Controlled toolchain–pre-curated secure environment

I'm a hacker dammit. I'm supposed to be dangerous if you let me within whistling distance of a payphone.

The idea of slowly becoming useless without an internet connection and access to my devpod makes my skin crawl.


I've been doing remote development over RDP for the past 3 or so years. The one problem I have is if my internet is having a bad day the latency makes using something with a UI like IntelliJ unbearable at which point I just use vim over ssh.


This seems like a massive amount of work and frustration instead of admitting that mono repo doesn't work and scrapping that.

Plus super frustrating to have somebody preconfigure your environment because they will inevitably get it wrong.


This sounds awesome, but not practical for smaller companies. The cloud costs must be significant? "Faster Git, Build, and IDE experience by utilizing beefy cloud resources–up to 48 cores and 96 GB of RAM per environment"


Correct. My last job built a low rent version of this and it was very painful. 15-45 minutes to launch a pod that would have taken 3 minutes under Compose. Pushed a new base image? Whole thing would roll over and need another 15-45 minutes.

To do this well it takes manpower. There are some off the shelf options out there for remote/prod like Kubernetes based development that are smarter choices than building your own. The table stakes for doing this is being a big corp that can afford to spend millions on one custom tool with a hope for positive ROI. For the rest of us, find something that works that is open source or easily licensed.


Why would it need to be per user? An environment of that size could easily support many users. Developers spend most of their time just staring and only intermittently need to build. There is no particular reason why everyone would be building at once. Most of the time you'll get all those cores to yourself even though there could be dozens of devs sharing the box.


We’re coming around full circle to the old days of thin client terminals to a beefy mainframe under the corporation’s control. Only it’s hosted in the cloud and all your data is at risk.


You can run it not in cloud. Uber uses its own DCs (although not sure they do that for dev). It’ll probably be more expensive because you can’t scale everything down for the night (unless you have dev teams spread globally).


Also, smaller company can probably go a long way just on codespaces. If GitHub can, 99% of projects out there likely can too.


Does smaller companies have mono repos that size?


This sounds very similar to my first IT Support role setting up thin clients for law firms in London that had VM instances for each role including development, and then a networked file system with version control for code.


so I can't really tell from the text. are they using some remote mode in the IDEs that keeps the editor itself local or are they doing NX type stuff?

also curious about the terminal access. does it maintain state on the remote (like screen/tmux) or are the sessions subject to reset under network cuts.

i'd also be curious about settings since the pods are ephemeral. i suppose people would have scripts to grab their dot files, but some things like say, the android avd tool, can store settings in myriad ways.

that said, looks very cool!


Very curious how, or if, this helps or affects non-developers also working in the monorepo, like tech writers, QA, UX/UI/visual designers, training/enablement, support, etc.


[Putting a note here for awareness. I am debo, a user/contributor to the devpods project at Uber]

I am a founder of DevZero (devzero.io) where we are taking the theme of "remote compute with local tools", but built specifically to serve engineers in enterprise companies - still pretty early, but would love for people to check it out and provide feedback!

How we're looking at the space: - IDEs need to stay local but thankfully, VS Code, Jetbrains etc all now allow connecting to remote VMs and containers - the main issue is around not have enough of your dependencies present. So outside of standalone dev environment (VM/container), we also let companies "bring their k8s/serverless config" and let each engineer have their ephemeral full-stack to code against.

The standalone environments offer various perf boosts (super simple onboarding, switching project, we're seeing it reduces net time-to-deploy from start to deployed as well) for engineers coding in monoliths, which is true in many large companies still. We're already seeing really good traction here.

For the stuff where we're trying to take the engineer's IDE to an "ephemeral and hermetic env" that is built off of however "production workload management" works: an engineer can connect their local IDE to a remote "devpod" and do all their normal coding activities (w/ the relevant boosts in speed etc). When they want to do some form of end-to-end testing, the engineer can hit downstream pods/serverless stacks etc (their own copy, i.e., not shared tenancy). We're currently figuring out if we can enable engineers to test a full end-to-end call chain and connect live debuggers to arbitrary pods in that call chain -- I think this will make debugging amazing cause so far its been pretty hard to repro end-to-end call chains in "dev-mode". Our platform approach is basically making cloud dev environments (CDEs) even more awesome by putting them within a production-like environments for every dev.

Lots of info here (not shared widely yet, not even on the website). Please let us know if you want to use it, or consider working with us (we're actively hiring).

Re: workload management, we added support for k8s* generally but are now expanding out across the various clouds. For serverless, we started with AWS lambda and now have to tackle the other clouds. Then, we also need to do the default container mgmt for each of the cloud providers. (also looking at hashicorp nomad etc).

*Say you have helm to deploy containers/pods to prod. We look at that (and w/ a little more config re: dbs etc), give every engineer their copy of prod in their namespace alongside a devpod. Similar vibes for other workload mgmt systems.


genuine question here. im hoping u folks have put a lot more thought into it than i have.

so devpod "production OS" - the top level devpod - contains ALL running services ? at Uber scale it is what 64 GB RAM ?

if ur already using kubernetes ... why do it this way rather than have kubernetes namespaces with many containers in a dev cluster ? trivially this is a docker compose stack right ?

second question - has there been a ROI recovery in terms of laptop hardware for devs ? like - u only 8 GB ram laptops and not more.


> at Uber scale it is what 64 GB RAM ?

I don't work there, but I bet their full stack takes more than 64 GiB RAM.

> trivially this is a docker compose stack right ?

In the way that docker is the same thing as Kubernetes, yes. However there are differences that become material when you zoom in a bit closer.

second question - has there been a ROI recovery in terms of laptop hardware for devs ? like - u only 8 GB ram laptops and not more.

> second question - has there been a ROI recovery in terms of laptop hardware for devs ? like - u only 8 GB ram laptops and not more.

I don't think it's an optimization for dev laptop specs. At the end of the day, it's cheap for Uber to just max out the ram on dev laptops.


Dumb terminal + mainframe over IP?


[Putting a note here for awareness. I am debo, a user/contributor to the devpods project at Uber]

I am a founder of DevZero (devzero.io) where we are taking the theme of "remote compute with local tools", but built specifically to serve engineers in enterprise companies - still pretty early, but would love for people to check it out and provide feedback!

How we're looking at the space: - IDEs need to stay local but thankfully, VS Code, Jetbrains etc all now allow connecting to remote VMs and containers - the main issue is around not have enough of your dependencies present. So outside of standalone dev environment (VM/container), we also let companies "bring their k8s/serverless config" and let each engineer have their ephemeral full-stack to code against.

The standalone environments offer various perf boosts (super simple onboarding, switching project, we're seeing it reduces net time-to-deploy from start to deployed as well) for engineers coding in monoliths, which is true in many large companies still. We're already seeing really good traction here.

For the stuff where we're trying to take the engineer's IDE to an "ephemeral and hermetic env" that is built off of however "production workload management*" works: an engineer can connect their local IDE to a remote "devpod" and do all their normal coding activities (w/ the relevant boosts in speed etc). When they want to do some form of end-to-end testing, the engineer can hit downstream pods/serverless stacks etc (their own copy, i.e., not shared tenancy). We're currently figuring out if we can enable engineers to test a full end-to-end call chain and connect live debuggers to arbitrary pods in that call chain -- I think this will make debugging amazing cause so far its been pretty hard to repro end-to-end call chains in "dev-mode". Our platform approach is basically making cloud dev environments (CDEs) even more awesome by putting them within a production-like environments for every dev.

Lots of info here (not shared widely yet, not even on the website). Please let us know if you want to use it, or consider working with us (we're actively hiring).

*Re: workload management, we added support for k8s** generally but are now expanding out across the various clouds. For serverless, we started with AWS lambda and now have to tackle the other clouds. Then, we also need to do the default container mgmt for each of the cloud providers. (also looking at hashicorp nomad etc).

**Say you have helm to deploy containers/pods to prod. We take those charts (and w/ a little more config re: dbs etc), give every engineer their copy of prod in their namespace alongside a devpod. Similar vibes for other workload mgmt systems.


I was expecting towable spherical thing with solar panels, eco toilet and a hammock


I guess they expect Uber devs to get an extra day or two off annually from their cloud provider being down:

> Unfortunately, here we are limited by the cloud provider availability that’s capped at 99.5%.


That's a big improvement over maintaining non-trivial local environments at least.


TLDR for vscode remote users? Let me be lazy and just get a Y/N over whether it has any benefit over the normal remote vscode UX.


They reimplemented GitHub Codespaces without GitHub


So, they have services than engineers - 4000+ services and 3000 engineers? So, some services don't have maintainers?


Uber still has time for this crap? Counting the days till they shut down. Non businesses need to die out.


I’m one of the engineers at Coder and we build internal development platforms for a living. Here’s how our team builds Coder with Coder - https://coder.com/blog/how-our-development-team-shares-one-g...

Over at https://GitHub.com/coder/coder you’ll find our source code btw. Essentially we provision software development environments using terraform for Linux,windows,Mac,arm,amd64 and soon FreeBSD…




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: