Remember that Netflix runs their core product (content distribution) internally on their own CDN. All of the supporting technologies (billing, content discovery, etc.) is on AWS, but the core product is not.
This was a pretty recent change. Netflix has classically relied on third-party CDN providers. They launched the streaming service in 2007, and it took them 9 years to move 100% of their traffic to their own CDN infrastructure. That said, they still have one of the largest AWS footprints.
I've always found it interesting that Netflix has not at least tried to go off AWS; if only because Amazon has a competing service with Amazon video. I'm sure its not an easy problem to solve, but Dropbox has gone off AWS as well; it seems like they would be much better served with their own "Video Cloud" with specialized hardware for streaming/processing videos.
Netflix realizes that the value presented by AWS for outweighs the costs. If you feel differently, you should probably ask yourself what information you are missing, rather than just dismissing Netflix and their decision. It's clear that Netflix knows more about operating applications with tens of millions of simultaneous global users than most companies...
I don't think Netflix and Amazon really compete yet. Very few people see the two services as either/or. They're more allies against the cable companies at the moment.
You make a good point that Netflix has a pretty sophisticated system for delivering video (not just served from AWS Cloudfront). However IIUC they still use AWS for preparing video files for their CDN.
Dropbox is an example where their core business is very data-intensive and AWS costs can be considerable.
I'm thinking of services, say the automated robo-built sadwiches, or the Easy 401 service, or the luxury shoes startup (all real YC-accepted companies), where the core business proposition isn't necessarily dependent on having a huge infrastructure.
I have services that do not fit on one server and still I don't need AWS. Distributing load/dividing services on x-xx servers is not rocket science, especially with tools we have available at the moment.
How does one begin to learn about these things? Minimizing cost of running services sounds super interesting but as a student I've never had to deal with it and am basically starting with 0 knowledge.
I worked through the labs at https://pdos.csail.mit.edu/6.824/ for fun. It's more along the lines of "How can we write a distributed fault-tolerant database?" but you might like it anyway.
Lab 4 is a beast with more lives than <insert metaphor>. The moment you think you've finally written your distributed system correctly, the unit tests will prove your service fails during XYZ network partition topologies. It's very worthwhile to be forced to think about issues like that and to design distributed systems for correctness.
But to address your question more directly, it's generally just a matter of scaling your service as much as possible on a single server. The server has finite resources (CPU, Memory, network, disk) so you either know how your system consumes those resources (because e.g. you wrote the service, and you know it uses O(n) memory w.r.t. the workload) or you graph your usage over time and try to predict when you'll exhaust your available resources. At that point you can usually think of some straightforward optimizations, which keeps everything on a single server. But eventually you might run out of resources with no obvious path to optimize it, so what then?
It depends entirely on your service, but typically you can just use off-the-shelf software to scale to multiple servers. For example you could set up three servers, each running Redis, and have Redis keep a list of "work to be done." Then your central process just farms out the workload to each of the three servers round-robin style, for example.
But at that point your service becomes a lot more brittle, e.g. you'll need to set up a failover solution so that your cluster can survive partitions and outages. (Redis uses Sentinel for that.) So it's worth keeping everything on a single server for as long as possible, if you can work out the optimizations to do so, since it's so much simpler with only one server. (HN is still running on a single core and serving 350k daily uniques, I believe, which shows just how effective it can be to keep your architecture as simple as possible.)
Wrt the first link, I'm going to be taking a distributed systems course next semester that (hopefully) covers the same material. Nice to know what I learn at school is somewhat applicable.
My question is learning about design ideas like having Redis keeping a list of "work to be done" and such. Using modern tools to combat 'modern' problems. Is it just something you figure out after knowing the fundamentals (i.e. when you can identify a problem you know what the solution needs to be)?
Yeah, pretty much. I've never actually done it, but I know I could do it if I needed to.
If it sounds mysterious, think of it this way: Imagine you were thrown into a room with a computer, internet, and endless food, and the only way out of the room was to solve this problem. I bet you'd figure it out within a week or two, or a couple months max. (If only to go have a shower.)
One thing to watch out for: Solving modern problems can be pretty unsatisfying. Before you experience what life is like at a modern company, you tend to think of them as functional, orderly, and planned. Real companies are almost universally dysfunctional, disorderly, and haphazard. It's very rare that a plan is conceived, followed, and deployed without growing some warts and throwing in some hacks to get it done.
So I think you should enjoy this time, where you're free to think of thought-problems like "What's the most correct and extensible way to solve this problem?" instead of being forced to solve them on a time crunch.
Use the right tool for the job.