Scaling On The Cheap

“Planning is bringing the future into the present so you can do something about it now” — Alan Lakein

I’m going to take a break from my Tech Hiring and Team Building series to write a more technical article.

I co-founded a company. We had no investors. We weren’t looking for investors. Ideas are cheap, and valuations on ideas are cheaper. What we had was an idea and a need to keep our servers up on a shoestring budget. This article is the first in a series of how we did that and how you can too.

Start on the right track

“It does not take much strength to do things, but it requires a great deal of strength to decide what to do” — Elbert Hubbard

When you start something new, there’s nothing bad, no cruft, nothing you need to explain away or be embarrassed about. On the other side there are so many possibilities it can be overwhelming. It’s a time when you make decisions that can either pay massive dividends or cause a systemic failure that’s basically unrecoverable without redoing everything. This is where you need to start thinking about scale. This doesn’t mean prematurely optimizing the login flow, or buying a huge server. It means thinking about architecture in a way that let’s you avoid future pain at little upfront cost. That said, you can follow this advice at any time. Maybe it can help, maybe you’re too far down another path, that’s up to you to decide.

The first commandment

Never render HTML to be delivered to a web browser on the server on an individual request basis. Ever. This single rule, as I’ll explain, will pay you back immensely. Not just in scaling, but also in code reuse, future expansion, etc.

Our server has 2 parts: the static side and the API side. It’s worth considering keeping these in 2 separate repos (that way they CAN’T mix).

It’s worth noting that “on an individual request basis” does give you a bit of flexibility. It’s totally ok, and often preferable, to render your static html from templates at build time. This allows you to avoid repeating common snippets like headers and footers, analytics code, etc.

The static site

Our static site is pure, vanilla HTML/CSS (with a few embedded script tags, etc that I’m glossing over here). This gets served by our web server, and scaled by our CDN (Cloudflare in our case). Because of this, we can serve static pages at the same rate per second as Techcrunch, Gizmodo, Engadget or any other big player you want to name on the web. The beauty of this is it cost us next to nothing, just a little time to plan ahead.

The API server

Obviously if we just had a static site it’d be pretty boring. Our data comes from our API server. The API server is responsible for all the unique data about users, responding to their updates, etc. All the things you normally would associate with dynamic activities. Every request produces some JSON. It’s important not to start here and not to just toss stuff into a bucket and call it an API. In order to understand what and how data will be needed you need to design the layouts and flow of your site first. Once you have that, it’s important to organize those needs into logical groups(/api/user/, /api/album/, etc). Suggestions for API design are beyond the scope of this post, but I’ll be writing about them later.

Out of the dark ages

Not too terribly long ago, Javascript versions weren’t consistently supported enough across browsers for us to even dream of using it to primarily display our content. IE 6 roamed the web and would eat half of what you tried to do with impunity. Those days are (thankfully) gone. Now all modern browsers can render at least ES5 and through the help of Babel, Modernizr and tools like it we can even use many ES2015 (or ES6) features. Adding to this, JS engines have gotten much faster (especially on mobile), and computers in general (as always) have gotten faster. These factors combine to create a perfect environment for us to distribute our scaling needs.

A million render engines

So where should we be generating our dynamic pages? The client of course. Rather than 1 (or 10 or 20) server doing a million renders and providing a page, we have a million clients do 1 render each. To do this we use ReactJS. There are several other options, but we chose React because it fits well with the functional reactive style we prefer (Aurelia is another great framework). I’d personally shy away from AngularJS as it currently stands, but that’s a topic for a different blog post.

Once the JSON payload comes back from our server, we cache and render that on the client. It’s important to note here that our app is a single page app (SPA). Though you don’t have to do a SPA to use this technique, if you need multiple pages that need to access the same data (or you need to support rapid back and forward button navigation without abusing your server), you can cache your JSON responses in localstorage (unless you need to support IE 6 or something, in which case you need to be looking for another job not reading this blog post).

Collecting dividends

So now that we’ve created a hard line between the API which serves data and the HTML/JS that consumes it, let’s see what we’ve gotten (again, for nearly free):

Duplication of code is significantly reduced.

The more painful it is to do the wrong thing, the more likely we are to do the right thing

When we’re writing pages that get generated on the server you may find that some bits of code that do basically the same thing get sprinkled all over the place. This isn’t through any coding malpractice, it just happens because people are in a hurry and assume something isn’t already done elsewhere. The API layer reduces this because creating an API has a much higher “activation energy” (the amount of energy it takes to get started doing something). Before we go to add a new API (having to come up with a name, add the boilerplate, properly categorize it, etc) we’re very likely to see if an existing one works for us.

Everything is a first class citizen

Mobile app first? Web first? Neither. Everything first.

Because we’re not directly accessing our backend while rendering our website, anything our web client can do, our mobile client now also has access to! Additionally if we’ve designed our API with some care (sadly, a topic beyond the scope of this post), we can even easily bring on 3rd party partners with whom we may share data.

Reduced attack surface

Because we’re eliminating a sizable chunk of code (template, embedded HTML, HTML generation engine, whatever) on the backend, we reduce the ways in which it can be attacked. It’s easy to ensure that each new API validates its inputs, because these don’t get added that often, and because they tend to be pretty small relative to an entire page render with lots of embedded HTML, etc. Additionally if some strange bit of data causes the render to crash, this happens in 1 browser rather than affecting lots of otherwise unaffected clients.

Your CDN can do its thing

Here’s one of our actual traffic spikes. The orange is the traffic our CDN serving from its cache, the blue is what we served from our web server. That difference is about 100K requests. That’s 100K requests that required no server resources from us.

You’re probably never going to be better at serving pages than your CDN. If you are, you’re either a multi-billion dollar company, or you need to get a new CDN. Because all your pages are now static, your CDN can provide DDoS protection and absorb the load when you’re getting hammered. This also means you’ll save money because you need a less beefy box. If you notice abuse of your API server, you have more options. Just as an example, you can temporarily add a slight delay to responses. Nothing a user would notice (perhaps 50–100 ms), but something that will slow the attacker down significantly while costing you next to nothing (assuming you process your requests in a non-blocking manner). This is possible because a legitimate user isn’t looking at a blank page while you’re dealing with malicious actors. They’re seeing a partial page which, at the very least, is letting them know you’re loading their request. Your CDN may even be able to cache some of your API responses short term.

So what’s next?

In the coming weeks I’ll talk about rules for building good APIs, some suggestions about how to structure your front end app as well as other related topics. Stay tuned!

About Me

I’m Amir Yasin, a polyglot developer deeply interested in high performance, scalability, software architecture and generally solving hard problems. You can follow me on Medium where I blog about software engineering, follow me on Twitter where I occasionally say interesting things, or check out my contributions to the FOSS community on GitHub.

If you enjoyed this post, I’d really appreciate a recommend (just click the heart below).