Lessons Learned Building a Backend-as-a-Service: A Technical Deep Dive

In this post we share our technical learnings from building a multi-tenant Backend-as-a-Service (BaaS). We cover how a BaaS works, how it fits into the serverless paradigm and how performance and scalability can be achieved with tools such as Docker Swarm, MongoDB, AWS, Redis, Varnish and CDNs.

Published in

Speed Kit Blog

16 min readMay 17, 2017

TL;DR

At Baqend, we are building a Backend-as-a-Service (BaaS) service that is geared towards scalability and web performance. Coming from research, we try to be open about the architecture of our platform and therefore would love to share how everything works at a technical level. This post is a combined writeup of different talks we gave.

Challenges of Modern Applications

Engaging web and mobile applications have to fulfill several requirements:

Each one individually may appear easy, but achieving them in combination turns out to be quite difficult in practice.

“Over 30 percent of web development teams deliver projects late or over-budget.” (Survey by New Bamboo)

The Classic 3-Tier Architecture

Today, the most common way of implementing web applications is a 3-tier architecture with a client, a server and a database tier. This design poses some difficulties regarding the requirements:

A typical website consists of more than 100 resources fetched from the server, which makes latency critical for performance (see HTTP archive).
With server-rendering, the client has to wait until the server has assembled the data through database queries and rendered it in a language such as Java, PHP or Python.
The database must be highly available, which is extremely difficult if you also like consistency, according to the CAP theorem.
The server needs to be scalable, which is difficult when shared state such as user sessions are involved.
In many cases, business logic is duplicated between the client and server, as well as communication overhead between frontend and backend teams, making projects more difficult and delaying the time-to-market.

2-Tier Architectures to the Rescue

An alternative to the 3-tier architecture is to leave out the server tier and talk directly to a cloud service, e.g. a Database-as-a-Service (DBaaS). This has two major advantages:

You can render the site progressively in the client to make the UX interactive and enganging.
Much of the business logic can happen in client-side JavaScript, i.e. you get less hassle between backend and frontend team.

While there are good frameworks and technologies for building these application frontends (e.g. Angular and React) and storing the data (e.g. DynamoDB, or MongoDB hosters), some problems remain:

You still have to do many latency-sensitive requests, somestimes even more than before, since data comes in small, separate JSON requests.
All the heavy lifting involved with scalability and high availability is now shifted to the Database-as-a-Service.
Cloud databases are usually not equipped to handle User Management, Access Control, Server-side Business Logic.

The Serverless Paradigm

The two common forms of serverless architectures are Function-as-a-Service (e.g. AWS Lambda, Google Cloud Functions) and Backend-as-a-Service (e.g. Baqend, Firebase).

The great thing about Function-as-a-Service (FaaS) is that the business logic hosted there, e.g. in the form of a Node.js script, is rapidly scalable and easy to operate. However, FaaS only covers stateless code execution which means that you have to orchestrate different cloud services to handle data storage, user management, push notifications, etc. Thus, the client has to talk to many different APIs. Or the FaaS has to wrap and expose various services, which quickly becomes infeasible to manage.

This is where Backend-as-a-Service (BaaS) enters the stage: the idea is to combine the easiness of Function-as-a-Service with all the APIs and capabilities a typical website or mobile app requires.

How a BaaS Works

Any BaaS will typically be composed of three layers.

The API is what makes the BaaS useful, as it allows to store and query data, run server-side code, authenticate users, and more. This is usually implemented as a multi-tenant cloud service on an Infrastructure-as-a-Service (IaaS) provider such as AWS.

A BaaS also takes care of hosting and delivery. In case of web applications, this means that it will store your HTML files and other assets and deliver them to users.

As a developer, build your website or mobile app using either a REST API or an SDK.

Components of a Backend-as-a-Service platform.

Let’s look at how a BaaS-backed website works.

First, the browser will load the HTML and the assets, including your JavaScript application logic written in the framework-of-the-month (Angular, React, Vue, Aurelia, …).
Next, your JavaScript will make calls using the BaaS SDK to load the data and render the site, e.g. by fetching the latest posts.
The browser will resolve any further references like images that were inserted into the DOM.

How is BaaS different from PaaS and IaaS?

BaaS basically pulls up the abstraction level, so as a developer you do not need to manage a full-blown application and database stack. PaaS gives you full control over the server, e.g. your Rails or Django application. In an IaaS, you can basically run any kind of application but it leaves you with the full complexity of manging and scaling it.

The high abstraction layer of BaaS often comes at a cost: performance, expressiveness and scalability are usually quite limited. But this does not have to be the case. Let’s look at how to actually build a BaaS.

Performance

I bet everyone of us has stared at blank loading screens and cursed whoever is responsible for that.

49% of users expect websites to load in 2 seconds or less, according to a survey by Akamai. These expectations are not matched in practice: the median top 500 e-commerce website has a page load time of 9.3 seconds (see source).

My three favourite findings relating web performance to user behaviour are these (there are many more):

Amazon found that 100 ms of additional loading time decrease sales revenue by 1%. With Amazon’s current revenue, the impact is over 1 billion USD per year.
Yahoo! saw 9% of visitors dropping out, when the site was 400ms slower.
When comparing user reactions to showing 30 search results instead of 10 , Google measured a 20% drop in traffic. The decrease in engagement was caused by 500ms of additional latency for the search query.

What makes websites slow?

Let’s say a user from the US visits our BaaS-backed application hosted in Europe. There are two performance bottlenecks:

The server has to do database queries, rendering, etc. which causes processing overhead.
We need to transfer more than 100 resources over a high-latency network connection, delaying the moment where the browser has enough data to show something meaningful.

Interestingly, bandwidth is usally not the bottleneck. If you observe page load time under increasing bandwidth, you’ll notice that it doesn’t get any faster above 5 Mbps. However, if you are able to decrease access latency, you will see a proportional decrease in page load time.

Or put differently:

2× Bandwidth = Same Load Time
½ Latency ≈ ½ Load Time

The effect of bandwidth vs. latency on page load time. See source.

State of the Art: How to improve web performance?

Let’s look at one example of how web performance can be improved: Google‘s Accelerated Mobile Pages (AMP).

In a nutshell, AMP prescribes the following:

Stripped down HTML + AMP tags (e.g. amp-img) rendered asynchronously by AMP JS runtime
CSS must be inlined and below 50 KB
No custom JS (except in iframes)
Only static sizes for DOM elements → no repaints
Limited to mobile pages and forces a Google bar
Cached in Google CDN, as long as it is crawled the next time
→ only suited for static media, e.g. news

So the basic idea of AMP is to enforce performance best pratices and use CDN caching. However, there is no reason to limit these techniques to proprietary Google technology with uncertain motiviations.

Tackling latency

So we know that we have to reduce latency in order to be fast. What is the oldest technique to achieve that? Caching!

This is the main idea behind the Baqend BaaS: reducing latency through web caching. And since interesting applications do not only use static data, we have to make dynamic data cacheable.

If you can cache dynamic data, you get two things:

Low Latency, since every request can be answered by a nearby web cache (e.g. your browser cache or a CDN edge node)
Less Processing overhead in the backend, as only the cache misses will hit the server

How to cache dynamic content?

The idea is quite simple. We have to keep track of every response and invalidate it upon changes in both expiration-based caches (e.g. the browser cache) and invalidation-based caches (e.g. CDNs like Akamai and Fastly).

As expiration-based caches simply use the time-to-live provided in the server’s Cache-Control HTTP header, we have to use a trick: we let the client know, when something needs to be revalidated. As the full list of stale URLs can get quite, long we compress it using Bloom filters.

The concrete algorithms to handle this are not some secret sauce, it is all published research. Here is how it works.

Caching everything, not just assets

The tricky thing when using web caches is that you must specify a time-to-live (TTL) when you first deliver the data from the server. After that you do not have any chance to kick the data out. It will be served by the browser cache up to the moment the TTL expires. For static assets, it is not such a complex thing, since they usually only change when you deploy a new version of your web application. Therefore, you can use tools like gulp-rev-all and grunt-filerev to hash the assets. By renaming the assets at deployment time, you ensure that all users will see the latest version of your page while using caches at their best.

But wait! What do you do with all the data which is loaded and changed by your application at runtime? Changing user profiles, updating a post or adding a new comment are seemingly impossible to combine with the browsers cache, since you cannot estimate when such updates will happen in the future. Therefore, caching is usually just disabled or very low TTLs are used.

The Bloom filter trick

We need to check the staleness of any data before we actually fetch them. At the begin of each user session, the connectcall therefore fetches a Bloom filter, which is a highly compressed representation of all recently changed resources. Before making a request, the SDK first checks the Bloom filter to know if it contains an entry for the resource we fetch.

An entry in the Bloom filter indicates that the content was changed in the near past and that the content may be stale. In such cases the SDK bypasses the browser cache and fetches the content from the nearest CDN edge server. In all other cases the content is served directly from the browsers cache. Using the browser cache saves network traffic, bandwidth and is rocket-fast.

In addition, we ensure that the CDN always contains the most recent data, by instantly purging data when it becomes stale.

How useful and fast is this in practice?

Here is a small example of the effects using a simple news application as an open-source benchmark. You can also run it in your browser.

This means: caching dynamic content leads to a performance advantage of roughly 15x. If you want to learn how this work’s for real applications, check out our findings from a high-traffic e-commerce scenario:

Building a Shop with Sub-Second Page Loads: Lessons Learned

Here is the story of how we leveraged research on web-caching and NoSQL systems to prepare a webshop for hundreds of…

medium.baqend.com

Optimizing the Network

We’ve seen that caching optimizes latency. But there are more networking topics to take care of.

Baqend uses a CDN (Fastly) to improve several networking aspects:

Arguably the most important optimization is HTTP/2. For a typical use case and in particular combined with caching, it can improve performance drastically:

HTTP/2 does three important things better than HTTP/1.1:

Multiplexing requests over one single TCP connection (no head-of-line blocking)
Server Push allows you to send data before the user requests it
Header Compression saves bandwidth, e.g. for repeatedly sent cookies

BaaS Performance: Summary

So in a nutshell, Baqend does two things to tackle the web performance problem:

Latency is reduced by caching dynamic content (e.g. queries).
Using HTTP/2 and CDN delivery, data is transferred faster.

Scaling a Backend-as-a-Service

How can we scale the backend? There are a few performance best practices that any scalable backend should address:

This is how Baqend Cloud implements these:

Data is stored in a set of different scalable NoSQL databases (MongoDB, Redis, ElasticSearch), depending on what data store suits the application best.
Java-based Baqend servers expose the BaaS REST API and can be scaled independently. Each tenant has its own server and a Node.js instance for running server-side business logic. This pair of servers is hosted on Docker and is scaled using Docker Swarm by adding more containers per tenant on demand.
All types of web cache (reverse-proxy caches, CDNs, ISP caches, forward proxies and browser caches) are allowed to cache the responses.

Scalability & Performance Wrap-up

All in all, there are some things that Baqend can optimize and there things that you as a developer need to take care of:

How do I use a BaaS from a development perspective?

To add Baqend to an application, you just include the SDK and connect it to your app. Then you can start loading and storing data:

The three major interfaces you will be using with any BaaS is usally the Service Dashboard, the CLI and the SDK.

The Baqend SDK is compatible with all major frameworks and also works for hybrid apps and in Node.js. Besides wrapping the REST API, the SDK does a few useful things:

Abstracts from caching logic, and has an intelligent object-identity
Completely ES6- and TypeScript-compatible (support for Maps, Sets, Arrays, models can be ES6-classes)
Promise-based
Powerful query builder for doing search & retrieval of data
Automatic change- and dependency-Tracking, so objects are only sent to the server when they actually change

Data Modeling

Baqend combines schemaful and schemaless data modelling: you can use and check data types wherever you need strict validation and fallback to plain JSON, where the structure is not known in advance.

Types: Boolean, Integer, String, DateTime, Time, Date, GeoPoint, List, Map, Set, JSON-Object, JSON-Array
Files, references, embedded types & inheritance

Documents can be nested or have references to other objects. Each field can be indexed and efficiently queried, even nested documents and schemaless JSON.

For example, when building a todo list, you can choose between embedding and referencing depending on whether query or update performance is more important and what the access patterns look like.

The CRUD API makes it very easy to interact with objects and files.

Every request issued from the SDK is mapped to the REST API and automatically uses the client-side Bloom filter-logic and the server-side cache invalidations:

The REST API is specified using Swagger, which also makes it browsable and allows to generate simple SDKs for other languages.

The server exposing the REST API is built in Java and Jetty. It is highly tuned towards throughput and latency efficiency: a single server can handle roughly 15K requests per second on commodity cloud hardware.

User Management

There are simple APIs to handle users of your application. Users can be registered using classic email validation or through OAuth providers like Facebook or Google.

User can have roles. Acess rights can be set on user and role level. The User-schema is extensible and sessions for returning users are handled automatically.

How are ACL-protected ressources handled internally?

When an end user performs a requests, a token is attached to it. This token contains the roles of the user and a signature. First, the schema-level ACLs are checked (e.g., checking that anonymous read access is allowed). If they pass, the next step is to validate the object-level ACLs. This is efficiently handled by constructing a query predicate on indexed ACL-attributes that is pushed down to the database for every operation.

One of the coolest things about this ACL-scheme, is that the ACLs are already checked in the CDN if the data is cached, allowing high cache hit rates, even on private data.

Backend Code

Most applications have some business logic that cannot be done in the client, e.g. the checkout in an e-commerce shop. Baqend allows you to execute server-side code using the same SDK as in the client.

The code can be called as a microservice similar to a Function-as-a-Service providers or be triggered based on operations such as update, insert, delete, etc.:

Going Real-Time

While triggers are useful, they are also pretty limited in their expressiveness and not the right tool for complex reactive applications. This is why we built InvaliDB, a streaming system for real-time queries on MongoDB. For registered real-time queries, result updates are pushed to clients and caches alike.

As illustrated below, incoming database updates are matched in an Apache Storm cluster against queries.

The unique feature of this architecture is that it scales with both queries and updates — unlike RethinkDB, Meteor and Parse that all have centralized bottlenecks that break with increasing throughput.

The coolest part is, how easy it is to use this feature: it is used exactly like regular queries. You simply subscribe to any MongoDB query (including complex predicates, sorting, limit, etc.) and the SDK maintains the result in realtime (< 30ms end-to-end latency):

Accelerating Legacy Sites: Going Beyond BaaS

A limitation of any Backend-as-a-Service is that you have to built your app or website around its APIs in order to use it. Moving an existing application to a BaaS is a lot of work.

Fortunately, there’s a funky new browser technology to cicrumvent this problem: Service Workers. What they do is:

They proxy any HTTP request and allow you to change the browser’s default behaviour
They provide an Offline Cache for „progessive web apps“ — if the client is disconnected, you can choose to serve a cached copy for an offline mode

How can we use this to make legacy sites easily use a BaaS?

The idea is simple:

You include our Service Worker on your site, quite similar to aGoogle Analytics snippet
The Service Worker gets a list of patterns of what to intercept and redirect to our CDN
Baqend pulls the data from the original site and stores its metadata and content
The legacy system (e.g. Wordpress) simply triggers a refresh when something changes → Baqend iterates over all cached files and revalidates them against the origin. If something changed, it is updated in the CDN and the Bloom filter.
Caching in the Service Worker uses the Bloom filter logic to determine when to bypass the cache

We will release this feature soon. On sites we have tested so far, performance can often be tripled without changing the current backend architecture.

Summary: Lessons Learned

Our main learnings from building Baqend are: database systems and distributed systems are hard. Some of these complexities should be exposed to developers (e.g. indexing) and some should be hidden (e.g. scalable stream processing).

All in all, of the three categories frontend, network and backend, a BaaS can optimize many things of the network and backend that are really hard work on traditional platforms.

Curious? Try it.

Learn how to develop with a Backend-as-a-Serivce, it is a fun experience:

Create a Baqend app (or do: npm install -g baqend followed by baqend register)
Upload your assets (HTML, images, etc.) to Baqend. Learn how to deploy a simple web app in the quickstart.
Use the Baqend SDK to easily store & query data, register users, etc. Learn how to include the SDK.
Start building your website or app. Check out our starter kits and the guide.

Don’t want to miss our next post on Backend-as-a-Service topics? Get it conveniently delivered to your inbox by joining our newsletter.