Over a million developers have joined DZone.
{{announcement.body}}
{{announcement.title}}

Solving Invisible Scaling Issues with Serverless and MongoDB

DZone's Guide to

Solving Invisible Scaling Issues with Serverless and MongoDB

Even though serverless applications might seem to hide a lot of details when running, issues still need to be dealt with. We discuss how to do just that in this post.

· Cloud Zone ·
Free Resource

Insight into the right steps to take for migrating workloads to public cloud and successfully reducing cost as a result. Read the Guide.

Don't follow blindly, weigh your actions carefully.

Ever since software engineering became a profession, we have been trying to serve users all around the globe. With this comes the issue of scaling and how to solve it. Many times these thoughts of scaling up our software to unimaginable extents are premature and unnecessary.

This has turned into something else altogether with the rise of serverless architectures and back-end-as-a-service providers. Now we’re not facing issues of how to scale up and out, but rather how to scale our database connections without creating heavy loads.

With the reduced insight we have about the underlying infrastructure, there’s not much we can do except for writing sturdy, efficient code and use appropriate tools to mitigate this issue.

Or is it? 

How Do Databases Work with Serverless?

With a traditional server, your app will connect to the database on startup. Quite logical, right? The first thing it does is hook up to the database via a connection string and only when that’s done will the rest of the app will initialize.

Serverless handles this a bit differently. The code will actually run for the first time only once you trigger a function. Meaning you have to both initialize the database connection and interact with the database during the same function call.

Going through this process every time a function runs would be incredibly inefficient and time-consuming. This is why serverless developers utilize a technique called connection pooling to only create the database connection on the first function call and re-use it for every consecutive call. Now you’re wondering, how this is even possible?

The short answer is that a lambda function is, in all essence, a tiny container. It’s created and kept warm for an extended period of time, even though it is not running all the time. Only after it has been inactive for over 15 minutes will it be terminated.

This gives us a time frame of 15 to 20 minutes where our database connection is active and ready to be used without suffering any performance loss.

Using Lambda with MongoDB Atlas

Here’s a simple code snippet for you to check out.

// db.js
const mongoose = require('mongoose')
const connection = {}

module.exports = async () => {
  if (connection.isConnected) {
    console.log('=> using existing database connection')
    return
  }

  console.log('=> using new database connection')
  const db = await mongoose.connect(process.env.DB)
  connection.isConnected = db.connections[0].readyState
}

Once you take a better look at the code above you can see it makes sense. At the top, we’re requiring mongoose and initializing an object called connection. There’s nothing more to it. We’ll use the connection object as a cache to store whether the database connection exists or not.

The first time the db.js file is required and invoked it will connect mongoose to the database connection string. Every consecutive call will re-use the existing connection.

Here’s what it looks like in the handler which represents our lambda function.

const connectToDatabase = require('./db')
const Model = require('./model')

module.exports.create = async (event) => {
  try {
    const db = await connectToDatabase()
    const object = Model.create(JSON.parse(event.body))
    return {
      statusCode: 200,
      body: JSON.stringify(object)
    }
  } catch (err) {
    return {
      statusCode: err.statusCode || 500,
      headers: { 'Content-Type': 'text/plain' },
      body: 'Could not create the object.'
    }
  }
}

This simple pattern will make your lambda functions cache the database connection and speed them up significantly. Pretty cool, huh? 

All of this is amazing, but what if we hit the cap of connections our database can handle? Well, great question! Here’s a viable answer.

What about Connection Limits?

If capping your connection limit has you worried, then you might think about using a back-end-as-a-service to solve this issue. It would ideally create a pool of connections your functions would use without having to worry about hitting the ceiling. Implementing this would mean the provider will give you a REST API which handles the actual database interaction while you only use the APIs.

You hardcore readers will think about creating an API themseles to house the connection pool or use something like GraphQL. Both of those solutions are great for whichever use case fits you best. But, I’ll focus on using off-the-shelf tools for getting up and running rather quickly.

Using Lambda with MongoDB Stitch

If you’re a sucker for MongoDB, like I am, you may want to check out their back-end-as-a-service solution called Stitch. It gives you a simple API to interact with the MongoDB driver. You just need to create a Stitch app, connect it to your already running Atlas cluster and your set. In the Stitch app, you make sure to enable anonymous login and create your database name and collection.

Install the stitch npm module and reference your Stitch app id in your code then start hitting the APIs.

const { StitchClientFactory, BSON } = require('mongodb-stitch')
const { ObjectId } = BSON
const appId = 'notes-stitch-xwvtw'
const database = 'stitch-db'
const connection = {}

module.exports = async () => {
  if (connection.isConnected) {
    console.log('[MongoDB Stitch] Using existing connection to Stitch')
    return connection
  }

  try {
    const client = await StitchClientFactory.create(appId)
    const db = client.service('mongodb', 'mongodb-atlas').db(database)
    await client.login()
    const ownerId = client.authedId()
    console.log('[MongoDB Stitch] Created connection to Stitch')

    connection.isConnected = true
    connection.db = db
    connection.ownerId = ownerId
    connection.ObjectId = ObjectId
    return connection
  } catch (err) {
    console.error(err)
  }
}

As you can see the pattern is very similar. We create a Stitch client connection and just re-use it for every consequent request.

The lambda function itself looks almost the same as the example above.

const connectToDatabase = require('./db')

module.exports.create = async (event) => {
  try {
    const { db } = await connectToDatabase()
    const { insertedId } = await db.collection('notes')
      .insertOne(JSON.parse(event.body))

    const addedObject = await db.collection('notes')
      .findOne({ _id: insertedId })

    return {
      statusCode: 200,
      body: JSON.stringify(addedObject)
    }
  } catch (err) {
    return {
      statusCode: err.statusCode || 500,
      headers: { 'Content-Type': 'text/plain' },
      body: 'Could not create the object.'
    }
  }
}

Seems rather similar. I could get used to it. However, Stitch has some cool features out of the box like authentication and authorization for your client connections. This makes it really easy to secure your routes.

How to Know If It Works?

To make sure I know which connection is being used at every given time, I use Dashbird’s invocation view to check my Lambda logs.

Lambda invocation log creating new

Here you can see it’s creating a new connection on the first invocation while re-using it on consecutive calls.

lambda invocation log using existing

The service is free for 14 days, so you can check it out if you want.  if you want an extended trial or just join my newsletter

Dashbird gif

Wrapping up

In an ideal serverless world, we don’t need to worry about capping our database connection limit. However, the amount of users required to hit your APIs to reach this scaling issue is huge. This example above shows how you can mitigate the issue by using back-end-as-a-service providers. Even though Stitch is not yet mature, it is made by MongoDB, which is an amazing database. And using it with AWS Lambda is just astonishingly quick.

To check out a few projects which use both of these connection patterns shown above jump over here:

If you want to read some of my previous serverless musings head over to my profile or join my newsletter!

Or, take a look at a few of my other articles regarding serverless:

Hope you guys and girls enjoyed reading this as much as I enjoyed writing it. Until next time, be curious and have fun.

TrueSight Cloud Cost Control provides visibility and control over multi-cloud costs including AWS, Azure, Google Cloud, and others.

Topics:
serverless ,alerting ,scaling ,issues ,cloud ,databases ,mongodb ,connections

Published at DZone with permission of

Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}