DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Please enter at least three characters to search
Refcards Trend Reports
Events Video Library
Refcards
Trend Reports

Events

View Events Video Library

Zones

Culture and Methodologies Agile Career Development Methodologies Team Management
Data Engineering AI/ML Big Data Data Databases IoT
Software Design and Architecture Cloud Architecture Containers Integration Microservices Performance Security
Coding Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Culture and Methodologies
Agile Career Development Methodologies Team Management
Data Engineering
AI/ML Big Data Data Databases IoT
Software Design and Architecture
Cloud Architecture Containers Integration Microservices Performance Security
Coding
Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance
Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks

The software you build is only as secure as the code that powers it. Learn how malicious code creeps into your software supply chain.

Apache Cassandra combines the benefits of major NoSQL databases to support data management needs not covered by traditional RDBMS vendors.

Generative AI has transformed nearly every industry. How can you leverage GenAI to improve your productivity and efficiency?

Modernize your data layer. Learn how to design cloud-native database architectures to meet the evolving demands of AI and GenAI workloads.

Related

  • Android Cloud Apps with Azure
  • How to Build a Full-Stack App With Next.js, Prisma, Postgres, and Fastify
  • Unified Observability: Metrics, Logs, and Tracing of App and Database Tiers in a Single Grafana Console
  • How To Create Application Systems In Moments

Trending

  • Ensuring Configuration Consistency Across Global Data Centers
  • Next-Gen IoT Performance Depends on Advanced Power Management ICs
  • AI Speaks for the World... But Whose Humanity Does It Learn From?
  • Event-Driven Microservices: How Kafka and RabbitMQ Power Scalable Systems
  1. DZone
  2. Data Engineering
  3. Databases
  4. I Wrote My Own Database!

I Wrote My Own Database!

In most cases, writing your own database is a bad idea. Luckily, there are some when it's not and there's no risk. If that's your case, take the chance; it's lots of fun!

By 
Grzegorz Ziemoński user avatar
Grzegorz Ziemoński
·
Jun. 28, 17 · Opinion
Likes (3)
Comment
Save
Tweet
Share
6.8K Views

Join the DZone community and get the full member experience.

Join For Free

It's been one of the moments that I've been unconsciously waiting for ever since I started programming. I mean, writing your own database is not something you do every day. Actually, you should never do that unless you have a very, very good reason to do so. Otherwise, you're probably wasting someone's time and money, and adding a fair bit of risk in case of failures.

Driving Forces

That said, let's explore some of the "very, very good" reasons that would justify writing your own data store instead of using an existing one.

Performance

It's hard to imagine, but if there were no data store performant enough to handle your needs, then you'd have no choice. It might also happen that such stores exist, but the cost of using them is way too big for you to handle.

Disk Space

Even though disk storage is pretty cheap nowadays and existing databases are pretty good at taking little storage space, it's possible that your needs are... special. Maybe you're in need of some extreme compression or special encryption. It might not be a good enough reason for writing a whole database, but almost certainly will require some extra coding in this area.

Deployment Model

We might very well live in the times of cloud, automation, and all this cool stuff, but there are still cases in which you might want to deploy the database in a "special" way or in a "special" environment that none of the current solutions support. (Ever wondered if you could run Oracle on a little chip implanted into a human body?!)

Ease of Use

This point is very broad but covers a few important topics regarding choosing a database. How hard is it to run and maintain the database? How hard is it to access it from the code level? How hard is it to perform schema migrations on the database? And so on and so forth.

This list is by no means complete (I didn't even touch on transactions and such!), but it conveys the most important idea: you should only write your own database if none of the existing solutions match your most crucial needs in a given situation.

Obviously, you also need to make sure that your very own solution does not only solve some burning problem that other solutions do not but also that it does not introduce any new ones.

The Project

Now that I've laid out my way of thinking about the idea of writing an own data store, we can get into the nitty gritty details of my own case and the actual solution that I've produced.

The project that I've been working on is a relatively small, simple application for a lady working at a university. She needed an application that would aid her in conducting classes, keeping information about class attendees, and sharing information about assigned tasks. The final application, including the front-end, is around one thousand lines of code. Not that big, is it?

Requirements

I guess it's not any surprise that a small, simple application like this one has no special performance needs. In fact, the data store could be way slower than existing solutions and things would work out just fine.

When talking about disk space, encryption, or whatever alike, there aren't any special requirements, either. The app will probably run on a server with gigabytes or terabytes of free space while storing no more than a few hundred (worst case: a few thousand) objects.

The deployment model? That's not yet decided, but the most likely solution is simply running the app on one of the Univerity servers. Boring!

This is probably the point at which some of you feel cheated on, click-baited, etc. We're like 600 words in and all of it for a "nothing special app" with "no special requirements."

Well, there comes the last category: ease of use. I'm writing a simple app that will store relatively few objects in the whole app lifetime. What's more, I'm handing out the complete source code of the application and from this point on, the lady has to take care of the application herself. As long as the application works well and suits her needs, all she wants to do is type a simple command and not care anymore. If by any chance she wants to change something in the data store schema, it should be as straightforward and easy as possible.

The Choice

The question that I had to answer myself was, What kind of solution fits best in the description above? Some SQL database? Document? Graph? Something else?

I meditated on this problem for a decent amount of time and my conclusion was that any of these is a significant overkill and overcomplication. I'd be just fine with a file-backed collection or something similar. Makes sense, doesn't it?

And so I went to Google a "Java file backed collection." Largely to my surprise, there aren't too many good(-looking) options, with MapDB looking most promising. I decided to give it a try.

Now, don't get me wrong. This might be my misuse of the tool or inability to configure it correctly. Anyway, I spent like an hour trying to replace my in-memory collections with MapDB in the project and I got seriously pissed off. I was like... God, I want the most basic, non-performant, stupid, working option. I failed.

Implementation

And so, driven by my annoyance with the failure to get things working, I sat to a blank source file and pulled off something like this in approximately 15-20 minutes:

class Store<in K : Any, V : Any>(private val log: File,
                                 private val keyType: KClass<K>,
                                 private val valueType: KClass<V>) {

    private val map = mutableMapOf<K, V>()

    init {
        if (log.exists()) {
            val lines = log.readLines()
            lines.forEach {
                val cmdKeyValue = it.split(" ")
                if (cmdKeyValue[0] == "PUT") {
                    val key = deserialize(cmdKeyValue[1], keyType)
                    val value = deserialize(cmdKeyValue[2], valueType)
                    map.put(key, value)
                } else {
                    val key = deserialize(cmdKeyValue[1], keyType)
                    map.remove(key)
                }
            }
        }
    }

    operator fun set(key: K, value: V) {
        synchronized(log) {
            log.appendText("PUT ${serialize(key)} ${serialize(value)}\n")
            map.put(key, value)
        }
    }

    operator fun get(key: K) = map[key]

    val values: Iterable<V>
        get() = map.values

    fun remove(key: K) {
        synchronized(log) {
            log.appendText("REM ${serialize(key)}\n")
            map.remove(key)
        }
    }

    fun serialize(value: Any): String {
        val string = gson.toJson(value)
        return Base64.getEncoder().encodeToString(string.toByteArray())
    }

    fun <T : Any> deserialize(serialized: String, type: KClass<T>): T {
        val bytes = Base64.getDecoder().decode(serialized)
        return gson.fromJson(String(bytes), type.java)
    }

    companion object {
        val gson = Gson()
    }
}

As you can see, it's basically an in-memory map populated by an append-only log of operations. I didn't want to bother myself (and the nice university lady) with schema problems, so I used Gson to (de)serialize the objects and Base64 to avoid any potential problems with special characters and such.

Let's face some harsh truths. The performance of this is probably pretty bad, especially at startup. The storage method is largely inefficient. The only deployment method of this is to ship it with the application and it limits the deployment a single instance of the application. Luckily, neither of these is a serious problem given the way the application will be used, deployed, etc.

On the other side, it has a few key benefits. There's literally no setup needed. The user of the app can simply run the JAR and everything works out of the box. The usage in the code is super simple, as the available operations resemble the ones in the classical Map interface. Last, but not least, any schema migration necessary can be prepared in the form of a simple, short Kotlin file:

fun main(args: Array<String>) {
    val oldStore = Store(File("a"), String::class, A::class)
    val newStore = Store(File("b"), String::class, B::class)
    oldStore.values.forEach { newStore[it.id] = B(it.id, it.someField) }
}

Summary

Before I let you go, let's make a quick wrap up. In most cases, you should not write your own database — use an existing one instead. You should only go for writing a thing of your own if you can justify the time spent and money invested with a reasonable benefit. Fortunately, there's this rare case when writing something of your own is actually faster than using an already existing solution with no big risks involved. If that's your case, I strongly encourage you to write a data store of your own. It's actually a lot of fun!

Database application Data store app

Opinions expressed by DZone contributors are their own.

Related

  • Android Cloud Apps with Azure
  • How to Build a Full-Stack App With Next.js, Prisma, Postgres, and Fastify
  • Unified Observability: Metrics, Logs, and Tracing of App and Database Tiers in a Single Grafana Console
  • How To Create Application Systems In Moments

Partner Resources

×

Comments
Oops! Something Went Wrong

The likes didn't load as expected. Please refresh the page and try again.

ABOUT US

  • About DZone
  • Support and feedback
  • Community research
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Core Program
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 3343 Perimeter Hill Drive
  • Suite 100
  • Nashville, TN 37211
  • support@dzone.com

Let's be friends:

Likes
There are no likes...yet! 👀
Be the first to like this post!
It looks like you're not logged in.
Sign in to see who liked this post!