DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Refcards Trend Reports
Events Video Library
Refcards
Trend Reports

Events

View Events Video Library

Zones

Culture and Methodologies Agile Career Development Methodologies Team Management
Data Engineering AI/ML Big Data Data Databases IoT
Software Design and Architecture Cloud Architecture Containers Integration Microservices Performance Security
Coding Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Culture and Methodologies
Agile Career Development Methodologies Team Management
Data Engineering
AI/ML Big Data Data Databases IoT
Software Design and Architecture
Cloud Architecture Containers Integration Microservices Performance Security
Coding
Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance
Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks

Generative AI has transformed nearly every industry. How can you leverage GenAI to improve your productivity and efficiency?

SBOMs are essential to circumventing software supply chain attacks, and they provide visibility into various software components.

Related

  • How to Store Text in PostgreSQL: Tips, Tricks, and Traps
  • MongoDB to Couchbase for Developers, Part 1: Architecture
  • MongoDB to Couchbase: An Introduction to Developers and Experts
  • MongoDB With Spring Boot: A Simple CRUD

Trending

  • Lessons Learned in Test-Driven Development
  • Beyond the Checklist: A Security Architect's Guide to Comprehensive Assessments
  • Reducing Hallucinations Using Prompt Engineering and RAG
  • Beyond the Glass Slab: How AI Voice Assistants are Morphing Into Our Real-Life JARVIS
  1. DZone
  2. Data Engineering
  3. Data
  4. BSON Data Types — ObjectId

BSON Data Types — ObjectId

In this article, learn about the makeup and history of ObjectId and explore binary JSON (BSON).

By 
Ken Alger user avatar
Ken Alger
·
Dec. 06, 19 · Review
Likes (5)
Comment
Save
Tweet
Share
12.1K Views

Join the DZone community and get the full member experience.

Join For Free

Bison in snow

BSON Data Types — ObjectId

In the database world, it is frequently important to have unique identifiers associated with a record. In a legacy, tabular database, these unique identifiers are often used as primary keys. In a modern database, such as MongoDB, we need a unique identifier in an `_id` field as a primary key as well. MongoDB provides an automatic unique identifier for the `_id` field in the form of an `ObjectId` data type.

For those that are familiar with MongoDB Documents, you've likely come across the `ObjectId` data type in the `_id` field. For those unfamiliar with MongoDB Documents, the [ObjectId](https://alger.me/mongodb-objectid) datatype is automatically generated as a unique document identifier if no other identifier is provided. But what is an `ObjectId` field? What makes them unique? This post will unveil some of the magic behind the BSON ObjectId data type. First, though, what is BSON?

You might also be interested in:  Data Types in MongoDB

Binary JSON (BSON)

Many programming languages have JavaScript Object Notation (JSON) support or similar data structures. MongoDB uses JSON documents to store records. However, behind the scenes, MongoDB represents these documents in a binary-encoded format called BSON. BSON provides additional data types and ordered fields to allow for efficient support across a variety of languages. One of these additional data types is ObjectId.

Makeup of an ObjectId

Let's start with an examination of what goes into an ObjectId. If we take a look at the construction of the ObjectId value, in its current implementation, it is a 12-byte hexadecimal value. This 12-byte configuration is smaller than a typical universally unique identifier (UUID), which is, typically, 128-bits. Beginning in MongoDB 3.4, an ObjectId consists of the following values:

  • 4-byte value representing the seconds since the Unix epoch
  • 5-byte random value
  • 3-byte counter, starting with a random value


Makeup of ObjectId

With this makeup, ObjectIds are likely to be globally unique and guaranteed to be unique per collection. Therefore, they make a good candidate for the unique requirement of the `_id` field. While the `_id` in a collection can be an auto-assigned `ObjectId`, it can be user-defined as well, as long as it is unique within a collection. Remember that if you aren't using a MongoDB generated `ObjectId` for the `_id` field, the application creating the document will have to ensure the value is unique.

History of ObjectId

The makeup of the ObjectId has changed over time. Through version 3.2, it consisted of the following values:

  • 4-byte value representing the seconds since the Unix epoch,
  • 3-byte machine identifier,
  • 2-byte process id, and
  • 3-byte counter, starting with a random value.

The change from including a machine-specific identifier and process id to a random value increased the likelihood that the `ObjectId` would be globally unique. These machine-specific 5-bytes of information became less likely to be random with the prevalence of Virtual Machines (VMs) that had the same MAC addresses and processes that started in the same order. While it still isn't guaranteed, removing machine-specific information from the `ObjectId` increases the chances that the same machine won't generate the same `ObjectId`.

ObjectId Odds of Uniqueness

The randomness of the last eight bytes in the current implementation makes the likelihood of the same ObjectId being created pretty small. How small depends on the number of inserts per second that your application does. Let's do some quick math and look at the odds.

If we do one insert per second, the first four bytes of the ObjectId would change so we can't have a duplicate ObjectId. What are the odds though when multiple documents are inserted in the same second that *two* ObjectIds are the same? Since there are *eight* bits in a byte, and *eight* random bytes in our Object Id (5 random + 3 random starting values), the denominator in our odds ratio would be 2^(8*8), or 1.84467441x10'^19.

For those that have forgotten scientific notation, that's 18,446,744,100,000,000,000. Yes, that's correct, 18 quintillion and change. As a bit of perspective, the odds of being struck by lightning in the U.S. in a given year are 1 in 700,000, according to National Geographic. The odds of winning the Powerball Lottery jackpot are 1 in 292,201,338. The numerator in our odds equation is the number of documents per second. Even in a write-heavy system with 250 million writes/second, the odds are, while not zero, pretty good against duplicate ObjectIds being generated.

Wrap Up

ObjectId is one data type that is part of the BSON Specification that MongoDB uses for data storage. It is a binary representation of JSON and includes other data types beyond those defined in JSON. It is a powerful data type that is incredibly useful as a unique identifier in MongoDB Documents.

Further Reading

Converting XML to JSON, Raw Use in MongoDB, and Spring Batch

Hiding Fields in MongoDB: Views and Custom Roles

BSON Data (computing) MongoDB Data Types Identifier Document

Published at DZone with permission of Ken Alger. See the original article here.

Opinions expressed by DZone contributors are their own.

Related

  • How to Store Text in PostgreSQL: Tips, Tricks, and Traps
  • MongoDB to Couchbase for Developers, Part 1: Architecture
  • MongoDB to Couchbase: An Introduction to Developers and Experts
  • MongoDB With Spring Boot: A Simple CRUD

Partner Resources

×

Comments

The likes didn't load as expected. Please refresh the page and try again.

ABOUT US

  • About DZone
  • Support and feedback
  • Community research
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Core Program
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 3343 Perimeter Hill Drive
  • Suite 100
  • Nashville, TN 37211
  • [email protected]

Let's be friends: