Over a million developers have joined DZone.
{{announcement.body}}
{{announcement.title}}

Save Large Files in MongoDB Using Kundera

DZone's Guide to

Save Large Files in MongoDB Using Kundera

Kundera can save some time when it comes to saving big files in MongoDB. This tutorial shows you how it's done.

· Database Zone
Free Resource

Learn how to create flexible schemas in a relational database using SQL for JSON.

Kundera is a "Polyglot Object Mapper" with a JPA interface. Kundera currently supports Cassandra, MongoDB, HBase, Redis, OracleNoSQL, Neo4j, CouchDB, Kudu, Relational databases, and Apache Spark.

New folks should read Getting Started in 5 minutes and Working with MongoDB using Kundera.

Storing Large Files in MongoDB

For storing and retrieving files that exceed the BSON-document size limit of 16MB, MongoDB provides GridFS.

Instead of storing a file in a single document, GridFS divides a file into chunks and stores each of those chunks as a separate document. By default, GridFS limits chunk sizes to 255k. GridFS uses two collections to store files. One collection stores the file chunks, and the other stores file metadata.

GridFS with Kundera

Kundera allows the user to perform CRUD and Query operation on large files in MongoDB using GridFS. All a user needs to add is an @javax.persistence.Lob annotation in the field you want to insert using GridFS.

Sample Entity:

@Entity
public class Book
{
    @Id
    private String id;

    @Column
    private String title;

    @Lob
    private byte[] pdfFile;

    //setters and getters
}


Mapping

@Lob Field ==> GridFSInputFile.

Other Fields ==> GridFSInputFile.metadata.

Consider the Book Entity mentioned above. GridFSInputFile is created using pdfFile(@Lob field). Other fields (id, title) are saved in the metadata of GridFSInputFile.

CRUD and Querying Operation

Refer to this test case.

Limitations

  • CRUD on simple entities (without relationships, inheritance, embeddable, etc.) is possible.
  • Only select queries are allowed with WHERE and ORDER BY clauses.
  • Lob field can only be byte[].

Conclusion

Kundera, being JPA compliant, makes it easier to work with NoSQL databases. Irrespective of the database used, the user needs to write the same JPA query.  Also, if a user wants to switch databases (say from HBase to MongoDB), there is no need to rewrite the code — just change some configurations in the persistence.xml file. 

Create flexible schemas using dynamic columns for semi-structured data. Learn how.

Topics:
kundera ,mongo db ,gridfs ,orm ,nosql ,big data

Opinions expressed by DZone contributors are their own.

THE DZONE NEWSLETTER

Dev Resources & Solutions Straight to Your Inbox

Thanks for subscribing!

Awesome! Check your inbox to verify your email so you can start receiving the latest in tech news and resources.

X

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}