High-Performance Persistence With MicroStream (Part Three)
After saving a tree the last time, I will try a data structure of the type graph now. A simple version of it is a ring.
Join the DZone community and get the full member experience.Join For Free
For some time, there has been a new competitor in the field of persistence and serialization. We are talking about Project MicroStream. What is it exactly? MicroStream claims to be a high-performance and, most importantly, developer-friendly solution for the challenges of serialization and persistence.
How easy, fast and comfortable that is, we will look at in detail in a multi-part series.
You may also like: High-Performance Persistence With MicroStream (Part One)
What Happened Until Now
In the last part about the MicroStream Engine, we looked at how to handle a tree. It was as easy as a single instance of a class. We saw how to persist this data structure on the disk. This example of a tree was a trivial implementation of a binary tree.
What's Up Today
After saving a tree the last time, I will try a data structure of the type graph now. A simple version of it is a ring. For this, I take again the class created and used in the previous part with the name Node.
Now to form a ring will start from a node, always the left child will be set. For the last child, the first used node will be used as a child again. The question that arises is, therefore: Can the MicroStream Engine deal with such a design, or are additional measures necessary to deal with cycles within a data structure to be stored?
Once this construction has been created, it will be passed to the MicroStream Engine with the goal of persisting this instance on the disk. The node with ID 1 is used as the root node.
The storage process runs smoothly, which is already a good sign. Thus, it seems to have been recognized that all nodes are to be stored only once. Next comes the attempt to load this data back into a newly created instance of a MicroStream Engine.
In the usual manner, a new instance of the storage engine is created, and the root node is loaded. A subsequent pass through the ring reveals that it is again an accurate replica of the previously created construction. Admittedly, this has been a simple construction, but it seems to work in principle.
Data — Export
After storing and loading the data, we arriving at a point that an interesting question arises: How it looks to export such a data structure. In the online documentation, you can read that in principle, there are two ways. The first and certainly the most straightforward way, according to the online documentation, is to create a binary dump.
In order to save the currently generated ring as a backup in a binary format, a destination directory is required. At this place, all exported files are stored. To get the data from the storage engine, we need a connection of the type StorageConnection from the active StorageManager, which manages the data to be exported.
This connection can be used to extract the data and export it to the hard disk. In principle, it is nothing more than a copy, with the difference that it has been cleaned up a bit compared to the regular data directory. It also seems to be the case that the individual files are now sorted by type.
Should mean that in each file on disk a single datatype has been persisted. The file names reflect that too. Details of the different possibilities are not yet found in the documentation. Maybe this change shortly.
If you have now made a binary extract, you can then convert it to a CSV format. A binary data export, therefore, seems to be mandatory. To save the data from the binary-format as CSV, you need access to the previously generated files.
In this example, it's pretty straightforward, as the binary export was done just before, and the respective data for the persisted types are in an instance of type XSequence < file>. From this sequence, you can get all instances as type File and then use them for the CVS export.
If you now look at a file in the CSV export directory, then you recognize the structure used here.
On the right, we recognize the data content again. The ID's from 1 to 4. Based on the reference numbers, you can also identify the ring. The format is so simple at this point that you can even manually add more records.
In this part, we looked at two things in more detail. First, if it is possible to save data structures like graphs. Data structures have cyclic connections. This data was stored properly and read back into memory.
After that, we took a quick look at how export can be carried out. My motivation here was to see how the data is stored in the CSV format, and if I recognize the ring that was created before. That worked well, in my opinion.
Thus, one can export the data in a simple text format, which in itself can be an essential criterion for later use.
In the next part, I will discuss the topic of how the MicroStream Engine will behave when using amounts of data that are larger than the available memory. We're talking about mechanisms like lazy loading here.
Opinions expressed by DZone contributors are their own.