Are traditional storage's days numbered? The push is on to perfect an open-source technology that allows your applications to keep data in memory — shut down the application and the machine — but the data will still be there when you need it.
Eric Kaczmarek, a senior Java performance architect in Intel’s Software Solution Group, spoke about this topic at the In-Memory Computing Summit Oct. 24 in San Francisco. His session was titled, "In-Persistent-Memory Computing with Java."
I interviewed Eric and his colleague Steve Dohrmann, a senior staff software engineer at Intel, just prior to that conference. Collectively the two men have nearly a half-century of experience working with Java.
When they told me that one day soon you will no longer need traditional disks in your system — no spinning drives and no SSD drives — well, that caught my attention. Imagine, if you will, a machine that has only memory and nothing else. No storage at all.
Tom: Intel’s talk at the IMC Summit this month is titled, “In-Persistent-Memory Computing with Java.” Is ‘In-Persistent-Memory’ something new?
Steve: It’s not completely new. But the way we are trying to offer programming for it — and the size, density, and capacity — is something that hasn’t been seen before. If you think about a disk being persistent – when you write to a disk and close the file, it’s still there. And RAM being just write-accessible… this combines the two.
You program it like you would do for RAM, but it behaves like a disk. This is not brand-new, but offering it for Java — that part is new.
Eric: You can find papers from the 1990s talking about this concept. But no one has built the hardware to do it until recently. There’s a new industry push to build this new memory technology that allows your application to keep data in memory — shutdown the application and the machine — but the data will still be there. It’s an old problem with a new solution.
Steve: All Java is programmed with Java classes and Java objects — and they’re all typically on the Java heap. When an application creates a new object, the JVM (Java Virtual Machine) sub-allocates a contiguous area of heap memory to store it.
What we want to do is maintain that programming model — Java classes and Java objects — but selectively you can pick some of those objects to be persistent. Specifically, you might have a map (something that maps keys and values). This would be a traditional Java map interface — but this particular map would retain its values in persistent memory. Indefinitely.
There’s no need to serialize it; there’s no need to write it to a file. It behaves like a regular Java object but this one survives across Java VM instances and even across machine restarts.
Tom: Could you give an example of how this would pertain to Apache Ignite?
Eric: If you look at Apache Ignite today, you know it’s fully in-memory. The memory is distributed across machines and that allows for quick access. But if you want to make it persistent you need to flush or store a superset of data on disk.
You wouldn’t have to do that anymore — just keep it in memory at all times. You no longer need to access your drives or anything else. You can shut-down the machine, reboot it — and the data will be there.
Tom: Are you working with the Apache Ignite community on this project?
Eric: We plan on doing that at some point down the road.
Tom: How will this change the paradigm of how developers work?
Steve: I think the first category of opportunity is in retrofitting or modifying databases and things like key-value stores and NoSQL databases. Right now, they are sometimes constrained by the fact you have to conform around their particular object types... a particular key or value — or you have to have all string data for your databases.
What persistent Java objects gives you is relatively arbitrary types — so you can have a Java object with the same kind of flexibility you would have with any other Java object. But now this could go into your map.
Tom: So, then it’s really no longer a storage service, or specialized application for storing key-value pairs, rows, or columns?
Steve: Exactly. It’s just Java objects. Anything you can imagine designing with Java classes — you could make persistent. And that could be a new kind of database or data store.
Tom: What else can we expect?
Steve: The second category of ideas is things like long-lived caches. We might cache right now in RAM as a performance tool, but when the Java VMX hits you, then you lose that cache. People complain about how long it takes to warm-up their caches. But you could have a persistent cache that behaves like memory — it’s written in the same general style you’d write any Java object cache — but now it’s persistent. You can restart your JVM and it’s already warmed-up.
That kind of long-lived cache is sometimes called “memoization.” If it’s an expensive computation, and you want to cache the result of that, you could also cache those kinds of things.
Tom: And the third category of ideas?
Steve: It’s a big bucket of things that nobody has thought of yet — all of a sudden you have this easy-to-use persistence. We’re hoping that there will be brand-new ideas that come out of this.
Eric: I have two things to add to that. This memory will give you huge capacity. Today the memory on your machines has about 200 GB. What we are talking about will bring that to terabytes. We’re talking 6+ terabytes of memory per node potentially.
Tom: Eric, I recall that after your talk in Amsterdam (at the In-Memory Computing Summit Europe) back in June, someone said this would change the way we write code and the way we compute.
Eric: Yes, I remember that. Storage, traditional storage, will disappear in time. I really believe that this will be change everything.
Tom: And Intel will make the machines and will contribute to the software, which will remain open source, right?
Eric: We are pushing all the Java enabling into OpenJDK which is and will remain open source. We are also working across the Apache communities to help them adopt persistent memory into their Java applications (Cassandra, Spark, HBase, HDFS, just to name a few). All of it will remain open source. However, we believe that new usages will emerge as developers get the hardware and Java support. We are just trying to showcase what is possible.
Tom: How else are you getting the word out about this sea change in machine memory?
Eric: At Intel, we are trying to show the first usages. But look back at the iPhone, which came out in 2007. At first, there were just a few apps…. At Intel, we are trying to build the ecosystem — we cannot know all of the apps that will come from this.
Tom: Any predictions in terms of what can change in machines?
Eric: In time, you will no longer need traditional disks in your system. And I’m not just talking about spinning drives. SSD drives, as well. You will have a machine which has only memory and nothing else. You won’t need any storage at all.
Tom: What about the software traditionally used today to access that kind of storage?
Steve: Accessing the memory will use software that’s very much unified with what accesses RAM. Specialized software to get at disks won’t be as necessary.
Tom: Where do you see middleware like Apache Ignite being used on these new machines?
Eric: Apache Ignite, because it’s all in Java, is one of the easiest to use. We think with little or no changes we can would with the community to enable it for this new mode. Other software will need to be modified because it won’t be able to adapt to this new API.
Steve: It’s still early days. We’re going really fast and furious. We have an open source project on GitHub. We’re doing our development in batches but we’re doing it in the open via that GitHub project. Because things are happening so fast in this project, we’re really not in a position to effectively process pull requests. It’s meant as a “read-only” until the end of the year. But you can download it and do your own trials, prototyping and see how you might be able to use these persistent Java objects.
We’re going to keep going as fast as we can on this particular project. And as Eric mentioned, we’re working with the OpenJDK Community to try to get a path towards core support for persistent memory in the JDK.
Tom: But you’ll be working with other open-source projects, right? Like Apache Ignite, and I assume Apache Cassandra…
Eric: Yes, and also Apache Spark, Hadoop, and others. One of the form factors that will result is an in-memory DIMM (Dual In-line Memory Module) that you plug-in to your machine. The technology behind it is 3D-XPoint.
Tom: And for our readers, 3D-XPoint is memory storage technology jointly developed by Intel and Micron that fills a gap in the storage market between dynamic RAM (DRAM) and NAND flash.
Eric: And if someone Googles “3D-XPoint” they’ll quickly learn that there are two form factors: an SSD form factor and a memory form factor. And we’re really focusing on the memory side of things and not the storage.
Tom: OK, have I missed anything?
Eric: We are always looking for feedback on the API. I invite your readers to download and play around with the APIs and email the owner on the GitHub project because we need the input.
Steve: Yes, it’s under heavy development but the broad direction of things is obvious to see and we have a little bit of documentation up there — so we are looking forward to your feedback.