DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Refcards Trend Reports
Events Video Library
Over 2 million developers have joined DZone. Join Today! Thanks for visiting DZone today,
Edit Profile Manage Email Subscriptions Moderation Admin Console How to Post to DZone Article Submission Guidelines
View Profile
Sign Out
Refcards
Trend Reports
Events
View Events Video Library
Zones
Culture and Methodologies Agile Career Development Methodologies Team Management
Data Engineering AI/ML Big Data Data Databases IoT
Software Design and Architecture Cloud Architecture Containers Integration Microservices Performance Security
Coding Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Culture and Methodologies
Agile Career Development Methodologies Team Management
Data Engineering
AI/ML Big Data Data Databases IoT
Software Design and Architecture
Cloud Architecture Containers Integration Microservices Performance Security
Coding
Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance
Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks

Integrating PostgreSQL Databases with ANF: Join this workshop to learn how to create a PostgreSQL server using Instaclustr’s managed service

Mobile Database Essentials: Assess data needs, storage requirements, and more when leveraging databases for cloud and edge applications.

Monitoring and Observability for LLMs: Datadog and Google Cloud discuss how to achieve optimal AI model performance.

Automated Testing: The latest on architecture, TDD, and the benefits of AI and low-code tools.

Related

  • Turn Cloud Storage or HDFS Into Your Local File System for Faster AI Model Training With TensorFlow
  • Modern Data Backup Strategies for Safeguarding Your Information
  • A Complete Guide to AWS File Handling and How It Is Revolutionizing Cloud Storage
  • Usage Metering and Usage-Based Billing for the Cloud

Trending

  • Writing Reusable SQL Queries for Your Application With DbVisualizer Scripts
  • Navigating the Skies
  • Spring WebFlux Retries
  • Essential Complexity Is the Developer's Unique Selling Point
  1. DZone
  2. Software Design and Architecture
  3. Cloud Architecture
  4. How Memory-mapped Files, File Systems and Cloud Storage Works

How Memory-mapped Files, File Systems and Cloud Storage Works

Oren Eini user avatar by
Oren Eini
·
Aug. 22, 13 · Interview
Like (0)
Save
Tweet
Share
2.64K Views

Join the DZone community and get the full member experience.

Join For Free

kelly has an interesting post about memory-mapped files and the cloud . this is in response to a comment on my post where i stated that we don’t reserve space up front in voron because we  support cloud providers that charge per storage.

from kelly’s post, i assume she thinks about running it herself on her own cloud instances, and that is what her pricing indicates. indeed, if you want to get a 100gb cloud disk from pretty much anywhere, you’ll pay for the full 100gb disk from day one. but that isn’t the scenario that i actually had in mind.

i was thinking about the cloud providers . imagine that you want to go to ravenhq , and get a database there. you sign up for a two gb plan, and all is great. except that, on the very first write, we allocate a fixed 10 gb, and you start paying overage charges. this isn’t what you pay when you run on your own hardware. this is what you would have to deal with as a cloud dbaas provider, and as a consumer of such a service.

that aside, let me deal a bit with the issues of memory-mapped files and sparse files. i created six sparse files, each of them 128 gb in size, in my e drive.

as you can see, this is a 300 gb disk, but i just “allocated” 640gb of space in it.

image

this also shows that there has been no reservation of space on the disk. in fact, it is entirely possible to create files that are entirely too big for the volume they are located on.

image

i did a lot of testing with mmap files and sparseness, and i came to the conclusion that you can’t trust it. you especially can’t trust it in a cloud scenario.

but, why? well, imagine a scenario where you need to use a new page, and the file system needs to allocate one for you. at this point, it needs to find an available page. that might fail, and let's imagine that this fails because there's no free space, because that is easiest.

what happens then? well, you aren’t accessing things via an api, so there isn’t an error code it can return or an exception to be thrown.

in windows, it will use the standard exception handler to throw the error. in linux, that will probably generate a sivxxx error. now, to make things interesting, this may not actually happen when you are writing to the newly reserved page, it may be deferred by the os to a later point in time (or if you call msync or flushviewoffile).  at any rate, that means that at some point the os is going to wake up and realize that it promised something it can’t deliver, and in that point (which, again, may be later than the point you actually wrote to that page) you are going to find yourself in a very interesting situation. i’ve actually tested that scenario, and it isn’t a good one form the point of view of reliability. you really don’t want to get there, because then all bets are off with regards to what happens to the data you wrote. and you can’t even do graceful error handling at that point, because you might be past the point.

considering the fact that a full disk is one of those things that you really need to be aware about, you can’t really trust this intersection of features.



File system Cloud Cloud storage

Published at DZone with permission of Oren Eini, DZone MVB. See the original article here.

Opinions expressed by DZone contributors are their own.

Related

  • Turn Cloud Storage or HDFS Into Your Local File System for Faster AI Model Training With TensorFlow
  • Modern Data Backup Strategies for Safeguarding Your Information
  • A Complete Guide to AWS File Handling and How It Is Revolutionizing Cloud Storage
  • Usage Metering and Usage-Based Billing for the Cloud

Comments

Partner Resources

X

ABOUT US

  • About DZone
  • Send feedback
  • Careers
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 3343 Perimeter Hill Drive
  • Suite 100
  • Nashville, TN 37211
  • support@dzone.com

Let's be friends: