DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Please enter at least three characters to search
Refcards Trend Reports
Events Video Library
Refcards
Trend Reports

Events

View Events Video Library

Zones

Culture and Methodologies Agile Career Development Methodologies Team Management
Data Engineering AI/ML Big Data Data Databases IoT
Software Design and Architecture Cloud Architecture Containers Integration Microservices Performance Security
Coding Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Culture and Methodologies
Agile Career Development Methodologies Team Management
Data Engineering
AI/ML Big Data Data Databases IoT
Software Design and Architecture
Cloud Architecture Containers Integration Microservices Performance Security
Coding
Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance
Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks

Because the DevOps movement has redefined engineering responsibilities, SREs now have to become stewards of observability strategy.

Apache Cassandra combines the benefits of major NoSQL databases to support data management needs not covered by traditional RDBMS vendors.

The software you build is only as secure as the code that powers it. Learn how malicious code creeps into your software supply chain.

Generative AI has transformed nearly every industry. How can you leverage GenAI to improve your productivity and efficiency?

Related

  • AWS Serverless Lambda Resiliency: Part 1
  • Split the Monolith: What, When, How
  • Java, Spring Boot, and MongoDB: Performance Analysis and Improvements
  • Introduction to Apache Kafka With Spring

Trending

  • Infrastructure as Code (IaC) Beyond the Basics
  • How Large Tech Companies Architect Resilient Systems for Millions of Users
  • Navigating Double and Triple Extortion Tactics
  • What Is Plagiarism? How to Avoid It and Cite Sources
  1. DZone
  2. Coding
  3. Frameworks
  4. Writing a Download Server Part I: Always Stream, Never Keep Fully in Memory

Writing a Download Server Part I: Always Stream, Never Keep Fully in Memory

By 
Tomasz Nurkiewicz user avatar
Tomasz Nurkiewicz
DZone Core CORE ·
Jun. 24, 15 · Interview
Likes (0)
Comment
Save
Tweet
Share
16.8K Views

Join the DZone community and get the full member experience.

Join For Free

Downloading various files (either text or binary) is a bread and butter of every enterprise application. PDF documents, attachments, media, executables, CSV, very large files, etc. Almost every application, sooner or later, will have to provide some form of download. Downloading is implemented in terms of HTTP, so it's important to fully embrace this protocol and take full advantage of it. Especially in Internet facing applications features like caching or user experience are worth considering. This series of articles provides a list of aspects that you might want to consider when implementing all sorts of download servers. Note that I avoid "best practices" term, these are just guidelines that I find useful but are not necessarily always applicable.

One of the biggest scalability issues is loading whole file into memory before streaming it. Loading full file into byte[] to later return it e.g. from Spring MVC controller is unpredictable and doesn't scale. The amount of memory your server will consume depends linearly on number of concurrent connections times average file size - factors you don't really want to depend on so much. It's extremely easy to stream contents of a file directly from your server to the client byte-by-byte (with buffering), there are actually many techniques to achieve that. The easiest one is to copy bytes manually:

@RequestMapping(method = GET)
public void download(OutputStream output) throws IOException {
    try(final InputStream myFile = openFile()) {
        IOUtils.copy(myFile, output);
    }
}

Your InputStream doesn't even have to be buffered, IOUtils.copy() will take care of that. However this implementation is rather low-level and hard to unit test. Instead I suggest returning Resource:

@RestController
@RequestMapping("/download")
public class DownloadController {
 
    private final FileStorage storage;
 
    @Autowired
    public DownloadController(FileStorage storage) {
        this.storage = storage;
    }
 
    @RequestMapping(method = GET, value = "/{uuid}")
    public Resource download(@PathVariable UUID uuid) {
        return storage
                .findFile(uuid)
                .map(this::prepareResponse)
                .orElseGet(this::notFound);
    }
 
    private Resource prepareResponse(FilePointer filePointer) {
        final InputStream inputStream = filePointer.open();
        return new InputStreamResource(inputStream);
    }
 
    private Resource notFound() {
        throw new NotFoundException();
    }
}
 
@ResponseStatus(value= HttpStatus.NOT_FOUND)
public class NotFoundException extends RuntimeException {
}

Two abstractions were created to decouple Spring controller from file storage mechanism.FilePointer is a file descriptor, irrespective to where that file was taken. Currently we use one method from it:

public interface FilePointer {
 
    InputStream open();
 
    //more to come
 
}

open() allows reading the actual file, no matter where it comes from (file system, database BLOB, Amazon S3, etc.) We will gradually extend FilePointer to support more advanced features, like file size and MIME type. The process of finding and creatingFilePointers is governed by FileStorage abstraction:

public interface FileStorage {
    Optional<FilePointer> findFile(UUID uuid);
}

Streaming allows us to handle hundreds of concurrent requests without significant impact on memory and GC (only a small buffer is allocated in IOUtils). BTW I am using UUID to identify files rather than names or other form of sequence number. This makes it harder to guess individual resource names, thus more secure (obscure). More on that in next articles. Having this basic setup we can reliably serve lots of concurrent connections with minimal impact on memory. Remember that many components in Spring framework and other libraries (e.g. servlet filters) may buffer full response before returning it. Therefore it's really important to have an integration test trying to download huge file (in tens of GiB) and making sure the application doesn't crash.


Writing a download server

  • Part I: Always stream, never keep fully in memory
  • Part II: headers: Last-Modified, ETag and If-None-Match
  • Part III: headers: Content-length and Range
  • Part IV: Implement HEAD operation (efficiently)
  • Part V: Throttle download speed
  • Part VI: Describe what you send (Content-type, et.al.)

The sample application developed throughout these articles is available on GitHub.

Memory (storage engine) Download Stream (computing) Spring Framework application integration test User experience IT Connection (dance)

Published at DZone with permission of Tomasz Nurkiewicz, DZone MVB. See the original article here.

Opinions expressed by DZone contributors are their own.

Related

  • AWS Serverless Lambda Resiliency: Part 1
  • Split the Monolith: What, When, How
  • Java, Spring Boot, and MongoDB: Performance Analysis and Improvements
  • Introduction to Apache Kafka With Spring

Partner Resources

×

Comments
Oops! Something Went Wrong

The likes didn't load as expected. Please refresh the page and try again.

ABOUT US

  • About DZone
  • Support and feedback
  • Community research
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Core Program
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 3343 Perimeter Hill Drive
  • Suite 100
  • Nashville, TN 37211
  • support@dzone.com

Let's be friends:

Likes
There are no likes...yet! 👀
Be the first to like this post!
It looks like you're not logged in.
Sign in to see who liked this post!