Streaming Large ZIP Files in Rails
Recently, I needed to add a “Download all” button in a Rails application for managing meeting assets. Specifically, this magic button would allow attendees to download all the meeting documents in a single zip file. Before I explain how I tackled streaming of large zip files, let’s first look at the files storage implementation.
How the Files are Stored
Since Rails 5.2, there is a baked-in solution for handling file uploads namedActive Storage. What I like about this is that it doesn’t require you to alter any of your application existing models with extra columns to support file uploads. You can easily add file uploading to any model in your application. Active Storage achieves this flexibility via a polymorphic association in theActiveStorage::Attachment model, which is a join model between your record and theActiveStorage::Blob model:
# rails/active_storage/app/models/active_storage/attachment.rbclassActiveStorage::Attachment<ActiveRecord::Base...belongs_to:record,polymorphic:true,touch:truebelongs_to:blob,class_name:"ActiveStorage::Blob"...endTheActiveStorage::Blob record contains all the necessary file metadata. Among them a unique key to the storage location, filename, content type, byte size and more. Later, we will use the Blob model to access our storage files content bit by bit.
Despite using the Active Storage, the advice in this article is storage agnostic. There are many other great alternatives for handling file uploads such as Carrierwave, Dragonfly or Shrine. But we’re going to stick with the default storage solution here.
We aren’t going to go cover setting up active storage from scratch. Please follow the official Rails guides on how to do it in your project. Instead, our starting point will be a Meeting model. The only thing we need to do to be able to attach many documents to our meeting is to usehas_many_attached method:
# app/models/meeting.rbclassMeeting<ApplicationRecordhas_many_attached:documentsendA Common Approach
The common solution to downloading zip archives is to create an entire zip file with all the files first. This means reading each file into memory first before writing it back to the disk as part of a single archived zip file. Once done, the web server will begin sending the zip file to the client.
Unfortunately, this approach has a few drawbacks. Depending on the sizes of files in the archive, you may need a lot of memory and disk space to generate a zip file. Even if you have ample resources, your application user may need to wait a long time before their browser starts downloading the archived file. The perceived lag and inactivity will negatively impact their experience.
So what’s the alternative?
Tricks Up the Streaming Sleeve
The solution is to stream a zip archive immediately to the client as the very first file is being read from the disk. This way, we don’t even have to wait for the file to be fully read. We can start streaming in smaller byte chunks without creating a zip file upfront. This approach removes the need for large disk space and reduces memory allocations as our zip content is sent over the wire in small chunks. With decreased latency and faster download time, the user experience improves significantly.
To stream large files in a single zip archive, we’re going to use thezip_tricks gem. The library boasts the ability to handle millions of zip files generated per day. So, we have our backs covered with the volume of archived files here as well.
Let’s add the download button.
The Download Button
We start by creating a request path that will handle streaming of our download. To do so, we add a download route to the meeting resources that will use a custom controller:
# config/routes.rbresources:meetingsdomemberdopost:download,to:"zip_streaming#download"endendIt’s worth noting that we chose the POST method in place of GET. It’s so that we can skip having to deal with templates rendering in our controller action.
Next, we add a custom controllerzip_streaming_controller.rb with a download action that will handle streaming of the zip archive:
# app/controllers/zip_streaming_controller.rbclassZipStreamingController<ApplicationControllerbefore_action:set_meetingdefdownloadendprivatedefset_meeting@meeting=Meeting.find(params[:id])endendAnd finally, in our view, we add the “Download all” button that will trigger zip file download:
# app/views/meetings/show.html.erb<%=button_to"Download all",download_meeting_path(@meeting),method: :post,data:{"no-turbolink"=>true}%>Now we’re ready to discuss the implementation of the download action.
File Download Response Headers
When discussing downloading files of any kind, we need to touch on the subject of HTTP response headers and, in particular, theContent-Disposition header. TheContent-Disposition response header tells the browser how to display the response content. If the browser knows how to handle the MIME type, theinline value displays the content as part of the web page. Otherwise, the content is immediately downloaded. We can also instruct the browser to always download the content and save it locally. To do this, we use anattachment disposition. When the “Save as” dialog is presented, by default, the filename is the last segment of the URL. To change this, we can usefilename attribute to name the downloaded file:
response.headers["Content-Disposition"]="attachment; filename=\"download.zip\""We also want to inform the browser about the content type. To do so we use theContent-Type response header with the“application/zip” MIME type:
response.headers["Content-Type"]="application/zip"To help the user identify their download, we name our archived file by the meeting title using an easy to read slug identifier. Putting it all together, we add theContent-Disposition andContent-Type response headers to the download action:
defdownloadzipname="#{@meeting.slug}.zip".gsub('"','\"')# escape quotesdisposition="attachment; filename=\"#{zipname}\""response.headers["Content-Disposition"]=dispositionresponse.headers["Content-Type"]="application/zip"endThere are many quirks when dealing with the filename attribute of aContent-Disposition header. For starters, the filename may contain special characters that need escaping. To handle the edge cases in filenames and make the solution more robust, we can useActionDispatch::HTTP::ContentDisposition and theformat method:
defdownloadzipname="#{@meeting.slug}.zip"disposition=ActionDispatch::Http::ContentDisposition.format(disposition:"attachment",filename:zipname)response.headers["Content-Disposition"]=dispositionresponse.headers["Content-Type"]="application/zip"endHowever, we’re not done with the response headers yet. Since we’re dealing with streaming, we do not know the byte length of our content. When theContent-Length header is omitted, the browser will assume that the content will be streamed in chunks in a single request/response cycle. So we ensure that the header is removed:
response.delete_header("Content-Length")If our meeting app becomes widely successful, we want to be kind to our server resources and send a cached copy when possible. To control cache settings, we use theCache-Control header with“no-cache” directive. Contrary to popular belief, the“no-cache” doesn’t imply that the server will perform no caching. It means that the server will perform validation before releasing a cached copy.
response.headers["Cache-Control"]="no-cache"For the server to perform cache validation, we need to provide a validator in our response as well. One choice is to useLast-Modified response header to validate the cached archive file. We use theTime classhttpdate method to provide the date and time in the expected format for when the archive was last modified:
response.headers["Last-Modified"]=Time.now.httpdate.to_sBefore we finish headers declaration, we need to deal with the HTTP server buffering problem. Web servers like Nginx perform buffering to reduce overhead with writing and reading streamed content. The problem is that if you stream chunks of content, the Nginx’s will store them in a buffer and send it back to the client only when the buffer fills up or the stream closes. Unfortunately, this will make the browser wait for content. To disable this behaviour, we can use theX-Accel-Buffering header to stop the Nginx from buffering:
response.headers["X-Accel-Buffering"]="no"Finally, the download action with all the response headers looks like this:
defdownloadzipname="#{@meeting.slug}.zip"disposition=ActionDispatch::Http::ContentDisposition.format(disposition:"attachment",filename:zipname)response.headers["Content-Disposition"]=dispositionresponse.headers["Content-Type"]="application/zip"response.headers.delete("Content-Length")response.headers["Cache-Control"]="no-cache"response.headers["Last-Modified"]=Time.now.httpdate.to_sresponse.headers["X-Accel-Buffering"]="no"endStreaming the Zip File
Now, we can turn our attention to actually streaming the zip file content. To do this, we use theZipTricks::BlockWriter that will be responsible for streaming chunks of the zip archive back to the browser. Each time a writer receives a chunk of content, it will call a block and write the content directly onto the response stream:
defdownload...writer=ZipTricks::BlockWrite.newdo|chunk|response.stream.write(chunk)endendHaving specified our writer, we’re ready to open a stream for writing. We useZipTricks::Streamer and call theopen method with a previously created writer to begin writing the zip archive. As we do so, we ensure that we close the stream when the streaming is done, otherwise the socket could be left open forever:
defdownload...writer=ZipTricks::BlockWrite.newdo|chunk|response.stream.write(chunk)endZipTricks::Streamer.open(writer)do|zip|endensureresponse.stream.closeendNext, one by one, we begin to retrieve meeting documents for streaming. We use thewrite_deflated_file method to create an entry in the zip archive. This method takes the document filename as an argument and yields back the previously created writer IO object that will serve for writing the document content:
defdownload...ZipTricks::Streamer.open(writer)do|zip|@meeting.documents.eachdo|doc|zip.write_deflated_file(doc.filename.to_s)do|file_writer|...endendendensureresponse.stream.closeendThanks toActiveStorage::Attachment association, we can access document metadata via theblob record. TheActiveStorage::Blob provides adownload method which, when called with a block, will stream the file content in chunks. Be careful here though, as calling this method without a block would read the entire file into memory before returning its content - not what we want. Notice, since the file writer is an IO object it responds to the<< message that we can use to write our chunks:
defdownload...writer=ZipTricks::BlockWrite.newdo|chunk|response.stream.write(chunk)endZipTricks::Streamer.open(writer)do|zip|@meeting.documents.eachdo|doc|zip.write_deflated_file(doc.filename.to_s)do|file_writer|doc.blob.downloaddo|chunk|file_writer<<chunkendendendendensureresponse.stream.closeendThe Missing Piece
Unfortunately, callingresponse.stream.write isn’t enough to make file streaming work. If you were to try running our code now, it would work but hold the browser from downloading until the full archived file is ready. Each chunk from theresponse.stream object would be added to the response buffer and sent to the client when the entire response body finishes.
There is one more piece missing from this puzzle - theActionController::Live module. To make all your actions capable of streaming live data, all you need to do is to mix in this module into your controller:
# app/controllers/zip_streaming_controller.rbclassZipStreamingController<ApplicationControllerincludeActionController::Live...endOnce theActionController::Live is included, theresponse.stream.write will stream data to the client in real-time without buffering. When downloading you will see archive file size growing as in this example:

Under the covers, the streaming is done by executing an action in a child thread. This lets Rails, and specifically Rack process response body in parallel with sending data to the client. Because of this, you need to make sure your action is thread-safe. It also means that a web server needs to be capable of multithreading and streaming. But, Rails default web server Puma has you covered here. The final caveat is that you need to specify response headers before writing data to the response stream.
Summing it all up, the entire zip streaming controller with download action looks like this:
# app/controllers/zip_streaming_controller.rbclassZipStreamingController<ApplicationControllerincludeActionController::Livebefore_action:set_meetingdefdownloadzipname="#{@meeting.slug}.zip"disposition=ActionDispatch::Http::ContentDisposition.format(disposition:"attachment",filename:zipname)response.headers["Content-Disposition"]=dispositionresponse.headers["Content-Type"]="application/zip"response.headers.delete("Content-Length")response.headers["Cache-Control"]="no-cache"response.headers["Last-Modified"]=Time.now.httpdate.to_sresponse.headers["X-Accel-Buffering"]="no"writer=ZipTricks::BlockWrite.newdo|chunk|response.stream.write(chunk)endZipTricks::Streamer.open(writer)do|zip|@meeting.documents.eachdo|doc|zip.write_deflated_file(doc.filename.to_s)do|file_writer|doc.blob.downloaddo|chunk|file_writer<<chunkendendendendensureresponse.stream.closeendprivatedefset_meeting@meeting=Meeting.find(params[:id])endendImproving the Design
Even though we have a working implementation, there is still room for improvement. Apart from the method being verbose, it’s usually a bad practice to have so much code logic in a single controller action. Let’s do something about it.
We will deal with the headers first. As it turns out Rails provides a convenient methodsend_file_headers! for specifying theContent-Disposition andContent-Type headers. This method will ensure the right format and escaping for the file attachment. This will reduce the header specification to this:
defdownloadzipname="#{@meeting.slug}.zip"send_file_headers!(type:"application/zip",disposition:"attachment",filename:zipname)response.delete_header("Content-Length")response.headers["Cache-Control"]="no-cache"response.headers["Last-Modified"]=Time.now.httpdate.to_sresponse.headers["X-Accel-Buffering"]="no"...endHowever, we can cut down the above response headers setup even further. TheActionController::Live moduleresponse.stream.write method deletes theContent-Length and sets theCache-Control to“no-cache” headers for us, so we can remove them as well:
defdownloadzipname="#{@meeting.slug}.zip"send_file_headers!(type:"application/zip",disposition:"attachment",filename:zipname)response.headers["Last-Modified"]=Time.now.httpdate.to_sresponse.headers["X-Accel-Buffering"]="no"...endNext, we extract the streaming behaviour into a separate class calledDocumentsStreamer. In the constructor, it will accept documents collection and allow us to enumerate over all the streamed chunks witheach method. Essentially, turning our class into anEnumerable object. As a convenience, we add a class level methodstream to abstract the underlying plumbing and provide a verb that expresses the class purpose.
# app/services/documents_streamer.rbclassDocumentsStreamerincludeEnumerabledefself.stream(documents,&chunks)streamer=new(documents)streamer.each(&chunks)endattr_reader:documentsdefinitialize(documents)@documents=documentsenddefeach(&chunks)writer=ZipTricks::BlockWrite.new(&chunks)ZipTricks::Streamer.open(writer)do|zip|documents.eachdo|doc|zip.write_deflated_file(doc.filename.to_s)do|file_writer|doc.blob.downloaddo|chunk|file_writer<<chunkendendendendendendUsing theDocumentsStreamer, we can reduce our download action code down to this:
defdownload...DocumentsStreamer.stream(@meeting.documents)do|chunk|response.stream.write(chunk)endensureresponse.stream.closeendIn the end, our refactored action uses Rails to its full potential and tells a more succinct story of how the download works:
defdownloadzipname="#{@meeting.slug}.zip"send_file_headers!(type:"application/zip",disposition:"attachment",filename:zipname)response.headers["Last-Modified"]=Time.now.httpdate.to_sresponse.headers["X-Accel-Buffering"]="no"DocumentsStreamer.stream(@meeting.documents)do|chunk|response.stream.write(chunk)endensureresponse.stream.closeendSummary
This concludes our overview of streaming large zip files in Rails. We covered a lot of ground by lifting the lid on how Active Storage can facilitate streaming of files. We explored various types of HTTP response headers that instruct clients to download content. All this wouldn’t be possible without a great zip_tricks gem and convenient Rails APIs. We finished by cleaning our code up and abstracting away the streaming, thus making the whole thing more maintainable.
I hope this was a useful article that showcased how you can implement any type of download feature and take advantage of Rails streaming API. Feel free to post a comment on social media.
I'm Piotr Murach. I document my programming journey and share my coding experiences with practical examples to improve your day-to-day work. If you enjoy my articles or open source projects, please considersupporting what I do. Be sure to also subscribe to mynewsletter andfeed.