Compressing Documents To Zip


If you have a ton of uploaded files you need to have a visitor download, you will want to compress said information to decrease size and combine many files into one single zip. We will do this using the rubyzip gem. Keep in mind that if the files are audio or video, compressing them may not have any effect on the size.

Installation

We will first add the gem to our project.

# Gemfile
gem "rubyzip"

View

= link_to 'Download Zip', route_path(format: :zip)

Using a Controller Route

Next, you will need to create an endpoint that will generate the zip file format, whether it is a single document or many. You can nest subdirectories by specifying the path you want the file to be written to.

def zip_download
  post_files = Post::Uploads.where(id: params[:ids])

  respond_to do |format|
    format.zip do
      compressed_filestream = Zip::OutputStream.write_buffer do |output|
        post_files.each do |post|
          output.put_next_entry "post_files/#{post.id}/#{post.name}.#{post.file.file.extension}"
          output.print File.open("public/#{post.file}", "r").read
        end
      end
      compressed_filestream.rewind
      send_data compressed_filestream.read, filename: "post_files.zip"
    end
  end
end

The above method will generate a compressed zip file on the fly. There is another way to do this which creates a temp file and this way could be use as a background job, in which case would be the ideal one to use if needed for that case. Keep in mind that the file will persist on the disk until deleted.

def zip_files
  post_files = Post::Uploads.where(id: params[:ids])
  Zip::File.open('your_local_path/your_desired_name.zip', Zip::File::CREATE) do |z|
    post_files.each do |f|
      z.add("#{f.name}", "public/#{f.file}")
    end
  end
end

You can then choose the send the data for direct download, or omit this if it's a background job.

send_file 'your_local_path/your_desired_name.zip', type: 'application/zip',
      disposition: 'attachment', filename: "post_files.zip"

JSON File Uploads or Downloads

It is possible to have a user upload a JSON file and insert that as a row into your database.

Since we only want .json files in this case, we will only grab json files with glob.

# controller.rb
def create
  if params[:zip_file].present?
    Zip::File.open(params[:zip_file].tempfile) do |zip_file|
      zip_file.glob('*.json').each { |payload| Model.from_json(payload) }
    end
  end
  redirect_to root_path
end

You can also use extract instead of get_input_stream, but we just want to read it from memory instead of extracting this file somewhere. We can also control which columns are accepted to prevent users from changing information we don't want them to. So we have created a PERMITTED_COLUMNS constant, which specifies the allowed columns a user can change while uploading a file. "JSON.load" will convert out file into a usable json object in which we can select only our permitted columns.

# Model.rb
PERMITTED_COLUMNS = ['title', 'user_id', 'description']

class << self
  def from_json(payload)
    begin
      Model.create!(JSON.load(payload.get_input_stream.read).select{|k, v| PERMITTED_COLUMNS.include?(k)})
    rescue => e
      warn e.message
    end
  end
end

If you wanted to do the opposite and convert your model rows into files with json, you could do so with the following code. When we convert the contents to json, we can specify the columns to do so, just like we did above with an :only statement.

def index
  @model = Model.all

  respond_to do |format|
    format.html
    format.zip do
      compressed_filestream = Zip::OutputStream.write_buffer do |payload|
        @model.each do |item|
          payload.put_next_entry "#{item.name}_#{item.id}.json"
          payload.print item.to_json(only: [:title, :user_id, :description])
        end
      end
      compressed_filestream.rewind
      send_data compressed_filestream.read, filename: "model.zip"
    end
  end
end

Using Tar.gz

This is another good way that could be more efficient depending on your needs to compress many files that would cause a timeout error if you were to do it using the methods above. Basically we loop through all the download links and then write those to a file. We then pass in that file with all the links delimited by a new line, which tar then goes and finds and puts all together. You can also just pass a whole directory in and tar will compress it recursively.

def download_files
  rails_tmp = File.join(Rails.root, "tmp")
  file_mass_tar_list = File.join(rails_tmp, "file_mass_tar_list.txt")
  file_mass_tar_path = File.join(rails_tmp, "file_mass.tar.gz")
  location = File.join(Rails.root, "public")
  download_links = Download.all

  File.open(file_mass_tar_list, "w+") do |f|
    download_links.each do |link|
      f.puts "downloads/#{link.name}_#{link.id}.#{link.file.file.extension}"
    end
  end

  `cd #{location}/uploads && tar -czf #{file_mass_tar_path} -T #{file_mass_tar_list}`

  tar_ball = File.open(file_mass_tar_path)
  File.delete(file_mass_tar_path)
  File.delete(file_mass_tar_list)
  send_data tar_ball.read, filename: "file_mass.tar.gz"
end