Sometimes you need automated test on production

In this article I am making the case that sometimes you just need to run automated tests against the real production and the real systems with real data for real users.

The case

We have a feature on one of our platforms:

  1. User clicks on “Export” for a “record”
  2. A job is scheduled. It generates a CSV file with information about the record and uploads on S3. Then a presigned_url for 72 hours is generated and an email is sent to the user with a link to download the file.

The question is how do you test this?

Confidence

When it comes to specs I like to develop automated specs that give me the confidence that I deliver quality software. I am not particularly religious to what the spec is as long as it gives me confidence and it is not standing in my way by being too fragile.

Sometimes these specs are model/unit specs, many times they are system/feature/integration specs, but there are cases where you just need to run a test on production against the production db, production S3, production env, production user, production everything.

Go in a System/Integration spec

A spec that would give me confidence here is to simulate the user behavior with Rails system specs.
The user goes and click on the Export and I check that we’ve received an email and this email contains a link

  scenario "create an export, uploads it on s3 and send an email" do
    # Set up the record
    user = FactoryBot.create(:user)
    record = FactoryBot.create(:record)
    ... 

    # Start the spec
    login_as user
    visit "/records"
    click_on "Export"
    expect(page).to have_text "Export successfully scheduled. You will receive an email with a link soon."

    mail_html_content = ActionMailer::Base.deliveries.select{|email| email.subject == "Successful export"}.last.html_part.to_s
    expect(mail_html_content).to have_xpath "//a[text()='#{export_name}']"
    link_to_exported_zip = Nokogiri::HTML(mail_html_content).xpath("//a[text()='#{export_name}']").attribute("href").value

    csv_content = read_csv_in_zip_given_my_link link_to_exported_zip 
    expect(csv_content).not_to be_nil
    expect(csv_content).to include user.username
  end

This spec does not work!

First problem – AWS was stubbed

We have a lot of other specs that are using S3 API. It is a good practice as you don’t want all your specs to touch S3 for real. It is slow and it is too coupled. But for this spec there was a problem. There was a file uploaded on S3, but the file was empty. The reason was that on one of the machines that was running the spes there was no ‘zip’ command. It was not installed and we are using ‘zip’ to create a zip of the csv files.

Because of this I wanted to upload an actual file somehow and actually check what is in the file.

I created a spec filter that would start a specific spec with real S3.

# spec/rails_helper.rb
RSpec.configure do |config|
  config.before(:each) do
    # Stub S3 for all specs
    Aws.config[:s3] = {
      stub_responses: true
    }
  end

  config.before(:each, s3_stub_responses: false) do
    # but for some specs, those that have "s3_stub_responses: false" tag do not stub s3 and call the real s3.
    Aws.config[:s3] = {
      stub_responses: false
    }
  end
end

`This allows us to start the spec

  scenario "create an export, uploads it on s3 and send an email", s3_stub_responses: false do
    # No in this spec S3 is not stubbed and we upload the file
  end

Yes, we could create a local s3 server, but then the second problem comes.

Mailer was adding invalid params

In the email we are sending a presigned_url to the S3 file as the file is not public.
But the mailer that we were using was adding “utm_campaign=…” to the url params.
This means that the S3 presigned url was not valid. Checking if there is an url in the email was simply not enough. We had to actually download the file from S3 to make sure the link is correct.

This was still not enough.

It is still not working on production

All the tests were passing with real S3 and real mailer in test and development env, but when I went on production the feature was not working.

The problem was with the configuration. In order to upload to S3 we should know the bucket. The bucket was configured for Test and Development but was missing for production

config/environments/development.rb:  config.aws_bucket = 'the-bucket'
config/environments/test.rb:  config.aws_bucket = 'the-bucket'
config/environments/production.rb: # there was no config.aws_bucket

The only way I could make sure that the configuration in production is correct and that the bucket is set up correctly is to run the spec on a real production.

Should we run all specs on a real production?

Of course not. But there should be a few specs for a few features that should test that the buckets have the right permissions and they are accessible and the configuration in production is right. This is what I’ve added. Once a day a spec goes on the production and tests that everything works on production with real S3, real db, real env and configuration, the same way that users will use the feature.

How is this part of the CI/CD?

It is not. We do not run this spec before deploy. We run all the other specs before deploy that gives us 99% confidence that everything works. But for the one percent we run a spec once every day (or after deploy) just to check a real, complex scenario, involving the communication between different systems.

It pays off.