avatarZhimin Zhan

Summary

The undefined website content details the author's perspective on the exemplary practices of Continuous Integration (CI) at Facebook, emphasizing the three key goals of high-signal, rapid, and frequent feedback in the development process.

Abstract

The article on the undefined website discusses the author's journey with Continuous Integration (CI) since 2005, highlighting the challenges faced and the solutions implemented. The author recommends a presentation by Katie Coons from Facebook, which outlines the primary goal of CI at Facebook as developer efficiency, aiming for computers to wait on humans. The presentation emphasizes three key goals for CI: high-signal (reducing false alarms), rapid (providing quick feedback), and frequent (ensuring ease of triggering builds). The author concurs with these goals based on personal experience and criticizes other CI presentations for missing these essential points. Facebook's CI practices, including the use of WebDriver for end-to-end tests and their custom CI server Sandcastle, are presented as superior to traditional CI servers and methods. The article also touches on the importance of developers writing their own tests, the handling of flaky tests, and the impressive build farm infrastructure at Facebook, which handles a high volume of test results daily.

Opinions

  • The author believes that the presentation by Katie Coons is the best on CI/CD and DevOps, offering practical solutions and addressing real challenges.
  • The article suggests that most CI presentations fail to address the actual challenges of CI and often do not provide practical solutions.
  • The author emphasizes the importance of having a CI build process that is trustworthy, with rapid feedback and frequent execution, to facilitate quicker bug fixes and developer efficiency.
  • The author advocates for developers to be responsible for writing tests for their code, which is a practice at Facebook.
  • The use of Web

Recommend a Great CI/DevOps Presentation: “Continuous Integration at Facebook”

See what a real Continuous Integration is like.

Image credit: from the slides in the video presentation

I started practising Continuous Integration in late 2005, using CruiseControl (CC in short), the so-called “Grand-daddy” of CI servers. The most challenging part is running automated UI tests; it took me a couple of years to master it.

I had to hack into CruiseControl code (Java) and implemented some features. if you are interested, read ‘My Continuous Testing Journey’.

Since then, I have attended a number of CI presentations at conferences/meetups. Frankly, none was good (Check out CI/CD Clarified for whys). It did not address the real challenges of CI at all, let alone offer practical solutions.

On 20/06/2015, I found an excellent presentation (YouTube) on How Facebook does CI from F8 (Facebook Developer Conference) by Katie Coons, a software engineer at Product Stability (a good division name) at Facebook. It was quite short, only 15 mins, but with a wealth of wisdom. I have watched it many times, and it is still valid. Still, in my opinion, the best presentation on CI/CD and DevOps. I was very surprised by the low number of views and likes on YouTube.

Katie started the presentation with

“№1 goal of CI at Facebook is developer efficiency”

“We want computers waiting on humans.”

Then the three key goals of CI at Facebook:

  • High-Signal

“We don’t want our developers to chase down the failures that are not their fault, that’s waste of developers’ time”

  • Rapid

“The system must provide rapid feedback to developers because our developers wll not wait for it. Moving fast is a really important part of Facebooks’ culture”

  • Frequent

“The system must provide frequent feedback to developers. Because the sooner we let them know about a problem, the easier and faster it is for them to fix it”

I concur with my experience. Ironically, all other CI presentations missed all of the key goals, maybe with the exception of Rapid, but often was vague. I like Katie gave a specific target build time for a CI build: 10 minutes.

10-Minute Build” is an Extreme Programming Practice.

Here, I share my understanding of these 3 CI goals:

  1. High-Signal

Reduce the false alarms, i.e., flagged a test failure, but it was not. If the team did not trust the results of automated test executions, everything goes out of the window, right? However, we frequently see projects executing automated tests every night with a mere ~40% pass rate in Jenkins. Why bother?

2. Rapid

The feedback from CI build must be very quick, as we all know, the classic software engineering rule: earlier a bug was found, cheaper and quicker to fix. However, it is easier said than done. Few companies have a proper parallel testing lab set up to make executing automated functional tests faster.

Once, one manager (of another division) wanted to review their CI implementation (over the phone), she said: “Our DevOps guys spent a lot of effort, the build runs for 10 hours …”. I replied: “It has already failed. Your DevOps guys had no idea how to reduce the build time”. The result was exactly as I predicted, the whole thing was discarded shortly after.

3. Frequent

It shall be easy to trigger a run of CI build. There is no point if the total execution time is 10 minutes, but each build's preparation time took 2 hours.

On developer’s involvement in automated testing and CI:

“It is really important at Facebook I will never, as a developer, write code and toss over the fence and for someone else to write test for it. It is my job as a developer at Facebook to write tests for my code, and my reviewers make sure I do”.

💡Show the above to your programmers who don’t want to write automated tests. The fact: most programmers don’t have the capability to develop automated unit tests, let alone Automated End-to-End UI Testing which is much more challenging.

On WebDriver:

“For all of our end-to-end tests at Facebook we use WebDriver, WebDriver is an open-source JSON wired protocol, I encourage you all check it out if you haven’t already. ”

For more on Testing Pyramid, check out “Testing Pyramid Clarified

“One of the great advantages of WebDriver is that it gets applications cross all platforms.”

💡Show the above section to your team members who want to create ‘own test framework’ or use other commercial/proprietary ones. If you are interested in why Selenium WebDriver is the best choice for test automation, please read my other articles:

On Continuous Testing Server:

Facebook’s own Sandcastle. Of course, not Jenkins/Bamboo/TeamCity which are traditional CI servers. Have you ever seen Level 2 of AgileWay CT Grading (50+ E2E tests) implemented in Jenkins? I mean, reliable test executions with quick feedback. I haven’t.

> 1000 test results per second every single day

On addressing flaky automated tests:

“Failures inevitable at Scale

“Infrastructure failures get retried

“You would be surprised how many tests cannot run in parallel with themselves”

Please note test-execution-retries here. From the context, the retries were handled by the Sandcastle CI/CT server, NOT the test scripts like Cypress (bad). If you are interested in this topic, please read my other article: “Why Auto-Retry of Test Execution in a Test Framework is Wrong?

On testing infrastructure.

“At Facebook, We have some of our top engineers working on development infrastructure”

And these impressive FaceBook Build Farm.

(the above is pretty much what I got at the moment)

Wow!

That’s why I gave Level 6 (highest, a handful in the world) to Facebook on AgileWay CT Grading.

If you enjoy reading stories like these and want to support me as a writer, consider signing up to become a Medium member. It’s $5 a month, giving you unlimited access to stories on Medium. If you sign up using my link, I’ll earn a small commission.

Related reading:

Continuous Integration
Ci Cd Pipeline
Agile
DevOps
Test Automation
Recommended from ReadMedium