eBay’s Definition of Done

A while back, I was interviewed by TalentBuddy about how we use JavaScript at eBay. In it, I mentioned our software development checklist that must be completed before any code is considered dev complete.

Pilots started using checklists in the early days of aviation as it was found that many common problems and deaths could have been solved if a pilot had simply done some verification of basic things before taking off - Do I have enough fuel? Do my instruments work? Through the use of these checklists, air travel became an incredibly safe endeavor. The same ideas apply to software development. Nobody likes it when their code fails in production, or when we have to re-run an A/B test because we didn’t track the right data. A checklist helps ensure that this information is addressed before code is completed, which further decreases the technical debt that normally accumulates and allows teams to spend time innovating rather than fixing.

At eBay, we call this checklist our Definition of Done. It was implemented in the last year as part of a broader effort to place greater importance on engineering and quality with the intention of meeting those goals above.

eBay’s Definition of Done

The code is well-written and conforms to our coding standards; we do not feel any need to immediately refactor or rewrite it. Code is reviewed and checked in.
Automated Testing is completed and passed. Test coverage is >70%.
All P1/P2 bugs are closed and verified with automated tests.
The code meets architecture requirements as defined for the domain or project.
The code meets performance/scalability requirements.
The code meets security standards.
The code meets accessibility standards.
The code supports internationalization to defined locales.
The code meets published CommerceOS standards.
User Experience has reviewed the feature and all P1/P2 issues are fixed and verified.
The code is instrumented for logging and monitoring to identify faults and key decisions (both service level and algorithmic outcomes). Alerts have been implemented to detect and warn about meaningful faults.
Usability testing is scheduled where necessary, and feedback incorporated into backlog.
The code has been appropriately documented (minimum: a Service Dependency Map, and any production monitors and alerts relevant to the feature) and documentation is checked in.
The code includes tracking support.

The majority of the items on this checklist are generally applicable to software development, however some may not apply depending on the size of the project or the specific requirements (this is even true within eBay). There is some language and terms that might be unfamiliar, so I’ve added a bit of explanation to each of the items below:

The code is well-written and conforms to our coding standards; we do not feel any need to immediately refactor or rewrite it. Code is reviewed and checked in.

There’s a reason this one is first. As a developer, you should be proud of the code you wrote and how it was implemented. At eBay, we have some internally defined coding standards (which I will be sharing later in the year), which is the case with most large companies. Standards make development, and especially maintenance, significantly easier and helps minimize the accumulation of technical debt. Also, by requiring code review, it ensures that any code written is approved by others. This is particularly important, as most software at eBay will get touched by several developers over its’ lifetime. Standards make it easy for other developers to read, understand, and maintain, and review ensures all code meets these standards.

Automated Testing is completed and passed. Test coverage is >70%.

Automated testing can be whatever makes sense for your organization. At eBay, we consider this a combination of unit testing, integration testing, and end-to-end testing. We track coverage metrics for all of these, with a company-wide baseline at 70%. There are a lot of tools and frameworks that help you measure this in any language, and I would encourage all developers to be writing unit tests at a minimum. It’s also important to note that coverage isn’t necessarily the perfect metric - it measures what code was executed during a test, but not whether the tests covered appropriate cases. However, within our Definition of Done, test code must also adhere, so bad or poorly written tests should be caught and addressed during code review.

All P1/P2 bugs are closed and verified with automated tests.

We categorize all bugs and features with a four level priority system, with P1 being a critical bug down to a P4 being one that relatively non-impacting bug (either by number of users affected or how key it is to the product). We generally draw the line at P2 bugs, meaning any open P1 or P2 bugs block a release going to production. When these are found, automated testing must be written to cover the specific case that caused the bug to help ensure this is part of any future regressions.

The code meets architecture requirements as defined for the domain or project.

Your team may not have architecture requirements written or specifically stated, but you should be ensuring the design of a component or system should be generally agreed upon (ideally before development is started, but at least on completion).

The code meets performance/scalability requirements.

There’s a host of requirements we have at eBay around how performant our applications need to be, and since any production application will easily handle hundreds of millions of requests daily it’s important that all code scales appropriately. Often this is influenced by the architecture requirements, but it must be followed through in implementation. Further, developer should be asking themselves questions like these at every stage: Is parallelization of tasks used where possible? Are you lazy-loading images on the front-end? Did you request enough database capacity to handle the projected number of requests? Your specific application may not have the same requirements as what we build at eBay, but it does have requirements. If you aren’t defining these before development, you should start.

The code meets security standards.

We have internally published security standards to ensure applications don’t leak user info, aren’t vulnerable to well-known attacks, and similar common security concerns. There’s no guarantee of 100% secure software, but you should be considering this as you develop.

The code meets accessibility standards.

eBay users are incredibly diverse, and it’s important that we ensure that everyone can easily shop on our site. While accessibility isn’t applicable to all development (for example backend batch jobs that just aggregate data), this is important enough that we want all developers to think about it. WCAG has done a great job of defining standards for web applications to conform to, and we use that as a baseline for all our software.

The code supports internationalization to defined locales.

eBay is a global product, with users in every country. While we don’t offer a fully internationalized experience, we do translate the site for over 30 countries and offer local currency display for many more. We ensure that at a minimum static text, currency display, and date/time/number formatting is localized where required.

The code meets published CommerceOS standards.

CommerceOS is an internal standard we have that essentially defines how we build applications and services at eBay, covering everything from authorization to internationalization to REST service naming to common object definitions. This one isn’t specifically applicable outside of eBay, but developers should be checking that their applications follow broader design standards such as returning meaningful status codes for a REST service, etc.

User Experience has reviewed the feature and all P1/P2 issues are fixed and verified.

This relates to the earlier P1/P2 bug scenario, but instead focuses on the user experience. You could design a feature that technically doesn’t have any bugs, but maybe the user experience it creates isn’t desirable. For example, it happens occasionally where a mockup in Photoshop just doesn’t feel the same after being created in the web and requires further refinement.

The code is instrumented for logging and monitoring to identify faults and key decisions (both service level and algorithmic outcomes). Alerts have been implemented to detect and warn about meaningful faults.

This is huge, and I mentioned it in the interview with TalentBuddy. We’ve spent many years developing an insanely detailed monitoring and logging system at eBay, and there’s a significant amount of automation around error detection and fault tolerance that all developers at eBay should be leveraging. What you consider to be a fault worth logging or a key decision is dependent on the application, but making these available will help live debugging and root cause analysis after a failure much easier.

Usability testing is scheduled where necessary, and feedback incorporated into backlog.

Usability testing is important to get qualitative feedback on your product, and should be used in addition to the typically quantitive feedback A/B testing provides. A/B testing will tell you how many users performed an action, but it can be very hard to get at why they performed it. Usability testing can uncover that, and should be considered a priority for any application that has users.

The code has been appropriately documented (minimum: a Service Dependency Map, and any production monitors and alerts relevant to the feature) and documentation is checked in.

Going back to the first item on the checklist - software tends to live much longer than we expect it to when first writing it. A key part of ensuring future maintainers can smartly make changes when needed is having good documentation and knowing where to look if something goes wrong. Applications get particularly complex at eBay - the homepage itself depends on over 18 services, which in turn might depend on 18 more, and so on. Without good documentation, it can be incredibly difficult to track down which service might be responsible for an error happening in your application.

The code includes tracking support.

At eBay, we define tracking to be any information sent about how a product is being used. This is different than logging or monitoring in that logging and monitoring is about the health of your application, where we define tracking to be more about the “what” of your application. For example, tracking impressions and clicks so they can later be compared to answer the question of how many users clicked through to view an item after we launched our new relevancy algorithm in the feed. Ensuring you have the appropriate information being tracked to answer these types of questions saves work later on (and helps you iterate faster).

Does working with us sound interesting? Check out our open positions at http://www.ebaynyc.com/jobs.

#ebay #software development #software #definition of done #checklists #coding #programming