Planning and Supporting Growth: Scalability at Workday, Part 2

This is the second part in our two-part interview with Workday technologists Jon Ruggiero and Jim Stratton on how Workday was built to scale from the start. In part one, we talked about the impact of system scalability and how it’s a critical element of the Workday architecture. In this blog, Ruggiero and Stratton discuss the role of scalability in meeting customer needs and how Workday keeps up with different types of customer demands.

How does Workday meet customer needs through scalability?

Ruggiero
Jon Ruggiero

Ruggiero: As an engineering organization, we have always set goals for ourselves and then created projects to deliver on them. These goals are tied to Workday business and customer objectives—for example, delivering a system that could scale from hundreds to hundreds of thousands of employees—and today, we are at the point where some large enterprise customers are managing more than 1 million active and former employees (such as retirees) in Workday.

Can you point to specific architectural changes that have helped Workday provide its customers with better scalability?

Stratton
Jim Stratton

Stratton: We are always investing in our architecture and applications to meet the scalability goals we set for ourselves. In the early days, when our customers consisted of relatively small organizations, we were able to scale up mainly through more powerful hardware. When we realized that approach wasn’t going to get us very far, we invested heavily in architectural improvements that allowed us to scale horizontally by adding more computing resources.

One of the first changes we made was to add the ability to scale computing capacity for our read workload. We generally have to be prepared for a factor of 10x for our write workload. For example, as we looked to scale our payroll processing, we invested in technology that enabled us to distribute the workload across a compute grid to efficiently calculate pay in parallel.

“We will be able to allocate resources to meet the needs of specific customers at the moment they need them.”

The grid is a compute resource shared by all of our customers so it creates a very efficient way for Workday to provide enormous computing power. As the other areas of our application evolved and grew, customers were also able to benefit from this scalable platform, so many of our largest jobs—such as those found in Workday Financial Management—now run on this parallelized compute grid.

Today we are working on technology that will allow us to scale specific application areas independently of each other. We will be able to allocate resources to meet the needs of specific customers at the moment they need them. This will allow us to scale for computing demands that are specific to business cycles such as open enrollment periods, time entry at the start and end of business days, financial close periods, and other time-sensitive needs.

Does scalability mean the same thing across all Workday applications?

Ruggiero: Every application area has its own unique characteristics. As an engineering organization, it is critical that we closely align with business goals. Given that we have decided to separate our application development from our platform development, we need to ensure that our platform can meet the needs of the applications that are built on top of it.

The biggest factor when thinking about scaling Workday Human Capital Management (HCM) is the number of employees a customer has. Employee count drives many of the core processes around payroll, open enrollment, time entry, and merit cycles. We’ve become pretty good at estimating how much compute capacity a Workday HCM customer will need if we know how many employees it has.

For Workday Financial Management, employee count is not a good indication of scalability requirements. A company with a relatively small employee population can generate a significant volume of financial transactions. We focus our scalability efforts in financial management around journal entries and journal lines entered per year to understand how much compute capacity we will need to allocate for a customer.