Evidenced-based scheduling: The Monte Carlo method

The problem with predicting the future is it is genuinely impossible, none more so than in product development and delivery.

When you throw in a highly volatile working environment or a complex technology stack you have limited exposure to, it should be no surprise to hear your team say, “It’ll be done when it is done…”. However, users have real-world problems that we are trying to fix and understandably want to know when they are going to benefit from the amazing work you are doing. So, when our stakeholders ask the most fundamental question “When will it be done?” we need to be able to answer with some degree of reliability. 

There are countless different techniques to assist us in these future predictions and I want to introduce you to the one that has helped me and my teams the most over the years: Evidence-based Scheduling through Monte Carlo Simulation. The beauty of this method is it brings together real-world data and probability simulations to enable truly data-driven decision making.  

There is a reason “The House always wins” in Vegas, it’s because all their games are rigged to give them the edge by at least a few percentage points. However, this is only a viable advantage when it is combined with a large throughput. In a one-off bet of red or black in roulette, the house has a substantial chance of losing (~48%), when this is multiplied by numerous bets over hours/days/weeks. Through the probability that a 2% advantage will give them the edge and turn over massive profits for the casino.  

Why am I telling you all this? Because you can supercharge your project delivery forecasts with the same advantage. It might not be perfect every time but it’s going to be a lot closer than pure gut feel/guesswork and doesn’t require your development team to be consistently solid estimators every time. 

The below demonstration outlines a simple Monte Carlo probability forecasting tool written in Excel. It requires several input data points that you collect from your team and provides you with several outputs which indicate the aggregated probability of delivering a project/product increment in a certain timeframe (e.g. weeks from today).  


  • Data set of completed backlog items (example data set 23 weeks)
  • Broken down product increment you wish to schedule (example 63 work items)
  • Link to Excel file HERE

In this example, we take 3 months of data on the throughput of a development team. Specifically, the number of issues/tasks completed (to the ‘Definition of Done’ standards) in any given week. 

Please note that no sizing is incorporated in these issues/tasks deliberately for this method as it assumes that humans are particularly bad at estimating the size and complexity of work. It also deliberately includes all relative downtime/delay/wait time a piece of functionality will go through on its journey to completion. Just because your dev was not coding for the whole 10 hours the ticket took to complete does not mean that that conversation over making a coffee and a quick scroll of Reddit at lunch isn’t a normal part of the team’s day. This tenet of measuring total elapsed time is important for Monte Carlo forecasting as it is the actual time it takes to see results delivered rather than an estimate of what you predict.  

So, you have your data set. Now you need to break down the project deliverable or epic into as appropriately small pieces as you can. Generally, the smaller you break down the work items the better understood they are by the team. Another tip at this stage is to try and have a target size of the work item. In my current team, we are striving to keep all work items to one day or so from “to do” to “done”. Of course, we are not always successful and that’s okay. 

For simplicity this shows tossing a coin 3 times.

Aggregated probability is used to analyse and predict the joint occurrence of events or outcomes. It allows you to understand the likelihood of specific combinations of events happening. E.g. getting either 3xH or 3xT (one occurrence each in the sample size) is less likely than getting a combination of H and T (6 occurrences in sample)

In our example specific combinations of work item throughout in a given time period.

Now you have your data set of past performance and your broken-down piece of functionality you wish to schedule, it’s time to simulate the future!  

The Monte Carlo simulation takes your total number of work items (e.g. 63) and runs a simulation imagining that one random week from your data set happens again and repeats this process until you exceed your target total (e.g. Week 1: 15, Week 2: 18, Week 3: 7, Week 4: 7, Week 5: 1, Week 6: 18, all together sum to 66 work items delivered in 6 weeks).

The equation then repeats a few thousand times to generate a few thousand other alternative realities based on your team’s actual data. 

So if you somehow have 4 weeks back-to-back where you managed to knock it out of the park and deliver 20 work items every week, which you have only ever managed once before, then in theory you could deliver this project in 4 weeks. But as we know from gut feeling, this is highly unlikely and only happens in 4% of our few thousand simulations. 

Likewise, if you have 7 of your worst-ever performing weeks as a team back-to-back and only deliver 4/5/6 work items in each week it will take you 9 weeks to deliver this project. Again, though the probability suggests that this is highly unlikely and would (based on past performance) only happen in 2% of the thousands of alternate reality simulations.

No scheduling tool is ever going to be a silver bullet. There is no replacing the Agile principles of collaboration, working in the open, responding to change and focusing on delivering value. So:

  • Be proactive in removing blockers and improving your ways of working and processes.
  • Acknowledge your environment and its unique complexity. If you have a habit of scope creep due to late-stage requirement changes then build that into your initial scope (e.g. In my current team, we add  5-10 additional work items per sprint to our estimate as this is consistent scope creep in our sprint burn down charts).
  • A schedule doesn’t remove the need for difficult conversations with stakeholders about what will be delivered in a certain timeframe – if it doesn’t fit, it doesn’t fit. At least this way, you get to have that conversation up front with the stakeholders before they have spent their budget which allows for better business decisions (great mixed with WSJF).
  • Only the people who are doing the work should estimate/break it down (Reliability is helped by a consistent team).
  • Fix issues as you find them and charge the time back to the original task.
  • Don’t let managers badger people into shorter estimates. If it doesn’t fit, it doesn’t fit.
  • Whilst you are building up your past performance data set, be careful who you share the schedule with.

It’s pretty easy! A day or so at the start of the iteration to plan out the work and break it down plus a few seconds per day at its most in depth. My version is about 5 mins per week!

Four steps:

  • Break it down.
  • Track elapsed time.
  • Simulate the future.
  • Be proactive.

Realistic schedules should lead to informed decisions which in turn should lead to a better product, happier client and a consistent pace of work enabling you to log off at 5pm come deadline day!

A blog by

Joe Walker

Principal Consultant