How to Excel with past, present, and future data

By Liam Bastick, FCMA, CGMA

February 19, 2019

Please note: This item is from our archives and was published in 2019. It is provided for historical reference. The content may be out of date and links may no longer function.

Many accountants and analysts have to analyse trends using past and current performance to create informed forecasts for the future. It’s a common problem that causes consternation and bad Excel modelling far too frequently. However, it can be circumvented using a simple layout and the application of functions discussed in previous editions of this column (more on that below).

This requires three distinct data sets:

Historical information: Data that actually happened. These data often are recorded in management information systems, general ledgers, tax returns, etc. Data in this category tend to have one thing in common: They are a matter of fact.
Actual data: Whilst historical information summarises past data that have been locked down, “actuals” may be slightly more fluid in that data may be available for the first nine months of the year, say, combined with three months’ projections, based upon predicted orders, work schedules, hours of operation, and so on.
Forecast projections: These numbers are usually based upon calculations and correlations with the historical/actual data. The first two data sets may be distinct and independent, but typically, there is a relationship with at least one of the other types and forecast projections.

You may or may not agree with my above summation. That’s not really the point. The key takeaway is that when we undertake financial modelling, we often have more than one situation that occurs over the time periods in the same row of the financial model, as shown in the screenshot below.

The problem with data set out like this is twofold:

Less experienced modellers will use three types of data entry for the three segments, which may be a combination of formulas and/or hard code. The concern is someone else will come along and copy one formula across the entire row that might not be appropriate for the other data categories.
More experienced modellers acknowledge one consistent formula must be used across the row — but then they might try to do all calculations in the one cell and construct a formula that typically starts =IF(IF(… etc.

Neither option is optimal, but the second is nearer the truth. The problem with =IF(IF(… is it intimidates end users who try to follow a succession of nested calculations. Excel 2016’s IFS function is even worse as people have to understand both the concept of nesting and the prevailing syntax. Stepping out seems way simpler. And it gives me an opportunity to bring together quite a few of my past articles in an applied example.

Let me explain. I am going to start with an example with three scenarios (download this Excel file to follow along):

Historical: In my example, these will be inputs, but they could link to another file.
Actual: In my example, again, these will be inputs, but this could be a calculation based on actual data for the period to date and then extrapolated.
Forecast: In my example, this will be a formula.

You could have five situations: two linked, one hard-coded, and two formulaic — the idea would remain the same. The first thing is to collate all necessary inputs, as shown in the screenshot below.

My plan here is to use the first row of data for historical periods, use the second row for the actual period, and then use the growth rate on the previous period for forecasts. Clearly, only one value would be used in each year (column).

Next, I created a selector for which year was the actual year, as shown below.

The input cell (Actual, in yellow and using the data validation technique detailed in last month’s column) drives the other two dates, as Historical would have to be all periods up to the year before and Forecast from the following year. More situations would require more inputs, but the general idea would remain similar.

I can now create a “flag” system (1 for on/yes, 0 for off/no) to denote which situation relates to which period, as demonstrated in the screenshot below. These 1’s and 0’s are then (conditionally) formatted to look like ticks (for zeroes) and crosses (for ones) in row 41. Allow me to clarify.

The Historical flags (row 37 of the image above) are 1 for years 2015, 2016, and 2017, using the formula

=(G$36<=$G25)*1

in cell G37, for example. This checks that the year in question is less than the value in cell G25, which is 2017. Similarly, the Actual formula is

=(G$36=$G26)*1

and the Forecast formula is

=(G$36>=$G27)*1

which means all formulas are simple.

A check has been put in row 41 to ensure there is a “1” (but only one “1”) in each column (year), too:

=(SUM(G37:G39)<>1)*1

Finally, the Choice (row 43) uses the MATCH function (see this past article for more information) to determine where the 1 in each column is, ie, 1 if it is in the first row, 2 if it is in the second, etc.

=MATCH(1,G$37:G$39,0)

MATCH(lookup_value,lookup_array,[match_type]) returns the relative position of an item in an array that (approximately) matches a specified value. It is not case-sensitive.

The third argument, match_type, does not have to be entered, but for most situations, I strongly recommend that it is specified. It allows one of three values:

match_type 1 [default if omitted]: Finds the largest value less than or equal to the lookup_value — but the lookup_array must be in strict ascending order, limiting flexibility;
match_type 0: Probably the most useful setting, MATCH will find the position of the first value that matches lookup_value exactly. The lookup_array can have data in any order and even allows duplicates; and
match type -1: Finds the smallest value greater than or equal to the lookup_value — but the lookup_array must be in strict descending order, again limiting flexibility.

When using MATCH, if there is no (approximate) match, #N/A is returned (this may also occur if data are not correctly sorted depending upon match_type).

For example:

In the figure above, MATCH(“d”,F12:F22,0) gives a value of 6, being the relative position of the first “d” in the range. Note that having match_type 0 here is important. The data contain duplicates and are not sorted alphanumerically. Consequently, match_types 1 and -1 would give the wrong answer: 7 and #N/A, respectively.

Returning to the 1’s and 0’s section (the latter displayed as dashes in the graphic), these values have also been used to drive conditional formatting in the original input section, too, as shown in the screenshot below.

To create grey cells, I have simply highlighted the entire range and used conditional formatting (in the Styles section of the Home tab of the ribbon, Alt+O+D), as shown below.

We are pretty much done now. All that is needed is a formula for the output. Let’s consider the following screenshot.

The formula

=CHOOSE(G$43,G$13,G$14,F53*(1+G$15))

in cell G53 uses the value in row 13 if the choice is 1 (Historical selected), in row 14 if the choice is 2 (Actual) and multiplies the previous period by 1 plus the growth rate in row 15 if the choice is 3 (Forecast), using the CHOOSE function.

The CHOOSE function employs the following syntax to operate:

CHOOSE(index_number, value1, [value2])

As a reminder, the CHOOSE function has the following arguments:

index_number: This is required and is used to specify which value argument is to be selected. The index_number must be a number between 1 and 254, or a formula or reference to a cell containing a number between 1 and 254.
- If index_number is 1, CHOOSE returns value1; if it is 2, CHOOSE returns value2; and so on.
- If index_number is less than 1 or greater than the number of the last value in the list, CHOOSE returns the #VALUE! error value.
- If index_number is a fraction, it is truncated to the lowest integer before being used.
value1, value2, …: value 1 is required, but subsequent values are optional. There may be between 1 and 254 value arguments from which CHOOSE selects a value or an action to perform based on index_number. The arguments can be numbers, cell references, defined names, formulas, functions, or text.

You may recall CHOOSE from my October 2018 article. The great thing about this method is regardless of whether end users know about conditional formatting, INDEX or CHOOSE, the actual final calculation is easy to follow on paper, as shown in the screenshot below.

And that’s what financial modelling is all about!

— Liam Bastick, FCMA, CGMA, FCA, is director of SumProduct, a global consultancy specialising in Excel training. He is also an Excel MVP (as appointed by Microsoft) and author of An Introduction to Financial Modelling. Send ideas for future Excel-related articles to him at liam.bastick@sumproduct.com. To comment on this article or to suggest an idea for another article, contact Jeff Drew, an FM magazine senior editor, at Jeff.Drew@aicpa-cima.com.

How to Excel with past, present, and future data

Up Next

Gen Z workers looking for new skills, purpose — along with a new job

Our Websites

Up Next

Most Read

LATEST STORIES

Related Articles