Advertisement

Focus on 2 new Excel functions: UNIQUE and FILTER

In Part 5 of a seven-part series, Excel MVP Liam Bastick takes a look at two of the seven new functions Microsoft has created to go with its spreadsheet program’s new Dynamic Arrays capabilities.

Editor’s note: Microsoft recently announced a partial rewrite of Excel’s calculation engine. The result is an evolutionary leap that promises to bring numerous improvements to Excel users in the coming years. Already unveiled are a new type of calculation (Dynamic Arrays), seven new functions, and two new errors. These changes have not yet been rolled out to all Excel users, but you don’t have to wait to see what they can do. Excel MVP Liam Bastick is covering all the changes in a seven-part series. In this instalment, Bastick shows how to use two of the new functions, FILTER and UNIQUE.  

In Part 4 of this series, we covered arguably the two most exciting of the seven fresh functions that work with Excel’s new Dynamic Arrays capabilities. You don’t have to read Part 4 before reading this article, but it’s worth checking out (you should find it at least SORT of helpful).

If you haven’t read Parts 1–3, don’t despair! If you want to catch up, the links are at the bottom of this article. If you don’t want to read three articles before continuing this one, the following section covers what you need to know about Dynamic Arrays (if you have read the other articles, you can jump ahead to the “Fiddling With the FILTER Function” section).

Dynamic Arrays: A quick review

Microsoft’s new Dynamic Arrays functionality greatly simplifies the process of working with arrays in Excel. There’s no more copying formulas to the entire range and having to press Ctrl+Shift+Enter to see multiple results. Instead, you enter a formula in one cell and, poof, the data you are looking for just spill out. Just look at the screenshots below. As you can see, the formula entered into cell F33 in the right image is =F12:H27, which references the original data shown in the left image.

excel5-1


The functionality shown above is unprecedented in Excel. Of the seven new functions that come with Dynamic Arrays, the two covered in this article are interesting on their own but combine them with other functions and good things start to happen, as shown with the SORT(UNIQUE(FILTER())) formula mentioned later in the article. But first ...

Fiddling with the FILTER function

The FILTER function will accept an array, allow you to filter a range of data based upon criteria you define, and return the results to a spill range.

The syntax of FILTER is as follows:

=FILTER(array, include, [if_empty]).

It has three arguments:

  • array: This is required and represents the range that is to be filtered.
  • include: This is also required. This specifies the condition(s) that must be met.
  • if_empty: This argument is optional. This is what will be returned if no data meets the criterion/criteria specified in the include argument. It’s generally a good idea to at least use "" (ie, a pair of double quotation marks) here.

For example, consider the following source data:

excel5-2


To begin with, I will perform a simple FILTER:

excel5-3


Here, in cell F36, I have created the formula

=FILTER(F12:I27,G12:G27=G33,"Not Located.")

F12:I27 is my source array and I wish only to include shapes (column G12:G27) that are Triangles (specified by cell G33). If there are no such shapes, then “Not Located.” is returned instead. To show this, I will change the shape as follows:

excel5-4


That is about as basic as it gets. I can get cleverer. Consider the following example:

excel5-5


I have repeated the source array (cells F48:I63) for clarity. The formula

=FILTER(F48:I63,(G48:G63=G69)*(H48:H63=G70),{"-","None","N/A","N/A"})

looks horrible to begin with, but it’s not quite as bad as it appears upon further scrutiny. The argument below,

(G48:G63=G69)*(H48:H63=G70)

contains two conditions. Firstly, G48:G63=G69 means that the “Shape” (column G48:G63) has to be a “Triangle” (cell G69) and that the “Colour” (column H48:H63) has to be “Red” (cell G70). The multiplication operator (*) is used to denote AND. The Excel function AND cannot be used with arrays — this is nothing special to Dynamic Arrays; AND does not work with Ctrl+Shift+Enter formulas either. This syntax is similar to how you would create AND criteria with the SUMPRODUCT function, for example.

The final argument is similar to the syntax in SORT: {"-","None","N/A","N/A"}. Braces (typed in!) are used to create an array argument that specifies what should be written in each column should there be no record that meets both criteria, eg.

excel5-6


See? Not as bad as you might first think.

My final example is very similar:

excel5-7


Once you realise I have simply repeated referencing for clarity, the formula

=FILTER(F84:I99,(G84:G99=G105)+(H84:H99=G106),{"-","None","N/A","N/A"})

is nothing more than the OR equivalent of the previous example, with + replacing * to switch from ensuring both conditions are met to only one condition being met. As at the time of writing, XOR is not catered for, but I am sure some clever person will create an equivalent in due course (if Microsoft doesn’t beat them to it), necessity being the mother of invention and all that jazz.

This function takes a UNIQUE approach

The hilarious thing about UNIQUE is that it does two things. It details distinct items (ie, provides each value that occurs with no repetition), and it also can return values that occur once and only once in a referred range. I understand that Excel users may welcome the former use with open arms and that database developers may be very interested in the latter. I still think there should have been two functions though. Otherwise, let’s just extend the AGGREGATE function to do just everything (it almost does now) and be done with it!

The UNIQUE function has the following syntax:

=UNIQUE(array, [by_column], [occurs_once]).

It has three arguments:

  • array: This is required and represents the range or array from which to return unique values.
  • by_column: This argument is optional. This is a logical value (TRUE/FALSE) indicating how to compare. If you wish to compare by row, the argument should be FALSE or omitted (since this is the default). To compare by column, you will need to select TRUE.
  • occurs_once: This argument is also optional. This requires a logical value, too:
    • TRUE: Return only unique values that occur once;
    • FALSE: Include all distinct values (default if omitted).

It’s probably clearer with some examples. Let’s give it a go. As always, I need source data:

excel5-13


Time for the most basic illustration:

excel5-8


In cell L13, I have simply typed

=UNIQUE(F13:F41).

No optional arguments; everything in default. If I have made an error, it’s going to be my default. This has simply listed each store that appears; if “North” and “North ” (extra space) were there, then both would appear. UNIQUE is not case sensitive though, and each entry would appear as it first occurs reading down the range F13:F41. The other columns contain similar formulas, and UNIQUE looks like it takes seconds to learn.

It’s just as simple if you want to see unique records for two (or more) columns, as shown in the screenshot below.

excel5-9


You can see UNIQUE is sort of crying out for SORT, but we’ll get to that shortly.

As mentioned earlier, it’s not the only way of using UNIQUE (no, having a unique use would be just what “they” were expecting, whoever “they” are …). You can use it to determine values that occur only once:

excel5-10


Here, the formula in cell L56,

=UNIQUE(G56:G84,0,1)

uses the nondefault value of 1 for the optional occurs_once (third) argument. This means it identifies the salespeople who occur only once in cells G56:G84. Brilliant! I can die content now.

The real power starts coming when you start playing with Excel’s existing functions and features, together with these new functions. Take this comprehensive example:

excel5-11


Let me step you through some of this. The formulas in cells L94 and M94 use UNIQUE in a similar manner to my first example, to generate the list of distinct values in the “Section” and “Manager” fields. However, did you notice they have been sorted? That’s because I used the formula

=SORT(UNIQUE(H94:H122))

in cell L94, for example. Honestly, I think UNIQUE should have another argument for sorting (ascending/descending/none [default]). Watch Microsoft ignore that suggestion.

But then I did something really cool. Cells M105 and M106 use data validation (Alt+D+L) to generate a list from the “Lookup Data” section. That requires taking a closer look:

excel5-12


Do you see the source for the data validation in cell M105? =$L$94# — so elegant! This takes the “Section” list and automatically makes the dropdown list the required length! People create all sorts of tricks using OFFSET, dynamic range names, and the like to achieve a similar effect. No more. =$L$94# (with the #, the Spilled Range Operator) is all that is needed. That’s my favourite thing in all of these new functions and features. I’m impressed.

The AND/OR dropdown is a bit of an anticlimax after that, but the final formula that generates the final table, namely

=SORT(UNIQUE(FILTER(F93:I122,IF(M108="OR",(H93:H122=M105)+(I93:I122=M106),
(H93:H122=M105)*(I93:I122=M106)),{"N/A","-","-","-"})),{1;2;3;4},{1;1;1;1})

is rather fun. I am not going to go through it though — as every aspect of this formula is simply a rehash of an earlier point (assuming you know the IF function!). See if you can work your way through it for yourself.

An Excel Evolution: The Series

Getting started:

Part 1: Excel: Microsoft Unveils Dynamic Upgrades

Part 2: Excel: Arrays, Tables, and “The Twilight Zone”

Part 3: Excel: 2 New Errors and a SINGLE Function

Function focus:

Part 4: This Excel Change is More Than SORT of a Big Deal

Part 5: Focus on 2 New Excel Functions: UNIQUE and FILTER

Part 6: Excel New Function Focus: SEQUENCE and RANDARRAY

Final thoughts

Part 7: Will Dynamic Arrays Kill Data Tables, PivotTables?

— Liam Bastick, FCMA, CGMA, FCA, is director of SumProduct, a global consultancy specialising in Excel training. He is also an Excel MVP (as appointed by Microsoft) and author of Introduction to Financial Modelling. Send ideas for future Excel-related articles to him at liam.bastick@sumproduct.com. To comment on this article or to suggest an idea for another article, contact Jeff Drew, an FM magazine senior editor, at Jeff.Drew@aicpa-cima.com.