blogunique

COUNT UNIQUE DUPLICATE VALUES IN EXCEL

COUNTIF is an excellent function to count only those cells whose  value meets a certain criteria.

But as excellent as it is, it becomes a challenge to count Unique or Repeats in a range that contain duplicates.

For example, Using below data count;

  1. Total number of unique customers
  2. Total Non-Repeat Customers
  3. Total Repeat customers

countif-duplicates

There are 4 ways of counting unique values in excel;

  • Using an array formula (SUM & COUNTIF)
{=SUM(1/COUNTIF(C2:C23,C2:C23))} =16
  • Using SUMPRODUCT (SUMPRODUCT & COUNTIF)
=SUMPRODUCT((1/COUNTIF(C2:C23,C2:C23)))=16
  • Using a combination of SUM, IF, FREQUENCY & MATCH
=SUM(IF(FREQUENCY(MATCH(C2:C23,C2:C23,0),MATCH(C2:C23,C2:C23,0))>0,1,0))=16

The Fourth way comes from below Comment by Hervé Thiriez ( http://monsieur-excel.blogspot.co.ke/)

  • Using a combination of SUMPRODUCT, ROW & MATCH
=SUMPRODUCT(1*(ROW(C2:C23)=MATCH(C2:C23,C:C,0)))=16 OR {=SUM(1*(ROW(C2:C23)=MATCH(C2:C23,C:C,0)))}=16

How it works;

  • COUNTIF(C2:C23,C2:C23)generates an array of occurences of each customer
    {2;2;1;2;2;2;2;1;1;1;1;1;1;2;2;1;1;2;2;2;2;1}
  • 1/COUNTIF(C2:C23,C2:C23) generates an array of reciprocal of occurences
     {0.5;0.5;1;0.5;0.5;0.5;0.5;1;1;1;1;1;1;0.5;0.5;1;1;0.5;0.5;0.5;0.5;1}

Then Either SUM (array formula) or SUMPRODUCT adds up the reciprocals

►As for the combination of SUM, IF, FREQUENCY & MATCH;

  • The MATCH function searches for a specified item in a range of cells, and then returns the relative position of that item in the range.
MATCH(C2:C23,C2:C23,0)={1;1;3;4;4;6;6;8;9;10;11;12;13;14;14;16;17;18;18;20;20;22}
  • The FREQUENCY function calculates how often values occur within a range of values, and then returns a vertical array of numbers.
FREQUENCY(MATCH(C2:C23,C2:C23,0),MATCH(C2:C23,C2:C23,0))= {2;0;1;2;0;2;0;1;1;1;1;1;1;2;0;1;1;2;0;2;0;1;0}
  • The IF function returns a value of 1 if  Frequency is greater than Zero.
IF(FREQUENCY(MATCH(C2:C23,C2:C23,0),MATCH(C2:C23,C2:C23,0))>0,1,0)={1;0;1;1;0;1;0;1;1;1;1;1;1;1;0;1;1;1;0;1;0;1;0}
  • The SUM function adds all the numbers returned by IF function.
    =SUM({1;0;1;1;0;1;0;1;1;1;1;1;1;1;0;1;1;1;0;1;0;1;0})=16

►Combination of SUMPRODUCT, ROW & MATCH

  • ROW(C2:C23) returns the row numbers
{2;3;4;5;6;7;8;9;10;11;12;13;14;15;16;17;18;19;20;21;22;23}
  • MATCH(C2:C23,C:C,0) just like in FREQUENCY formula above returns the relative position of each item in the range
{2;2;4;5;5;7;7;9;10;11;12;13;14;15;15;17;18;19;19;21;21;23}
  • (ROW(C2:C23)=MATCH(C2:C23,C:C,0)) returns a Boolean array TRUE= unique rows, FALSE= Duplicate rows
{TRUE;FALSE;TRUE;TRUE;FALSE;TRUE;FALSE;TRUE;TRUE;TRUE;TRUE;TRUE;TRUE;TRUE;FALSE;TRUE;TRUE;TRUE;FALSE;TRUE;FALSE;TRUE}
  • 1*(ROW(C2:C23)=MATCH(C2:C23,C:C,0)) by multiplying this boolean array with 1 you convert TRUE=1 While FALSE=0
{1;0;1;1;0;1;0;1;1;1;1;1;1;1;0;1;1;1;0;1;0;1}

SUMPRODUCT just sums up the array.

 TOTAL NUMBER OF NON-REPEAT CUSTOMERS

Armed with above understanding it is easier to calculate the total number of non repreat cusotmers. Just sum customers whose frequency is  equal to 1

=SUM(IF(FREQUENCY(MATCH(C2:C23,C2:C23,0),MATCH(C2:C23,C2:C23,0))=1,1,0))=10

OR

=SUMPRODUCT(--(COUNTIF(C2:C23,C2:C23)=1)) =10

 TOTAL NUMBER OF REPEAT CUSTOMERS

As for repeat customers, just sum the customer whose frequency is more than 1

=SUM(IF(FREQUENCY(MATCH(C2:C23,C2:C23,0),MATCH(C2:C23,C2:C23,0))>1,1,0))=6

OR

=SUMPRODUCT(--(FREQUENCY(MATCH(C2:C23,C2:C23,0),MATCH(C2:C23,C2:C23,0))>1))=6

OR

{=SUM(IF((ROW(C2:C23)=MATCH(C2:C23,C:C,0)),0,1))=6}

DOWNLOAD WORKSHEET

RECOMENDATION:

Watch these Videos on FREQUENCY

4 thoughts on “COUNT UNIQUE DUPLICATE VALUES IN EXCEL”

  1. Hervé Thiriez says:

    Another solution:
    =SUMPRODUCT(1*(ROW(C2:C23)=MATCH(C2:C23,C:C,0)))

    It is a non-matrix formula, but you could also use a similar matrix formula with SUM instead of SUMPRODUCT…

    1. Thanks Herve’….I did not think of that
      well done

  2. Crispo, this is cool.
    The beauty of Excel is that it offers you myriad ways to achieve the same thing.
    I would add 2 more approaches to the above:
    METHOD 1:
    Add a helper column in your data using the formula =COUNTIF($C$2:C2,C2).
    Assuming this helper column is in the range G2:G23, then the following formula should give you distinct count of customers, i.e. =SUMIF($G$2:$G$23,1)

    METHOD 2:
    This would work best for those with Excel 2013 and above.
    In this method, I would use Pivot tables to summarize the data. In the Create PivotTable dialog box, I would then ensure I tick “Add this to the Data Model”. I would go ahead and insert a pivot table with the customer name in the Values box of the pivot table, and then change the function that is the basis of the resultant summary from COUNT to DISTINCTCOUNT

    1. Thanks William….True Excel offers different alternatives to same end. Will definately try the pivot table method

Leave a Comment:

Your email address will not be published. Required fields are marked *