I have a crosstab with customersID x agegroup. See picture.
When the scope of the analysis is 1 year then many customers have had a birthday in the period. But not all of them. For example if a new customer arrives in July, he/she can already have had his/her birthday in April. This gives a mix where some customers are represented I one age group and others are represented in two age groups - cluttering the total number of clients. I.e. the total number of clients does not equal the total for the 3 age groups as 2.695 does not egual 2.601(A customer can also have had his/her birthday and not change age group).
The first 3 columns (d1:d3) are my original age group dimensions and the last 4 columns are calculated columns. Column 4 (c1) “# age groups pr. customer” is the sum of d1:d3. A customer with the value 2 is represented in 2 age groups and has to be handled (cleaned). I use underscore _ to remind myself that this is a calculated column. Column 5 (c2) “_Age -29 Cleaned” does the following.
if sum(d1, 0, m1) = 1 then sum(d1, 0, m1) - sum(d2, 0, m1) else 0
The condition simply states that if the value of columns 1 for “this row” is 1 then (column 1 – column 2) else write 0 as there is no value in column 1. I can presume this as a customer only can be represented once or not at all in one age group. Stating (column 1 – column 2) makes sure that a customer is not represented in both column 1 and column 2. He/she is pushed up in the next/older age group.
I repeat this logic for the middle age group and it goes without saying that the oldest age group does not need to be handled/cleaned.
The last column “_# age groups pr. Customer Cleaned” is simply checking that all customers only belong to one age group. The check is done by summing the 2 new calculated columns _Age -29 Cleaned, _Age 30-59 Cleaned and the original Age 60- dimension. When there is 1 in all rows all customers only belong to one age group. This is verified as sum equals allcount for the entire column.
I'm nicing the output via visibility. As i'm only showing the total for_Age -29 Cleaned, _Age 30-59 Cleaned and the original Age 60- dimension and editing Dynamic captions for relevant members.
The problem is……Now I have to do the same exercise not on age groups but on actual age. I.e. 1,2,3,4…99,100…. This gives me about 100 columns and the method described here does not seem optimal.
I’m not proud to say that I solved the problem in Excel, but I would like to know if someone has an idea to how the problem can be solved in the TARGIT BI Suite. For example by returning the age group dimension value?
Edited by user Friday, September 14, 2012 10:20:32 AM(UTC)
| Reason: Not specified