how to transform a table in Excel from vertical to horizontal but with different length

Here, a solution that uses DROP/REDUCE/VSTACK pattern to generate each row. Check for example @JvdV’s answer from this question: How to split texts from dynamic range? and a similar idea DROP/REDUCE/HSTACK pattern to generate the columns for a given row. In cell E2 put the following formula:

=LET(set, A2:B13, IDs, INDEX(set,,1), dates, INDEX(set,,2),
  HREDUCE, LAMBDA(id, arr, REDUCE(id, arr, LAMBDA(acc, x, HSTACK(acc, x)))),
  output, DROP(REDUCE("", UNIQUE(IDs), LAMBDA(ac, id, VSTACK(ac, LET(
    idDates, FILTER(dates, ISNUMBER(XMATCH(IDs, id))), HREDUCE(id, idDates)
  )))),1), IFERROR(VSTACK(HSTACK("ID", "Dates"), output), "")
)  

and here is the output:

sample excel file

Update

As @JdvD pointed out in the comments section there is a shorted way:

=LET(set, A3:B13, title, A1:B1, IDs, INDEX(set,,1), dates, INDEX(set,,2),
  IFERROR(REDUCE(title, UNIQUE(IDs),LAMBDA(ac, id, 
    VSTACK(ac,HSTACK(id,TOROW(FILTER(dates,IDs=id)))))),"")
)

The main idea is to use the title as a way to initialize the VSTACK accumulator (no need to use DROP), and have all the dates for a given id all at once via the FILTER function. As a side note, it can be expressed in terms of the pattern we explained in the Explanation section (see below), as follow:

=LET(set, A3:B13, title, A1:B1, IDs, INDEX(set,,1), dates, INDEX(set,,2),
  HREDUCE, LAMBDA(id, HSTACK(id, TOROW(FILTER(dates,IDs=id)))),
  IFERROR(REDUCE(title, UNIQUE(IDs),LAMBDA(ac,id, VSTACK(ac, HREDUCE(id)))),"")
)

Note: Keeping the same name of the user LAMBDA function (HREDUCE) for sake of consistency with the Explanation section, but there is no need to use REDUCE. A more appropriate name would be PIVOT_DATES.

Explanation

HREDUCE is a user LAMBDA function that implements the DROP/REDUCE/HSTACK pattern. In order to generate all the columns for a given row, this is the pattern to follow:

DROP(REDUCE("", arr, LAMBDA(acc, x, HSTACK(acc, func))),,1) 

It iterates over all elements of arr (x) and uses HSTACK to concatenate column by column on each iteration. DROP function is used to remove the first column, if we don’t have a valid value to initialize the first column (the accumulator, acc). The name func is just a symbolic representation of the calculation required to obtain the value to put on a given column. Usually, some variables are required to be defined, so quite often the LET function is used for that.

In our case we have a valid value to initialize the iteration process (no need to use DROP function), so this pattern can be implemented as follow via our user LAMBDA function HREDUCE:

LAMBDA(id, arr, REDUCE(id, arr, LAMBDA(acc, x, HSTACK(acc, x))))

In our case the initialization value will be each unique id value. The func will be just each element of arr, because we don’t need to do any additional calculation to obtain the column value.

The previous process can be applied for a given row, but we need to create iteratively each row. In order to do that we use a DROP/REDUCE/VSTACK pattern, which is a similar idea:

DROP(REDUCE("", arr, LAMBDA(acc, x, VSTACK(acc, func))),1)

Now we append rows via VSTACK. For this case we don’t know how to initialize properly the accumulator (acc), so we need to use DROP to remove the first row. Now fun will be: HREDUCE(id, idDates), i.e. the LAMBDA function we created before to generate all the dates columns for a given id. Now we use a LET function to name the selected dates for a given id (idDates).

At the beginning of each row (first column), we are going to have the unique IDs (UNIQUE(IDs)). To find the corresponding dates for each unique ID (id) we use the following:

FILTER(dates, ISNUMBER(XMATCH(IDs, id)))

and name the result idDates.

Finally, we build the output including the header. We pad non existing values with the empty string to avoid having #NA values. This is the default behavior of V/HSTACK functions. We use IFERROR function for that.

IFERROR(VSTACK(HSTACK("ID", "Dates"), output), "")

Note: Both patterns are very useful to avoid Nested Array Error (#CALC!) usually produced by some of the new Excel array functions, such as BYROW, BYCOL, MAP when using TEXTSPLIT for example. This is one of the effective ways to overcome it.

Leave a Comment