They’re both correct, but yours is preferred from the point of view of numerical stability.
You start with
e ^ (x - max(x)) / sum(e^(x - max(x))
By using the fact that a^(b – c) = (a^b)/(a^c) we have
= e ^ x / (e ^ max(x) * sum(e ^ x / e ^ max(x)))
= e ^ x / sum(e ^ x)
Which is what the other answer says. You could replace max(x) with any variable and it would cancel out.