Matplotlib: avoiding overlapping datapoints in a “scatter/dot/beeswarm” plot

Extending the answer by @user2467675, here’s how I did it:

def rand_jitter(arr):
    stdev = .01 * (max(arr) - min(arr))
    return arr + np.random.randn(len(arr)) * stdev

def jitter(x, y, s=20, c="b", marker="o", cmap=None, norm=None, vmin=None, vmax=None, alpha=None, linewidths=None, verts=None, hold=None, **kwargs):
    return scatter(rand_jitter(x), rand_jitter(y), s=s, c=c, marker=marker, cmap=cmap, norm=norm, vmin=vmin, vmax=vmax, alpha=alpha, linewidths=linewidths, **kwargs)

The stdev variable makes sure that the jitter is enough to be seen on different scales, but it assumes that the limits of the axes are zero and the max value.

You can then call jitter instead of scatter.

Leave a Comment