Speed up plot() function for large dataset

(This question is closely related to Scatterplot with too many points, although that question focuses on the difficulty of seeing anything in the big scatterplot rather than on performance issues …)

A hexbin plot actually shows you something (unlike the scatterplot @Roland proposes in the comments, which is likely to just be a giant, slow, blob) and takes about 3.5 seconds on my machine for your example:

set.seed(101)
a<-rnorm(1E7,1,1)
b<-rnorm(1E7,1,1)
library(hexbin)
system.time(plot(hexbin(a,b)))  ## 0.5 seconds, modern laptop

enter image description here

Another, slightly slower alternative is the base-R smoothScatter function: it plots a smooth density plus as many extreme points as requested (1000 in this case).

system.time(smoothScatter(a,b,cex=4,nr=1000))  ## 3.3 seconds

enter image description here

Leave a Comment