numpy array TypeError: only integer scalar arrays can be converted to a scalar index

I ran into the problem when venturing to use numpy.concatenate to emulate a C++ like pushback for 2D-vectors; If A and B are two 2D numpy.arrays, then numpy.concatenate(A,B) yields the error. The fix was to simply to add the missing brackets: numpy.concatenate( ( A,B ) ), which are required because the arrays to be concatenated … Read more

In what situation would the AVX2 gather instructions be faster than individually loading the data?

Newer microarchitectures have shifted the odds towards gather instructions. On an Intel Xeon Gold 6138 CPU @ 2.00 GHz with Skylake microarchitecture, we get for your benchmark: 9.383e+09 8.86e+08 2.777e+09 6.915e+09 7.793e+09 8.335e+09 5.386e+09 4.92e+08 6.649e+09 1.421e+09 2.362e+09 2.7e+07 8.69e+09 5.9e+07 7.763e+09 3.926e+09 5.4e+08 3.426e+09 9.172e+09 5.736e+09 9.383e+09 8.86e+08 2.777e+09 6.915e+09 7.793e+09 8.335e+09 5.386e+09 4.92e+08 … Read more

Concatenate range arrays given start, stop numbers in a vectorized way – NumPy

Think I have cracked it finally with a cumsum trick for a vectorized solution – def create_ranges(a): l = a[:,1] – a[:,0] clens = l.cumsum() ids = np.ones(clens[-1],dtype=int) ids[0] = a[0,0] ids[clens[:-1]] = a[1:,0] – a[:-1,1]+1 out = ids.cumsum() return out Sample runs – In [416]: a = np.array([[4,7],[10,16],[11,18]]) In [417]: create_ranges(a) Out[417]: array([ 4, … Read more

Efficiently replace elements in array based on dictionary – NumPy / Python

Approach #1 : Loopy one with array data One approach would be extracting the keys and values in arrays and then use a similar loop – k = np.array(list(mapping.keys())) v = np.array(list(mapping.values())) out = np.zeros_like(input_array) for key,val in zip(k,v): out[input_array==key] = val Benefit with this one over the original one is the spatial-locality of the … Read more

creating a column which keeps a running count of consecutive values

You can use the compare-cumsum-groupby pattern (which I really need to getting around to writing up for the documentation), with a final cumcount: >>> df = pd.DataFrame({“binary”: [0,1,1,1,0,0,1,1,0]}) >>> df[“consec”] = df[“binary”].groupby((df[“binary”] == 0).cumsum()).cumcount() >>> df binary consec 0 0 0 1 1 1 2 1 2 3 1 3 4 0 0 5 0 … Read more

NumPy version of “Exponential weighted moving average”, equivalent to pandas.ewm().mean()

I think I have finally cracked it! Here’s a vectorized version of numpy_ewma function that’s claimed to be producing the correct results from @RaduS’s post – def numpy_ewma_vectorized(data, window): alpha = 2 /(window + 1.0) alpha_rev = 1-alpha scale = 1/alpha_rev n = data.shape[0] r = np.arange(n) scale_arr = scale**r offset = data[0]*alpha_rev**(r+1) pw0 = … Read more

Fast Algorithms for Finding Pairwise Euclidean Distance (Distance Matrix)

Well, I couldn’t resist playing around. I created a Matlab mex C file called pdistc that implements pairwise Euclidean distance for single and double precision. On my machine using Matlab R2012b and R2015a it’s 20–25% faster than pdist(and the underlying pdistmex helper function) for large inputs (e.g., 60,000-by-300). As has been pointed out, this problem … Read more