Difference between df[x], df[[x]], df[‘x’] , df[[‘x’]] and df.x

df[x] — index a column using variable x. Returns pd.Series df[[x]] — index/slice a single-column DataFrame using variable x. Returns pd.DataFrame df[‘x’] — index a column named ‘x’. Returns pd.Series df[[‘x’]] — index/slice a single-column DataFrame having only one column named ‘x’. Returns pd.DataFrame df.x — dot accessor notation, equivalent to df[‘x’] (there are, however, … Read more

Python Pandas iterate over rows and access column names

I also like itertuples() for row in df.itertuples(): print(row.A) print(row.Index) since row is a named tuples, if you meant to access values on each row this should be MUCH faster speed run : df = pd.DataFrame([x for x in range(1000*1000)], columns=[‘A’]) st=time.time() for index, row in df.iterrows(): row.A print(time.time()-st) 45.05799984931946 st=time.time() for row in df.itertuples(): … Read more

Convert Pandas series containing string to boolean

You can just use map: In [7]: df = pd.DataFrame({‘Status’:[‘Delivered’, ‘Delivered’, ‘Undelivered’, ‘SomethingElse’]}) In [8]: df Out[8]: Status 0 Delivered 1 Delivered 2 Undelivered 3 SomethingElse In [9]: d = {‘Delivered’: True, ‘Undelivered’: False} In [10]: df[‘Status’].map(d) Out[10]: 0 True 1 True 2 False 3 NaN Name: Status, dtype: object

Getting a list of indices where pandas boolean series is True

Using Boolean Indexing >>> s = pd.Series([True, False, True, True, False, False, False, True]) >>> s[s].index Int64Index([0, 2, 3, 7], dtype=”int64″) If need a np.array object, get the .values >>> s[s].index.values array([0, 2, 3, 7]) Using np.nonzero >>> np.nonzero(s) (array([0, 2, 3, 7]),) Using np.flatnonzero >>> np.flatnonzero(s) array([0, 2, 3, 7]) Using np.where >>> np.where(s)[0] … Read more