Counting the Frequency of words in a pandas data frame

IIUIC, use value_counts()

In [3361]: df.Firm_Name.str.split(expand=True).stack().value_counts()
Out[3361]:
Society       3
Ltd           2
James's       1
R.X.          1
Yah           1
Associates    1
St            1
Kensington    1
MMV           1
Big           1
&             1
The           1
Co            1
Oil           1
Building      1
dtype: int64

Or,

pd.Series(np.concatenate([x.split() for x in df.Firm_Name])).value_counts()

Or,

pd.Series(' '.join(df.Firm_Name).split()).value_counts()

For top N, for example 3

In [3379]: pd.Series(' '.join(df.Firm_Name).split()).value_counts()[:3]
Out[3379]:
Society    3
Ltd        2
James's    1
dtype: int64

Details

In [3380]: df
Out[3380]:
      URN                   Firm_Name
0  104472               R.X. Yah & Co
1  104873        Big Building Society
2  109986          St James's Society
3  114058  The Kensington Society Ltd
4  113438      MMV Oil Associates Ltd

Leave a Comment