Efficiently count word frequencies in python

The most succinct approach is to use the tools Python gives you. from future_builtins import map # Only on Python 2 from collections import Counter from itertools import chain def countInFile(filename): with open(filename) as f: return Counter(chain.from_iterable(map(str.split, f))) That’s it. map(str.split, f) is making a generator that returns lists of words from each line. Wrapping … Read more

How to generate distributions given, mean, SD, skew and kurtosis in R?

There is a Johnson distribution in the SuppDists package. Johnson will give you a distribution that matches either moments or quantiles. Others comments are correct that 4 moments does not a distribution make. But Johnson will certainly try. Here’s an example of fitting a Johnson to some sample data: require(SuppDists) ## make a weird dist … Read more