Calculate mean and standard deviation from a vector of samples in C++ using Boost

I don’t know if Boost has more specific functions, but you can do it with the standard library. Given std::vector<double> v, this is the naive way: #include <numeric> double sum = std::accumulate(v.begin(), v.end(), 0.0); double mean = sum / v.size(); double sq_sum = std::inner_product(v.begin(), v.end(), v.begin(), 0.0); double stdev = std::sqrt(sq_sum / v.size() – mean … Read more

Product() aggregate function

The logarathm/power approach is the generally used approach. For Oracle, that is: select exp(sum(ln(col))) from table; I don’t know why the original database designers didn’t include PRODUCT() as an aggregation function. My best guess is that they were all computer scientists, with no statisticians. Such functions are very useful in statistics, but they don’t show … Read more

Compute a confidence interval from sample data assuming unknown distribution

If you don’t know the underlying distribution, then my first thought would be to use bootstrapping: https://en.wikipedia.org/wiki/Bootstrapping_(statistics) In pseudo-code, assuming x is a numpy array containing your data: import numpy as np N = 10000 mean_estimates = [] for _ in range(N): re_sample_idx = np.random.randint(0, len(x), x.shape) mean_estimates.append(np.mean(x[re_sample_idx])) mean_estimates is now a list of 10000 … Read more