Calculate mean and standard deviation from a vector of samples in C++ using Boost

I don’t know if Boost has more specific functions, but you can do it with the standard library. Given std::vector<double> v, this is the naive way: #include <numeric> double sum = std::accumulate(v.begin(), v.end(), 0.0); double mean = sum / v.size(); double sq_sum = std::inner_product(v.begin(), v.end(), v.begin(), 0.0); double stdev = std::sqrt(sq_sum / v.size() – mean … Read more

Get the means of sub groups of means in R

With base, using aggregate > aggregate(WLN~GROUP+WORD, mean, data=df) GROUP WORD WLN 1 1 1 3.333333 2 1 2 2.333333 3 2 3 1.333333 4 2 4 1.000000 where df is @Metrics’ data. Another alternative is using summaryBy from doBy package > library(doBy) > summaryBy(WLN~GROUP+WORD, FUN=mean, data=df) GROUP WORD WLN.mean 1 1 1 3.333333 2 1 … Read more

calculating mean for every n values from a vector

I would use colMeans(matrix(a, 60)) .colMeans(a, 60, length(a) / 60) # more efficient (without reshaping to matrix) Enhancement on user adunaic‘s request This only works if there are 60×100 data points. If you have an incomplete 60 at the end then this errors. It would be good to have a general solution for others looking … Read more

Mean of a column in a data frame, given the column’s name

I think you’re asking how to compute the mean of a variable in a data frame, given the name of the column. There are two typical approaches to doing this, one indexing with [[ and the other indexing with [: data(iris) mean(iris[[“Petal.Length”]]) # [1] 3.758 mean(iris[,”Petal.Length”]) # [1] 3.758 mean(iris[[“Sepal.Width”]]) # [1] 3.057333 mean(iris[,”Sepal.Width”]) # … Read more

Reading multiple files and calculating mean based on user input

So, you can simulate your situation like this; # Simulate some data: # Create 332 data frames set.seed(1) df.list<-replicate(332,data.frame(sulfate=rnorm(100),nitrate=rnorm(100)),simplify=FALSE) # Generate names like 001.csv and 010.csv file.names<-paste0(‘specdata/’,sprintf(‘%03d’,1:332),’.csv’) # Write them to disk invisible(mapply(write.csv,df.list,file.names)) And here is a function that would read those files: pollutantmean <- function(directory, pollutant, id = 1:332) { file.names <- list.files(directory) file.numbers … Read more

group by in group by and average

If you want to first take mean on the combination of [‘cluster’, ‘org’] and then take mean on cluster groups, you can use: In [59]: (df.groupby([‘cluster’, ‘org’], as_index=False).mean() .groupby(‘cluster’)[‘time’].mean()) Out[59]: cluster 1 15 2 54 3 6 Name: time, dtype: int64 If you want the mean of cluster groups only, then you can use: In … Read more