reshape - w3toppers.com

R: Reshaping Multiple Columns from Long to Wide

An option would be to replace the duplicated elements by ‘Letter’ to NA and then in the reshaped data, remove the columns that are all NA library(data.table) out <- dcast(setDT(sample_df)[, lapply(.SD, function(x) replace(x, duplicated(x), NA)), Letter], Letter ~ rowid(Letter), value.var = c(“Number”, “Fruit”)) nm1 <- out[, names(which(!colSums(!is.na(.SD))))] out[, (nm1) := NULL][] # Letter Number_1 Number_2 … Read more

how to pivot/unpivot (cast/melt) data frame? [duplicate]

I still can’t believe I beat Andrie with an answer. 🙂 > library(reshape) > my.df <- read.table(text = “Country 2001 2002 2003 + Nigeria 1 2 3 + UK 2 NA 1”, header = TRUE) > my.result <- melt(my.df, id = c(“Country”)) > my.result[order(my.result$Country),] Country variable value 1 Nigeria X2001 1 3 Nigeria X2002 2 … Read more

Aggregate and reshape from long to wide

Your data are already in a long format that can be used easily by “reshape2”, like this: library(reshape) dcast(df, bdate ~ sex + age + diag, value.var = “admissions”) # bdate Female_35-64_card Female_35-64_cere Female_65-74_card Female_65-74_cere # 1 1987-01-01 1 6 1 6 # 2 1987-01-02 4 4 0 6 # 3 1987-01-03 2 6 4 … Read more

Getting a stacked area plot in R

You can use the ggplot2 package from Hadley Wickham for that. R> library(ggplot2) An example data set : R> d <- data.frame(t=rep(0:23,each=4),var=rep(LETTERS[1:4],4),val=round(runif(4*24,0,50))) R> head(d,10) t var val 1 0 A 1 2 0 B 45 3 0 C 6 4 0 D 14 5 1 A 35 6 1 B 21 7 1 C 13 … Read more

How do I resize a matrix in MATLAB?

reshape is of course the proper solution, as stated by @gnovice. A nice feature of reshape is that it allows this: A = 1:12; B = reshape(A,4,[]); B = 1 5 9 2 6 10 3 7 11 4 8 12 So if you don’t know how many columns there will be, reshape will compute … Read more

Compute mean and standard deviation by group for multiple variables in a data.frame

This is an aggregation problem, not a reshaping problem as the question originally suggested — we wish to aggregate each column into a mean and standard deviation by ID. There are many packages that handle such problems. In the base of R it can be done using aggregate like this (assuming DF is the input … Read more

Subsetting R data frame results in mysterious NA rows

Wrap the condition in which: df[which(df$number1 < df$number2), ] How it works: It returns the row numbers where the condition matches (where the condition is TRUE) and subsets the data frame on those rows accordingly. Say that: which(df$number1 < df$number2) returns row numbers 1, 2, 3, 4 and 5. As such, writing: df[which(df$number1 < df$number2), … Read more

Grouped bar plot in ggplot

EDIT: Many years later For a pure ggplot2 + utils::stack() solution, see the answer by @markus! A somewhat verbose tidyverse solution, with all non-base packages explicitly stated so that you know where each function comes from: library(magrittr) # needed for %>% if dplyr is not attached “http://pastebin.com/raw.php?i=L8cEKcxS” %>% utils::read.csv(sep = “,”) %>% tidyr::pivot_longer(cols = c(Food, … Read more

Easy way to convert long to wide format with counts [duplicate]

The aggregation parameter in the dcast function of the reshape2-package defaults to length (= count). In the data.table-package an improved version of the dcastfunction is implemented. So in your case this would be: library(‘reshape2’) # or library(‘data.table’) newdf <- dcast(sample.data, Case ~ Decision) or with using the parameters explicitly: newdf <- dcast(sample.data, Case ~ Decision, … Read more

Reshape an array in NumPy

a = np.arange(18).reshape(9,2) b = a.reshape(3,3,2).swapaxes(0,2) # a: array([[ 0, 1], [ 2, 3], [ 4, 5], [ 6, 7], [ 8, 9], [10, 11], [12, 13], [14, 15], [16, 17]]) # b: array([[[ 0, 6, 12], [ 2, 8, 14], [ 4, 10, 16]], [[ 1, 7, 13], [ 3, 9, 15], [ 5, … Read more