IF THEN on a Dataframe in r with LAG -
i have dataframe multiple columns, 2 columns in particular interesting me. column1 contains values 0 , number (>0) column2 contains numbers well.
i want create 21 new columns containing new information column2 given column1.
so when column1 positive (not 0) want first new column, column01, take value column2 goes 10 back. , column02 goes 9 back,.. column11 exact same column2 value.. , column21 10 forward.
for example
column 1 column2 columns01 columns02.. columns11..columns20 columns21 0 5 0 0 0 0 0 0 2 0 0 0 0 0 0 0 0 0 0 0 0 1 3 0 0 3 5 4 0 10 0 0 0 0 0 0 83 0 0 0 0 0 0 2 0 0 0 0 0 0 5 0 0 0 0 0 0 4 0 0 0 0 0 1 8 0 5 8 5 3 0 6 0 0 0 0 0 0 5 0 0 0 0 0 0 55 0 0 0 0 0 0 4 0 0 0 0 0 2 3 10 83 3 5 0 0 2 0 0 0 0 0 0 3 0 0 0 0 0 0 4 0 0 0 0 0 0 5 0 0 0 0 0 0 3 0 0 0 0 0 1 22 6 5 22 0 0 0 12 0 0 0 0 0 0 0 0 0 0 0 0 0 5 0 0 0 0 0 hope makes sense , can help.
here's 1 way using newly implemented shift() function data.table v1.9.5:
require(data.table) ## v1.9.5+ setdt(dat) ## (1) cols = paste0("cols", sprintf("%.2d", 1:21)) ## (2) dat[, cols[1:10] := shift(column2, 10:1, fill=0)] ## (3) dat[, cols[11] := column2] ## (4) dat[, cols[12:21] := shift(column2, 1:10, fill=0, type="lead")] ## (5) dat[column1 == 0, (cols) := 0] ## (6) assuming
datdata.frame,setdt(dat)converts data.table, reference (the data not copied physically new location in memory, efficiency).generate column names.
generated lagged vectors of
column2periods10:1, assign first 10 columns.11th column =
column2.generated leading vectors of
column2periods1:10, assign last 10 columns.get indices of rows
column1 == 0, , replace/reset newly generated columns indices0.
use setdf(dat) if want data.frame back.
you can wrap in function values -10:10 , choosing type="lag" or type="lead" accordingly, depending on whether values negative or positive.. i'll leave you.
Comments
Post a Comment