IF THEN on a Dataframe in r with LAG -
i have dataframe multiple columns, 2 columns in particular interesting me. column1 contains values 0 , number (>0) column2 contains numbers well.
i want create 21 new columns containing new information column2 given column1.
so when column1 positive (not 0) want first new column, column01, take value column2 goes 10 back. , column02 goes 9 back,.. column11 exact same column2 value.. , column21 10 forward.
for example
  column 1  column2   columns01 columns02.. columns11..columns20 columns21       0        5          0         0           0          0         0       0        2          0         0           0          0         0        0        0          0         0           0          0         0         1        3          0         0           3          5         4       0        10         0         0           0          0         0       0        83         0         0           0          0         0       0        2          0         0           0          0         0       0        5          0         0           0          0         0       0        4          0         0           0          0         0       1        8          0         5           8          5         3       0        6          0         0           0          0         0       0        5          0         0           0          0         0       0        55         0         0           0          0         0       0        4          0         0           0          0         0       2        3          10       83           3          5         0       0        2          0         0           0          0         0       0        3          0         0           0          0         0       0        4          0         0           0          0         0       0        5          0         0           0          0         0       0        3          0         0           0          0         0       1        22         6         5          22          0         0       0        12         0         0           0          0         0       0        0          0         0           0          0         0       0        5          0         0           0          0         0   hope makes sense , can help.
here's 1 way using newly implemented shift() function data.table v1.9.5:
require(data.table) ## v1.9.5+ setdt(dat)                                                      ## (1) cols = paste0("cols", sprintf("%.2d", 1:21))                    ## (2) dat[, cols[1:10] := shift(column2, 10:1, fill=0)]               ## (3) dat[, cols[11] := column2]                                      ## (4) dat[, cols[12:21] := shift(column2, 1:10, fill=0, type="lead")] ## (5) dat[column1 == 0, (cols) := 0]                                  ## (6)   assuming
datdata.frame,setdt(dat)converts data.table, reference (the data not copied physically new location in memory, efficiency).generate column names.
generated lagged vectors of
column2periods10:1, assign first 10 columns.11th column =
column2.generated leading vectors of
column2periods1:10, assign last 10 columns.get indices of rows
column1 == 0, , replace/reset newly generated columns indices0.
use setdf(dat) if want data.frame back.
you can wrap in function values -10:10 , choosing type="lag" or type="lead" accordingly, depending on whether values negative or positive.. i'll leave you.
Comments
Post a Comment