The reason that t_id = 1 doesn't show up in the output is because a rolling join takes the row where the key-combination occurs last. From the documentation (emphasis mine):

Applies to the last join column, generally a date but can be any ordered variable, irregular and including gaps. If roll=TRUE and i's row matches to all but the last x join column, and its value in the last i join column falls in a gap (including after the last observation in x for that group), then the prevailing value in x is rolled forward. This operation is particularly fast using a modified binary search. The operation is also known as last observation carried forward (LOCF).

Let's consider somewhat larger datasets:

> DT t_id airport thisTime 1: 1 a 5.1 2: 4 a 5.1 3: 3 a 5.1 4: 2 d 6.2 5: 5 d 6.2 > DT_LU f_id airport thisTime 1: 1 a 6 2: 2 a 6 3: 2 a 8 4: 1 b 7 5: 1 c 8 6: 2 d 7 7: 1 d 9

When you perform a rolling join just like in your question:

DT[DT_LU, nomatch=0, roll=Inf]

you get:

t_id airport thisTime f_id 1: 3 a 6 1 2: 3 a 6 2 3: 3 a 8 2 4: 5 d 7 2 5: 5 d 9 1

As you can see, from both the key combination a, 5.1 and d, 6.2 the last row is used for the joined datatable. Because you use Inf as roll-value, all the future values are incorporated in the resulting datatable. When you use:

DT[DT_LU, nomatch=0, roll=1]

you see that only the first value in the future is included:

t_id airport thisTime f_id 1: 3 a 6 1 2: 3 a 6 2 3: 5 d 7 2

If you want the f_id 's for for all combinations of airport & thisTime where DT$thisTime is lower than DT_LU$thisTime , you can achieve that by creating a new variable (or replacing the existing thisTime ) by means of the ceiling function. An example where I create a new variable thisTime2 and then do a normal join with DT_LU :

DT[, thisTime2 := ceiling(thisTime)] setkey(DT, airport, thisTime2)[DT_LU, nomatch=0]

which gives:

t_id airport thisTime thisTime2 f_id 1: 1 a 5.1 6 1 2: 4 a 5.1 6 1 3: 3 a 5.1 6 1 4: 1 a 5.1 6 2 5: 4 a 5.1 6 2 6: 3 a 5.1 6 2 7: 2 d 6.2 7 2 8: 5 d 6.2 7 2

Applied to the data you provided:

> dt[, thisTime2 := ceiling(thisTime)] > setkey(dt, airport, thisTime2)[dt_lookup, nomatch=0] t_id airport thisTime thisTime2 f_id 1: 1 a 5.1 6 1 2: 3 a 5.1 6 1 3: 1 a 5.1 6 2 4: 3 a 5.1 6 2

When you want to include al the future values instead of only the first one, you need a somewhat different approach for which you will need the i.col functionality (which is not documented yet):

1: First set the key to only the airport columns:

setkey(DT, airport) setkey(DT_LU, airport)

2: Use the i.col functionality (which is not documented yet) in j to get what you want as follows:

DT1 <- DT_LU[DT, .(tid = i.t_id, tTime = i.thisTime, fTime = thisTime[i.thisTime < thisTime], fid = f_id[i.thisTime < thisTime]), by=.EACHI]

this gives you:

> DT1 airport tid tTime fTime fid 1: a 1 5.1 6 1 2: a 1 5.1 6 2 3: a 1 5.1 8 2 4: a 4 5.1 6 1 5: a 4 5.1 6 2 6: a 4 5.1 8 2 7: a 3 5.1 6 1 8: a 3 5.1 6 2 9: a 3 5.1 8 2 10: d 2 6.2 7 2 11: d 2 6.2 9 1 12: d 5 6.2 7 2 13: d 5 6.2 9 1

Some explanation: In case when you are joining two datatables where the same columnnames are used, you can refer to the columns of the datatable in i by preceding the columnnames with i. . Now it's possible to compare thisTime from DT with thisTime from DT_LU . With by = .EACHI you assure that all combinations for with the condition holds are included in the resulting datatable.

Alternatively, you can achieve the same with:

DT2 <- DT_LU[DT, .(airport=i.airport, tid=i.t_id, tTime=i.thisTime, fTime=thisTime[i.thisTime < thisTime], fid=f_id[i.thisTime < thisTime]), allow.cartesian=TRUE]

which gives the same result:

> identical(DT1, DT2) [1] TRUE

When you only want to include future values within a certain boundary, you can use:

DT1 <- DT_LU[DT, { idx = i.thisTime < thisTime & thisTime - i.thisTime < 2 .(tid = i.t_id, tTime = i.thisTime, fTime = thisTime[idx], fid = f_id[idx]) }, by=.EACHI]

which gives:

> DT1 airport tid tTime fTime fid 1: a 1 5.1 6 1 2: a 1 5.1 6 2 3: a 4 5.1 6 1 4: a 4 5.1 6 2 5: a 3 5.1 6 1 6: a 3 5.1 6 2 7: d 2 6.2 7 2 8: d 5 6.2 7 2

When you compare that to the previous result, you see that now the rows 3, 6, 9, 10 and 12 have been removed.

Data: