Consider the following sequence. Let the first element be \(a_0 = 0\).

Then, for \(a_i\), when \(i>0\), let’s do the following. Consider the last element \(a_{i-1}\). Did this element appear previously in the sequence? If not (meaning that \(a_{i-1}\) is its first occurrence), set \(a_i\) to \(0\). If \(a_{i-1}\) did appear before, then let \(a_k\) be its previous occurrence. Set \(a_i\) to the value \(i-1-k\) (meaning, “how far back did we see this value?”).

So, we have 0,0,1,0,2,0,2,2,1,6,…

For example, the last 6 is there because the previous element 1 has appeared 6 positions earlier.

This is known as Van Eck’s sequence and appears as A181391 in the wonderful On-line encyclopedia of integer seqences. It has been added to OEIS by Jan Ritsema van Eck in 2010. There are many interesting properties of this sequence (for instance, it was shown that it contains infinitely many zeros), and even more intriguing open questions. For more details see the OEIS entry, or these slides by OEIS founder N.J.A. Sloane.

Here is a variation on the above sequence:

Instead of counting the “number of elements”, let’s count only the “number of distinct elements”. That is, again we start with 0, and then if the last element was new, we write 0, otherwise we write “how many distinct elements” have we seen since its previous appearance.

So the sequence goes: 0,0,1,0,2,0,2,2,1,3,…

For example, the last 3 is there because between the previous element 1 and its last appearance there are 3 distinct elements: 0,1,2.

I thought of this sequence independently, but searching for it on OEIS I found that it has been considered before by Nathaniel Shar, who added it to the OEIS as A268755. See here for a plot.

The rule for defining the sequence is similar to the concept of “working set” in the theory of data structures, which refers exactly to the number of distinct queries seen since the previous occurrence of the last query. Therefore, I think a fitting name for the sequence would be “working set sequence”.

So what can we say about the working set sequence (a.k.a. Shar’s sequence)? Does it also contain an infinite number of zeros, similarly to the Van Eck’s sequence? I generated about half a million elements by computer, and the answer seems yes. It could happen, of course, that at some point the sequence reaches a cycle that it can never escape, such as 1,1,1,1,1,… or the less trivial …,1,2,3,3,1,3,2,3,2,2,1,3,3,… (exercise: check that this is really a cycle!)

However, the computer experiment suggests that this does not happen, and eventually every positive integer appears in the sequence.

Can this be proven formally? Yes! Let’s leave this as an exercise (I added a proof sketch to the OEIS entry). As a hint, an easier property to observe is that \(k\) can appear only after \(0,…,k-1\) have already appeared.

Theorem. The working set sequence contains infinitely many zeros.

Zeros seem to appear much less frequently in the working set sequence than in Van Eck’s sequence. But how frequent are they? Empirically it seems that there is a gap of roughly a constant times \(k\) between the \(k\)th and \(k+1\)th zeros, although nothing like this has been proven. Also, the zeros often come in pairs: if, for a long time there have been no new elements, eventually the new element \(k\) comes, followed by \(0\), and then, since \(0\) has also not appeared for a very long time, there is high chance that \(k+1,0\) follow right away. This seems to be the case quite frequently. Overall, however, nothing seems to be known about the statistics of this sequence, e.g. the asymptotic growth rate of \(a_n\).

The indices of new elements in the working set sequence (or alternatively, the indices of zeros minus one) have been added to OEIS as A275668.

I mentioned above two possible cycles in the sequence (although they don’t actually appear if we start from 0). What is the set of all possible cycles, does it have a nice characterization? How many different cycles are there with \(k\) distinct elements? These questions are open.

Finally, another open question about the working set sequence (as far as I know, the same question is open for the Van Eck’s sequence as well):

Problem. Does every pair of nonnegative integers (a,b) eventually appear in the working set sequence as a pair of consecutive elements? (except for (1,1) of course, which cannot appear).

[NOTE added August 2017] As pointed out by Jan Ritsema van Eck in a comment (see below), the Van Eck sequence also excludes pairs of the type (p,p+1). Prove it as a fun exercise, it is not difficult! So let’s modify the conjecture to also exclude (p,p+1) pairs besides (1,1). The conjecture doesn’t seem extremely plausible to be honest (as Jan points out, there are weaker statements one should prove first), so let’s formulate it as a challenge, to move things forward. Can you find an (a,b) pair not of the form (p,p+1) and not (1,1) that does not appear in the van Eck sequence A181391?

This observation does not apply to the working set sequence. For example, (4,5) appears at positions (21,22). So the similar challenge is: Can you find an (a,b) pair other than (1,1) that does not appear in the working set sequence A268755?