Collections and Sequences in Clojure

Purpose

Newcomers to Clojure are often confused by the collection and sequence abstractions and how they relate to one another. This document aims to provide an overview of these concepts and how they may be used in one's code.

TL;DR

"Collection" and "sequence" are abstractions, not a property that can be determined from a given value. Collections are bags of values. Sequences are a type of collection, supporting linear access only. A seq can be derived from any collection (and some non-collections.) Many linear-access functions derive a seq from their argument using seq . The main use of seq in user code is to check if a collection or collection-like will yield elements.

Collections

Some collection types divided by their properties in a Venn diagram

Clojure's Collection API provides a generic mechanism for creating and handling compound data. Technically, a Clojure collection is an object claiming the clojure.lang.IPersistentCollection interface. This may be discovered using the predicate coll? . The associated conceptual abstraction is that of a bag o' values, supporting certain operations: Adding, removing, counting, finding, and iterating over the values. Commonly seen examples are lists, maps, vectors, sets, and seqs, but these are not the only collection types that Clojure provides.

Different collection types have different APIs, performance characteristics, and intended patterns of usage. Any given collection may match one or more of the following predicates, which group collection types by their broad characteristics:

counted? These colls know their size and can calculate their count in constant time, without actually traversing their data. This is not just a performance characteristic — some collections are infinite or may not be able to predict their size without running arbitrary code. associative? Associative colls support key-value lookups. Maps are the traditional associative data structure, but vectors can be treated as mappings of indices to values. sequential? Sequential colls retain a linear ordering under insertion and deletion. Lists, seqs, and vectors have this property. Note that a vector is both sequential and associative, while a set is neither.

Here we see which collections support which predicates, as well as how some non-collections are treated:

Note that the string is not a collection, but may be converted into one.

TODO: Raid clojure.core repo for instances of IPersistentCollection, ISeq, etc.

Sequences

A sequence is a data structure that is expected to be accessed in a sequential manner. It may be infinitely long, and may require additional computation in order to read.

(range) '(1 2 3) [4 5 6] {:a 1, :b 2} #{7 8 9} "hello" () [] "" nil 17 seq empty?

TODO: seq API

[1 2] [1] [] nil 17 first next rest

Relationship between the two

All sequences are also collections.

You can derive a sequence backed by any collection. Some collections have more than one seq implementation. (rseq [])

You can read any sequence into a new collection.

Nota Bene: Sequences are not implemented as lists, they just act a lot like them and may be backed by similar data structures.

(range) '(1 2 3) [4 5 6] {:a 1, :b 2} #{7 8 9} "hello" nil 17 coll? seq?

TODO: counted?

Equality

TODO: Equality partitions re: seqs and colls

() [] {} #{} #(= () %) #(= [] %) #(= {} %) #(= #{} %)

TODO: effect of metadata, sortedness

TODO: comparison with other Java Collections

Further reading

Different implementations of colls: sorted-set, hash-set, sorted-map, hash-map, array-map

Polymorphism under the collection and sequence APIs (e.g. invisible shifts in implementation as you hammer on a coll.)

Performance characteristics.