How do Signal Indexes work?

Each file in the log stream managed by Seq is naturally partitioned into pages by the host operating system and virtual memory manager:

989 — File divided into 4 KiB pages. Read operations work in page-sized units.

When Seq reads events from storage, it must do so in page-sized chunks, as these are the units in which the OS and virtual memory manager work.

A signal index is a bitmap index, where each page in the log stream corresponds to a single bit in the index. If the page contains events that match the signal, then the corresponding bit is "on", and otherwise, it's off.

When Seq searches within a signal, it only needs to retrieve pages that may contain events, as marked in the index, so the total amount of I/O required is drastically reduced:

989 — Signal matching events in a particular region. The "on" bits in the signal index correspond to pages in the log stream that contain matching events.

As signals are selected in the Seq Events screen, their bitmap indexes are combined using bitwise operations, so that the combination of the Errors signal and the one shown above will further restrict the search space to only pages that contain matches for both signals.

Signal indexes are extremely space-efficient; the size of the index is approximately 0.003% of the size of the indexed log data, making them much easier to maintain than alternatives which frequently exceed the size of the indexed data set.