avatarGabriel Shanahan

Free AI web copilot to create summaries, insights and extended knowledge, download it at here

5278

Abstract

it twice will result in an exception.</p><p id="9222">I tried to find out <i>why</i> this is so and couldn’t find any official material referencing the reason. I had a hunch this had something to do with <code>Sequences</code> that cause <a href="https://en.wikipedia.org/wiki/Side_effect_(computer_science)">side effects</a> that you wouldn’t want repeated, e.g. writing something to the database. After asking on reddit, my <a href="https://www.reddit.com/r/Kotlin/comments/ypfkmq/why_is_sequence_built_using_generatesequence/">suspicions were confirmed</a>, and I’ll refer you to the <a href="https://www.reddit.com/r/Kotlin/comments/ypfkmq/comment/iviyp9k/?utm_source=share&amp;utm_medium=web2x&amp;context=3">excellent answer</a> there instead of reproducing it.</p><p id="9335">It should be noted that you can constrain any <code>Sequence</code> to run only once using the aptly named method <a href="https://kotlinlang.org/api/latest/jvm/stdlib/kotlin.sequences/constrain-once.html"><code>constrainO</code>nce</a>.</p><p id="5f42"><b>Example</b></p> <figure id="f6aa"> <div> <div> <img class="ratio" src="http://placehold.it/16x9"> <iframe class="" src="https://cdn.embedly.com/widgets/media.html?src=https%3A%2F%2Fpl.kotl.in%2FWoKTImqZa&amp;display_name=Kotlin+Playground&amp;url=https%3A%2F%2Fpl.kotl.in%2FWoKTImqZa&amp;key=a19fcc184b9711e1b4764040d3dc5c07&amp;type=text%2Fhtml&amp;schema=kotl" allowfullscreen="" frameborder="0" height="300" width="800"> </div> </div> </figure></iframe></div></div></figure><p id="667d">There are two variants of <code>generateSequence</code> which allow you to make the process dependent on some value (and specify an initial value to use). Both these variants are <b>not</b> constrained to run only once.</p><div id="0e4c"><pre><span class="hljs-function"><span class="hljs-keyword">fun</span> <span class="hljs-type"><T : Any></span> <span class="hljs-title">generateSequence</span><span class="hljs-params">( seed: <span class="hljs-type">T</span>?, nextFunction: (<span class="hljs-type">T</span>) -> <span class="hljs-type">T</span>? )</span></span>: Sequence<T></pre></div><div id="d18f"><pre><span class="hljs-function"><span class="hljs-keyword">fun</span> <span class="hljs-type"><T : Any></span> <span class="hljs-title">generateSequence</span><span class="hljs-params">( seedFunction: () -> <span class="hljs-type">T</span>?, nextFunction: (<span class="hljs-type">T</span>) -> <span class="hljs-type">T</span>? )</span></span>: Sequence<T></pre></div><p id="3bc2">In both cases, the return value of <code>nextFunction</code> is used as the input in the following iteration, assuming its non-null. If it’s <code>null</code>, the <code>Sequence</code> is terminated.</p><h2 id="88d6">How Sequences Work & Terminal Operations</h2><p id="c6cc">When you apply operations to a <code>Sequence</code>, you’re basically creating a hierarchy of objects that describe the operations you want to perform on its elements. In other words, instances of <code>Sequence</code> are <b>sets of instructions that describe how to create and transform their elements. </b>The operations you have at your disposal are identical to the counterparts defined on <code>Iterable</code><code>map</code>, <code>filter</code>, <code>take</code>, <code>fold</code>, <code>groupBy</code>, you name it.</p><p id="879b">Whenever possible, calling an operation simply adds a new instruction, i.e. creates a new instance of <code>Sequence</code> with the new operation wrapping the previous ones (Dave Leeds has <a href="https://typealias.com/guides/inside-kotlin-sequences/">a great article</a> that talks about this in more detail). For example, both <code>map</code> and <code>filter</code> do this. Such operations are called <i>intermediate</i> operations.</p><p id="d9a9">However, some operations that require traversing all the elements to give the correct result. For example, <code>fold</code>, <code>count</code>, <code>groupBy</code>, <code>forEach</code> and others are all functions that need to traverse all the elements to compute their result. Such operations are called <i>terminal</i> operations, and the moment they are called is the moment all the operations in the sequence definition actually get executed.</p> <figure id="0c72"> <div> <div> <img class="ratio" src="http://placehold.it/16x9"> <iframe class="" src="https://cdn.embedly.com/widgets/media.html?src=https%3A%2F%2Fpl.kotl.in%2FSGWy74NZX&amp;display_name=Kotlin+Playground&amp;url=https%3A%2F%2Fpl.kotl.in%2FSGWy74NZX&amp;key=a19fcc184b9711e1b4764040d3dc5c07&amp;type=text%2Fhtml&amp;schema=kotl" allowfullscreen="" frameborder="0" height="300" width="800"> </div> </div> </figure></iframe></div></div></figure><h2 id="7b46">Eager vs. lazy, finite vs. infinite</h2><p id="c2ca">You will often encounter articles that describe the difference between <code>Sequence</code> and <code>Iterable</code> in terms of eagerness and laziness — <code>Iterable</code> is eager (runs immediately), while <code>Sequence</code> is lazy

Options

(only builds a set of instructions and does not run until you apply a terminal operation). You will also often encounter descriptions in terms of finiteness — <code>Sequences</code> can be infinite, as opposed to <code>Iterable</code>, which is always finite.</p><p id="bf88">While both statements are technically correct, I dislike this approach because I think that they address <i>consequences</i>, and not core properties. Framing the difference between <code>Iterable</code> and <code>Sequence</code> in this manner distracts from what’s fundamentally going on. If you’ve ever felt like laziness vs. eagerness and finite vs. infinite is an awfully ad-hoc way to discriminate between collection hierarchies, this is probably why —<b> it’s not at the core of what’s actually going on</b>.</p><p id="c293">The laziness of <code>Sequence</code> is a <i>consequence</i> of the fact that it traverses operations first, and therefore, cannot execute immediately when an instance constructed. If it did, you couldn’t specify multiple operations! By definition, we require all the operations to be applied to the first element before it moves on to the second element, but that can’t happen if it executes immediately, before we’ve even specified which operations should be run. Therefore, <b>by necessity, <code>Sequence</code> cannot start immediately</b>. On the other hand, since <code>Iterable</code> traverses elements first, it is able to start immediately — no need to wait for all the operations to be specified, it traverses the elements, spits out the result, and we’re free to chain another operation if we want to.</p><p id="15a2">But it also works the other way around — <code>Iterable</code> traverses elements first, and so <b>it can’t work unless all elements have been specified and are loaded into memory</b>. By necessity, this means that you cannot have an infinite amount of these elements — <code>Iterable</code> applies the first operation to all elements before moving on to the next operation, which would never finish executing. On the other hand, <code>Sequence</code> traverses operations first, and so it <i>doesn’t</i> need all the elements to be loaded in memory — it suffices to fetch/compute the next one once it’s finished applying all the operations to the current one. Therefore, as a <i>consequence</i>, <code>Sequences</code> can be infinite and still be used to give meaningful results.</p><p id="e785">The use-cases for infinite sequences are essentially the same as for infinite loops. Think of reading a sensor that is continuously producing data, which you want to transform and render to the screen. You could never model this using <code>Iterable</code> — the stream of data never ends, so <code>Iterable</code> would never get around to even running the first operation — and therefore, you couldn’t use <code>map</code> etc.</p><p id="e258">With <code>Sequence</code>, it’s no problem at all:</p> <figure id="c12a"> <div> <div> <img class="ratio" src="http://placehold.it/16x9"> <iframe class="" src="https://cdn.embedly.com/widgets/media.html?src=https%3A%2F%2Fpl.kotl.in%2Fl_cB3v1jM&amp;display_name=Kotlin+Playground&amp;url=https%3A%2F%2Fpl.kotl.in%2Fl_cB3v1jM&amp;key=a19fcc184b9711e1b4764040d3dc5c07&amp;type=text%2Fhtml&amp;schema=kotl" allowfullscreen="" frameborder="0" height="300" width="800"> </div> </div> </figure></iframe></div></div></figure><p id="373b">You saw another example of an infinite sequence using <code>generateSequence</code>:</p> <figure id="ecd7"> <div> <div> <img class="ratio" src="http://placehold.it/16x9"> <iframe class="" src="https://cdn.embedly.com/widgets/media.html?src=https%3A%2F%2Fpl.kotl.in%2FEvrnugUto&amp;display_name=Kotlin+Playground&amp;url=https%3A%2F%2Fpl.kotl.in%2FEvrnugUto&amp;key=a19fcc184b9711e1b4764040d3dc5c07&amp;type=text%2Fhtml&amp;schema=kotl" allowfullscreen="" frameborder="0" height="300" width="800"> </div> </div> </figure></iframe></div></div></figure><p id="4a9b">To sum up, eagerness/laziness and finiteness/infiniteness are both a consequence of the fact that <code>Sequence</code> requires all <i>operations</i> to be specified before it starts, while <code>Iterable</code> requires all <i>elements</i> to be specified before it starts. And in turn, both of these are themselves consequences of the fact <b>that <code>Sequence</code> traverses operations first, while <code>Iterable</code> traverses elements first.</b> This is the fundamental property, from which the others follow.</p><p id="8e74">Go back to <a href="https://readmedium.com/sequences-why-we-need-them-45b320e8922">Sequences: Why We Need Them</a>, jump to the <a href="https://readmedium.com/table-of-contents-c52573cfa291">Table of Contents</a>, or continue to <a href="https://readmedium.com/sequences-yield-vs-return-how-pausing-functions-work-e7d0c32917b1">Sequences: Yield vs. Return & How Pausing Functions Work</a>.</p><figure id="8ecd"><img src="https://cdn-images-1.readmedium.com/v2/resize:fit:800/1*biBSB579iezsNvEQ_NMLBg.png"><figcaption></figcaption></figure></article></body>

Sequences: How They Work & How To Use Them

How to create a Sequence, how they work under the hood & the difference between intermediate and terminal operations, and how laziness and infiniteness arise from a more fundamental property.

— — — — — — — — — — — — — — —

THE CURRENT VERSION OF THIS ARTICLE IS PUBLISHED HERE.

— — — — — — — — — — — — — — —

Tags: #FYI++

This article is part of the Kotlin Primer, an opinionated guide to the Kotlin language, which is indented to help facilitate Kotlin adoption inside Java-centric organizations. It was originally written as an organizational learning resource for Etnetera a.s. and I would like to express my sincere gratitude for their support.

It is recommended to read the Introduction before moving on. Check out the Table of Contents for all articles.

Creating a Sequence

There are three main ways you can create a sequence:

  • By directly specifying its elements, via sequenceOf
  • From an Iterable, via asSequence
  • By using a lambda that calculates each element, via generateSequence
  • By using what are essentially generators, i.e. functions that can return multiple times using yield and yieldAll, via sequence. The way this works is sufficiently involved to warrant its own separate article.

sequenceOf

fun <T> sequenceOf(vararg elements: T): Sequence<T>

Creates a Sequence from the passed in elements. Functionally equivalent to listOf, setOf etc., but for Sequence.

Example

asSequence

fun <T> Iterable<T>.asSequence(): Sequence<T>

Creates a Sequence from the contents of the Iterable.

Example

generateSequence

fun <T : Any> generateSequence(nextFunction: () -> T?): Sequence<T>

Generates a Sequence by repeatedly calling nextFunction, until it returns null.

There are a few variants of this method that we will discuss shortly, however be aware that the Sequence returned by the above variant is constrained to only run once! Attempting to run it twice will result in an exception.

I tried to find out why this is so and couldn’t find any official material referencing the reason. I had a hunch this had something to do with Sequences that cause side effects that you wouldn’t want repeated, e.g. writing something to the database. After asking on reddit, my suspicions were confirmed, and I’ll refer you to the excellent answer there instead of reproducing it.

It should be noted that you can constrain any Sequence to run only once using the aptly named method constrainOnce.

Example

There are two variants of generateSequence which allow you to make the process dependent on some value (and specify an initial value to use). Both these variants are not constrained to run only once.

fun <T : Any> generateSequence(
    seed: T?,
    nextFunction: (T) -> T?
): Sequence<T>
fun <T : Any> generateSequence(
    seedFunction: () -> T?,
    nextFunction: (T) -> T?
): Sequence<T>

In both cases, the return value of nextFunction is used as the input in the following iteration, assuming its non-null. If it’s null, the Sequence is terminated.

How Sequences Work & Terminal Operations

When you apply operations to a Sequence, you’re basically creating a hierarchy of objects that describe the operations you want to perform on its elements. In other words, instances of Sequence are sets of instructions that describe how to create and transform their elements. The operations you have at your disposal are identical to the counterparts defined on Iterablemap, filter, take, fold, groupBy, you name it.

Whenever possible, calling an operation simply adds a new instruction, i.e. creates a new instance of Sequence with the new operation wrapping the previous ones (Dave Leeds has a great article that talks about this in more detail). For example, both map and filter do this. Such operations are called intermediate operations.

However, some operations that require traversing all the elements to give the correct result. For example, fold, count, groupBy, forEach and others are all functions that need to traverse all the elements to compute their result. Such operations are called terminal operations, and the moment they are called is the moment all the operations in the sequence definition actually get executed.

Eager vs. lazy, finite vs. infinite

You will often encounter articles that describe the difference between Sequence and Iterable in terms of eagerness and laziness — Iterable is eager (runs immediately), while Sequence is lazy (only builds a set of instructions and does not run until you apply a terminal operation). You will also often encounter descriptions in terms of finiteness — Sequences can be infinite, as opposed to Iterable, which is always finite.

While both statements are technically correct, I dislike this approach because I think that they address consequences, and not core properties. Framing the difference between Iterable and Sequence in this manner distracts from what’s fundamentally going on. If you’ve ever felt like laziness vs. eagerness and finite vs. infinite is an awfully ad-hoc way to discriminate between collection hierarchies, this is probably why — it’s not at the core of what’s actually going on.

The laziness of Sequence is a consequence of the fact that it traverses operations first, and therefore, cannot execute immediately when an instance constructed. If it did, you couldn’t specify multiple operations! By definition, we require all the operations to be applied to the first element before it moves on to the second element, but that can’t happen if it executes immediately, before we’ve even specified which operations should be run. Therefore, by necessity, Sequence cannot start immediately. On the other hand, since Iterable traverses elements first, it is able to start immediately — no need to wait for all the operations to be specified, it traverses the elements, spits out the result, and we’re free to chain another operation if we want to.

But it also works the other way around — Iterable traverses elements first, and so it can’t work unless all elements have been specified and are loaded into memory. By necessity, this means that you cannot have an infinite amount of these elements — Iterable applies the first operation to all elements before moving on to the next operation, which would never finish executing. On the other hand, Sequence traverses operations first, and so it doesn’t need all the elements to be loaded in memory — it suffices to fetch/compute the next one once it’s finished applying all the operations to the current one. Therefore, as a consequence, Sequences can be infinite and still be used to give meaningful results.

The use-cases for infinite sequences are essentially the same as for infinite loops. Think of reading a sensor that is continuously producing data, which you want to transform and render to the screen. You could never model this using Iterable — the stream of data never ends, so Iterable would never get around to even running the first operation — and therefore, you couldn’t use map etc.

With Sequence, it’s no problem at all:

You saw another example of an infinite sequence using generateSequence:

To sum up, eagerness/laziness and finiteness/infiniteness are both a consequence of the fact that Sequence requires all operations to be specified before it starts, while Iterable requires all elements to be specified before it starts. And in turn, both of these are themselves consequences of the fact that Sequence traverses operations first, while Iterable traverses elements first. This is the fundamental property, from which the others follow.

Go back to Sequences: Why We Need Them, jump to the Table of Contents, or continue to Sequences: Yield vs. Return & How Pausing Functions Work.

Java
Kotlin
Programming
Functional Programming
Sequence
Recommended from ReadMedium