Iterators
Outdated Notice
An iterator is not a collection, but rather a way to access the elements of a collection one by one. The two basic operations on an iteratorit
arenext
andhasNext
. A call toit.next()
will return the next element of the iterator and advance the state of the iterator. Callingnext
again on the same iterator will then yield the element one beyond the one returned previously. If there are no more elements to return, a call tonext
will throw aNoSuchElementException
. You can find out whether there are more elements to return usingIterator’shasNext
method.
The most straightforward way to “step through” all the elements returned by an iteratorit
uses a while-loop:
while (it.hasNext) println(it.next())
Iterators in Scala also provide analogues of most of the methods that you find in theTraversable
,Iterable
andSeq
classes. For instance, they provide aforeach
method which executes a given procedure on each element returned by an iterator. Usingforeach
, the loop above could be abbreviated to:
it foreach println
As always, for-expressions can be used as an alternate syntax for expressions involvingforeach
,map
,withFilter
, andflatMap
, so yet another way to print all elements returned by an iterator would be:
for (elem <- it) println(elem)
There’s an important difference between the foreach method on iterators and the same method on traversable collections: When called on an iterator,foreach
will leave the iterator at its end when it is done. So callingnext
again on the same iterator will fail with aNoSuchElementException
. By contrast, when called on a collection,foreach
leaves the number of elements in the collection unchanged (unless the passed function adds or removes elements, but this is discouraged, because it may lead to surprising results).
The other operations that Iterator has in common withTraversable
have the same property. For instance, iterators provide amap
method, which returns a new iterator:
scala> val it = Iterator("a", "number", "of", "words")it: Iterator[java.lang.String] = non-empty iteratorscala> it.map(_.length)res1: Iterator[Int] = non-empty iteratorscala> res1 foreach println1625scala> it.next()java.util.NoSuchElementException: next on empty iterator
As you can see, after the call toit.map
, theit
iterator has advanced to its end.
Another example is thedropWhile
method, which can be used to find the first elements of an iterator that has a certain property. For instance, to find the first word in the iterator above that has at least two characters you could write:
scala> val it = Iterator("a", "number", "of", "words")it: Iterator[java.lang.String] = non-empty iteratorscala> it dropWhile (_.length < 2)res4: Iterator[java.lang.String] = non-empty iteratorscala> res4.next()res5: java.lang.String = number
Note again thatit
was changed by the call todropWhile
: it now points to the second word “number” in the list.In fact,it
and the resultres4
returned bydropWhile
will return exactly the same sequence of elements.
One way to circumvent this behavior is toduplicate
the underlying iterator instead of calling methods on it directly.Thetwo iterators that result will each return exactly the same elements as the underlying iteratorit
:
scala> val (words, ns) = Iterator("a", "number", "of", "words").duplicatewords: Iterator[String] = non-empty iteratorns: Iterator[String] = non-empty iteratorscala> val shorts = words.filter(_.length < 3).toListshorts: List[String] = List(a, of)scala> val count = ns.map(_.length).sumcount: Int = 14
The two iterators work independently: advancing one does not affect the other, so that each can bedestructively modified by invoking arbitrary methods. This creates the illusion of iterating overthe elements twice, but the effect is achieved through internal buffering.As usual, the underlying iteratorit
cannot be used directly and must be discarded.
In summary, iterators behave like collectionsif one never accesses an iterator again after invoking a method on it. The Scala collection libraries make this explicit with an abstractionTraversableOnce, which is a common superclass ofTraversable andIterator. As the name implies,TraversableOnce
objects can be traversed usingforeach
but the state of that object after the traversal is not specified. If theTraversableOnce
object is in fact anIterator
, it will be at its end after the traversal, but if it is aTraversable
, it will still exist as before. A common use case ofTraversableOnce
is as an argument type for methods that can take either an iterator or a traversable as argument. An example is the appending method++
in classTraversable
. It takes aTraversableOnce
parameter, so you can append elements coming from either an iterator or a traversable collection.
All operations on iterators are summarized below.
Operations in class Iterator
WHAT IT IS | WHAT IT DOES |
---|---|
Abstract Methods: | |
it.next() | Returns next element on iterator and advances past it. |
it.hasNext | Returnstrue ifit can return another element. |
Variations: | |
it.buffered | A buffered iterator returning all elements ofit . |
it grouped size | An iterator that yields the elements returned byit in fixed-sized sequence “chunks”. |
xs sliding size | An iterator that yields the elements returned byit in sequences representing a sliding fixed-sized window. |
Duplication: | |
it.duplicate | A pair of iterators that each independently return all elements ofit . |
Additions: | |
it ++ jt | An iterator returning all elements returned by iteratorit , followed by all elements returned by iteratorjt . |
it padTo (len, x) | The iterator that first returns all elements ofit and then follows that by copies ofx until lengthlen elements are returned overall. |
Maps: | |
it map f | The iterator obtained from applying the functionf to every element returned fromit . |
it flatMap f | The iterator obtained from applying the iterator-valued function f to every element init and appending the results. |
it collect f | The iterator obtained from applying the partial functionf to every element init for which it is defined and collecting the results. |
Conversions: | |
it.toArray | Collects the elements returned byit in an array. |
it.toList | Collects the elements returned byit in a list. |
it.toIterable | Collects the elements returned byit in an iterable. |
it.toSeq | Collects the elements returned byit in a sequence. |
it.toIndexedSeq | Collects the elements returned byit in an indexed sequence. |
it.toStream | Collects the elements returned byit in a stream. |
it.toSet | Collects the elements returned byit in a set. |
it.toMap | Collects the key/value pairs returned byit in a map. |
Copying: | |
it copyToBuffer buf | Copies all elements returned byit to bufferbuf . |
it copyToArray(arr, s, n) | Copies at mostn elements returned byit to arrayarr starting at indexs . The last two arguments are optional. |
Size Info: | |
it.isEmpty | Test whether the iterator is empty (opposite ofhasNext ). |
it.nonEmpty | Test whether the collection contains elements (alias ofhasNext ). |
it.size | The number of elements returned byit . Note:it will be at its end after this operation! |
it.length | Same asit.size . |
it.hasDefiniteSize | Returnstrue ifit is known to return finitely many elements (by default the same asisEmpty ). |
Element Retrieval Index Search: | |
it find p | An option containing the first element returned byit that satisfiesp , orNone is no element qualifies. Note: The iterator advances to after the element, or, if none is found, to the end. |
it indexOf x | The index of the first element returned byit that equalsx . Note: The iterator advances past the position of this element. |
it indexWhere p | The index of the first element returned byit that satisfiesp . Note: The iterator advances past the position of this element. |
Subiterators: | |
it take n | An iterator returning of the firstn elements ofit . Note: it will advance to the position after then ‘th element, or to its end, if it contains less thann elements. |
it drop n | The iterator that starts with the(n+1) ‘th element ofit . Note:it will advance to the same position. |
it slice (m,n) | The iterator that returns a slice of the elements returned from it, starting with them ‘th element and ending before then ‘th element. |
it takeWhile p | An iterator returning elements fromit as long as conditionp is true. |
it dropWhile p | An iterator skipping elements fromit as long as conditionp istrue , and returning the remainder. |
it filter p | An iterator returning all elements fromit that satisfy the conditionp . |
it withFilter p | Same asit filterp . Needed so that iterators can be used in for-expressions. |
it filterNot p | An iterator returning all elements fromit that do not satisfy the conditionp . |
Subdivisions: | |
it partition p | Splitsit into a pair of two iterators: one returning all elements fromit that satisfy the predicatep , the other returning all elements fromit that do not. |
it span p | Splitsit into a pair of two iterators: one returning all elements of the prefix ofit that satisfy the predicatep , the other returning all remaining elements ofit . |
Element Conditions: | |
it forall p | A boolean indicating whether the predicate p holds for all elements returned byit . |
it exists p | A boolean indicating whether the predicate p holds for some element init . |
it count p | The number of elements init that satisfy the predicatep . |
Folds: | |
it.foldLeft(z)(op) | Apply binary operationop between successive elements returned byit , going left to right and starting withz . |
it.foldRight(z)(op) | Apply binary operationop between successive elements returned byit , going right to left and starting withz . |
it reduceLeft op | Apply binary operationop between successive elements returned by non-empty iteratorit , going left to right. |
it reduceRight op | Apply binary operationop between successive elements returned by non-empty iteratorit , going right to left. |
Specific Folds: | |
it.sum | The sum of the numeric element values returned by iteratorit . |
it.product | The product of the numeric element values returned by iteratorit . |
it.min | The minimum of the ordered element values returned by iteratorit . |
it.max | The maximum of the ordered element values returned by iteratorit . |
Zippers: | |
it zip jt | An iterator of pairs of corresponding elements returned from iteratorsit andjt . |
it zipAll (jt, x, y) | An iterator of pairs of corresponding elements returned from iteratorsit andjt , where the shorter iterator is extended to match the longer one by appending elementsx ory . |
it.zipWithIndex | An iterator of pairs of elements returned fromit with their indices. |
Update: | |
it patch (i, jt, r) | The iterator resulting fromit by replacingr elements starting withi by the patch iteratorjt . |
Comparison: | |
it sameElements jt | A test whether iterators it andjt return the same elements in the same order. Note: Using the iterators after this operation is undefined and subject to change. |
Strings: | |
it addString (b, start, sep, end) | Adds a string toStringBuilder b which shows all elements returned byit between separatorssep enclosed in stringsstart andend .start ,sep ,end are all optional. |
it mkString (start, sep, end) | Converts the collection to a string which shows all elements returned byit between separatorssep enclosed in stringsstart andend .start ,sep ,end are all optional. |
Laziness
Unlike operations directly on a concrete collection likeList
, operations onIterator
are lazy.
A lazy operation does not immediately compute all of its results. Instead, it computes the results as they are individually requested.
So the expression(1 to 10).iterator.map(println)
would not print anything to the screen. Themap
method in this case doesn’t apply its argument function to the values in the range, it returns a newIterator
that will do this as each one is requested. Adding.toList
to the end of that expression will actually print the elements.
A consequence of this is that a method likemap
orfilter
won’t necessarily apply its argument function to all the input elements. The expression(1 to 10).iterator.map(println).take(5).toList
would only print the values1
to5
, for instance, since those are only ones that will be requested from theIterator
returned bymap
.
This is one of the reasons why it’s important to only use pure functions as arguments tomap
,filter
,fold
and similar methods. Remember, a pure function has no side-effects, so one would not normally useprintln
in amap
.println
is used to demonstrate laziness as it’s not normally visible with pure functions.
Laziness is still valuable, despite often not being visible, as it can prevent unneeded computations from happening, and can allow for working with infinite sequences, like so:
def zipWithIndex[A](i: Iterator[A]): Iterator[(Int, A)] = Stream.from(0).zip(i)
Buffered iterators
Sometimes you want an iterator that can “look ahead”, so that you can inspect the next element to be returned without advancing past that element. Consider for instance, the task to skip leading empty strings from an iterator that returns a sequence of strings. You might be tempted to write the following
def skipEmptyWordsNOT(it: Iterator[String]) = while (it.next().isEmpty) {}
But looking at this code more closely, it’s clear that this is wrong: The code will indeed skip leading empty strings, but it will also advanceit
past the first non-empty string!
The solution to this problem is to use a buffered iterator. ClassBufferedIterator is a subclass ofIterator, which provides one extra method,head
. Callinghead
on a buffered iterator will return its first element but will not advance the iterator. Using a buffered iterator, skipping empty words can be written as follows.
def skipEmptyWords(it: BufferedIterator[String]) = while (it.head.isEmpty) { it.next() }
Every iterator can be converted to a buffered iterator by calling itsbuffered
method. Here’s an example:
scala> val it = Iterator(1, 2, 3, 4)it: Iterator[Int] = non-empty iteratorscala> val bit = it.bufferedbit: scala.collection.BufferedIterator[Int] = non-empty iteratorscala> bit.headres10: Int = 1scala> bit.next()res11: Int = 1scala> bit.next()res12: Int = 2scala> bit.headOptionres13: Option[Int] = Some(3)
Note that callinghead
on the buffered iteratorbit
does not advance it. Therefore, the subsequent callbit.next()
returns the same value asbit.head
.
As usual, the underlying iterator must not be used directly and must be discarded.
The buffered iterator only buffers the next element whenhead
is invoked. Other derived iterators,such as those produced byduplicate
andpartition
, may buffer arbitrary subsequences of theunderlying iterator. But iterators can be efficiently joined by adding them together with++
:
scala> def collapse(it: Iterator[Int]) = if (!it.hasNext) Iterator.empty else { | var head = it.next | val rest = if (head == 0) it.dropWhile(_ == 0) else it | Iterator.single(head) ++ rest | }collapse: (it: Iterator[Int])Iterator[Int]scala> def collapse(it: Iterator[Int]) = { | val (zeros, rest) = it.span(_ == 0) | zeros.take(1) ++ rest | }collapse: (it: Iterator[Int])Iterator[Int]scala> collapse(Iterator(0, 0, 0, 1, 2, 3, 4)).toListres14: List[Int] = List(0, 1, 2, 3, 4)
In the second version ofcollapse
, the unconsumed zeros are buffered internally.In the first version, any leading zeros are dropped and the desired result constructedas a concatenated iterator, which simply calls its two constituent iterators in turn.
Contributors to this page:
Contents
- Introduction
- Mutable and Immutable Collections
- Trait Traversable
- Trait Iterable
- The sequence traits Seq, IndexedSeq, and LinearSeq
- Sets
- Maps
- Concrete Immutable Collection Classes
- Concrete Mutable Collection Classes
- Arrays
- Strings
- Performance Characteristics
- Equality
- Views
- Iterators
- Creating Collections From Scratch
- Conversions Between Java and Scala Collections
- Migrating from Scala 2.7