Making a Words Offset Iterator with Iterators & Generators in Typescript
Background
In my everyday job, it is very rare where I can find a use-case for generators. Recently, I ran into a LeetCode problem where I needed to iterate through every word in a line and get its starting index. I thought this might actually be a good use-case for generators and decided to explore how to do it.
I’ll demonstrate an implementation for simple words iterator using iterators and generators with Typescript.
For a quick refresher, I found the following to be excellent references to learn more about iterators and generators:
- Iterators gonna iterate by Jake Archibald: Jakes explains it in the simplest way possible.
- Iterators and generators by MDN
Problem
Given a string, return an iterator that yields the starting offset of every word.
const wordsIter = new WordsIterator("Boy you got my heartbeat runnin' away")
for (const offset of wordsIter) {
console.log(`offset: ${offset}`)
}
// output:
// offset: 0
// offset: 4
// offset: 8
// offset: 12
// offset: 15
// offset: 25
// offset: 33
Motivation
I wanted to learn more about the following:
- Use
WordsIterator
for input validation in slatejs rich text editor - Learn more on generators in an actual use-case
- Learn more about type-systems with typescript
- Implementation
Implementation
export class WordsIterator implements Iterable<number> {
constructor(private doc: string) {}
*[Symbol.iterator](): Iterator<number> {
let offset = 0;
while (offset < this.doc.length) {
if (isWhiteSpace(this.doc[offset])) {
offset += 1;
continue;
}
yield offset;
offset += wordLength(this.doc, offset) + 1;
}
}
}
function isWhiteSpace(str: string): boolean {
return /\s/.test(str);
}
function wordLength(doc: string, index: number): number {
let start = index;
while (index < doc.length && doc[index].match(/\S/)) {
index++
}
return index - start;
}