Queues In JavaScript
Recently I was working on something with Node.js
where I needed to use a queue; in a nutshell, I needed
a data structure where I could put a bunch of elements in it, and then get the elements back out in the
order in which they were added. I've been working with the broader JS ecosystem for a few years now and up until
this point I realized I've never actually had a use case for a queue before! I've used them in Python and
so I expected to be able to find something within the language and be able to do something like this:
However there isn't really an agreed upon convention for dealing with queues in JS. It appears we either have to roll with a
custom solution (build ourselves or pull one via NPM) or use the JS array
as a queue in Node.js
.
To be precise, array
has a built in method shift()
that removes the first element from the collection and returns it. In effect, this gives us what we'd want out of a queue:
Although this seems to work just fine, I wondered what would happen if instead of thousands of elements,
I had to run shift()
on tens of thousands of elements or even millions of elements?
To put this to the test, I used the PerformanceObserver
interface from Node.js
and achieved the following results on my machine (MacBook Pro M1 Max, 10-Core CPU and 24-Core GPU):
For a queue with one million elements, it takes over a minute to empty the collection using shift()
. This quick benchmark reveals that shift()
is very
likely a linear time operation on average; hence it takes quadratic time to empty the queue entirely.
Optimizing
Can we do a little better? Well, our bottleneck is items leaving the queue and we want to speed up that operation as much as possible. One approach
could be to implement a doubly linked list to use as a queue. Having a pointer in the front and back would allow us to push
and popleft
in
constant time. Another approach, with perhaps fewer lines of code and potentially easier to reason about, would be to create a queue around the
built in array
. This solution is explored below.
The built in array
already has a very fast method for adding items to the collection, push(item)
. So we don't need to modify this aspect. What
we do need though is an efficient way of getting items out of the collection. Borrowing an idea from the doubly linked list, we could maintain
a pointer at the front and move the pointer each time an item leaves the queue. Note that this efficiently gets the front item simply by indexing into
the collection at the position of the pointer.
Great! So that's it? Well, not quite. Notice that even though we're getting the correct item and returning it, the item itself hasn't actually left the collection. We're still allocating memory for that item. So we need an efficient-ish way to reduce the size of the collection. When it comes to knowing when to resize a data structure, there are a lot of approaches and optimizations we could make, but to keep things simple, we'll see how far we can get with resizing when the collection has reduced in size by half. Putting together these ideas, we get the following:
Obviously this code isn't perfect, and we'd need to handle some edge cases, but how does it perform? On the same machine using the same parameters as before, we get the following results:
Recall that previously it took over a minute to empty the queue with a million elements. With roughly 20 lines of code, we were able to reduce the
time down to roughly 8ms
, pretty cool!