How to move Javascript functions out of its closure to save memory

Edit on GitHub

Long time ago I talked about a pattern I use (a lot!) abusing the functional methods of Javascript to earn some memory and CPU usage. I promised to create a blog post, and almost 4 months after (too much energy draining work...), here is it.

The problem

Imagine you have a list with info about Ubuntu versions...

const versions = [
  {codename: 'Disco Dingo', version: '19.04'},
  {codename: 'Eoan Ermine', version: '19.10'},
  {codename: 'Focal Fossa', version: '20.04'}
]

...and you want to filter the versions published last year:

function getUbuntuVersion()
{
  return versions.filter(function(info)
  {
    return info.version.startsWith('19.')
  })
}

Simple, isn't it? But we have a nested anonimous function as the filter() test callback. Due to being Javascript a dynamic language, this function object is created each time we call to the getUbuntuVersion() function, not only wasting CPU time by making the Javascript VM to create a new function object each time [^1], but also discarding them once we move out of the getUbuntuVersion() function, wasting heap memory until the garbage collector finally destroy them (and when that it's done, wasting more CPU cycles too).

The solution for that is simple too, just only move the anonimous function out from the getUbuntuVersion() body to the parent (global) closure:

function filterTest(info)
{
  return info.version.startsWith('19.')
}

function getUbuntuVersion()
{
  return versions.filter(filterTest)
}

This is a simple operation because filterTest() function is not making use of any value from the closure generated by the getUbuntuVersion outer function, and due to that probably modern Javascript VMs does this optimization too (or maybe also unroll the anonimous function in place and make it inline, that would lead to a faster code since there's no need of a function call, and would use less memory since there's no need to create a new Function object). But what if we want getUbuntuVersion() to be configurable and accept the year to be searched for? Then filterTest() will need to access to the getUbuntuVersion() argument...

function getUbuntuVersion(year = 19)
{
  return versions.filter(function(info)
  {
    return info.version.startsWith(`${year}.`)
  })
}

...and we are back to the original code, with its waste of CPU and memory. How can we solve this? How can we pass an additional parameter to filterTest() function with the year to be checked for, if Array.filter() has already a fixed signature for its test callback?

A quick look on functions binding

Contrary to popular believe, Javascript is not an object-oriented programming language, but instead it's prototype based. This is because it was designed to have a low memory footprint, and due to that, not only object methods are in fact regular functions, but also the this keyword doesn't point to the object that's executing that "method" as it happens in other languages like C++, but in fact it points to the current execution context. This execution context is always keep updated by the Javascript VM and it works like a special register, and it can point to any object[^2].

In fact, you can tell explicitly to the Javascript VM what this object to use when calling to a function. The most know way is to use the Function.bind() method (yes, in Javascript functions are objects too, and yes, they have methods themselves too, and yes, I've already say that methods doesn't exist in Javacript but are regular functions, go to find a life yourself and leave me alone), that return a new function that has the this object (the execution context) "binded" to the original function, so you can call it standalone similar to what happens with Arrow functions or with methods in Python, but also you can be able to bind to it some additional arguments so you don't need to provide them later, what's known as currying. The next code blocks are logically equal (I personally prefer the first one):

const a =
{
  b(){}
}

const binded = a.b.bind(a)

binded()
const a =
{
  b(){}
}

function binded(...rest)
{
  return a.b(...rest)
}

binded()

But for our purposses, its better to use the Function.bind() sister methods, since this way we don't need to create a new Function object (wasting memory): Function.apply(), that allow to apply an explicit array of arguments to a method and this object defined as the first argument, or better Function.call() method, where arguments are taken implicitly after the one for the this object and so the signature is almost the same of the original function. These functions are the ones being used internally by the Array objects functional methods to set the execution context of their provided callback functions... like the Array.filter() method we were using at begin of this blog post.

Functional methods thisArg argument

Array objects functional methods exec the provided callback function using the global context by default as execution context, but most of them like map() or filter() allow to set it by using an additional optional thisArg argument (the most notable missing one is reduce(), since it already accept an optional initial acumulator value, and when setting just only one of them, it could lead to some indetermination about how do we want to use it). Since the execution context can be any object, we can be able to set this thisArg argument to whatever we want, so for functional methods it's perfect to use it to pass some config that will be common to all the calls to the functional method, without needing to waste memory by using nested closures or creating a bind'ed function:

function filterTest(info)
{
  const year = this.valueOf()

  return info.version.startsWith(`${year}.`)
}

function getUbuntuVersion(year = 19)
{
  return versions.filter(filterTest, year)
}

This also allow to pass mapping objects, so instead of doing...

return Object.keys(data).map(function(pid)
{
  const {childs, comm: label} = data[pid]

  return {
    label,
    nodes: pstree2archy(childs)
  }
})

...we can do...

return Object.keys(data).map(pid2archy, data)

function pid2archy(pid)
{
  const {childs, comm: label} = this[pid]

  return {
    label,
    nodes: pstree2archy(childs)
  }
}

...without requiring that pid2archy function to be in the same closure where data is defined, but instead being it in a more global closure or also defined in another file, like a library.

This has a little drawback: since the execution context can be any object, when passing a primitive value a thisArg argument it will be automatically promoted to an Object instance, being that a String for strings, a Number for numbers, and so on. That can give some problems, for example when searching for a field using the provided value (identity comparation === will never match with this new ad-hoc String object since they will be always different instances). That's why I'm using valueOf() to recover the original primitive. Another option to don't need to juggle with the conversion to object and extract the primitive, it's to already provide ourselves an object as thisArg argument, and set the value we are passing as a property of that object, so we get the original unmodified value:

function filterTest(info)
{
  const {year} = this

  return info.version.startsWith(`${year}.`)
}

function getUbuntuVersion(year = 19)
{
  return versions.filter(filterTest, {year})
}

Doing it this way it also prevent us of needing to do some refactory in case we need to pass later more values to the function. Also, since the original values are unmodified, we can use it to store some acumulator value and use it in a reduce()-like function.

Just a final disclaimer: in case you actually needs to get an Ubuntu release info, maybe it would be better if you use my package OS lifecycle instead of fetching the info and doing the queries yourself... ;-)

[^1]: I would need to check if this is still true with modern Javascript engines that make use of JIT compilers, or if they now do some optimizations like code inlining.

[^2]: In fact, this is the root cause of the infamous undefined is not a function error, specially when defining event handlers by devs comming from OOP languages that intuitively think that this will point to the object they are registering the event handler function (this is true by design when using Node.js EventEmitter)... while in fact they are registering just a function that will be executed when the event handler is dispatched, that will be done using the default execution context: the global window object (and no, probably the method you are calling will not be there). To make things "easier" to work with, when using strict mode (like the one used by methods defined in ES6 classes), default this object is set to the null object instead (yes, by a legacy implementation error, null is an object itself in Javascript, just only here is a happy coincidence), so it can provide a more helpful "can't not call method on null object" instead.

Written on December 6, 2020

Comment on Twitter

You can leave a comment by replying this tweet.