UPDATE / TLDR;
As it turns out, this is the approach that, under the hood, makes ajv the fastest JSON validator library.
Also - someone took my idea to the next level, and used it to speed up "summing over an object's properties" by over 100x across the browser spectrum - find his jsperf here:
The pink bar represents his "pre-compiled sum" approach which just leaves all the other approaches and operations in the dust.
What is the trick?
Using the pre-compiled sum approach, he used my code to automatically generate this code:
var x = 0;
x += o.a;
x += o.b;
x += o.c;
// ...
which is way faster than this:
var x = 0;
for (var key in o) {
x += o[key];
}
...especially if the order in which we access the properties (a
, b
, c
) matches the order in o
's hidden class.
Long explanation follows:
Faster object property loops
Let me start by saying, for ... in
loops are just fine, and you only want to think of this in performance-critical code with a lot of CPU and RAM usage. Usually, there is more important stuff you should spend your time on. However, if you are a performance-freak, you might be interested in this near-perfect alternative:
Javascript Objects
Generally, there are two use-cases for JS objects:
- "Dictionaries", a.k.a "associative arrays" are general containers with a varying set of properties, indexed by string keys.
- "Objects of constant type" (for which the so-called hidden class is always the same) have a fixed set of properties of fixed order. Yes! - While the standard does not guarantee any order, modern VM implementations all do have a (hidden) order, to speed things up. It will be crucial to always maintain that order, as we explore later.
Using "objects of constant type" instead of "dictionary types" is generally a whole lot faster because the optimizer understands the structure of these objects. If you are curious as to how to achieve that, you might want to check out Vyacheslav Egorov's blog which sheds a whole lot of light on how V8 but also other Javascript run-times, work with objects. Vyacheslav explains Javascript's object property look-up implementation in this blog entry.
Looping over an object's properties
The default for ... in
is certainly an Ok choice to iterate over all properties of objects. However, for ... in
might treat your object as a dictionary with string keys, even if it has a hidden type. In that case, in every iteration you have the overhead of a dictionary lookup, which is often implemented as a hashtable lookup. In many cases, the optimizer is smart enough to avoid that, and performance is on par with constant naming of your properties, but it is simply not guaranteed. Often enough, the optimizer can't help you, and your loop will run a whole lot slower than it should. The worst thing is though that sometimes that is unavoidable, especially if your loop gets more complex. Optimizers are just not that smart (yet!). The following pseudocode describes how for ... in
works in slow mode:
for each key in o: // key is a string!
var value = o._hiddenDictionary.lookup(key); // this is the overhead
doSomethingWith(key, value);
An unrolled, un-optimized for ... in
loop, looping over an object with three properties ['a', 'b', 'c'] of given order, looks like this:
var value = o._hiddenDictionary.lookup('a');
doSomethingWith('a', value);
var value = o._hiddenDictionary.lookup('b');
doSomethingWith('b', value);
var value = o._hiddenDictionary.lookup('c');
doSomethingWith('c', value);
Assuming that you cannot optimize doSomethingWith
, Amdahl's law tells us that you can gain a lot of performance if and only if:
doSomethingWith
is very fast already (when compared to the cost of the dictionary lookup) and
- you can actually get rid of that dictionary lookup overhead.
We can indeed get rid of that lookup using, what I call, a pre-compiled iterator, a dedicated function that iterates over all objects of a fixed type, i.e. a type with a fixed set of properties of fixed order, and performing a specific operation on all of them. That iterator explicitly calls a callback (let's call it doSomethingWith
) on each of your properties by their proper name. As a result, the run-time can always make use of the type's hidden class, without having to rely on promises by the optimizer. The following pseudocode describes how the pre-compiled iterator works for any object with the three properties ['a', 'b', 'c']
in given order:
doSomethingWith('a', o.a)
doSomethingWith('b', o.b)
doSomethingWith('c', o.c)
There is no overhead. We don't need to look anything up. The compiler already can trivially compute the exact memory address of each of the properties, using the hidden type information, and it even uses the most cache-friendly order of iteration. This is also (very very close to) the fastest code you can get with for...in
and a perfect optimizer.
Performance Test
This jsperf shows that the the pre-compiled iterator approach is quite a bit faster than the standard for ... in
loop. Note though that the speed-up largely depends on how the object is created and on the complexity of the loop. Since this test only has very simple loops, you sometimes might not observe much of a speed-up. However, in some of my own tests, I was able to see a 25x speed-up of the pre-compiled iterator; or rather a significant slow-down of the for ... in
loop, because the optimizer was not able to get rid of the string-lookups.
With more tests coming in, we can draw some first conclusions on different optimizer implementations:
- The pre-compiled iterator is generally performing a whole lot better, even in very simple loops.
- In IE, the two approaches show the least variance. Bravo Microsoft for writing a decent iteration optimizer (at least for this particular problem)!
- In Firefox,
for ... in
is the slowest by a huge margin. The iteration optimizer does not do a good job over there.
However, the tests have a very simple loop body. I am still looking for a test case where the optimizer can never achieve constant indexing, across all (or almost all) browsers. Any suggestions are very welcome!
Code
JSFiddle here.
The following compileIterator
function pre-compiles an iterator for any type of (simple) object (disregarding nested properties, for now). The iterator needs a bit of extra information, representing the exact type of all objects it should iterate over. Such type information can generally be represented as an array of string property names, of the exact order, which the declareType
function takes to create a convenient type object. If you want to see a more complete example, refer to the jsperf entry.
//
// Fast object iterators in JavaScript.
//
// ########################################################################
// Type Utilities (define once, then re-use for the life-time of our application)
// ########################################################################
/**
* Compiles and returns the "pre-compiled iterator" for any type of given properties.
*/
var compileIterator = function(typeProperties) {
// pre-compile constant iteration over object properties
var iteratorFunStr = '(function(obj, cb) {
';
for (var i = 0; i < typeProperties.length; ++i) {
// call callback on i'th property, passing key and value
iteratorFunStr += 'cb('' + typeProperties[i] + '', obj.' + typeProperties[i] + ');
';
};
iteratorFunStr += '})';
// actually compile and return the function
return eval(iteratorFunStr);
};
// Construct type-information and iterator for a performance-critical type, from an array of property names
var declareType = function(propertyNamesInOrder) {
var self = {
// "type description": listing all properties, in specific order
propertyNamesInOrder: propertyNamesInOrder,
// compile iterator function for this specific type
forEach: compileIterator(propertyNamesInOrder),
// create new object with given properties of given order, and matching initial values
construct: function(initialValues) {
//var o = { _type: self }; // also store type information?
var o = {};
propertyNamesInOrder.forEach((name) => o[name] = initialValues[name]);
return o;
}
};
return self;
};
And here is how we use it:
// ########################################################################
// Declare any amount of types (once per application run)
// ########################################################################
var MyType = declareType(['a', 'b', 'c']);
// ########################################################################
// Run-time stuff (we might do these things again and again during run-time)
// ########################################################################
// Object `o` (if not overtly tempered with) will always have the same hidden class,
// thereby making life for the optimizer easier:
var o = MyType.construct({a: 1, b: 5, c: 123});
// Sum over all properties of `o`
var x = 0;
MyType.forEach(o, function(key, value) {
// console.log([key, value]);
x += value;
});
console.log(x);
JSFiddle here.