feat: Improve performance of Uint8Array Hex functions#1510
feat: Improve performance of Uint8Array Hex functions#1510zloirock merged 15 commits intozloirock:masterfrom
Conversation
|
My older benchmarks used buggy implementations. The newer benchmarks comparing the 3 approaches is posted below: IE11 benchmarks for from-hexVariant 0(Original polyfill) function uncurryThis(fn) {
return function() {
return Function.prototype.call.apply(fn, arguments);
};
}
var min = Math.min;
var NOT_HEX = /[^\da-f]/i;
var exec = uncurryThis(NOT_HEX.exec);
var stringSlice = uncurryThis(''.slice);
exp = function (string, into) {
var stringLength = string.length;
if (stringLength % 2 !== 0) throw new SyntaxError('String should be an even number of characters');
var maxLength = into ? min(into.length, stringLength / 2) : stringLength / 2;
var bytes = into || new Uint8Array(maxLength);
var read = 0;
var written = 0;
while (written < maxLength) {
var hexits = stringSlice(string, read, read += 2);
if (exec(NOT_HEX, hexits)) throw new SyntaxError('String should only contain hex characters');
bytes[written++] = parseInt(hexits, 16);
}
return { bytes: bytes, read: read };
};
var testString = "deadbeefcafebabedeadbeefcafebabedeadbeefcafebabedeadbeefcafebabedeadbeefcafebabedeadbeefcafebabedeadbeefcafebabedeadbeefcafebabedeadbeefcafebabe";
var iterations = 10000;
console.time("hexToBytes benchmark");
for (var i = 0; i < iterations; i++) {
exp(testString);
}
console.timeEnd("hexToBytes benchmark");Results: Variant 1(Using exec with each segment) function uncurryThis(fn) {
return function() {
return Function.prototype.call.apply(fn, arguments);
};
}
var min = Math.min;
var NOT_HEX = /[^\da-f]/i;
var exec = uncurryThis(NOT_HEX.exec);
var stringSlice = uncurryThis(''.slice);
exp = function (string, into) {
var stringLength = string.length;
if (stringLength & 1) throw new SyntaxError('String should be an even number of characters');
var maxLength = (into && into.length < (stringLength >> 1) ? into.length : (stringLength >> 1));
var bytes = into || new Uint8Array(maxLength);
var segments = stringMatch(string, /.{2}/g);
for (var written = 0; written < maxLength; written++) {
if (exec(NOT_HEX, segments[written])) {throw new SyntaxError("String!!!")};
var result = parseInt(segments[written], 16);
bytes[written] = result;
}
return { bytes: bytes, read: written << 1 };
};
var testString = "deadbeefcafebabedeadbeefcafebabedeadbeefcafebabedeadbeefcafebabedeadbeefcafebabedeadbeefcafebabedeadbeefcafebabedeadbeefcafebabedeadbeefcafebabe";
var iterations = 10000;
console.time("hexToBytes benchmark");
for (var i = 0; i < iterations; i++) {
exp(testString);
}
console.timeEnd("hexToBytes benchmark");Result: Variant 2(used) var $Number = Number;
var $isNaN = $Number.isNaN;
var stringMatch = uncurryThis(''.match);
exp = function (string, into) {
var stringLength = string.length;
if (stringLength & 1) throw new SyntaxError('String should be an even number of characters');
var maxLength = (into && into.length < (stringLength >> 1) ? into.length : (stringLength >> 1));
var bytes = into || new Uint8Array(maxLength);
var segments = stringMatch(string, /.{2}/g);
for (var written = 0; written < maxLength; written++) {
var result = $Number("0x" + segments[written]);
if ($isNaN(result) && result.trim() === result) throw new SyntaxError('String should only contain hex characters');
bytes[written] = result;
}
return { bytes: bytes, read: written << 1 };
};
var testString = "deadbeefcafebabedeadbeefcafebabedeadbeefcafebabedeadbeefcafebabedeadbeefcafebabedeadbeefcafebabedeadbeefcafebabedeadbeefcafebabedeadbeefcafebabe";
var iterations = 10000;
console.time("hexToBytes benchmark");
for (var i = 0; i < iterations; i++) {
exp(testString);
}
console.timeEnd("hexToBytes benchmark");Result: |
|
@zloirock This is ready for you take a look at. I'll leave the base64 aspects to another PR. |
|
@zloirock It's been a few days -- if you've got a moment, could you please take a look at this? Thanks. |
| var stringSlice = uncurryThis(''.slice); | ||
| var $Number = globalThis.Number; | ||
| var $isNaN = $Number.isNaN; | ||
| var stringMatch = uncurryThis(''.match); |
There was a problem hiding this comment.
Use I think here it's OK. But I'm not sure that it's good for performance..exec if it's possible, since, unlike .match, it does not have significant side effects.
There was a problem hiding this comment.
It is very good for performance -- with one of the benchmarks for IE11 in my original post (the last few are wrong implementations...) it speeds up a few times than slicing every time.
|
@zloirock Done with requested changes. |
|
@zloirock It's been a few days, when you've got time after you're done fixing various bugs, can you take a look at this again? Thanks. |
| // 2x faster than naively using a regex to check each hexit. Number constructor | ||
| // is maximally strict, except for whitespace which it ignores, so special-case | ||
| // this. | ||
| var result = $Number('0x' + segments[written] + '0'); |
There was a problem hiding this comment.
Since Number constructor is polyfilled before this module, it definitely can't be fast.
| var bytes = into || new Uint8Array(maxLength); | ||
| var read = 0; | ||
| // This splitting is faster than using substrings each time. | ||
| var segments = stringMatch(string, /.{2}/g); |
There was a problem hiding this comment.
If you really wanna make it fast, you could just iterate by chars and check char codes.
There was a problem hiding this comment.
@zloirock tried that, actually benchmarked slower.
There was a problem hiding this comment.
@johnzhou721 it can't be slower. In any engine. It can be bad tests.
|
Thanks. |
|
@zloirock Thanks for the reviews, and yes you're right about the bad testing... i got myself confused here with all the different parsing options I was testing. Also sorry for the repeated pings... I wanted to make sure nothing was missed. I see you pushing a lot to the main branch but being less responsive on issues/prs -- were you just really busy with the long list of bugs you have at hand, and only check periodically? If so, I'll respect that. |
Refs #1503 for the source of inspiration to do this. This PR significantly improves the performance of Uint8array hex functions.
Benchmarks / Console logs on IE11 for from hex (including final performance of from-hex functions in addition to all the moments used for the decisions of what do put)