Mitigating Catastrophic Backtracking in Node.js Regular Expressions

Recently I’ve been learning a bit about Node.js
I have a guilty pleasure of really enjoying writing Regular Expressions, I think it stems from some anti-spam work I did a while back.

The issue of catastrophic backtracking is a plague on regex, and can be a pain to test as it can only happen with certain input that may not be covered during testing.

Libraries like node-rsa on npm have had issues where a catastrophic backtrack can cause a full denial of service (DOS) of the node process. My angle of interest is from the information security side, where a malicious adversary may craft a payload such as this with the intent to actually DOS your server-side JavaScript.

As the RegExp constructor in JavaScript doesn’t have a timeout feature (which it should!) I thought of a trick using Node.js’s core vm module. (available in v0.12.x, v4.x and v5.x branches)

const util = require('util');
const vm = require('vm');
 
var sandbox = {
    result: null
};
 
var context = vm.createContext(sandbox);
console.log('Sandbox initialized: ' + vm.isContext(sandbox));
 
var script = new vm.Script('result = /^(A+)*B/
    .test(\'AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAC\');');
 
try{
    // One could argue if a RegExp hasn't processed in a given time.
    // then, its likely it will take exponential time.
    script.runInContext(context, { timeout: '1000' }); // milliseconds
} catch(e){
    console.log('ReDos occurred'); // Take some remedial action here...
}
 
console.log(util.inspect(sandbox)); // Check the results

Ideally using JavaScript’s protatypal inheritance one would overload the RegExp() constructor to include a timeout to tidy away all this code. I didn’t go to those lengths here, but perhaps I would in future.