[jQuery] jQuery selectors speed improvements

[jQuery] jQuery selectors speed improvements


I've recognized that lots of people recently complained about the
jQuery selector speeds. Especially because Base2 and DomQuery seem
to be ultra-fast in this area nowadays. As a man with parsing
backgrounds, I've looked at jQuery's current (version in SVN as of
today) "selector.js", noticed a few points and spontaneously hacked a
little bit on it.
In general IMHO the code does too much (in case they are invariants)
or even unneccessary (in case they are not used) calculations under
run-time of jQuery.find(). Those calculations can be either completely
eliminated or at least moved out of the run-time loops. After about 2
hours of surgical hacking in "selector.js" this morning I at least got
the jQuery test suite already running in about 10% less time:
Run# Before After
1 7365 6543
2 7417 6631
3 7343 6931
4 7504 6788
5 7544 6720
6 7551 6781
7 7574 6805
8 7616 7024
9 7415 6920
10 7523 6906
==== ====
AVERAGE: 7485 6804 (milliseconds)
But let's forget those benchmarks numbers here. Any benchmark numbers
should just give an optimization trend and not treated like really
comparable run-times.
But as the test suite does also lots of non-selector based tests, the
bare speed increase in the selector engine might be even a little bit
larger. But hey, the 10% are already fine enough compared to the fact
that I've still not done heavy optimizations and instead just focused
on some eye-catching peep-hole optimizations to keep the code diff as
small as possible. So, there are certainly lots of possibilities to make
jQuery faster (again).
What I've done was just this:
- FLEXIBILITY: the sub-regex " *" in general should be replaced with
"\s*" as "\s" also allows the other whitespace characters \f, \n, \r,
\t, \v, \u00A0, \u2028, \u2029, etc.
- SPEED: as long as the order is not important (which is the case
here), constructs like "'?"?" can be replaced with the (usually faster)
character set based construct "['"]?".
- SPEED: unnecessary capturing parenthesis can be removed to
speed up processing. Or they at least can be replaced with the
non-capturing/grouping-only constructs "(?:...)".
- SPEED: unnecessary re-parsing of regular expressions under run-time
through the RegExp() constructor can be replaced with single load-time
parsings by moving the RegExp() constructors at least outside the
run-time function calls.
- SPEED: regex engine based constructs like
m = re.exec(); [...]; t = t.replace(re, "");
can be CAREFULLY replaced with the faster
m = re.exec(); [...]; t = t.substring(m[0].length);
as long as (AND ONLY IF THIS IS THE CASE) "re" is anchored to the
beginning with "^".
You see, nothing spectacular, just the usual amount of small peep-hole
optimizations one can make without changing the code a lot. But it
already helped. If more would be investigated on "selector.js" I think
we can achieve even more speed increase for jQuery because there are
more small invariants or other calculations which could be at least
moved out of the run-time loops or perhaps even replaced with some
data structure based lookups, etc. But at least you get the general
idea: let us do as less calculations under run-time as possible and if
calculations are needed try to do them just once.
Find my patch (a diff against jQuery SVN as of today) appended below.
If you want take it as a seed for your optimizations...
Ralf S. Engelschall
rse@engelschall.com
www.engelschall.com
Index: src/selector/selector.js
===================================================================
--- src/selector/selector.js    (revision 1613)
+++ src/selector/selector.js    (working copy)
@@ -67,24 +67,28 @@
    // The regular expressions that power the parsing engine
    parse: [
        // Match: [@value='test'], [@foo]
-        /^\[ *(@)([\w-]+) *([!*$^=]*) *('?"?)(.*?)\4 *\]/,
+        /^\[\s*(@)([\w-]+)\s*([!*$^=]*)\s*(['"]?)(.*?)\4\s*\]/,
        // Match: [div], [div p]
        /^(\[)\s*(.*?(\[.*?\])?[^[]*?)\s*\]/,
        // Match: :contains('foo')
-        /^(:)([\w-]+)\("?'?(.*?(\(.*?\))?[^(]*?)"?'?\)/,
+        /^(:)([\w-]+)\(["']?(.*?(\(.*?\))?[^(]*?)["']?\)/,
        // Match: :even, :last-chlid, #id, .class
-        new RegExp("^([:.#]*)(" +
-            ( jQuery.chars = "(?:[\\w\u0128-\uFFFF*-]|\\\\.)" ) + "+)")
+        new RegExp("^([:.#]?)((?:[\\w\u0128-\uFFFF*-]|\\\\.)+)")
    ],
+    parse2: {
+        "id1": new RegExp("^(\\w+)(#)((?:[\\w\u0128-\uFFFF*-]|\\\\.)+)"),
+        "id2": new RegExp("^([#.]?)((?:[\\w\u0128-\uFFFF*-]|\\\\.)*)")
+    },
+
    token: [
-        /^(\/?\.\.)/, "a.parentNode",
-        /^(>|\/)/, "jQuery.sibling(a.firstChild)",
-        /^(\+)/, "jQuery.nth(a,2,'nextSibling')",
-        /^(~)/, function(a){
+        /^\/?\.\./, "a.parentNode",
+        /^(?:>|\/)/, "jQuery.sibling(a.firstChild)",
+        /^\+/, "jQuery.nth(a,2,'nextSibling')",
+        /^~/, function(a){
            var s = jQuery.sibling(a.parentNode.firstChild);
            return s.slice(jQuery.inArray(a,s) + 1);
        }
@@ -160,7 +164,7 @@
                            r.push( c );
                ret = r;
-                t = t.replace( re, "" );
+                t = t.substring( m[0].length );
                if ( t.indexOf(" ") == 0 ) continue;
                foundToken = true;
            } else {
@@ -178,7 +182,7 @@
                            fn : new Function( "a", "return " + fn ) );
                        // And remove the token
-                        t = jQuery.trim( t.replace( re, "" ) );
+                        t = jQuery.trim( t.substring( m[0].length ) );
                        foundToken = true;
                        break;
                    }
@@ -203,8 +207,8 @@
                    t = " " + t.substr(1,t.length);
                } else {
-                    // Optomize for the case nodeName#idName
-                    var re2 = new RegExp("^(\\w+)(#)(" + jQuery.chars + "+)");
+                    // Optimize for the case nodeName#idName
+                    var re2 = jQuery.parse2["id1"];
                    var m = re2.exec(t);
                    // Re-organize the results, so that they're consistent
@@ -214,7 +218,7 @@
                    } else {
                        // Otherwise, do a traditional filter check for
                        // ID, class, and element selectors
-                        re2 = new RegExp("^([#.]?)(" + jQuery.chars + "*)");
+                        re2 = jQuery.parse2["id2"];
                        m = re2.exec(t);
                    }
@@ -269,7 +273,7 @@
                        ret = r;
                    }
-                    t = t.replace( re2, "" );
+                    t = t.substring( m[0].length );
                }
            }