Saturday, July 30, 2011

SPDY Server Push in node-spdy Revisited

‹prev | My Chain | next›

I just realized that some of the SPDY server push stuff that I write about in the SPDY Book is not actually available in node-spdy. Today, I would like to rectify that situation.

I had been spiking in the node_modules directory of a test application. That was not the brightest thing to do (I should have either installed from my local copy of node-spdy or sym-linked it). Anyhow, I copy my changes into my local copy of the node-spdy repo and am left with changes to two files:
➜  node-spdy git:(server-push-fixes) ✗ gst
# On branch server-push-fixes
# Changes not staged for commit:
# (use "git add ..." to update what will be committed)
# (use "git checkout -- ..." to discard changes in working directory)
#
# modified: lib/spdy/push_stream.js
# modified: lib/spdy/response.js
#
no changes added to commit (use "git add" and/or "git commit -a")
The only change to the PushStream class is the last-modified headers, which are required (at least by Chrome) for pushing CSS into the browser cache. I had hard coded that value:
    this._headers["last-modified"] = "Wed, 20 Jul 2011 01:34:27 GMT";
To get that date format (GMT), I cannot simply call toString() on a new Date object. Rather, I have to call toGMTString():
> (new Date).toString();
'Sat Jul 30 2011 15:15:50 GMT-0400 (EDT)'
> (new Date).toGMTString();
'Sat, 30 Jul 2011 19:15:55 GMT'
I do not use the actual last-modified date here, because there is no cache invalidation in SPDY server push. SPDY server push pushes directly into browser cache regardless of whether or not the browser already has the files. That is not a huge loss since most push occurs after the original request has been fulfilled. In an HTTP world, SPDY server push is all bonus.

My change to PushStream thus becomes no more than the following change to the constructor:
  this._headers = {
status: 200,
version: "http/1.1",
url: url,
"last-modified": (new Date).toGMTString()
};
Most of the actual change takes place in the Response class. It is the Response that needs to initiate push streams and send data out before and after the response proper has been sent. The API that I would like to support is a callback to the standard express.js createServer call:
  push: function(pusher) {
// Only push in response to the first request
if (pusher.streamID > 1) return;

var host = "https://jaynestown.local:3000/";

// Push immediately with pushFile
pusher.pushFile("public/stylesheets/style.css", host + "/stylesheets/style.css");

// Push resources that can be deferred until after the response is
// sent
pusher.pushLater([
["public/one.html", host + "one.html"],
["public/two.html", host + "two.html"],
["public/three.html", host + "three.html"]
]);
}
In either case, I need to tell the Push stream where to find the resource on the filesystem and under with what URL to push the response. It is possible to infer the url from the file system location, but I will not worry about that for now.

In Response, when writing the response data to the browser, I invoke the push callback:

Response.prototype._write = function(data, encoding, fin) {
if (!this._written) {
this._flushHead();
this._push_stream();
}
The pushLater() method from the push callback is responsible for sending out the response headers (SPDY headers are separate from data) and for remembering the data to be pushed:
Response.prototype.pushLater = function(resources) {
var that = this;

this.deferred_streams = [];

// Send headers for each post-response server push stream, but DO
// NOT sent data yet
resources.forEach(function(push_contents) {
var filename = push_contents[0]
, url = push_contents[1]
, data = fs.readFileSync(filename)
, push_stream = createPushStream(that.cframe, that.c, url);

push_stream._flushHead();
push_stream._written = true;
that.deferred_streams.push([push_stream, data]);
});
};
Then, after the response is written, the deferred data can be pushed:
  this.c.write(dframe);

// ...

// Push any deferred data streams
this._pushLaterData();
The only "real" change to the overall behavior is the support of deferred push with the addition of the pushLater method. I have also renamed the old push_file as pushFile to better fit Javascript and node-spdy conventions (I do leave a deprecated push_file() wrapper to retain backwards compatibility).

With that, there is little to do aside from trying it out in the browser. I do it right this time and sym-link my copy of the node-spdy repository into the application's node_modules. Loading it up in the browser, and checking Chrome's SPDY tab about:net-internals, I see that the reply to the web page request is immediately followed by the CSS being pushed directly into browser cache:
t=1312064818809 [st=124]     SPDY_SESSION_SYN_REPLY  
--> flags = 0
--> connection: keep-alive
content-length: 50360
content-type: text/html
status: 200 OK
version: HTTP/1.1
x-powered-by: Express
--> id = 1
t=1312064818812 [st=127] SPDY_SESSION_PUSHED_SYN_STREAM
--> associated_stream = 1
--> flags = 2
--> last-modified: Sat, 30 Jul 2011 22:26:58 GMT
status: 200
url: https://jaynestown.local:3000/stylesheets/style.css
version: http/1.1
--> id = 2
t=1312064818812 [st=127] SPDY_SESSION_RECV_DATA
--> flags = 0
--> size = 111
--> stream_id = 2
t=1312064818812 [st=127] SPDY_SESSION_RECV_DATA
--> flags = 0
--> size = 0
--> stream_id = 2
Even the data associated with the CSS is pushed into browser cache as evidenced by the SPDY_SESSION_RECV_DATA events with the stream ID (2) of the CSS push stream.

Once node-spdy has sent out the pushFile() resource, it is time to push the pushLater() resources, but only the headers:
t=1312064818814 [st=129]     SPDY_SESSION_PUSHED_SYN_STREAM  
--> associated_stream = 1
--> flags = 2
--> content-type: text/html
last-modified: Sat, 30 Jul 2011 22:26:58 GMT
status: 200
url: https://jaynestown.local:3000/one.html
version: http/1.1
--> id = 4
t=1312064818815 [st=130] SPDY_SESSION_PUSHED_SYN_STREAM
--> associated_stream = 1
--> flags = 2
--> content-type: text/html
last-modified: Sat, 30 Jul 2011 22:26:58 GMT
status: 200
url: https://jaynestown.local:3000/two.html
version: http/1.1
--> id = 6
t=1312064818817 [st=132] SPDY_SESSION_PUSHED_SYN_STREAM
--> associated_stream = 1
--> flags = 2
--> content-type: text/html
last-modified: Sat, 30 Jul 2011 22:26:58 GMT
status: 200
url: https://jaynestown.local:3000/three.html
version: http/1.1
--> id = 8
t=1312064818821 [st=136] SPDY_SESSION_RECV_DATA
--> flags = 0
--> size = 8184
--> stream_id = 1
Once all of the push headers have been sent, then node-spdy begins to send the response to the originally requested resource. The stream ID (1) tells us that this is for the original request and not one of the push streams which all have IDs of 2 or higher.

After all of the response data has been sent out, only then do the push resources begin to go out:
t=1312064818823 [st=138]     SPDY_SESSION_RECV_DATA  
--> flags = 0
--> size = 0
--> stream_id = 1
t=1312064818857 [st=172] SPDY_SESSION_RECV_DATA
--> flags = 0
--> size = 8184
--> stream_id = 4
That continues all the way through stream ID #8 at which point Chrome acknowledges that we have a legitimate push stream via the a SPDY_STREAM_ADOPTED_PUSH_STREAM event:
t=1312064818884 [st=199]     SPDY_SESSION_RECV_DATA  
--> flags = 0
--> size = 0
--> stream_id = 8
t=1312064818929 [st=244] SPDY_STREAM_ADOPTED_PUSH_STREAM
Nice! That's a good stopping point for today. I will push my new branch to the node-spdy github repository and discuss with Fedor Indutny to make sure it aligns with his thinking.

For now, it's back to slogging through the last edits of SPDY Book!


Day #98

No comments:

Post a Comment