-
-
Notifications
You must be signed in to change notification settings - Fork 238
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adding callback to res.write and res.end for streaming support #80
Conversation
Thanks! This looks fine to me. Can you add tests and documentation as well? We cannot accept this pull requests without at least tests fully testing all new code paths. From looking at this, we probably need tests for at least checking that the callback works for .write, it works for .end and maybe that they are called in the correct order. |
Also thinking about the problem described, it almost seems like another (better?) solution would be to actually have the call to res.flush() internally queue up a flush for when all previous writes complete. This may have the added benefit to work more like what people would expect when calling .write() + .flush(). Thoughts? |
Thanks for the fast response! I'd be happy to write tests and documentation, although that might have to wait until tomorrow. As for your suggestion, you're saying it might be better to make it so If the implementation you're envisioning is simple, I say let's go for your recommendation. Otherwise, this PR works just fine. I'm trying to think through how we would determine that all queued |
I was doing some more thinking and it seems to me that providing callbacks is the most flexible solution. Let's say that, for some reason, a customer wants to execute a specific set of code only after a specific write or flush step has completed. If write does not provide a callback then this isn't possible. If flush is blocked by queued write commands but doesn't provide a callback then we've simply moved the race condition. I still like your proposed solution, but it doesn't appear to mitigate the need for callbacks. So we could do only callbacks or we could block flushes on queued writes and allow flush to accept a callback. What do you think? |
Yea, I definitely agree that having the callbacks are useful for certain workflows. I guess I was mostly thinking about your original report that calling So perhaps from your description plus your proposal here gives us two separate tasks (two different pull requests):
Adding the callbacks is useful, absolutely, but not really a change that helps with the |
Good points, lets fix it right! I'd be happy to help because I would really benefit from this in some production applications as well as in my open-source work referenced above. Does the rough algorithm I outlined above seem reasonable, or do you have a better suggestion? I'm happy to code this if we can agree on an algorithm ahead of time. |
I apologize for being a badger on this, but I was hoping we could discuss implementation so I know how you prefer I code this. Or, if you're coding it, please let me know and I'll stop bugging you :) |
Hey, sorry, just been busy catching up after a long vacation :) I'm not coding it currently, no, so there would be no duplicate effort if you were going to do it. As for the algorithm, I can't really say for sure without digging into the code, but it roughly sounds fine to me :) |
OK thanks, Doug! I'll get on it and hopefully have a proposition (without unit tests and updated docs at first) in 1-3 days. |
Doug, I've taken a first stab at making calls to What do you think? |
Hi @jpodwys, ignoring any roughness of the code, it does seem pretty different than your initial idea of just doing counting, and seems like it's probably a lot more complex than necessary. I would think you could just track when something was written to the zlib stream and then when that stream drained out. Knowing that state, you could then handle a flush by either just passing it though if zlib stream has been drained, or queue up all future writes to the zlib stream until it drains, then flush + write all and start the loop again. This would probably make it easier to implement back-pressure semantics, which the current WIP is lacking (and seems like it would be pretty difficult to add in). |
I'm not having the time to implement the complete race condition fix we've outlined here like I thought I would. Are you willing to merge the addition of callbacks so users can get around the async issue in the mean time? |
Hi @jpodwys, I certainly can! This just brings us back to the comments in #80 (comment) to address here :) |
I've rebased and updated the unit tests and readme. |
test/compression.js
Outdated
.get('/') | ||
.set('Accept-Encoding', 'gzip') | ||
.expect('Content-Encoding', 'gzip') | ||
.end(function(){ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The first argument to this callback is an err
and the code here is not handling the error. Please add error handling.
Awesome, @jpodwys! Is there a reason we have to change the example? I thought it was agreed to be a bug and that it needs to be fixed, just not in this PR. As such, unless we are going to put the word out there is a breaking change everyone needs to make to use the callback, I think the readme should not be changed at this time. Also, it looks like you still have some outstanding tasks from #80 (comment), namely:
I don't see any test verifying the callback order, or even all the code paths being covered. An example of one of the un-covered code path is there is no tests that the callbacks are functioning when the response is being compressed (i.e. the callbacks are only tests in half of those ternaries). |
test/compression.js
Outdated
res.setHeader('Content-Type', 'text/plain') | ||
res.write('Hello', null, function(){ | ||
callbacks++ | ||
res.flush() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since the test is labeled "when callbacks are used should call the passed callbacks", it doesn't sound like the test should be testing .flush()
functionality. Is there a reason to call .flush()
in theses callbacks? If it is testing something, should that be a different test, or can it be reflected in the name?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point, I'll remove it. I was simply including .flush()
for completeness since nothing makes it to the client without that call.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Gotcha. So perhaps that would call for two different tests? Testing the callback when not flushed vs testing with flushed? DO you think that would be a meaningful test?
I prefer the first option. Doing the below, I'm able to determine whether the available cb = (res._write.length === 3) ? cb : noop You'll notice that in my latest commit, I attached This is getting a little strange--I've rarely seen assumptions based on argument list lengths. Any feedback is welcome. But at least there's progress--the tests now pass in all configured node versions. |
index.js
Outdated
|
||
var _end = res.end | ||
res._end = res.end |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You cannot store the references as res._end and _write. There is no code where that check nothing is being overwritten, for example. Using this module twice will no longer function, etc.
Doing this causes too many bugs to be able to accept. Why did you have to change this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're right that doing this isn't going to work, but I documented why I did it in the comment just above this one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You'll notice that in my latest commit, I attached _write and _end to res . I did this so that I can determine from the unit tests whether the pre-existing .write() and .end() calls accept a callback parameter.
Ah, sorry, I missed that. Unfortunately that is not an acceptable solution, due to the bugs it is causing from actual real-world apps I dropped your PR into in the last few hours. There is other code from npm that has written to those variables and responses are hanging or causing a stack overflow now.
You will need you use a different method for that check in the tests, not compromise the self-contained nature of this module. For example, add a middleware prior to this on in the tests and check against that.
In fact, the necessity to do this seems like a critical flaw to this functionality, for example, another module overwrites the moths with the three argument form, and just passes them upstream. This will now erroneously think callbacks are supported and we end up in the same buggy situation we set out to solve. It seems the method of using function length is just too fragile and I encourage you to come up with a different solution.
It seems that the solution you proposed where we detect whether the upstream stream accepts callbacks is not reasonable. Essentially, I'd have to accept a black box function that could be any number of layers abstracted from the original Another solution you proposed is to implement support for calling the callbacks at the correct time even if the upstream stream does not support callbacks. I would prefer to only implement it when it's not already natively supported, but because I can't tell whether it's supported, I would have to implement it 100% of the time thereby bypassing native support for it in newer versions of node. In order to accomplish this, I would need to either prototype into This brings us back to the idea of forcing |
So all the proposals so far were just my initial thoughts, off the top of my head kinds of thoughts. Without having the time to actually sit down to look into this new feature, that's pretty much the best I can do. I'm leaving it up to you to bring a solution to the table until I can get time to work on this (since this is an enhancement, not a bug fix, it falls at least in my enhancement queue (you can publicly view this queue at https://github.com/pulls?q=is%3Aopen+assignee%3Adougwilson+label%3Aenhancement+sort%3Acreated-asc). I can take some time out to consult on your feature, and comment on the implementation, issues, and provide possible solutions, but I just don't know the solution off the top of my head. This may bring us back around to #80 (comment) on if you are actually trying to fix a bug, perhaps we should focus our efforts on trying to fix the bug vs trying to add a new feature that would enable you to work-around the bug? I'm not sure, it's up to you on the best approach you want to take. |
In my latest attempt, I'm looking at Some changes you requested earlier to the unit test error handling still need to happen, but please let me know what you think of this approach so I know how to proceed. |
Perhaps this latest commit still fails this test though:
|
Yea, I mentioned not testing for Node.js versions, because I have been burned multiple times trying to do this (and the io.js team stressed only doing feature detection, and they are now the Node.js team), especially with people these days running these modules on non-Node.js runtimes, for better or worse. Currently there is a big push to get Express to work better with non-Node.js-core HTTP servers, while this PR is now in direct conflict with a core Express.js directive (this module falls under the Another issue from jumping up and checking the "root write" is that it also glosses over people trying to use other middleware in their Express.js application. There are many popular middleware that overwrite |
Noticed the link to the other issue here. As a note, I'm still trying to get synchronous executing working. I believe I've simplified my original algorithm quite a bit. Hopefully I'll have a new approach within a few days. |
@dougwilson I've finally come up with an insanely simple way to make I'd like to know what you think of the code changes (+6 and -1 lines). It relies on the fact that I've tested it in the application that originally prompted me to open this PR and it works like a charm. |
Hi @jpodwys, it seems fine. I would like to understand why the test doesn't pass on 0.8 (to determine if it's actually an issue to be resolved) and also see an issue I would like to discuss. I don't want to diverge from this pull request which attempts to add callback support for |
OK I've moved it to #84 but it has all the commit history from this one with it. |
@jpodwys I was having the same problem you were but found a work around.
Basically what this does is it attempts to write the content. If stream needs to drain then it will wait for the drain and then flush. If the write was successful it will just flush and move on. |
d7bb81b
to
cd957aa
Compare
@whitingj sorry for the extremely late response, I just stumbled onto this PR again. Thanks for the note! |
I want to be able to use
compression
to enable GZIP on streamed responses that flush multiple chunks to the browser whenever the implementer wants. (This is especially useful when implementing a BigPipe algorithm such as with myexpress-stream
library.) As a result, I need a sure way of flushing only after a chunk of output has been zipped.Currently, the following code encounters a race condition because
res.write
is not blocking. As a result, it's possible thatres.flush
will execute beforeres.write
completes.I've confirmed the above by ensuring this works as expected:
What I'd prefer to do, and what this PR enables, is the following: