You mock, but I can imagine a perfectly reasonable implementation using one of those top-of-document coloured bars. It'd be better for the user to see a short, non-modal, notice along the lines of "This document may be truncated as less information was received from the server than was initially promised" rather than reading down to the cut off point in some HTML and wondering where the rest of it is.
In fact, I think there are a wide variety of places where browsers uniformly bad at helpfully surfacing information on problems to the user because, you know, error handling isn't sexy.
Posted by anon at April 30, 2009 3:20 AM"Thank God"?
Tell that those who do a "save as" and get a corrupted download, because it was truncated but the UA didn't tell them.
Posted by Julian Reschke at April 30, 2009 3:43 AMWhat's wrong with putting a "downloaded %d of %d bytes" in the statusbar if the page load stalls for a second, and a warning icon if it times out or downloads too much? No need for something idiotic like a modal box.
Posted by ant at April 30, 2009 6:33 AMThe neat thing is that with a fully compliant http 1.1 server (with connection keepalives and pipelining) the browser can't actually know - if theres less data than in the header the browser will just hang, and if theres more the browser will ignore it. Of course, then the subsequent requests get stuffed up, which is why pipelining sucks in practice...
The HTTP spec is better than the FTP spec, though. RFC959 has the 'LIST' command, which says: "Since the information on a file may vary widely from system to system, this information may be hard to use automatically in a program, but may be quite useful to a human user." Theres the NLST command but that only gives you names. Yay specifications!
Posted by Bradley Baetz at April 30, 2009 7:22 AManon, the user doesn't care whether the HTML got cut off before or after the HTTP server computed the Content-Length. They just care that it was cut off. And it's not that error handling isn't sexy, it's that users (myself included, when I have that hat on) don't care precisely why the server broke. They just want their stuff. A diagnostic tool is a different matter, but a MUST applies to all implementation equally, in theory.
Julian, there's nothing wrong with notifying the user on Save As (and some UAs do precisely that if they detect a connection drop). But the MUST means it has to be done for all HTTP requests. And of course the MUST requires reporting if the Content-Length is too small (actually more common, from my testing, than it being too big, and usually due to it just being computed wrong on the server).
ant, you seem to be assuming that the only HTTP request that matters is "the page". A typical "page" results in dozens, if not hundreds of HTTP requests.
Posted by Boris at April 30, 2009 11:09 AMClearly, the solution is to avoid "detecting" invalid lengths. If we don't detect it, we don't have to notify the user.
Posted by Jesse Ruderman at April 30, 2009 1:40 PMIt's a definite possibility, Jesse, but there's a problem with that solution, too. If an invalid length is not detected because no detection is being done, we have a problem. For example, a PHP page could send a custom Content-Length header, cutting things off (possibly for testing for compliance with the HTTP 1.1 spec), resulting in either a page with only a fraction of the necessary content or the UA hanging (unless the UA is smart enough to notice that no more data is being sent by the server, in which case a page timeout error would occur). In either case, there are issues.
However, is it really possible to detect this sort of thing in the case of the Content-Length header being greater than the actual length of the data? A UA would want to keep requesting data, but at the same time it wouldn't want to kill the server just to get the data (especially if the data doesn't exist). In the case of content_length > content_length_header, it is a simple case of requesting an extra few bytes of data to determine if the value of the Content-Length header was too small in comparison to the data of the page.
I'd say detecting invalid lengths is a bit more trivial than many of us might be led to believe. Of course, that doesn't mean it's a UA's job to do so. After all, who said anything about a UA being designed with courtesy in mind? ^_^
Posted by Dustin at April 30, 2009 5:36 PM> However, is it really possible to detect this
> sort of thing in the case of the Content-Length
> header being greater than the actual length of
> the data?
Yes, of course, if you get a Connection: close. On a pipelined connection, it's impossible, as Bradley pointed out.
I have no idea what you're talking about wrt "keep requesting data". The UA makes one and only one HTTP request per resource, typically (modulo redirects and such).
When I refer to "keep requesting data", I'm referring to moments during which data transfer is interrupted due to connection issues, in which case a UA might initiate another request to read the rest of the data.
After all, if a UA requested 100 KB (100 KO) of data for example, would it read all of it at once or would it read it in chunks? Obviously if the connection is terminated abruptly, it can't be completely read in chunks. As a result, the data must be requested again.
That is what I meant when I wrote "keep requesting data".
Posted by Dustin at May 1, 2009 6:28 AMDustin, you have a pretty weird idea of how HTTP works... With the exception of Range GET requests, a UA doesn't request "100KB" of data. It sends one and only one request; the server sends a response. Period. Initiating another request is explicitly forbidden in many cases (e.g. POST requests), and highly undesirable in all cases.
Posted by Boris at May 1, 2009 9:30 AMI believe we're looking at it from two different points of view, Boris. I'm looking at it from an implementation perspective, and you are looking strictly at the specification. The specification states that persistent connections are a part of the default behaviour for HTTP/1.1 in contrast to earlier versions of HTTP, and once a close has been signalled, the client must not send any more requests [1]. However, if the client is not signalled with a close due to an abnormal connection termination, would it not be the client's job to attempt to finish receiving the data, assuming that the client's time-out period was not reached?
[1] - http://tools.ietf.org/html/rfc2616#section-8.1.2
Posted by Dustin at May 2, 2009 12:22 AMMy initial reaction was: 'Surely it would be nice if the browser let the user know that the server is buggy. Maybe more server bugs would be fixed then.'
But after reading the comments I got the impression that the spec could be impossible to implement in practice. So which one is it? Fix the server or fix the spec? Or both?
Dustin,
> However, if the client is not signalled with a
> close due to an abnormal connection termination
> would it not be the client's job to attempt to
> finish receiving the data
You mean make more requests? No. That would in fact be a violation of the spec.
Mike,
> Surely it would be nice if the browser let the
> user know that the server is buggy
Could be! That doesn't mean the spec has to REQUIRE the browser to let the user know the server is buggy, which is the current state of things. And yes, in many cases the browser can't tell, but the spec doesn't require it to do anything there.
Posted by Boris at May 3, 2009 12:26 PM