April Fool's Day happened over the weekend, which means no April Fool's Day post. I mean, I could have, but instead, I thought it would be fun to look into something that, once upon a time, wasn't a WTF, but honestly, always should have been.
Our story starts with Jake, a relatively modern web developer, who was helping another developer, Jane, build a site-scraper for a local government's site. See, the government site couldn't be bothered with anything like RSS, so if you wanted a list of upcoming events, notifications, or other information, your only option was to load the page manually, or whip up a site-scraper that automated extracting that information.
The problem was, Jane was finding some absolute nonsense in the HTML of the page, and it was incredibly difficult to parse out the information they were looking for. "What? How hard could this possibly be?" Jake wondered. He opened the site, viewed source, and was greeted with this:
<form method="post" action="NewsChannelSummary.aspx?NRMODE=Published&NRNODEGUID=%7b50A895D1-42F3-4442-BAD6-FD12D9CE38F7%7d&NRORIGINALURL=%2fyourcouncil%2fnews%2fnewsreleases&NRCACHEHINT=Guest" id="Form1">
<div><input type="hidden" name="__VIEWSTATE" value="<several kilobytes of base64 encoded data redacted>">
… page content here
</form>
Now, Jake was flummoxed by this. It just seemed like the absolute wrong way to build web pages. Clearly, this __VIEWSTATE
field contained the data, encoded somehow, but why was it done this way? Why not use JSON or just do something sane with it? Anything but this had to be better.
Well, Jake had fortunately made it to this point is his career without having to deal with Microsoft's WebForms.
We've discussed them before: WebForms was Microsoft's attempt to make designing a web page like designing a client-side form, complete with a drag-and-drop WYSIWYG designer, and even event handlers. Drop a button on a form, double click on the button, and the IDE would take you to the click event handler, where you could write your .NET code to respond to the event. The .NET code ran server side, which meant when you clicked a button, the browser would send an HTTP request up to the server, the server would then execute an entire page lifecycle: firing a series of events like Init
, Load
, and- because you clicked a button- a Click
event. If any textboxes were changed, appropriate Change
events would fire too. At the end of this lifecycle, .NET would generate an HTML document and send it back down to the client. These controls which could fire server-side events were known as "Server Controls". There were also "Client Controls"- just regular HTML elements, which had no special representation on the server-side- they were just form fields that'd be part of any POST
requests.
And, because a whole page reload was kind of expensive, Microsoft also included some extensions in Internet Explorer which would allow IE to update the rendered display only after the entire response had been loaded and pre-rendered to a back buffer, meaning the user wouldn't see a page load at all, just a seemingly "instant" response.
It was terrible, fragile, and the mismatch between desktop UI metaphors and the realities of web-based applications made it an absolute nightmare to maintain. But there was an additional "feature" here, which is our WTF above. The ViewState
object.
HTTP is stateless. You can store data is a session object, on the server, but that's expensive, especially if you have a lot of simultaneous users. For small amounts of session data, ASP.Net offered a different option: ViewState
. On the server-side, it was just a dictionary that you could stuff objects into. That data would then get base64 encoded and dropped into a hidden form field on the page, which is what you see above. When the user submitted the page (by triggering a server-side event), that ViewState
would get sent back to the server, to be deserialized back into a dictionary.
You, as the developer, could decide to put data in there, and thus how much data you stored in the ViewState
. But that wasn't the only source of data for the ViewState
. Each one of the controls you added to the page might want to track some state on its own. So, by default, each control would chuck a little data into the ViewState
dictionary. If you had a huge number of controls on the page? You'd have a similarly huge ViewState
form field. And the problem only got worse when you started using custom controls, especially third party controls, which powered a lot of their advanced features by stuffing information into the ViewState
object.
Now, this wasn't a forgone conclusion. You could, on a control-by-control basis, disable their use of the ViewState
, or even disable using the ViewState
entirely. But by default, every page would include a big-ol base64 encoded, serialized blob of junk. And developers working on small budgets and under tight time constraints, like perhaps the web developer responsible for maintaining the town council's news feed, frequently wouldn't think about optimizing the ViewState
.
Now, I have some highly cynical takes on the state of modern web development. It's gotten incredibly complicated, it's surprisingly difficult to just dive in and get started, and it sometimes feels like it's just frameworks all the way down. I hate it so much that I recently whipped up the Stupid Site Generator, a static site generator that's just a thin bash script wrapping around Pandoc.
To paraphrase Mitch Hedberg: Web development used to be bad. It still is, but it also used to.