Thursday, August 28, 2008

Banish XSS Forever!

The two biggest classes of web site vulnerabilities are SQL injection and Cross-Site Scripting (XSS). The ever-vigilant programmer can reduce the occurence of these problems, but proper architecture can eliminate them completely.

The SQL Problem Has Been Solved

I couldn't stop myself from referencing the XKCD SQL injection example. But for coders using prepared statements or parameterized SQL, SQL injection is already a bad memory. The problem of unsanitized data getting into your SQL can be avoided by following one rule: Never put any data into SQL. Instead of this:

$sql = "SELECT id FROM usertable WHERE username = '".$uname."';";
$result = pg_query($sql);

You do this:

$sql = 'SELECT id FROM usertable WHERE username = $1';
$result = pg_query_params($sql, array($uname));

Now when a script kiddie pastes some exploit code into my account sign-up page, he can't drop my tables or read all my password hashes. Little Bobby Tables can grow up to live a rich life without incurring the wrath of any more schools.

The XSS Problem is the Same Problem

SQL injection happens when unescaped data is interpreted by your SQL parser. XSS happens when unescaped data is interpreted by the client web browser. It's really the same problem, and the solution is pretty much the same.

I said "pretty much" because in the case of SQL, all the main relational databases have added APIs to allow "data" to be sent seperately from the "query". I don't see browser makers doing anything analogous. But you don't need your database to support parameterized queries natively - you can use a wrapper library that does it for you. Likewise, web site developers can adopt web frameworks that enforce the seperation of markup and text on the server side.

DOM View to the Rescue?

Stay with me here. The "traditional" way to build web pages programmatically was top-to-bottom. First you send your http headers, your DTD, your opening <html> and <head> tags, etc. until you finally send the closing </html> tag at the end. We're treating the web page as a string of text. Just because it is text on the wire doesn't mean we need to treat it that way. We should treat it as a tree, just like the browser on the other end does after it parses all that HTML.

Anyone who writes JavaScript is familiar with the Document Object Model (DOM) that is the web browser's internal representation of the web page. The page is a tree, with the <html> element at the root, and every element or chunk of text as a node. I propose that we write web frameworks that build web pages as trees - never having the programmer output any markup.

This system would start out with a basic empty web page.

html
|-head
| \-title
\-body

Set the title (please don't mind the syntax of my psuedo-code):

head.title.appendtext("Hello World")
html
|-head
| \-title
|   \-textnode value="Hello World"
\-body

Add a footer

body.appenddiv(class="footer").appendtext("© 2008 Me")
html
|-head
| \-title
|   \-textnode value="Hello World"
\-body
  \-div class="footer"
    \-textnode value="© 2008 Me"

Add our main content:

maindiv = body.prependdiv(class="maincontent")
maindiv.appendh1.appendtext("Hello World!")
html
|-head
| \-title
|   \-textnode value="Hello World"
\-body
  |-div class="maincontent"
  | \-h1
  |   \-textnode value="Hello World!"
  \-div class="footer"
    \-textnode value="© 2008 Me"

And so on. When you're done, your framework generates the HTML or XHTML or what-have-you from the tree and spits it out to the client. No markup is ever generated by hand and text can only go into text nodes or attribute values. The markup generator will automatically make the "&copy;" for the copyright symbol and knows how to escape quotes embedded in attribute values as well.

This is the solution I propose - not the sisyphean task of "remember to escape your strings". Maybe I'll build it.

2 comments:

Unknown said...

There'd definitely be a benefit to doing that but there's kind of a trade-off too, I think. It seems to me that the advantage of using html templates as so many web frameworks do is that it's more readable (at least when things haven't degenerated into tag soup) than generating the entire document imperatively. You can look at the template and get a sense of the structure of the page. I think it's harder to get that same sense with a series of statements describing the construction of a tree.

Neal said...

Will - You have a point about HTML-based templates being more readable. In my experience though, once you start jamming a bunch of code in there it looses that quality pretty quickly.

Alternately, you could start with a pure HTML (and CSS) template and override elements in the code. That way, your initial template could be built and revised by a designer, and as long as it had the proper hooks (ids and classes), the code would still be able to work. This would also maintain the separation of logic and HTML.