Friday, May 29, 2009

Network Printer

If you're trying to set up a printer that is attached to the network, do you think you should select the option that says "Local printer attached to this computer" or the option that says "A network printer, or a printer attached to another computer"? Hint: this is Windows.

Scroll down to the bottom of the dialog box for the answer. I can't understand why they don't just relabel the options. I would make a guess, but Al Gore tells me that my sarcasm footprint is already too high.

addprinter

Wednesday, October 15, 2008

The Forgotten Half of Computer Literacy

Gruber today noted that the New York Times is adding an API to their campaign finance data (bravo!). He also pointed to an article from 2006 titled "The programmer as journalist: a Q&A with Adrian Holovaty".

The interview is pretty concise and lays out how and why journalism can better be accomplished with help from software. Holovaty breaks down the journalistic process of collecting, filtering and disseminating information and how computer automation can be applied to each step, leading to better journalism. Near the end, he argues that all journalists should have at least some experience with programming, if for no other reason than just to know what is possible.

His points are persuasive, but they're also applicable to almost any field - not just journalism. I believe that everyone should at least know how to write software in the same way that everyone should be able to write in general.

Literacy describes the ability to read and write. When we talk about computer literacy, we usually think of the ability to use software - which I would liken to the "reading" part. We often ignore the "writing" part - the ability to make software. But writing software is incredibly powerful. If you write a page-turning novel, it inspires the imagination of everyone who reads it. If you write an informative article, all its readers are enlightened. Likewise, great software entertains and/or empowers every person who makes use of it.

The analogy holds up pretty well. No - it holds up too well - it's not an analogy at all! Writing software is writing. It might have an apparently arcane grammar, but making software is merely writing down instructions that a computer can understand. And the language is not more complicated than English. Have you seen English? It's the most bastardized, crazy language on the planet! The languages we use to talk to computers are incredibly simple - necessarily so. Computers aren't going to pick the correct meaning of a word out of a dozen possibilities based on context. They're not going to detect our sarcasm in step two of the instructions we give them. They need to be told exactly what to do in the simplest way possible, and programming languages are the way we do that. Consequently, programming languages have only gotten simpler over the years. Strong typing is going the way of the dodo. How often do you need to know what a pointer is any more? All the complicated things are getting hidden away in libraries or abstracted by the VM. There has never been a easier time to become a programmer than today.

I don't believe that everyone has to become a "programmer" any more than everyone has to become a "writer". But everyone should be able to write; a sentence, a paragraph, an essay, a script that their computer executes.

Tuesday, October 14, 2008

Exercising Git

At work I use SVN for source control, but I've been itching to try out Git. I also recently stumbled across Project Euler, a series of math problems intended to be solved by writing small programs. I think you see where this is going. So - as of a few minutes ago - my Project Euler solutions (such as they are) are up on my github account.

I am a complete Git newb, so I followed some instructions on setting up Git and a new github repository. Since I can see all the code on github, I guess it's working.

My Project Euler solutions have been very fun so far. You're allowed to implement your solution in whatever language you want. I chose Ruby because it is the most expressive language I've found.

My solutions involving factoring primes have been a mixed bag. I think I did ok finding the largest prime factor of 600851475143 (problem 3), but my solution to problem 10 (sum all the primes below two million) was an ugly brute force. I really should have known a better algorithm (I won't say it by name here, but it was referenced in one of the Dark Tower books).

Some of the problems have been quite easy, but I'm only up to problem 17. I've peeked ahead and it looks like they get pretty tough. Sometimes it feels like using Ruby is cheating - for instance, problem 16 is "sum all the digits of 2^1000". Pity the poor C++ programmer!

The tricks I have up my sleeve are mainly inject, memoization and recursion. Recursion is especially helpful in simplifying problems. Memoization can be invaluable when you're performing expensive math. Inject is just plain fun - who wants to write all that boring loop code?

One reason that people might shy away from Ruby in Project Euler is the perceived performance benefit of more low-level languages. That is a complete red herring. When I was writing assembly language in high school, I read Michael Abrash's "Graphics Programming Black Book". One thing I learned was that extremely optimized assembly language will only net you a 2x to 10x improvement over compiled code. "Isn't that a lot?" you say? No - it is nothing compared to the gains you can usually get from improving your algorithms - often orders of magnitude. Using an expressive high-level language like Ruby allows you to concentrate on the high-level algorithms. I was especially proud of my solution to problem 15. I first broke the problem down with recursion. When that ran for more than a minute I realized I was solving the same problem more than once so I memoized it. Now it runs in less than a second even on my super-slow EeePC. Of course, there are much more elegant solutions out there - my solution that I was so proud of was far from the best. It just goes to show how far a good algorithm can take you.

This turned out to be a rambling post - sorry about that. In conclusion, Git and github are great, Project Euler is fun and algorithms matter. Good night!

Thursday, August 28, 2008

Banish XSS Forever!

The two biggest classes of web site vulnerabilities are SQL injection and Cross-Site Scripting (XSS). The ever-vigilant programmer can reduce the occurence of these problems, but proper architecture can eliminate them completely.

The SQL Problem Has Been Solved

I couldn't stop myself from referencing the XKCD SQL injection example. But for coders using prepared statements or parameterized SQL, SQL injection is already a bad memory. The problem of unsanitized data getting into your SQL can be avoided by following one rule: Never put any data into SQL. Instead of this:

$sql = "SELECT id FROM usertable WHERE username = '".$uname."';";
$result = pg_query($sql);

You do this:

$sql = 'SELECT id FROM usertable WHERE username = $1';
$result = pg_query_params($sql, array($uname));

Now when a script kiddie pastes some exploit code into my account sign-up page, he can't drop my tables or read all my password hashes. Little Bobby Tables can grow up to live a rich life without incurring the wrath of any more schools.

The XSS Problem is the Same Problem

SQL injection happens when unescaped data is interpreted by your SQL parser. XSS happens when unescaped data is interpreted by the client web browser. It's really the same problem, and the solution is pretty much the same.

I said "pretty much" because in the case of SQL, all the main relational databases have added APIs to allow "data" to be sent seperately from the "query". I don't see browser makers doing anything analogous. But you don't need your database to support parameterized queries natively - you can use a wrapper library that does it for you. Likewise, web site developers can adopt web frameworks that enforce the seperation of markup and text on the server side.

DOM View to the Rescue?

Stay with me here. The "traditional" way to build web pages programmatically was top-to-bottom. First you send your http headers, your DTD, your opening <html> and <head> tags, etc. until you finally send the closing </html> tag at the end. We're treating the web page as a string of text. Just because it is text on the wire doesn't mean we need to treat it that way. We should treat it as a tree, just like the browser on the other end does after it parses all that HTML.

Anyone who writes JavaScript is familiar with the Document Object Model (DOM) that is the web browser's internal representation of the web page. The page is a tree, with the <html> element at the root, and every element or chunk of text as a node. I propose that we write web frameworks that build web pages as trees - never having the programmer output any markup.

This system would start out with a basic empty web page.

html
|-head
| \-title
\-body

Set the title (please don't mind the syntax of my psuedo-code):

head.title.appendtext("Hello World")
html
|-head
| \-title
|   \-textnode value="Hello World"
\-body

Add a footer

body.appenddiv(class="footer").appendtext("© 2008 Me")
html
|-head
| \-title
|   \-textnode value="Hello World"
\-body
  \-div class="footer"
    \-textnode value="© 2008 Me"

Add our main content:

maindiv = body.prependdiv(class="maincontent")
maindiv.appendh1.appendtext("Hello World!")
html
|-head
| \-title
|   \-textnode value="Hello World"
\-body
  |-div class="maincontent"
  | \-h1
  |   \-textnode value="Hello World!"
  \-div class="footer"
    \-textnode value="© 2008 Me"

And so on. When you're done, your framework generates the HTML or XHTML or what-have-you from the tree and spits it out to the client. No markup is ever generated by hand and text can only go into text nodes or attribute values. The markup generator will automatically make the "&copy;" for the copyright symbol and knows how to escape quotes embedded in attribute values as well.

This is the solution I propose - not the sisyphean task of "remember to escape your strings". Maybe I'll build it.

Stack Overflow

I have been having so much fun with the beta of Stack Overflow - a new programmer help site by Jeff Atwood and Joel Spolsky.

If you're interested in the beta, you can get in here (scroll down a bit). Hopefully it will be live to the world soon. I expect scaling issues (not that it's had any yet, but I expect massive popularity).

Wednesday, July 30, 2008

In Praise of jQuery

At work I'm starting to weave jQuery into the interface to our job database. Until now my Javascript needs have been limited to a menu at the top of the screen. Now I am trying to create a time entry interface, and it is complicated.

Each employee needs to be able to say "I worked on task alpha from 8:00 to 10:10, beta from 10:10 to 12:00 and gamma from 13:00 to 17:00". It needs to be easy to use, give them a list of the tasks that they are working on to choose from and make sure that their times don't overlap. This is all to say that some more advanced scripting is called for. I hacked around with vanilla Javascript for a while, but soon decided to check out one of the free libraries available.

Enter jQuery, and I could not be happier. The core feature of jQuery is that you can select elements in your document by xpath or CSS-style selectors. This alone is well worth the price of entry. For instance, originally I had to do a "document.getElementById("timebar")", use ".getElementsByTagName("div")" on that, and then check each element's class as I looped through them to remove a class. Now I just write: "$("#timebar .selentry").removeClass("selentry")". That's an if statement, a loop and a lot of long DOM method names I didn't need.

jQuery gives you all sorts of additional nice things - most of which I haven't tried yet. It includes:

  • Easy creation and insertion of new elements into your document tree.
  • Add, modify and remove attributes.
  • Modify styles on an element without wiping out unrelated styles.
  • Easily add event handlers with cross-browser problems smoothed out for you.
  • Iterators.
  • Handle ajax requests easily.
  • Plugins to handle even more things. (I was able to easily drop the date picker into my existing date fields.)

I resisted using a Javascript library for a long time. I thought they were just for fancy ajax and slide-in effects. I also worried that they would force all my scripts to work inside their framework. With jQuery at least, my fears were unfounded. Even if you just use it to find elements in your document tree it will be extremely helpful.

Tuesday, March 25, 2008

Bug Zero

Bug zero is that the program does not exist. Fix that bug first.