Friday, November 30, 2007

Amazon or... Amazing?

I am completely blown away by Amazon Web Services. S3 is wonderful in its simplicity and EC2 is revolutionary.

S3 stands for Simple Storage Service, and it lives up to its name. Put stuff up there and take it back down again. You can also have it generate auto-expiring URLs that others can access. Pay based on upload amount, download amount and storage amount.

EC2 stands for Elastic Compute Cloud. It is just a pay-by-the-hour server service. You provide the Linux image you want to run (or use one of their pre-built ones) and tell them to fire it up. The service assigns an IP address to the newly created machine and the rest is up to you. Pricing is such that it could destroy the dedicated server hosting business with a couple of small changes (basically a load balancer with a static IP would do it). It's biggest strength now is on-demand number crunching. Great for development/demo servers too.

Sunday, November 11, 2007

Creating a Tree-based Task List in a Relational Database

Things we want:

  • tasks
  • subtasks
  • task owners (responsible for that task)
  • task start time
  • task end time
  • prerequisite tasks

How do we represent these things in a relational database? Tasks are represented as a tree. Only leaf tasks will have owners and start/end times. This is because any non-leaf task will be made up entirely of its subtasks. In other words, when all the subtasks are done, the parent task is done. The start time of a non-leaf task is the earliest start time from among its descendants. Likewise, its end time equals the last end time from among its descendants (null if any are not yet finished).

Note: "bigserial" and "serial" are PostgreSQL abstractions for 8-byte and 4-byte integers that have an auto-incrementing default. They're mostly used for primary keys.

CREATE TABLE task (
    taskid BIGSERIAL NOT NULL PRIMARY KEY,
    parentid BIGINT REFERENCES task(taskid),
    jobid INTEGER NOT NULL REFERENCES job(jobid),
    taskname VARCHAR(200) NOT NULL
);

CREATE TABLE taskleaf (
    taskid BIGINT NOT NULL PRIMARY KEY
      REFERENCES task(taskid),
    employeeid INTEGER NOT NULL
      REFERENCES employee(employeeid),
    starttime TIMESTAMPTZ,
    endtime TIMESTAMPTZ
);

Problems:

Every task has a "jobid" - only the root task really needs this. Make the jobs themselves the root tasks? That wouldn't work in my pre-existing job scheme. Also, if performance was database-bound in my application, I could shift the tree-decoding work to the web server by selecting all the tasks by jobid.

Prevent circular references? I could make a trigger or rule in the database that would check the path back to a root node before allowing a parentid to be assigned. While I am at it, I can check to make sure the parent's "jobid" matches.

I looked at the "adjacency list" method, but I don't think it would work well for my needs.

Thursday, November 8, 2007

Are You Watching These?

Every time I watch one of the Google techtalks, I am blown away. As far as I can tell, this is how these things get made:

  1. Google chooses a subject related to their business.
  2. They find someone who knows that subject to give a talk.
  3. They distribute the video so employees who couldn't attend can still get the info.

This is especially good because:

  1. All subjects are related to their business.
  2. They always seem to get the foremost authority on any subject (I bet they would reanimate Lincoln if they had a techtalk on the Civil War).
  3. They give the video out to everybody.

Tuesday, November 6, 2007

PHP is Funny

There are a lot of things that I like about PHP. For instance, the addition operator (+) is different from the concatenation operator (.). This prevents the common ambiguity in weakly-typed languages of "'1' + 1".

PHP has a lot of quirks though - due to its highly "organic" development. Variable variable names are a prime example:

$a = 'hello';
$$a = 'world';
echo "$a $hello"; // prints "hello world"

Confusing, no? Also, the object-oriented stuff seems tacked on (probably because it was tacked on) and the function to escape text so that it is HTML-safe is called "htmlspecialchars()". For a language whose main function is to output web pages, couldn't they have thought up a shorter function name?

Intro

This will be my space to dump my software development thoughts. Welcome to a piece of my mind.