Moving from Django to Hyde

Posted on January 15th, 2010.

Almost exactly one year ago I posted a blog entry about how I rewrote this site using Django. It's a new year, and once again I've rewritten the site.

If you've visited my site before you may have noticed some small style tweaks. I've cleaned up a few things and I think the site looks better overall. The biggest change, however, is that I completely rewrote the core — this time with Hyde.

Hyde is a "static site generator" written in Python, similar to Jekyll, Blatter and Pilcrow. I chose it over the others because it's written in Python (unlike Jekyll), more flexible than Blatter and more mature than Pilcrow.

Rewriting an existing site that gets a decent amount of visitors is different than creating a new site from scratch, so I decided to write this entry to describe some of the issues I ran into and how I tackled them.

  1. Why Static?
    1. Less Memory Needed
    2. Static Pages are Faster
    3. Easier to Maintain
    4. Easier to Backup
  2. Getting Started
  3. Layout and Styling
    1. Cleanup
    2. Rewriting the Layout and Styles
    3. A New CSS Framework
    4. Pagination
  4. Using Fabric to Type Less
  5. Moving the Content
  6. Converting the Comments
  7. Rewriting the Old URLs
  8. Creating an RSS Feed
  9. Merging the Repositories
  10. Still to Come
    1. "Ago" Dates
    2. Categories
    3. Use LessCSS
  11. View the Code

Why Static?

There are a couple of reasons I decided to switch to a static site.

Less Memory Needed

I use WebFaction for my web hosting, and with my current plan I have a limit on how much memory I can use at any given time. Each Django site takes up around 20mb of memory on the server.

By switching to a static set of HTML pages for my site I save those 20mb so I can use them for something else (like a staging site for another project). My site doesn't particularly need all of the functionality Django can provide, so I figured I'd switch to a static site and save the memory.

Static Pages are Faster

My site doesn't get an enormous amount of traffic, so the Django instances weren't really working very hard. Still, serving a static page through Apache or nginx is always going to be faster than running it through Django. Faster-loading pages is always a good thing, even if the speed increase isn't very large.

Easier to Maintain

This is the main reason I switched. Previously, to edit a blog entry I would log in through Django's admin interface and edit the entry. With Hyde, I simply edit a file (or create a new one) on my machine and then run a single command to publish it.

Another problem with the old site is that I needed to be connected to the internet to get to the admin interface, so I couldn't update entries (or publish new ones) offline. I usually have an internet connection, but occasionally I don't. Now I can edit as much as I like on my own machine and only need a connection to publish the finished product.

Easier to Backup

One more advantage of Hyde is that the site structure and content are all held in the same place. I keep everything in a Mercurial repository, and so every time I push that repository somewhere it creates a full backup of the site's code and content. If WebFaction's server catches on fire I still have everything on my local machine (and on BitBucket).

I've toyed around with backing up Django's database tables when I had my old site, but this new method is far less work.

Getting Started

I wanted a fresh start when I was rewriting the site, so I went ahead and created a brand new folder and Mercurial repository for it.

I used hyde --init to lay out a skeleton in the new folder. Then I stripped out a lot of the default items that get added with --init and created the directory structure I wanted to use, with stubs for each of the main content files I knew I'd need.

Finally, I went ahead and filled some of the basic values in the settings.py file.

Layout and Styling

Once I had a skeleton in place I started working on layout of the site.

Cleanup

The templates created by hyde --init are functional, but when you look at the code they're a mess. The indentation is strange and inconsistent and there's trailing whitespace all over the place. I like clean code so I sat down and cleaned everything up before I started making any real changes.

Rewriting the Layout and Styles

After I finished cleaning up the templates I duplicated the HTML structure and styles of the old site from scratch. The old site had gone through a bunch of iterations and I was a bit sloppy in my editing, so there was a lot of cruft that had snuck in. I wanted a truly fresh start for the site, so I buckled down and did it all again.

A New CSS Framework

Previously I used the Blueprint CSS framework to make laying out the site easier. Blueprint is a great framework, but it's more powerful than I need for a site as simple as this one. The site now uses Aardvark Legs, which is a much simpler framework that simply sets up a great vertical rhythm and leaves you free to lay out the horizontal structure yourself.

Pagination

Before the rewrite the list of blog entries used to be paginated. Hyde supports pagination but I decided against using it because I simply don't write enough to make it necessary. All the blog entries are now listed on a single page, which means that you can use Cmd+F to find an article if you're looking for something specific.

Using Fabric to Type Less

I realized very quickly that typing hyde -g -s . && hyde -w -s . would get old pretty quickly, so I installed Fabric and wrote a fabfile.

Fabric is a tool written in Python that lets you define tasks and execute them by running fab taskname. Fabfiles are pure Python, so you can build larger tasks out of smaller ones very easily and do just about anything you want. It's similar to ant, but without the excessive over-engineering.

You can view the current fabfile on BitBucket. At the time of this entry it looks like this:

from fabric.api import *
import os
import fabric.contrib.project as project

PROD = 'sjl.webfactional.com'
DEST_PATH = '/home/sjl/webapps/slc/'
ROOT_PATH = os.path.abspath(os.path.dirname(__file__))
DEPLOY_PATH = os.path.join(ROOT_PATH, 'deploy')

def clean():
    local('rm -rf ./deploy')

def regen():
    clean()
    local('hyde -g -s .')

def serve():
    local('hyde -w -s .')

def reserve():
    regen()
    serve()

def smush():
    local('smusher ./media/images')

@hosts(PROD)
def publish():
    regen()
    project.rsync_project(
        remote_dir=DEST_PATH,
        local_dir=DEPLOY_PATH.rstrip('/') + '/',
        delete=True
    )

The task I use most often while developing is fab reserve, which regenerates the site and then starts serving it so I can view the result in a browser.

I use fab smush whenever I add new images. This runs smusher on all of the images to optimize them without reducing quality.

When I'm ready to publish changes to the live site I run fab publish, which will regenerate my local version and copy it up to the WebFaction server.

Moving the Content

The content of the old site (blog entries, projects, and static pages like the about page) was fairly easy to migrate over to the new one, because both versions use Markdown to format the text.

First I created an empty file for each page with a filename that matched the "slug" (last part of the URL) of the old page. Then I manually copied over the title, creation time and content for every page. I could have written a script to do this, but I don't have enough pages on the site to make it worth the time.

Converting the Comments

At this point the new site was looking very much like the old one. The styles were (roughly) matching and the blog posts, entries, and static pages were all rendered nicely.

The next big task was migrating all of the comments. Since the new site is static it can't handle dynamic content like comments on its own, so I decided to use Disqus. I use Disqus for the comments on the hg tip site and it's very nice. For that site I used it from the beginning, but for this rewrite I needed to somehow import all the old comments.

To migrate the comments over I used django-disqus, but there was a small snag that I needed to deal with first.

When I first wrote the old site I was just getting back into Django after not using it for a long time. I didn't know about Django's built-in comment models, so I created my own. They weren't as good as Django's, but they did the job and I didn't care enough to change them.

This became a problem when it was time to import the old comments into Disqus. django-disqus only supports importing comments from Django's built-in comment models. To work around this I first had to convert the old comments into Django's built-in ones. I wrote a small, hacky Python script to do it:

:::python
#!/usr/bin/env python

from django.core.management import setup_environ
import settings
setup_environ(settings)

from markdown import Markdown
from django.contrib.comments.models import Comment
from django.contrib.sites.models import Site
from stevelosh.blog.models import Comment as BlogComment
from stevelosh.projects.models import Comment as ProjectComment


mdown = Markdown()

site = Site.objects.all()[0]
blog_comments = BlogComment.objects.filter(spam=False)
project_comments = ProjectComment.objects.filter(spam=False)

for bc in blog_comments:
    c = Comment()
    c.content_object = bc.entry
    c.user_name = bc.name
    c.comment = mdown.convert(bc.body)
    c.submit_date = bc.submitted
    c.site = site
    c.is_public = True
    c.is_removed = False
    c.save()
    # print 'http://%s%s' % (site.domain, c.content_object.get_absolute_url())

for pc in project_comments:
    c = Comment()
    c.content_object = pc.project
    c.user_name = pc.name
    c.comment = mdown.convert(pc.body)
    c.submit_date = pc.submitted
    c.site = site
    c.is_public = True
    c.is_removed = False
    c.save()
    # print 'http://%s%s' % (site.domain, c.content_object.get_absolute_url())

Yes, it's ugly. No, I don't care. It was run once or twice locally and once on the live server. It worked and I'll never need to run it again.

Once I had the comments converted I could use django-disqus to migrate them. The import went very smoothly — I ran one command and after a couple of minutes everything was in Disqus. Once that finished it was just a matter of adding the Disqus JavaScript to one of the templates.

Rewriting the Old URLs

Since I was pretty much starting from scratch with this rewrite I decided to clean up the URL structure of my blog. Previously a blog entry's URL looked like this:

http://stevelosh.com/blog/entry/2009/08/30/a-guide-to-branching-in-mercurial/

The /entry/ part of the URL was useless — it just took up space, so I got rid of it. The year and month are useful to get an idea of how old an entry is just by looking at the link, so I left them in. The day, however, probably doesn't matter that much, so I took it out.

The new URLs look like this:

http://stevelosh.com/blog/2009/08/a-guide-to-branching-in-mercurial/

The problem with rewriting the URL structure is that there are already links around the web pointing at my entries. I didn't want those old links to break, so I crafted an .htaccess file that would rewrite the old URLs into the new ones:

{% templatetag openblock %} if GENERATE_CLEAN_URLS {% templatetag closeblock %}
RewriteEngine on
RewriteBase {% templatetag openvariable %} node.site.settings.SITE_ROOT {% templatetag closevariable %}

# Old URLs
RewriteRule ^blog/entry/(\d+)/(\d\d)/\d+/([^/]*)/?$ /blog/$1/$2/$3/ [R=301,L]
RewriteRule ^blog/entry/(\d+)/(\d)/\d+/([^/]*)/?$ /blog/$1/0$2/$3/ [R=301,L]

{% templatetag openblock %} hyde_listing_page_rewrite_rules {% templatetag closeblock %}

# listing pages whose names are the same as their enclosing folder's
RewriteCond %{REQUEST_FILENAME}/$1.html -f
RewriteRule ^([^/]*)/$ %{REQUEST_FILENAME}/$1.html

# regular pages
RewriteCond %{REQUEST_FILENAME}.html -f
RewriteRule ^.*$ %{REQUEST_FILENAME}.html

{% templatetag openblock %} endif {% templatetag closeblock %}

Most of that file is the stock Hyde .htaccess file — the two lines under # Old URLs are the ones that I added. It took me a while to get it right because I don't work with Apache's mod_rewrite very often, but it was worth it to avoid breaking all of the old links.

Creating an RSS Feed

I had an RSS feed for the old site which Django made very easy to set up. I definitely needed a feed for the new site, and fortunately Hyde provides a simple sample template that can be used to make an ATOM feed.

I cleaned up the whitespace and formatting of that template a bit, adjusted a few variables for my needs, put it on FeedBurner and everything was ready.

Merging the Repositories

The last step before I was finished was to merge the old and new repositories together so I wouldn't lose any of the history. It's probably not too important to keep the old site's history around, but I put a lot of work into it over the past year and it has some sentimental value.

Fortunately it's very easy to combine Mercurial repositories. I just pulled the old repository into the new one, merged the old head into the new one while discarding all the changes, and pushed it up to BitBucket.

Still to Come

At this point I felt the site was ready to be released, so I set up a new site on WebFaction and used Fabric to deploy it.

I'm very happy with the result, but there are still a few things I'm going to fix/change in the future.

"Ago" Dates

UPDATE: This is done. I've started using timeago.js to render the "ago" dates.

If you look at the top of each blog entry and project there's a line beneath the title that looks something like:

Posted on Monday, November 16, 2009 (1 month, 3 weeks ago).

The (1 month, 3 weeks ago) part of that line is something that I really appreciate on blogs. When they just list the date I always have to do the math in my head to figure out roughly how old something is.

With a static site, however, those times will quickly become inaccurate if I don't regenerate and publish the site regularly. I'm still thinking about the best way to work this out.

One option is to use a cron job on the WebFaction server to regenerate the site every day, which would keep the times fairly accurate.

Another option would be to use a bit of JavaScript to calculate and render the time when the page is loaded. This would make it completely accurate but wouldn't work if someone is browsing with JavaScript turned off.

Categories

UPDATE: I've added categories. Check out this changeset to see what I had to do.

Right now all the blog entries (and projects) are listed in a single chronological list. It would be great to break them up into categories so people can easily find the articles they're interested in.

Hyde supports categories but I haven't spent the time to learn how to use them yet. I also need to figure out a way to work a list of categories into the design without cluttering things up too much.

Use LessCSS

UPDATE: This is done. The main style file now uses LessCSS.

LessCSS is a language that extends CSS with some useful features like variables, mixins, operations and nested rules. It can make the styles of a site much, much cleaner.

Hyde includes a LessCSS "processor" that will automatically render your LessCSS files into normal CSS. I'm planning on rewriting the site's styles in LessCSS and using the processor once I get some free time.

View the Code

The code is on BitBucket and GitHub. Feel free to poke around and see how I've set things up. If you have questions or suggestions I'd love to hear them!