<![CDATA[TravisSwicegood.com]]> 2015-01-17T18:46:10-08:00 http://travisswicegood.com/ Octopress <![CDATA[Design Thinking vs Development Thinking]]> 2015-01-17T18:44:00-08:00 http://travisswicegood.com/2015/01/17/design-thinking-vs-development-thinking This morning I read an article on what the ideal operating system should look like. I devoured all all three parts and it got me thinking about my thought process and how I approach development. This post is a loose collection of those thoughts.

What Problem?

One thing that I’ve discovered about my thought process is how I approach problems. Too many times, it’s easiest to start from where I am right now and how I can modify the existing tool / code / product to do what I need. This provides a good starting point for context of what’s immediately possible, but not for solving the problem.

For example, let’s consider the text editor. The main purpose of a text editor is writing things down. You want to be extremely good at that if you’re going to be an editor that people want to use. Based on this description you can build an editor that’s a joy to use and makes the process of getting information into the editor easy and intuitive. There’s a problem with it: what happens when a user is done with new document that they’ve created? My original description did not include anything about saving or exporting the documents that are created.

Realizing that you’ve left saving out as a feature, you might write up a job story that looks something like this:

When writing a story I want to ensure that it’s been saved so that I can share the saved document with other people.

If you start from where you are, you might think to add a Save feature and tie that to a menu item, a keyboard shortcut, and maybe even a toolbar to provide multiple options to your user. This is a valid concern, but it overlooks one key thing. The user doesn’t care about saving, they just want it saved.

The user’s job is to write, not to save something. Explicitly saving something is a task. User’s aren’t interested in performing a task unless they have to. Auto-save is what the user needs. At this point in the process the only thing they need to know is that their work is saved. Instead of focusing on the job at hand and how this feature supports that job, adding a Save feature focuses on the task.

I’ve fallen victim to thinking that focuses on the task instead of focusing on the overall job, but I guard against it now. This causes me to think differently than a lot of developers: rather than focus on fixing one particular thing, I focus on what the underlying (or overarching?) problem or job is. This means I talk past people sometimes because I forget that we’re talking about different things.

How to fix a problem

On a recent open source project that I work on I opened a pull request that introduces a new higher level concept to the project in the service of fixing one discrete bug. To me, the discrete bug was a manifestation of the lack of that higher level structure. Without that common vocabulary, different parts of the code were touched by different developers at different times and there was a discrepancy between how the concept was represented.

To me, that larger problem was what needed fixing. To other developers, the bug needed fixing. Thinking about that larger problem, I tackled that and fixed the bug. Another developer on the project focused on the explicit problem and added the one-line fix to that code path that solved that one bug that manifested itself. On the surface, the one-line fix seems simpler because less code was involved (my fix was a little more than 30 lines). The one-line solution was only simpler when viewed as the task “fix this bug” not “fix the problem that gave rise to this bug.”

To be fair, both are legitimate ways to approach the problem. The one-line fix that focuses on the task at hand fixes the bug and avoids possible over-engineering that might happen by thinking about the bigger picture. It also runs the risk of having the same problem solved in different ways throughout the code base as each “just one-line” fix adds another branch into the complexity of the program.

Thinking like a developer vs like a designer

This all ties back to the story that started this post because of the way the problem was approached. Most developers I know would balk at the idea of creating an operating system, then starting by removing the file system and applications. “But where will I store my files and how will access them?!” I hear them all exclaim at once. Most designers I know would hear that idea, think for a second, then say “ok, so what replaces it?” followed closely by “and what was the user trying to do when they accessed those files?”

Designers tend to think in terms of solutions to general problems. Developers tend to think in terms of solutions to explicit problems. This is still a nascent revelation to me, but starts to explain to me while I’ve always felt slightly out of place in the development world.

It’s also making me question my description: am I still a developer with a bit of design knowledge or a designer that happens to program?

<![CDATA[Rethinking Web Frameworks in Python]]> 2014-07-25T15:41:00-07:00 http://travisswicegood.com/2014/07/25/rethinking-web-frameworks-in-python Listening to @pragdave talk about Exlir’s pipes he was talking about how these two styles, while fundamentally the same, have vastly different readability:


Try to explain that line of code to someone who doesn’t program. You start by telling them to just skip over everything until they hit the center, that’s the starting point. Then, you work you way back out, with each new function adding one more layer of functionality.

As programmers, we’ve taught ourselves how to read that way, but it isn’t natural. Consider this pseudo code:

"cat" | list | sorted | join

This code requires that you simply explain what | does, then it goes naturally from one step to the next to the next and the final result should be the joined sorted string.

Seeing that code example got me thinking about some of the discussions I’ve had with new programmers as I explain how Django works. I start explaining the view, to which I’m almost always asked “ok, how does the request know what view to execute?” I follow this up by moving over to URL route configuration. After that’s explained, I’m asked “ok, so how do requests come in and get passed through that?” And this goes on, until we’re standing on top of 20 turtles looking down at the simple Hello World we wrote.

In that vein, what would a web framework look like that started with the premise that a regular, non-programmer should be able to read it. Here’s an idea:

def application(request):
    request > get("/") > do_response()
    request > get("/msg") > say_hello()

So, you define an application function that takes a request, that request is then run through a get function with a route, and if that matches it would finally pass off to a final function that does something that would generate the response.

To that end, I’ve hacked up this simple script that uses werkzeug to do a simple dispatch. The implementation is a little odd and would need to be cleaned up to actually be useful, but I think I could be on to something. Just imagine this syntax:

request > get("/admin") > require_login > display_admin()

At this point, require_login can return early if you’re not logged, and display_admin could repeat the entire application style and be “mounted” on top of the /admin route and respond to request.path that is slightly different.

request > get("/users/") > display_user_list()
request > get("/user/<id>/") > display_user()
request > post("/user/<id>/") > edit_user()
# or...
request > route("/user/<id>/", methods=["GET", "POST"]) > handle_user()

Any thoughts?

<![CDATA[My First Docker]]> 2014-07-16T09:00:00-07:00 http://travisswicegood.com/2014/07/16/my-first-docker I’ve been told I should check out Docker for over a year. Chris Chang and Noah Seger at the Tribune were both big proponents. They got excited enough I always felt like I was missing something since I didn’t get it, but I haven’t had the time to really dig into it until the last few weeks.

After my initial glance at it, I couldn’t see how it was better/different than using Vagrant and a virtual machine. Over the last few weeks I’ve started dipping my toes in the Docker waters and now I’m starting to understand what the big deal is about.

Docker versus VM

I’ve been a longtime fan of Vagrant as a way to quickly orchestrate virtual machines. That fits my brain. It’s a server that’s run like any other box, just not on existing hardware. Docker goes a different route by being more about applications, regardless of the underlying OS. For example, let’s talk about my npm-cache.

Using this blog post as a base, I wanted to create an easily deployable nginx instance that would serve as a cache for npmjs.org. The normal route for this is to get nginx installed on a server and set it up with the right configuration. You could also add it to an existing nginx server if you have one running.

Docker views something like this npm-cache less as the pieces of that infrastructure (nginx and the server its on) and more as an application unto itself with an endpoint that you need to hit. Its a subtle shift, but important in a service-oriented world.

Getting Started

Docker has been described as Git for deployment, and there’s a reason. Each step of a deployment is a commit unto itself that can be shared and re-orchestrated into something bigger. For example, to start my npm-cache, I started by using the official nginx container.

The nginx container can be configured by extending it and providing your own configuration. I used in the configuration from yammer, created a few empty directories that are needed for the cache to work, then I was almost ready to go. The configuration needed to know how to handle rewriting the responses to point to the caching server.

Parameterizing a Container

This is where things got a little tricky for me as a Docker newbie. nginx rewrites the responses from npm and replaces registry.npmjs.org with your own host information. Starting the container I would know that information, but inside the running container, where the information was needed, I wouldn’t know unless I had a way to pass it in.

I managed this by creating a simple script called runner that checks for two environment variables to be passed in: the required PORT and the optional HOST value. HOST is optional because I know what it is for boot2docker (what I use locally). PORT is required because you have to tell Docker to bind to a specific port so you can control what nginx uses.

My runner script outputs information about whether those values are available, exiting if PORT isn’t, modifies the /etc/nginx.conf file, then starts nginx. The whole thing is less than 20 lines of code and could probably be made shorter.

Deploying with Docker

I got all of this running locally, but then the thought occurred to me that this shouldn’t be that hard to get running in the cloud. We use Digital Ocean a lot at Continuum, so I decided to see what support they have for Docker out-of-the-box. Turns out, you can launch a server with Docker already configured and ready to run.

With that, deploying is ridiculously easy. I started a small box with Docker installed, then used ssh to connect to the box, and ran the following commands:

docker pull tswicegood/npm-cache
export PORT=8080
docker run -d -e HOST=<my server's IP> -e PORT=$PORT -p $PORT:80 tswicegood/npm-cache

That’s it! Including network IO downloading the npm-cache, I spent less than five minutes from start to finish to get this deployed on a remote server. The best part, I can now use that server to deploy other infrastructure too!


Making deployment of a piece of infrastructure this easy is not a simple problem. I’m sure there are all sorts of edge cases that I haven’t hit yet, but kudos to the Docker team for making this so easy.

Check out Docker if you haven’t. The Getting Started tutorial is really great.

<![CDATA[Timeless Way of Coding]]> 2014-06-22T19:52:00-07:00 http://travisswicegood.com/2014/06/22/timeless-way-of-coding

… we must begin by understanding that every place is given its character by certain patterns of events that keep on happening there.

The above quote is in the opening chapter of one of my favorite books of all time, The Timeless Way of Building by Christopher Alexander. Alexander is famed in programming circles as the author of A Pattern Language which set the stage for programming design patterns some 40 years before the Gang of Four wrote the book.

The Timeless Way is the lesser known of his two-volume set. It sets up his pattern book by defining why patterns are important. It is a more thorough explanation of quality than Zen and the Art of Motorcycle Maintenance without the personal account of a descent into madness and a focus on quality through the lens of architecture and places. It is on my list of must read books for anyone who takes themselves seriously as a programmer.

If you’ve ever had one of my code reviews, you’ve probably seen something like this:

All functions need two \n characters between them

Or this gem:

Syntax of 'key' : 'value' in dictionaries will raise a flag on pyflakes. Best to avoid.

Both of these are from a commit message this past week with some simple cleanup, code gardening if you will, on code. My change didn’t affect what the code did at all, but it did make sure that it was more idiomatic Python. Pythonistas pride themselves on a certain style so much that there is even a coined term for this: Pythonic.

The importance of these small changes is summed up in the opening quote from this post. To paraphrase:

Things keep happening the way they happen.

By focusing on producing clean, readable, simple, uncomplicated code, you create an environment where more clean, readable, simple, uncomplicated code can flourish.

Tools I Use

You can stop here if you’re not interested in specific tools, otherwise, here are a few things I use to help keep my code clean.

The editor I use the majority of the time is Sublime Text 3 (though I will always have a soft spot in my heart for Vim). I start with these language-specific settings in Python, which you can use by opening a .py file, then going to Sublime Text 3 > Preferences > Settings - More > Syntax Specific - User and copying this JSON blob into that file.

    "detect_indention": false,
    "tab_size": 4,
    "translate_tabs_to_space": true,
    "use_tab_stops": true

Beyond some basic settings that cause spaces instead of tabs to be used and setting the tab size correctly, the most important part of those settings is the rulers. There are two lines that are displayed at character 72 and 80 in every Python file I open.

Docblock comments in Python are supposed to be less than 72 characters. This allows the docblock to be displayed indented in Python’s built-in help and not wrap to the next line. I try hard to ensure all docblocks I write stop before I hit that mark. The second line at 80 characters shows the point where my Python code needs to stop.

I know many developers think that the 80-character limit is too limiting. “I have a big monitor” I hear you say. The optimal character length for a line of text is around 60 characters. Going much beyond that makes it harder for the human brain to process what it’s seeing without scanning back and forth. Plus, take your code and increase it so someone at a meet-up can see your code sitting 20 feet away from the screen, then see how your 120 characters look.

There’s an even more practical consideration when thinking about line length. Forcing this constraint on yourself causes you to think really hard about what is the most effective use of those characters. Is that line really best expressed with an 80 character string in the middle, or can that be hidden behind a variable? Do all of those and conditions in your if statement make your code more readable, or would an intent-revealing function help this code? Constraints, even annoying ones, can really help hone your code design skills.

Next up, I use the Python Flake8 Lint. This tool scans your code using pyflakes and flags errors for you. Out of the box, it can be a little annoying (especially when you’re learning pep8’s rules). It displays a pop-up when you save your file and tells you all the places your code has errors. This is really useful on your own projects, as it causes you to pay attention to make sure that your code doesn’t raise these errors. But when you’re working with other developer’s code, you might want to reduce the chattiness. You can tweak the settings under Preferences > Package Settings > Python Flake8 Lint > Settings - User. Here are the settings I use for it:

    // run flake8 lint on file saving
    "lint_on_save": true,
    // run flake8 lint on file loading
    "lint_on_load": true,

    // popup a dialog of detected conditions?
    "popup": false,

    // show a mark in the gutter on all lines with errors/warnings:
    // - "dot", "circle" or "bookmark" to show marks
    // - "" (empty string) to do not show marks
    "gutter_marks": "bookmark",

This adds a mark to the gutter on each line that has an error, suppresses the popup, and makes sure that pyflakes is run when I open a file so I can see the errors immediately. To see the actual error, I move my cursor to a line that’s marked and this plugin displays the error message in the status bar.

These might seem like draconian tools that get in the way of coding quickly. Coding fast and coding sloppy are not synonymous. Spend a little time working within these constraints and your fellow developers will thank you.

Plus, you’ll be making sure that the code you write helps to create a better codebase by increasing the quality of the patterns that keep happening there.

<![CDATA[The Case for Django]]> 2014-03-03T13:42:00-08:00 http://travisswicegood.com/2014/03/03/the-case-for-django I get asked a lot where to start if you’re looking to python for web backed work. A lot of people look at Django and Flask and feel that Flask is where they should start. It’s nice and small, very simple, and after all they’re not doing anything big and complicated, so why start with a big, complicated framework?

This reminds me if something that happens in the running world. People get started running then either a) read Born to Run, or b) hear someone talking about the benefits of so-called barefoot running. (For the record, I’ve only seen a few people actually run barefoot. Most run with minimalist shoes like Vibram FiveFingers™.)

There are many benefits to running with minimal shoes. Proponents point to studies that show lower injury rates amongst bare footers. They talk about our natural instinct to run and how the modern shoe with all of its support and cushioning is actually doing more harm than good.

The next part of their pitch is ignored by many of the so-called Born-to-Runners: it takes a lot of practice to be able to get to the point where you can run 10k, much less an ultra-marathon with minimal shoes. You practically have to start over and slowly build. There is a huge payoff, but it takes time. Otherwise, you’re more likely to injure yourself.

I’m speaking from experience. I didn’t read Born to Run, but I know the claims. When I started running a few years ago, I switched on and off from a minimal pair of running shoes and a pair of FiveFingers™. I figured since I was just starting out I wouldn’t have any bad habits to break.

There was one snag in my plan: I wasn’t ready for them. I hadn’t built up the running specific muscles. My form wasn’t there yet. I quickly started having plantar fasciitis issues. They weren’t debilitating, but enough to make me take a week off to rest and work on stretching. It flared right back up as soon as I started running again. I had a half marathon a few months out so something had to give. A trip to the running store and about $100 later I had a pair of running shoes that felt like pillows on my feet and a week later the pain was completely gone.

The same thing applies to web frameworks. It might seem like a good idea to stick with frameworks that can be coded in one file, or ones that don’t do everything. Those frameworks are built on top of a lot of hard won lessons.

When you’re starting out, you don’t know what a properly factored web application looks like (yet). You don’t know where to draw the line between your model and controller layers (yet). You don’t really know the trade-offs involved in going with a relational database and a NoSQL database. And that’s ok. Micro frameworks assume you do, though. They give you a lot (or a little, depending on how you look at it) of rope and it’s really easy to end up with your app looking an awful lot like a noose.

So skip the minimalist when starting out, whether that’s shoes or web frameworks. Build on the experience of others, then start stripping away those layers once you’ve got a solid base.

<![CDATA[Moving On]]> 2013-07-30T10:55:00-07:00 http://travisswicegood.com/2013/07/30/moving-on I took over as the Director of Technology at the Texas Tribune a year ago this month, but if you’re reading this you probably already knew that. What you probably don’t know is that year ago June I had one foot out the door. So what kept me at the Tribune? A beer with Emily Ramshaw.

She reminded me why I left the start-up world in the first place: to make a difference in the world with my work. I do that at the Texas Tribune, but my role has always been a supporting one. It needed to be given my straddling the business side, the editorial side, and everything in between. I used to describe my job as not being on the front lines of changing the world, but I was the one supplying the ammo.

I’ve realized over the last few months that I wanted that to dynamic to change. It became crystal clear in the early morning hours of June 26 as I watched tools I had help put in place shed a light on the inner workings of Texas politics. Without our work, everyone would have read about the filibuster and questionable voting times the morning after instead of watching it happen live.

I’m happy to announce that as of today, I’m moving on from Director of Technology at the Texas Tribune to the editorial staff as the Tribune’s first News Apps and Data Editor. I’ll be continuing the amazing work that’s already been done and working full time with Ryan Murphy on our newly formed News Apps team.

I gotta say, I’m super stoked.

<![CDATA[Past, Present, and Future of Armstrong]]> 2013-06-03T21:21:00-07:00 http://travisswicegood.com/2013/06/03/past-present-and-future-of-armstrong Most of you who know me have heard me talk about Armstrong, the open-source news platform that I helped create when I first joined the Texas Tribune. I have and continue to talk at length about Armstrong and its future, but I’ve never collected those thoughts into one cohesive document outlining how we got to where we are now, what the current state of the project is, and where I hope to see it go.

This post is my attempt to do that.


Armstrong is peculiar if you look at it from the outside. It might be hard to understand exactly what it is and how it got to the point it is. This section should help you understand that a bit more.

Not a CMS

Using common names can be unfortunate. Most people hear Armstrong CMS and think it’s a content management system akin to Eidos on the closed-source side or Wordpress and Drupal on the open-source side. I’ve always envisioned Armstrong as a platform to build on top of, not simply a CMS. The distinction is small, but important.

You work within a CMS

You build on top of a platform

A CMS is something you use. It provides the tools you need to manage content. A platform is something that provides a base to build upon. It’s my belief that more 95% (maybe more) of the pieces that make a news focused website a news website are all the same. Everyone needs a way to collect similar content into sections, a way to schedule content, or a way to control publication status on content. That last 5% of content, however, is unique and what makes a site interesting. Being able to reuse data about bills throughout your site without having to confine it to a big blob of text is what allows us to do interesting things with a site. This is where Armstrong comes in.

Armstrong isn’t meant to be used out-of-the-box any more than a Lego™ train set is meant to be a toy train set as soon as you unpack it. Both require assembly, but both allow you to exercise some creativity in how you assemble them. To do this, we’ve taken an unconventional approach to packaging everything in the project.

Everything is a package

Python is famous for it’s horrible packaging solutions. It’s gotten a lot better over the last few years, but people still package most software in one big bundle. I created the packaging schema for Armstrong based on the way I wished Python packages were handled, as if they followed an adopted version of the Unix Philosophy:

Packages should do one thing, and do them well. Packages should work together.

This means that there’s some 25 packages on PyPI for Armstrong. Many of these work in concert with other Armstrong packages to form a bigger whole. I broke it down along two main lines with a few others thrown in.


This section of Armstrong contains all of the pieces essential to nearly all websites. These packages either have no models, or are meant to be used almost exactly as-is with little or no modification.

All core packages contain an arm_ prefix in the last part of their name: armstrong.core.arm_content, armstrong.core.arm_wells, and so on. This is to avoid potential naming conflicts since Django still assumes that its apps are all flat without full module names.

Try as I might, armstrong.core.arm_content did end up pretty big. It contains most of the mixins used to build the larger models throughout the system. Anything that can loosely be considered content in an Armstrong project probably has some connection to this package.


These apps are meant to be usable out-of-the-box, but are most useful as example implementations. This is one area that I didn’t document as well as I should have. Almost no one would (or should, for that matter) use armstrong.apps.articles as their out-of-the-box article implementation. For very simple sites it will probably work, but my assumption has always been that most sites will take that as a guide and build something similar to it.

A great example of this is the armstrong.apps.donations project. We use that pretty extensively at the Texas Tribune since we’re a member-driven organization, but we don’t use it in its out-of-the-box configuration. We have a custom tt_donations app that extends the views to add extra functionality and we have custom backends for all of our payment processing and CRM syncing.

Hatband and Pops

Any news tool is only as good as it’s admin interface. Unfortunately, most of our time while funded by the Knight Foundation was spent on backend code, but we do have a solid start of a custom admin interface that’s broken down into two pieces.

Hatband is the collection of Armstrong-specific interfaces to the Django admin. It’s meant to be used as a drop-in replacement for django.contrib.admin that extends the behavior. It provides several custom inline interfaces and will hopefully contain even more. It exposes a JSON API for searching any type of model that’s registered with the admin and has search_fields turned on. The plan has always been to use this to create a rich API on top of the admin. Search is simply the first foray into that.

Where Hatband is behind-the-scenes, Pops is the user-interface side. It’s currently built using Twitter’s Bootstrap framework and was built on top of a fork of django-admintools-bootstrap. Pops is meant to be entirely standalone and have no ties to Armstrong at all since it’s simply the skin on the interface.

Current State

The original Knight Foundation grant to The Bay Citizen and The Texas Tribune ended in early summer of 2012. Since that time there hasn’t been any full-time development dedicated to Armstrong, but that’s not to say that development has stopped. Both the BC and TT, along with a handful of other organizations, use Armstrong internally and are continuing development on the project.

There are a few things I want to call out.

Timed Releases of Armstrong

The original idea, and one I’ve deviated from since last year, was to do timed releases, every three month. You might have noticed that the version numbers are pretty high. That’s because they follow the format vYY.MM. That way you know when the last major release of Armstrong was. Each one of those releases is just the latest stable code from all of the components of Armstrong that are considered production ready.

One key point, however, is that the main armstrong package (note the lowercase, that means it is code in its packaged state, not the project as a whole), is just a collection of other packages. You don’t have to install armstrong to be able to use various components. For example, you can pull armstrong.utils.backends into any project without using anything else from Armstrong if that’s all you need.

Release Components

All of the components of Armstrong are released independent of the major armstrong releases. There’s been a handful of component releases in the last year and more are being worked on right now. Each of these follows Semantic Versioning, or SemVer. That means that you can always upgrade within a major version, say from v1.2 to v1.6, and not worry about your code breaking. If anything breaks, a new major number is introduced. So far, we’ve only had to do that once: armstrong.apps.related_content.

All of the components in Armstrong that reach a v1.0.0 release, other than those grandfathered in for armstrong’s first stable release (v11.09), are being used in production. Following SemVer, production code goes to v1.0.0 as soon as it’s production. Part of the criteria for code making it to stable is that it’s being used on at least one site as production code. There’s one v0.x release that has running code so you can install it, then once that’s ready the v1.0.0 should be a simple version number bump with no code change.

The production requirement goes for point releases of code once it goes stable as well. Someone has to be using the code that’s in a pull release in order for it to be considered stable (and tested) enough for a release. Unit tests are a requirement, but code that is running on production is the final requirement for any component that’s released as stable.

This points the burden mainly on the Bay Citizen and Texas Tribune to make sure we’re running code that we’re trying to get released. Right now we’re the main ones that are effected by this, but it allows us to ensure the quality of the code. As more organizations start to contribute, they’ll have to play by the same rules.

Future Plans

There are a handful of areas where I would like to see Armstrong grow toward in the future. These coincide with the technical direction here at the Texas Tribune, so we’ll be driving many of these changes based on restructuring of our internal code.

Streams of Content

One of the biggest regrets in Armstrong was that I relented when arguing that we should structure content to exist independently with streams that any content could opt-in to. The argument against this route was that we had a tried solution—monolithic concrete-model dependencies—so why try something new until we’d replicated what we knew works. The old method does work, but it’s not scalable, whereas independent streams that you push content in to means you can decouple that relationship and scale it to many different types of content.

For example, say you have an Education Stream that contains content related to education. Stories can put themselves in that stream by providing the information the stream expects, but so can an update to data about a campus. All the data has to do is be able to render itself the way the stream expects and it can opt-in.

My initial plan is that an object would provide at least one rendered version of itself and its canonical URL. For a section stream, that rendered version would probably be the title, plus artwork, byline, and description.

This decentralization means that the stream display can be moved around to different services. You could also make it streams all the way down. Content notifies data streams that expect certain JSON documents and those data streams notify content destination streams (think: HTML, iOS, computers on the dashboard, TVs, and so on and so forth).

Those with a background in enterprise software might recognize this type of decentralization by another name: Service Oriented Architecture, or SOA. This type of architecture is not simply nice, it’s a requirement in a multi-device world. Building services that can only return HTML is shortsighted and going to cause problems as the number of devices our content is displayed on explodes in the coming years. Decoupling content from the various streams they’re consumed in is the first step in future proofing Armstrong.

Testbed for a new Django Admin

One area where I think a SOA allows greater freedom is the admin interface. The Django admin is great for what it gets you out-of-the-box, but you outgrow it very quickly especially when it’s laid next to modern web tools. You have to remember, the Django admin was designed in 2004/2005 when your main option for dealing with any type of data was phpMyAdmin and editing the database directly!

One thing I hope to do with the Hatband/Pops combo is create a testbed for experimenting on top of Django’s admin. These roles aren’t set in stone, but my thought is that Hatband will serve as the place for Armstrong-specific experimentation and Pops will be the place for generic Django experimentation.

Since starting Hatband and Pops, a few other tools have popped up in this space. Nobody has gotten significant traction, but I’m not opposed to joining forces with one of them if somebody does start to head down a solid path, but there are a few things that I see as a requirement.

  1. It needs to build on top of the existing django.contrib.admin code. The admin’s bones (with a few exceptions) are really solid. Rewriting it from the ground up isn’t a good use of time. It needs a lot of refactoring and many of the changes wouldn’t be backwards compatible, but it’s possible to make it happen by simply building on top of what already exists.

  2. It needs to focus on decoupling the actual interface from the discovery and registration of apps along with exposing an API to them. Right now, the biggest wart on the Admin is how tightly coupled display and discovery are. Any new admin needs to focus on providing a solid API (both Python and REST) for working with the apps that are registered. On top of that you can build a solid, default client interface. Having a dogfooded API ensures that others can build alternative interfaces on top of it. Think: native iOS apps for the admin!

  3. It needs to exist outside of core Django. I think the Admin is one of the reasons Django has gotten the traction it has. That’s great, but right now it moves too slowly for that to be useful for such a potentially rich web application. Having the admin exist as a separate project means that it can move more nimbly, release more often if it needs to, and gather support from those who might have no interest in working on a traditional web framework, but would love to work on a web application.

Separating Editing and Publishing

Currently, CMS tools assume that you’re editing and publishing content in the same tool. Those are two different workflows that need to have different tools: a collaborative authoring/editing tool and a publishing tool. They can exist on the same domain and even within the same major tool, but each workflow needs a different presentation and to be separated from the other.

The editing tool needs to have real-time feedback for the user when edits are happening. It should update in real-time showing who is editing what, the changes they’ve made, and so forth. It could include the ability to lock a field by focusing on it, but it should allow you to return control back to other users by removing focus, but ideally it’s smart enough to work with multiple users editing the same content.

This interface needs to focus the user on writing and editing. Things like sections, tags, locations, and so forth are all secondary content that should be tucked away, accessibly only when called on, to allow the user to focus on the task at hand. It would take a lot of user testing to design this system in a way that it could replace the existing tools (the number of emailed Word documents at the Tribune is still a source of embarrassment for me), but you’d end up with a solid workflow to take something from idea to finished product with the right focus.

Once the content has been written and edited, it needs to be published. Focusing explicitly on content takes all layout decisions away from the authoring experience and moves them to a place where you can make device-specific choices.

Responsive design should be the first choice for all content so it reaches the broadest audience, but there is also room for device-specific display where it make sense. Layout tools should enable this.

There is a start for this in armstrong.core.arm_layout with some code to help reuse model-specific rendering throughout the site, but those are baby steps toward a GUI-based layout tool.

Being able to control any aspect of the display of the site across multiple devices is the Holy Grail of a news platform, and one I hope we’re able to tackle as part of the Armstrong community.

Up Next

There are some very immediate plans, however. First, we need to roll another release of armstrong. I plan on creating armstrong v13.06 at the end of the month with the latest stable versions of all of the stable components.

v13.06 is going to be a maintenance release to get everything updated to the latest and remove the component-level requirement on Django and ensure full testing of Django from Django 1.3.x through Django 1.5.x. From here forward, the only dependency on Django is going to be specified at the armstrong level, leaving it up to developers to work through whether they can upgrade. We’ll continue to test components against all supported versions of Django.

This release will put us back on the timed release. The next release after this will be v13.09. I know the folks over at the old Bay Citizen have some new code they would like to see released and I’m hoping to have a solid admin interface for armstrong.apps.related_content by then as well.

<![CDATA[Tools vs Materials]]> 2013-05-20T13:21:00-07:00 http://travisswicegood.com/2013/05/20/tools-vs-materials Last week I attended the Artifact Conference. It was my first design conference, and I was impressed. The talk lineup was amazing and speakers consistently delivered. The talk that stuck in my head (and not only because it was the last talk I attended) was Dan Rose’s Photoshop’s New Groove.

The talk was about using Photoshop while designing for the web. It was the counter balance to all of the random snarky comments on Twitter and so forth about Photoshop and how it should die in a fire (I’m paraphrasing, but just barely) and how you’re not a real web designer if you use Photoshop and so on and so forth. It was a great talk.

Dan spent a lot of time talking about using the tools that work for you. Dan Mall had a great line in his talk on Monday about fighting your tools as a way to kill creativity. I think he’s 100% right. Find the tools that work for you and use the hell out of them.

That said…

HTML and CSS aren’t the tools of web design, they’re the raw materials.

The difference between tools and materials means the world. It’s like a chef who doesn’t cook their new recipes because they know how the various ingredients are supposed to work together. A huge part of culinary school is learning how various ingredients interact with each other to produce different effects. It borders on a chemistry degree. The training doesn’t stop with the chemical reactions, though.

A chef wouldn’t produce a new dish by only thinking about the interaction of certain ingredients and putting together something she thinks would work well, she actually makes it. The designer who is creating for the web who doesn’t move to HTML and CSS with their design is like the chef who relies on the cooks to know exactly what to do. Until you get your hands dirty with code, you’re simply brainstorming ideas.

It’s taken me a long time to actually be able to articulate this. It wasn’t until I heard Dans refer to HTML and CSS as tools that it clicked: we’re looking at the world differently (shocking! I know!!). Hearing designers at Artifact talk about HTML and CSS as tools akin to Photoshop is what gave me the proper lens realize where the disconnect was.

<![CDATA[Open Source Licenses]]> 2013-03-06T10:32:00-08:00 http://travisswicegood.com/2013/03/06/open-source-licenses IANAL, but I like to pretend like I am on the Internets. This past week at NICAR, the discussion of open source licenses came up in one of the evening tracks over a few bourbons, or it might have been wine by that point, but I digress. The general theme: licenses are confusing.

I know a little bit about them I’m hoping to shed some light on them for fellow journalisty type developers who are thinking about releasing their code but aren’t sure which license they should use.

Caveats and such: I’m seriously not a lawyer, this isn’t legal advise, and so on, et cetera. Please talk to one if you have serious legal questions.

Range of Licenses

There are 69 official open source licenses in use. There are many, many more that are snowflake licenses—licenses that have provisions that are unique to them. Many companies, including ones that I’ve worked for in the past, have created custom licenses by modifying one of the main open source licenses. Many of these have been written by lawyers, but snowflake licenses are an unknown quantity until they’ve been tried in court.

You should avoid snowflake licenses for your open source code. Having a license that is unique to your project increases the barrier to entry. Each developer has to read and understand the license and try to tease out any differences you have with the more common licenses.

Instead of going the snowflake route, opt for one of the popular open source licenses that are commonly used. Each of the licenses have their place, but I’m going to touch on the three that are the most common and one additional license that I think journalists should be familiar with.

GPL: The Viral License

GPL, the Gnu Public License is possibly the most popular and familiar of the open-source licenses. It’s the license that the Linux Kernel and many of the tools that ship with the Linux operating system are released under as well as the wildly popular WordPress blogging platform. I can distribute GPL software any way I want. I can give it away, I can charge, I can do some hybrid of those two. One thing I can’t do is limit what you do with it after you acquire it.

The GPL is a copyleft license, sometimes referred to as a viral license. It’s viral because it forces your hand when it comes to licensing derivative works. Any derivative software must be distributed a compatible license like the GPL. In other words, if I came up with a way to modify Linux and wanted to distribute it, I would have to distribute it under the GPL license. That distribution could be paid, but anyone who pays for it could then redistribute it at will.

GPLv3 has some interesting provisions to. Namely, the Additional Terms. These are optional things that the author can add. For example, 7b requires “preservation of… author attributions” in a project. This is useful for businesses who want to release their software, but want make sure that their competitors can’t do a find-and-replace for their competitor’s name and repackage the software as their own and have to fully credit them, including displaying logos in the user-interface and such.

New BSD and MIT: Do what you will

On the other end of the spectrum are the New BSD (more commonly referred to simply as BSD) and the MIT licenses. These two licenses are much more permissive, allowing redistribution with only minor restrictions.

The MIT simply requires that the copyright notice be transmitted with “all copies or substantial portions of the Software.” Essentially, you have to tell the outside world that the software you’re distributing contains the MIT licensed software. Both Backbone.js and Underscore.js, two JavaScript projects that originated in the DocumentCloud project, are licensed as MIT.

The New BSD license says the same thing, plus one other clause that says you can’t reuse the original package’s name nor the names of any of the contributors to “endorse or promote products derived from this software without specific prior written permission.” FreeBSD and OpenBSD use the BSD license as does Django.

Licenses and Communities

My thoughts on licenses have evolved over the years. Jacob Kaplan-Moss introduced me to the idea of thinking of licenses as a community identifiers (Side note: he was introduced to this thought process by Van Lindberg, the current PSF Chairman and author of the book Intellectual Property and Open Source). All communities have certain things that they use to identify those who they have a common interest with. Rockabillies have fashion sense and a music that’s unique to them. Gangs have the color of their clothes. Developers have their languages and their licenses.

Each sub-community in the open-source community have their preferred license. For example, jQuery is dual-licensed as GPL/MIT, so most developers releasing software for jQuery use a similar license. The JavaScript and Ruby community tend to use the MIT license, as is evidenced by the amount of MIT code on npmjs.org and Rails. The Python community, and particularly the Django have a bias toward BSD.

Releasing software meant to be a part of those communities without following the cultural norms within those communities is a sure way to stick out. It’s like walking into a rockabilly bar dressed in a suit. You should always have a good reason for bucking the norm within a community that you want to be a part of. Trying to release GPL licensed code that builds on top of Django means that you’re not part of the community—you’ve set yourself up as an outsider.

Releasing your software with a more restrictive license than is common in a community that you’re trying to participate in also means you’re placing further restrictions on those in the community. You can use their BSD or MIT licensed code, but they can’t use your GPL code in their projects. That’s essentially telling the other developers that you love their contribution, but not enough to let them use what you’ve built under an equally permission license.

So what to use?

This is where I should mention discussions of being in Rome and so on, however, I think you should use another license: Apache License 2.0. Apache is essentially a BSD license with two very distinct modifications.

  1. Any contribution to the project is considered to be made under the terms of the Apache License. Contributor License Agreements (CLAs) can be used to enforce something similar with BSD or MIT licenses, but they aren’t guaranteed. The Apache License bakes the terms of the contribution in by default. 1. Apache grants a full rights to any current or future patents that might be derived from the contribution.

This last part is the reason to use Apache. When we started the Armstrong Project, I called up Jacob Kaplan-Moss to ask his opinions on licenses. He sold me on Apache with this line:

If I had [the licensing of Django] to do over again, it would be Apache today.

JKM’s endorsement on the grounds of patent protection was the reason that I advocated to use the Apache License on the Armstrong Project when we started instead of BSD, which is more common in the Python community (remember, community signifiers and all). I’m not worried about any current contributor, I’m worried about who might own the work a contributor makes in 1, 2, or 5 years.

Most newspapers are in a state of flux right now. Let’s say The Timbuk2 Independent contributes a few components to Armstrong. In a few years, they get bought by MegaNewsProfitExtraction, Inc. who then starts evaluating all of the intellectual property they’ve acquired. They realize the contribution from The Independent is patentable and apply for an receive a patent for their small contribution. Under a license like BSD or MIT MNPE, Inc. can now go around attempting to collect all of the patent licensing fees they’re due based on your use of Armstrong.

I don’t think that scenario is that far out there. Remember, you never write the rules for the guy you like, you write them for the one you don’t. Assuming this scenario, the best thing we can all do to protect ourselves is use a license that protects us from the future patent trolls that are lurking under the bridges of acquisitions.

Got other ideas? I’m interested in hearing them.

<![CDATA[Generic Dangers]]> 2012-09-30T11:08:00-07:00 http://travisswicegood.com/2012/09/30/generic-dangers Here at the Texas Tribune, we started using a project called django-chunks some time last year. Consider this post a cautionary tale and think long and hard before you start using. We didn’t. We’re paying the price.

The Promise

django-chunks gives you the ability to inject arbitrary chunks of HTML into any template inside Django. You load up a template tag library, call a templatetag, and you’re off to the races. No more waiting on clients (or other departments) to get you copy. “Here’s the chunk, go to town,” you tell them.

That seems pretty good, right? We’re all lazy, that’s why we program. The idea of making a computer do something for us tends to be at the core of why most programmers got into programming. Second only to making computers do the work is making other people do the work, so django-chunks let’s us off load that work to someone else.

The Problem

Once you’ve started making things chunks, everything becomes a chunk. It’s a golden hammer of sorts.

This field needs to be copy edited once in its entire existence? Bring in the chunks!

This type of thinking is shortsighted at best, and harmful at worst. This morning I took a look at our join page only to discover that we were giving users a security error due to a mishandled chunk. An image URL had been entered as http instead of https, mixing insecure content in to a secure page. Yes, there was a chunk created who’s sole purpose in life was to be edited to change out an <img> tag!

The Solution

Don’t use chunks. The case above is where a simple model should have been used if we really need to let non-technical people change out the image. We have a hard requirement on https on that page, and as programmers we can enforce that.

Django is meant for building custom stuff. Go do that. Use things that help you build that custom stuff faster, but don’t go looking for turn-key. Someone else’s answer to the problem isn’t going to be as good as what you could have built.

<![CDATA[Python for Beginners]]> 2012-03-09T08:33:00-08:00 http://travisswicegood.com/2012/03/09/python-for-beginners Yesterday I attended the Pycon Web Summit and there was a lot of talk about getting new programmers started in Python. I’ve been thinking about this a lot the last year since helping found the Austin Web Python User’s Group and I think I have a solution.

Success early, success often

One of the key things we need to be able to do is get developers on every platform up and running quickly. An iPython shell is a wonderful place for a newbie. Do you remember the first time you typed code into a REPL and it did what you told it to. 2 + 2 returned 4, then I’ll bet you tried 2 + 3 just to see that it wasn’t some trick. That sense of wonder, excitement, and, most importantly, accomplishment needs to be priority number one as we move forward whether we’re starting someone on raw Python for data manipulation or Django for full web application development.

The number of steps between “I want to learn to do X” and actually making Python do something for you needs to be minimal. That means the first words can’t be “pip install django” to get Django installed. We need to teach newbies about pip and virtualenv and how to install Python and all of the steps that go in betwee, but not yet.

The tool

I think an environment built on top of Vagrant is the right solution. We can bootstrap a virtual machine that’s ready to start accomplishing things the second it’s launched. We can’t start teaching Python by telling people they have to go find, download, and install Ruby, Rubygems, VirtualBox, and Vagrant.

The solution to this problem is a one-click installer that gets Vagrant and all of its dependencies installed and presents you with a GUI to select the type of environment you want to create. Need Django, Pyramid, NumPy, SciPy, or hell, even a setup with [csvkit]? Select that and a few minutes later (assuming a broadband connection), you’re up and running with a prompt that lets you start working.

This is doable. I haven’t done it. I’m not sure that I could (I haven’t programmed for Windows in well over a decade). I want this out there though. I want more people thinking about it and hopefully someone can kick the process off. I’ll help in any way I can and I’ll definitely use it if someone starts the project.

Got a better idea? Let’s hear it.

<![CDATA[Deploying TileStream to Heroku]]> 2012-03-01T13:14:00-08:00 http://travisswicegood.com/2012/03/01/deploying-tilestream-to-heroku This past week I attend the 2012 IRE conference. Remember all of those #nicar12 tweets you saw from me and few other programmery/journalisty type people? That’s the conference we were all hanging out at.

Custom maps were one of the big themes. There were a few TileMill talks and they were all packed. TileMill, for those who aren’t familiar, is a tool that let’s you create custom map tiles–the images that make up maps like Google Maps–so you can have a map that’s entirely unique. An example of this is the Idaho Unemployment Map by the folks over at State Impact.

We’ve been talking about using TileMill at the Texas Tribune for months now, but we’ve yet to actually deploy one. A few of us have TileMill locally and have played with it, but the tile serving component is something we haven’t touched.

I came back from the conference and got sick. Yesterday, while trying to kill some time without thinking of anything particularly important I decided to see what was involved with deploying TileStream.

TileStream is a tile server written in Node by MapBox, the creators of TileMill, to generate and server the tiles for a map you create. Since tiles are simply PNGs, it seems like you should be able to just generate a whole host of files, upload them to a server, and call it a day. The problem that a tile server solves is having to generate all of those tiles at once. Generating them, then uploading them once is a pain, but what happens if you need to make a change to them?

Lately, I’ve been on a “no new servers” kick. I’m tired of seeing the amount of time spent tweaking servers instead of working on code. DevOps is fun, don’t get me wrong, but sysadmins we are not. With that in mind, I decided to take a look at what’s involved in deploying TileStream to Heroku, a “cloud application platform” that supports a whole host of languages—including Node.

Preparing for Deploy

The very first thing you have to do is create a map an export it. That’s a topic unto itself, so I’m not going to cover it here. I created a simple copy of the state of Texas with all of its counties outlined and colored in. I forget where I procured the shape file, but some Googling should turn it up if you want to follow along.

Make sure to export the file as the mbtiles format when you export it. Where you export it to isn’t important right now, just remember where it’s at.

Next, you have to make sure Heroku is installed. If you already have a working Ruby and gems environment with Git and so installed, you can run gem install heroku to get the command line client. If you don’t, check out the Heroku Toolbelt for a quick start to get setup. Once you have the command line tools setup, log in to your Heroku account with heroku login and follow the directions.

The next step is to create a new Git repository. Heroku uses Git as its means of tracking files to deploy. You’re going to have to learn at least a little bit of Git if you’re going to use Heroku (side note: I’ve written two books on Git and highly recommend Pragmatic Version Control using Git if you’re new to version control). Once you have a Git repository, run the command heroku create -s cedar inside your working tree. You should see something similar to this:

prompt> heroku create -s cedar
Creating hollow-fire-2448... done, stack is cedar
http://hollow-fire-2448.herokuapp.com/ | git@heroku.com:hollow-fire-2448.git
Git remote heroku added

hollow-fire-2448 is the name of my Heroku application. Yours will be different. Now you have to tell Heroku what to install. To do that for Node applications, Heroku uses a package.json file. That’s the file that Node applications use to set up the dependencies to make sure that everything is installed. For this server, you just need to declare a simple dependency on tilestream. My package.json file looks like this:

  "name": "texas-counties",
  "version": "0.0.1",
  "dependencies": {
    "tilestream": "1.0.0"

Add that to the repository using git add followed by git commit. The next step is telling Heroku how to run TileStream. Heroku uses a Procfile to handle starting and stopping applications. The Procfile is run using Foreman and can define all of the processes required to run an application. The format is <name>: <command> and for this application you only need to add one line:

web: tilestream --host hollow-fire-2448.herokuapp.com --uiPort=$PORT --tilePort=$PORT --tiles=./tiles

There’s a couple of things going on there. First, notice that I’m explicitly adding a --host name and using the name of the app that Heroku told me when I called heroku create earlier. TileStream currently only responds to requests on hosts that it recognizes. You’re going to need to change that line to be whatever your Heroku app’s name is.

Next, notice that both --uiPort and --tilePort are set to the value of $PORT. Heroku exposes $PORT as an environment variable to let your application know what port to listen to for incoming connections.

Finally, you set the directory for tiles to ./tiles. Commit this, then push to Heroku to verify that everything went according to plan.

prompt> git push heroku master 
Counting objects: 6, done.
Delta compression using up to 2 threads.
Compressing objects: 100% (5/5), done.
Writing objects: 100% (6/6), 659 bytes, done.
Total 6 (delta 0), reused 0 (delta 0)

-----> Heroku receiving push
-----> Node.js app detected
… and a whole bunch more output …

Go ahead and stand up and stretch. Go grab some coffee or tea or water, whatever you vice. This step takes a few minutes while Heroku installs all of the dependencies and such for TileStream for the first time. It’s kind of awesome, though. Without a single bit of server administration, you’re just a few minutes away from having a fully operational TileStream server.

… waiting on Heroku to finish up …

Ok, done? Now run heroku open. This launches your browser and opens the URL of the Heroku application. If everything went well, you should see the empty TileStream server like this.

Empty TileStream

If you don’t get a page like the above, check the logs by running heroku logs to see if it gives you any clues. Another thing to double check is the process list. Run heroku ps to make sure that web.1 has a state of up.

That big error is the non-user-friendly way of saying there’s nothing in the tiles directory to read and display. Remember the mbtiles file you created earlier? Now it’s time to move it into place. Inside your Git repository, create a directory called tiles and copy the mbtiles file into it. Once the file is in place, add it to Git, then push the new commit to Heroku.

This push is going to take a little bit, depending on how fast your connection is. It has to send the entire mbtiles file over the wire to Heroku. Having done this a few times now, it seems like Heroku might throttle large uploads. I start out at a few hundred KB/sec, then it drops down to around 100KB/sec for about 30 seconds before settling in at 80KB/sec. Their business isn’t receiving huge files, so it would make sense if Heroku did throttle to make sure one large upload didn’t take over their entire pipe.

Once the push has finished, reload your browser window and you should see your new map, much like this:

TileStream with one map

And now, you have a tile server. Deploying to Heroku for this is a great fit for the standard news application. You need the ability to handle tons of traffic as you launch, then scale back until it hits maintenance mode where you only need a skeleton server running.

Heroku gives you one dyno–think of that as one process on a server–for free, with each additional dyno costing $0.05/hour (see the Heroku pricing page). That means you can spin up several dynos to handle the initial flood of traffic, then scale back to a smaller set and only have to pay for the initial spike. All, without any additional work on your end setting up or configuring servers.

Now, the one caveat to all of this is that I haven’t actually tried running TileStream like this with a production load. I’m not sure what kind of performance we could get out of it or what limitations there might be. The only way to answer that is to try. Hopefully we’ll be able to pluck one of the projects out of our pipeline and do some custom maps for it using TileMill and TileStream.

Where to from here?

The next thing you need to do is write some JavaScript to interact with the tile server. Leaflet has gained a lot of popularity and seems to be the default choice. I’ve yet to play around with, but that’s a topic for another blog post.

If you’re interested in seeing what all of the pieces look like together, my Heroku app is still online at hollow-fire-2448.herokuapp.com. I’ll try to leave it spinning, but if I take it down, I’ve posted the repository on GitHub so you can see all of the files in their original state.

<![CDATA[Importance of Context]]> 2012-01-14T14:20:00-08:00 http://travisswicegood.com/2012/01/14/importance-of-context Today I discovered the 99% Invisible podcast on architecture and design. Their latest podcast, Pruitt–Igoe Myth, tackles the problems associated with the Pruitt–Igoe housing project which was built in the 1950s in St. Louis to provide affordable housing in the St. Louis urban core. Due to a variety of reasons, which the podcast explores, it was torn down in the 1970s. From Wikipedia:

[Pruitt-Igoe’s] 33 buildings were torn down in the mid-1970s, and the project has become an icon of urban renewal and public-policy planning failure.

After listening to the podcast, you come away with the impression that this isn’t a fair assessment. It was built at the beginning of the White Flight, in a part of the city that saw a decrease in population, not the projected 100,000 every decade increase that was forecasted. These and other issues contributed to it turning into the very thing it was trying to prevent: a slum.

The building is considered the example of the failure of Modernist architecture as it applied public house, but if you view it in the context above you can see that there are many external factors that contributed. It’s easy to pick one particular piece of the puzzle and lay the blame on that for the failure. It’s much harder to try and understand the complex relationship around what caused the issue.

Applied to Programming

This type of logical error is present in many (not all, but many) of the conversations about what framework or language to use, what methodology should be adopted, or even where to found your startup. It’s easy to point to one success or failure and declare “X is why Z happened, so if I want to duplicate Z, then I must/must not do X.” This type of cargo-cult behavior is dangerous and should be guarded against.

Yesterday I tweeted this:

Whoa! JustinTV is moving from #rails to #django. I’m telling ya, Python & the web with a little Django mixed in is about to blow up.

It gives the impression of just that type of “Y leads to X” kind of thought process that I’m against. To clarify, I whole-hearted expected what kvogt wrote when explaining why they’re moving to Django. To paraphrase: “it just makes sense right now to be on one platform.” Justin.tv isn’t going to suddenly take the world by storm after moving to Django any quicker than they would have if they had moved their Python backend to Ruby.

That said, I stand behind the final point of that tweet. There are tons of shops using Python and Django that aren’t vocal about their use. Python is powering business logic that runs on servers sending me music, tracking my location, displaying my news, and a whole host of other things. Python can do everything low-level system tasks to scientific and financial analytical calculations to high-level business logic for websites and everything in between.

I can’t help but thing there’s going to be more Justin.tv-style announcements this year: shops standardizing on one language and that one language being Python.

<![CDATA[Using Basketweaver with GitHub]]> 2011-12-21T15:44:00-08:00 http://travisswicegood.com/2011/12/21/using-basketweaver Last month I blogged about using Travis CI with Armstrong. Things have been going along fine until the last few weeks. Tests were failing due to network timeouts while talking to PyPI. Never one to take failing tests lightly, I set out to fix it.

From local testing, it appeared that there was some sort of selective filtering happening at the server level on PyPI that was causing our tests to fail. All of our tests in the CI environment follow these tests:

  • Install all of the development requirements with pip install -r requirements/dev.txt
  • Install the local package
  • Execute the tests using fab test

I could follow these steps to the letter locally in a fresh virtualenv, but the second they hit the Travis-CI server they would time out while trying to install everything. We’ve seen similar behavior at the Tribune when we roll out new servers. PyPI appears to be up, but installs fail due to timeouts.

Once I confirmed this, I started looking at alternatives to pypi.python.org as our main index for testing. My initial thought was to have a dynamic server that would act as a proxy to PyPI and cache everything locally. This requires the least amount of work long-term—assuming the server stays up. The problem was that nothing worked quite the way I wanted. The closest I found was collective.eggproxy. It felt a little odd and wasn’t very configurable without going the Paster route, so I decided to fall back on basketweaver.

Basketweaver builds a static index suitable for using with pip via the --index-url option. It takes a directory of files, then generates the HTML that pip can scrape to determine if the package exists. This HTML can be hosted anywhere that can serve a static HTML page, such as GitHub Pages.

Working with GitHub

There’s a few hoops to jump through when deploying to GitHub Pages. First, make sure you include an empty .nojekyll file. GitHub assumes everything you want to publish is in Jekyll, but this file tells GitHub to not parse your files.

Next, and I can’t count the number of times I’ve done this, GitHub Pages doesn’t give you directory indexes. Basketweaver generates its index in the /index/ directory so you can’t hit the plain GitHub Pages URL and expect to see anything more than an error message. Make sure to add the /index/ after your GitHub Pages URL to view the it once you’ve published your changes.

The next thing I do is rework where basketweaver looks for files to build the indexes. I really don’t want to look at a full directory of files at my root directory, instead I want all of the files stored in the creatively named ./files/ directory. Basketweaver installs a file called makeindex which I can never remember, so I created a run.py file that remembers it for me.

The last thing to do is to use the newly created index when installing packages. For Armstrong, we do this with:

pip install -i http://armstrong.github.com/pypi.armstrongcms.org/index/ \
    -r requirements.txt

I haven’t gone to the trouble of setting up a CNAME for pypi.armstrongcms.org yet, so we’re using the main github.com-based address.

There’s one final gotcha: PyPI uses routing that treats http://pypi.python.org/pypi/South/ and http://pypi.python.org/pypi/south/ as the same URL. That’s why pip install Django and pip install django both work even though the former is the correct package name. The URL spec is ambigous as to whether this is correct, but most web servers are case sensitive, including GitHub Pages.

This will get you if you have dependencies on packages that don’t use all lowercase names, such as South, Fabric, or Django. All three of these are dependencies of Armstrong. The fix is to make sure that your install_requires and requirements files have the correct case. The easiest way to determine this is to look at the output of pip freeze and make sure you’re using the same package name as it generates.


At the end of the day, this keeps our tests from being held hostage whenever PyPI goes on the fritz or starts randomly filtering requests as it seemed to do this past week. All that said, we’re still borrowing other people’s infrastructure. GitHub had a little blip while I was writing this post, underlining that you get what you pay for.

While you can use Basketweaver and GitHub to create a mirror of sorts for your packages, make sure you control the infrastructure if its mission critial that everything always stay up. That, or pay for it so there’s someone to call when it goes down.

<![CDATA[Editing Mode]]> 2011-11-23T16:51:00-08:00 http://travisswicegood.com/2011/11/23/editing-mode In case you didn’t know, I use computers. A lot. Between working as a programmer, writing books, and the occasional leisure time spent playing on computers, the vast majority of my life is spent with a screen of some sort in front of it. That time means I come across and try a lot of different tools, and some of them actually make my life better.

One such tool I’ve started using extensively while writing my latest book is Notability. It’s a note taking application that let’s you import PDFs that you can write directly on top of. This is important because I can’t see the typos from within my text editor.

Switching environments when switching tasks is an important concept I picked a while back. For me, that switching comes when I build the book and switch over to a PDF version on my iPad to read it. With Notability, I can take the PDF version of my book and change rooms or sometimes just turn the chair around away from the desk, and switch into “editing” mode.

I’m not alone in using an iPad for editing. I hadn’t found an app that worked well for note taking though, so I often switched back to my text editor to write notes. Having to mentally switch context back and forth and back and forth as I physically switched devices hurt my productivity. Being able to do it all in one app has made iPad editing much more feasible.

Once I’ve finished an edit pass and have a whole slew of changes to make, I switch back to my computer with my iPad close by. Notability lets you change the color of the pen you use, so I swap it out for green, and slightly larger for impact, then start slashing through all of the red as I mark edits off. The satisfaction from marking something off with a physical slash can’t be overstated.

I’ve been using Notability for about a month now and don’t know how I managed to edit without it. I highly recommend it if you have an iPad and are doing any type of writing/drafting work.

Question for the Reader

I’ve been considering a series of short posts like this about tools that I use and how they fit into my work flow. I love to watch people work and see how they interact with their systems, though. Is this something that interests you?

<![CDATA[Travis and Python]]> 2011-11-11T15:39:00-08:00 http://travisswicegood.com/2011/11/11/travis-and-python Today I took my name back and got Armstrong tests running on Travis CI. Travis CI is the distributed, community run continuous integration server that the Ruby community has put together. It lets you do all manner of fun things, like testing in dozens of different Ruby version configurations.

You’re probably wondering what Armstrong is doing there with all of this talk of Ruby. No, I didn’t rewrite Armstrong in Rails last night. No, I didn’t convert all of our fabfiles over to Rakefiles either. Instead, I subverted it from within.

Travis CI uses a .travis.yml file for all of its configuration. There are two key fields that it gives you that let you do fun things with it: before_scripts and scripts.

before_scripts runs before anything starts. It’s like setup in the xUnit world, but for your whole environment. Each of the Armstrong components ships a requirements/dev.txt file, so I tell Travis to do a pip install -r of that during setup. That’s right, Travis CI has pip installed!

Next, I’ve set the script to use our test runner, fab test and we’re set. I had to add a few environment variables to turn off our coverage reports—they don’t provide much value when there’s no one there to view them—and we don’t need to do a re-install like we do on a local environment.

You can see this in action by checking out the current build status for the armstrong.core.arm_wells component here. Here’s the .travis.yml file’s contents:

   - 1.9.3
  - sudo pip install -r requirements/dev.txt
  - sudo pip install .
  - fab test
  email: false
    - "irc.freenode.net#armstrongcms"

There’s work happening to bring native Python support. Native support means being able to test against multiple versions and such. Be sure to check out the #travis channel on Freenode if you’re interested in helping out.

<![CDATA[Elegantly Simple]]> 2011-10-23T22:16:00-07:00 http://travisswicegood.com/2011/10/23/elegantly-simple JavaScript catches a lot of flack for it’s “ugliness,” but I’m rather fond of the language. It’s first-class functions make up for any quirks you have to deal with in the language. Consider this test case:

It generates this output when run with --spec:

I’m using test cases like this throughout my upcoming Programming Node.js book to test output of some of the simple scripts.

Yes, I know you can get some amazingly expressive test cases in other languages, but I dare people who say that JavaScript is any ugly language to find fault with this bit of code.

<![CDATA[50 Days]]> 2011-06-18T00:00:00-07:00 http://travisswicegood.com/2011/06/18/50-days Shh… Don’t tell my editor I’m blogging. I’m procrastinating by writing this blog post instead of working on Programming Node. I’ll still get to that, but this is on the brain right now.

Today marks the 50th straight day of pushing code to GitHub. My work on Armstrong has made a lot of this possible—it’s easy to push code when you’re getting paid to write open source software—but not every day has been Armstrong related code.

During the course of the last 50 days, I’ve rediscovered a few things that I want to share, in case anyone else thinks that they can’t possibly do this without changing jobs.

Keep it small

I’ve written about manageable chunks in writing, but not in contribution. It’s easy to make excuses about why you aren’t pushing code on a daily basis. You need to clean the code up; its not good enough, yet; or it’s not really significant enough to make a difference.

Excuses. All of them.

Every single piece of code you write has importance. Otherwise you wouldn’t write it. There are exceptions to this rule, but those are outliers. Most of the stuff you and I would write and go to the trouble of committing is going to be useful to someone.

Case in point, earlier this week I helped add some interactivity to a timeline on the Texas Tribune. My contribution was trivial, but it might be useful to someone trying to do something similar, so it’s up on GitHub.

There’s always something

There is always something you can do with 5 minutes. I’ve made a lot of contributions to bash-it. Think of it as your terminal on steriods, with pretty colors. I started out with some minor tweaks, then found some places where code could be better handled, then other devs built on that, and I’ve started refactoring some other parts.

I spend the vast majority of my time looking at a terminal, so it needs to fit like a glove. Working on bash-it means I’m getting more and more familiar with my environment and making some pretty cool enhancements.

Find something that you use, something that would make your life a little bit better if it just had X, then go to town and try to figure out how to do X in it. My bash programming sucks. Seriously, I wouldn’t know where to start to write a real bash program, but I can muck around in the internals and figure it out. Just because you don’t know how to program in a language doesn’t mean you: 1) can’t, 2) shouldn’t, 3) aren’t fully capable of figuring it out as a smart human being which I know at least some of you are.

Just start

It’s really easy to get part of the way through a month and say “oh, I’ll start the first of next month.” No. No you won’t. Well, if you’re me you won’t. I have a horrible tendency to want to go big or go home. Not necessarily a bad thing in and of itself, but not good for just getting shit done.™

It’s especially bad when “going big” is “I’m going to commit code every day in a month” and you’re already into an existing month. Then ya wait and you lose that initial momentum.

So the answer for me is to just start. The raw #s are what matters. Get out there, do something, start tallying it up.

<![CDATA[Armstrong on Vagrant]]> 2011-06-07T00:00:00-07:00 http://travisswicegood.com/2011/06/07/armstrong-on-vagrant We released our first version of Armstrong this past Wednesday. After taking a quick breather, I set out on getting Armstrong setup inside a Vagrant virtual machine to make evaluation easy. I finally got it running. There’s more information about getting started in the README, where it belongs, but I ran into some interesting technical issues while setting it up that I want to document here.

Vagrant + Puppet + pip

I initially wanted to create a full build-script inside Vagrant that could be used to setup the entire environment. I used puppet to start the process and found the puppet-pip provider so I was even going to be able to install Armstrong easily. Or so I thought.

There’s something that is happening when puppet runs pip that causes the installation to fail. I’m a big subscriber to select not being broken, but in this case I think there’s some odd in the combination of pip and puppet. The reason is that pip install armstrong via an ssh connection to the same virtual machine works. After briefly discussing it on #pip on Freenode, I opened ticket #298 which outlines the issues we ran into.

I finally decided to go the pragmatic route. For the time being I have a box that’s installed the way you would if you had a raw box yourself. It’s not ideal, but our new armstrong box (warning, that’s a 500mb download) boots up with everything you need to start playing with Armstrong.

Eventually, either I’ll figure out what the issue with pip+puppet is or I’ll switch to some other method that will work. My reason for picking puppet was pretty simple. The provisioning section of the getting started guide for Vagrant shows you puppet code and says essentially “Chef it too complex to simply show you how, so just use this prepared stuff.” I like simple. Right or wrong decision, I’m not 100% sure yet.

Django Server on Bootup

The server runs on startup thanks to upstart in Ubuntu. As far as Ubuntu is concerned, Armstrong is now a service that can be started and stopped with start armstrong, stop armstrong, and so on.

Upstart works on the concept of events. Different tasks emit different events that other tasks can be configured to react to. There’s a startup event and a net-device-up event and so on. I tried all manner of combinations before it dawned on me, the VM is booting, then Vagrant is mounting the NFS with the project.

Once I figured that part out, this recipe helped get things started. A quick task that starts monitoring for the config/development.py file that is mounted after booting was all I need to get runserver_plus going on “bootup”. You can check out the upstart scripts being used in the repository.

I chose runserver_plus from django-extensions rather than the built-in runserver because of issue 15880. Since I’m starting the script on start up, there’s no interactive interface and the watcher gets a little wonky. It works out though, because you get the awesome werkzeug debugger for development.


Minus a few oddities in the process, I’m really pleased with the end result. It should be noted that this is meant for development only. As we near our first stable release later this year I hope to be able to create another box that’s more deployment ready, but hopefully this will get you started down the right path.

<![CDATA[TekXI Recap]]> 2011-05-28T00:00:00-07:00 http://travisswicegood.com/2011/05/28/tek11-recap Had a good week at 2011’s version of tek. Thanks to Marco Tabini and his whole crew for putting together another great conference this year. I haven’t professionally developed in PHP for several years now, but still consider this a must attend conference. This was my 4th year. The people and the content make it worth attending, even though I’m mostly doing Python work these days.

I gave two talks this year, both on Git. Both talks went well, but my advanced Git talk needs some tweaking so I can get it in at an hour. I always plan too much material when I first give it, so it needs a little more taken out.

As promised, I am going to get both talks online over the next week. Each of the repositories we walked through in the advanced talk are going to be posted to GitHub in their “before” state that you can play with them. They also include README files that explain what you’re doing and how to do it.

I’ve already posted my amending and rebasing repositories. You can search my github for pres. to see all of the repositories. I’ll post again once I have them all up.

One of the more interesting evenings this year was a late-night hackathon that involved two 5 gallon kegs from Jason Sweat’s personal stash. I went to bed early’ish, but got this image emailed to me around 1:30. I’m told whiskey fueled its creation. :-)

Stealing Swicegood Code