POSTS

The problem with Python namespaces modules (or, Python Namespaces. There be dragons this way.)

Yesterday I lamented the issues with namespaces in Python. It's not really the namespaces, it's the marketing of namespaces. Newbies to the community (something I still consider myself for most purposes) are drawn to modules thinking that there's a one-to-one relationship between file hierarchy and namespaces. And there is. Well, sort of.

You have to read the entire manual or happen to have someone to point out the difference between namespaces and modules to even realize there is a difference. Under most circumstances, you won't even realize they are different until you start to do something slightly complex. Say, for example, building an application with multiple modules inside a similar namespace, each module in a separate repository and its own history. There needs to be a "there be dragons" warning to let people know.

Those dragons are such: you have two paths inside sys.path that contain similar code. Such as I do with all of the Domain51 code. Package one has foo as a module while package two contains bar. The assumption would be that Python acts like most other languages and would exhaust it's sys.path trying to find both packages, but that's not the case. It'll get to the first one, then pretend the second doesn't exist.

The fix for this is the explicitness, one of Python's cardinal virtues. You have to declare the namespace in order for it to work. As you don't see this hardly anywhere in Python because Pythonistas feel that namespaces are a bad idea, here's the code you need to include in your __init__.py file to make it declare itself as a namespace:


import pkg_resources
pkg_resources.declare_namespace(__name__)

That's it. Now Python becomes smart again, and you can have real namespaces with similar directory structures existing side-by-side. Python is perfectly capable of finding them - now. Which raises an interesting question: why does Python scan the entire sys.path looking for files, building up this list of what declares what namespaces and where only to ignore it later unless they're explicit about it? I haven't dove into the source to be sure, but it seems it would have to scan the __init__.py files in order to know whether there's something there.

But I digress. There's a bigger dragon that's not even hinted at. Python's inability to find modules.

Take, for example, my python-stupidity repository on GitHub. Run the test.py file and you can see the error for yourself. There are two barfoo modules within the path, but Python decides to act the villagedolt and stop as soon as it hits the first one that might match. This particular case is caused by foobar trying to import a method from barfoo that doesn't exist in foobar.barfoo

This is, in my opinion, a huge issue. Note that foobar.barfoo declared it's namespace. It said loudly, "I am me", and Python ignored that fact in favor of relative includes. Not only that, but it stopped and started pouting as soon as one module that said it was foobar.barfoo couldn't match.

Why not finishing looking through the rest of the sys.path? Why not pay attention to that precious declaration Python wants you to add to explicitly become a namespace?

Like almost all problems with programming languages, however, there is a fix. At first glance, I thought it might be the way PHP handled it - just include a separator at the beginning of the import. That didn't work, but in searching for the solution, I found out that Python supports relative imports through it's support of intra package references. The fix within the foobar module is to do from ..barfoo import base_barfoo, but this only covers you if you're in Python 2.5 or later.

According to my understanding of it, you use it to explicitly say I want a sibling module names X without having to declare the entire namespace or accidentally picking up a module from the global namespace. Fair enough, but my solution to the problem above is to use the relative import to trick Python into thinking it couldn't find a module named the same.

You can see the code in the d51.django.auth package. I have a d51.django.auth.facebook module which takes precedent over PyFacebook's facebook module, but only inside the d51.django.auth.

I'm not saying that namespaces are a bad idea in Python or any other language. I'll gladly take namespaces, in any form I can get them and use them. They provide a great way to segregate code into small, independent, re-usable packages while continuing to say "I'm from over here." They allow facebook to be used as a module in multiple places without causing an issue, other than the ones listed here.

No, my problem is not with namespaces. My problem is with Python's current method for searching for them; it's lack of exposing namespaces and modules and their differences up front, and it's brain dead way of halting on the first partial hit. I'm amazed that a language that prides itself on explicitness—on not doing anything that's not asked for—decides that it's ok to stop looking for matching code just because it found one thing that doesn't match. It smacks of premature optimization.