Recently, I stumbled across a weird conundrum when playing with Python code. I’ve been working on a number of small Python projects recently, and while working on a problem, it occurred to me – “boy, it’d sure be handy if I used that function from another project to accomplish this!”
After all, isn’t it a ubiquitous phrase in programming to “never solve the same problem twice” or so? If I can learn to re-use my code from other projects, then I’ll save time! I might even learn something about writing more easily repurposable code in the process.
At first, I thought - this will be easy! I can just import the relevant file from my other project and be done with it. But it’s not quite so simple. As it turns out, import only really works for local modules contained within a subdirectory of your current working directory. If your file structure is anything like mine, importing modules between projects like this is a no-go.
.
└── my_python_projects
├── project_a
| └── the_module_i_wanna_import.py
└── project_b
└-- the_script_i_wanna_import_it_into.py
I know that, at the very least, I could manually copy my project_a code into project_b, and for some hobbyists this is fine! But to me, this solution feels cumbersome, and I know there must be something better out there!
I also know that, at most, if we wanted to go all out, we could publish our code as a Python package on PyPi – then we could simply pip install the package whenever. However, this must be more work than necessary, right? Also, what about polluting PyPi’s package namespace? Package names can only be used once there, after all – that would be kinda selfish…
To summarize my goal here, I’m looking for the simplest approach that:
.py files from project_a into project_bproject_a code without leaving project_bpip to the rescue!As it turns out, pip provides built-in functionality for installing a local project as a package! The basic use is:
$ pip install ../project_a
However, to make this work, we need to add some additional structure to project_a first, so that it “looks” like a package to pip. Specifically, we need to add a pyproject.toml file (or the archaic setup.py file – learn more here) to our project_a directory.
Additionally, we need our .toml to contain the following three lines:
[build-system]
requires = ["yourbuildsystemofchoice"]
build-backend = "yourbuildsystemofchoice.building.module"
Here, we define a “build system” for pip, which tells pip which tool to use to build a local version of our project when running pip install. Some options are setuptools (Python’s default recommend), distutils (archaic), poetry (hipster start-up vibes), maybe others!
The tool that we choose determines what other structural changes we need to make to project_a in order to convince pip that it’s a package. For example, setuptools likes seeing a setup.cfg file in your project_a directory (though it’s not required).
setuptoolsOn the surface, setuptools looks like a great option for our minimalist goal! To use, the basic requirement is to make your .toml look like this:
[build-system]
requires = ["setuptools"]
build-backend = "setuptools.build_meta"
[project]
name = 'project_name'
version = '0.0.1'
And technically, even this still builds – a minimalist’s dream!
[build-system]
requires = ["setuptools"]
build-backend = "setuptools.build_meta"
Note – With the minimal solution above, if exactly one
.pyfile is found in yourproject_afolder,setuptoolsgives your package its name, andversion=0.0.0. If no.pyfile is found (or if they’re hidden in subfolders)setuptoolsmakes a package calledUNKNOWNinstead,version=0.0.0. In both cases, you canimport path.to.scriptin Python, wherepath.to.scriptis the relative path from withinproject_a.However, if two or more
.pyfiles sit in yourproject_afolder (instead of in subfolders),pip installwill fail. This kinda setup needs a more fleshed out.tomlfile.
Despite this wonderful simplicity, we run into trouble when we successfully pip install. You see, ever since the release of pip version 21.3 on October 11th, 2021, pip switched to what they call “in-tree builds” as a way to save time & space when installing local packages.
The setuptools project hasn’t adjusted to this new workflow yet. As a result, if we use setuptools as our build system of choice (as in the .toml above), pip install pollutes your project_a directory with two directories called build and project_a.egg-info whenever pip install is used.
Additionally, these new files/dirs hang around kinda like a cache, so if you make changes to project_a, you also need to remember to delete project_a/build in order to force pip install to re-build, so that your changes are incorporated. Otherwise, the cached build files are re-used, and you may not notice that your changes to project_a are missing from project_b even after a reinstall! Yuck.
People are hot for change in this Github issue here, but until something is implemented, I’m steering clear of setuptools for our purposes.
poetrySince my research shows that setuptools (and not pip) is the source of the mess described above, what about an alternative? Poetry does have stricter requirements, but let’s check it out!
First, we need a more detailed .toml file – Poetry requires all the following fields:
[build-system]
requires = ["poetry-core"]
build-backend = "poetry.core.masonry.api"
[tool.poetry]
name = "project_a"
version = "0.0.1"
description = ""
authors = ["test-name <test@name.com>"]
Technically,
pip installwill complete as long asauthorscontains at least one string, and that string is of the form"a <b>", whereaandbare non-whitespace strings themselves. Something like the following works fine!
authors = ["- <->"]
Additionally, poetry prefers that project_a employ a more package-like directory structure (covered below). However, if your .py files are just hanging out in your project_a directory like in our example above, you can add the following line to your .toml instead:
[tool.poetry]
packages = [{include = "the_module_i_wanna_import.py"}]
… And that’s it! With these requirements in place, you can run pip install ../project_a in your project_b folder, and successfully import the_module_i_wanna_import in Python – without dumping anything into your project_a folder as a side-effect!
Keep in mind that pip install still builds a local snapshot of project_a for use with project_b – so if you want to carry over project_a changes, you do need to pip install ../project_a again to rebuild & incorporate those changes into project_b.
To populate the required .toml with the details poetry needs:
$ poetry init
…within your project_a directory. This provides you an interactive prompt where you’re prompted for name, version, etc. Mostly, it’s super handy & easy to remember!
We’ve basically reached our goal, but with one caveat. poetry init doesn’t prompt you to customize your packages values – it just adds:
[tool.poetry]
packages = [{include = "[name]"}]
… where [name] is the project name you pass at the name prompt. Basically, poetry assumes your project has a subdir that shares a name with your package.
This means pip install will fail unless:
.toml; or.py files in project_a into a subfolder named project_a again (or similar)More on this in the last section below!
import easyNow, that we’ve got an easy, memorable way to install local packages, how do we make them easy to import too? For example, look how nice this import flow is for Python’s random library!
import random
x = random.randrange(4)
However, if we try something similar with our package, we get an error instead!
import project_a
x = project_a.the_module_i_wanna_import.a_func()
# AttributeError: module 'project_a' has no attribute 'the_module_i_wanna_import'
Note the language used in the error message here! When imported this way, Python treats project_a as its own module (rather than a parent package). With this kind of import statement, we only actually gain access to whatever’s in the parent package’s __init__.py (ie, nothing).
Python doesn’t automatically give you immediate access to all the child modules like we might expect. In practice, it takes something more like the import below to get things right – and how ugly is that?
import project_a.the_module_i_wanna_import
x = project_a.the_module_i_wanna_import.a_func()
To make import more intuitive, we can add an import line to the __init__.py of the package itself, saving our users (or future us) a lil bit of hassle.
# __init__,py
import project_a.the_module_i_wanna_import as the_module_i_wanna_import
With this added to our package-level __init__.py, our original import now works like a charm! Then just rinse and repeat, adding an import line to your __init__.py for each module in your package!
Technically, you can leave out the
as the_module_i_wanna_importclause, but there’s a good reason for it! Without this,project_aends up within its own scope - which means thatproject_a.project_abecomes a valid expression. If you rely on autocomplete for package/module names, this gets annoying fast. Stay up a little too late coding, and you’ll start seeing arbitrarily nestedproject_a.project_a.project_a.the_module_i_wanna_importvariants littered throughout your code!
If you don’t like the default values that poetry suggests (or the format of the generated .toml), that’s “too bad” by poetry standards. In short, we don’t have a config option for this.
Instead, like this StackOverflow thread suggests, you can use a project templating tool like cookiecutter to start new projects from template (which you can define/customize).
I gave cookiecutter a shot myself, and in short, here’s what I came up with! I like what this tool offers because, as mentioned at the beginning of this article, I want a memorable, easy way to start new Python package projects, so that I can focus my inspired energy into those projects themselves.
Using cookiecutter is easy, even if designing your own “cookiecutter” (project template) takes a bit of getting used to. To use, simply navigate to the folder you normally add new projects to, and:
$ cookiecutter https://github.com/ciraben/.cookie
Or you can git clone the repo into a /cookie directory within your projects folder. Then you can just run
$ cookiecutter cookie
… and you’ll be prompted through providing a new project name & details interactively!
If you’d like learn more about creating your own custom cookiecutter template, check out the tutorial here! Or if, you want to learn more about basic recommended project structures, check out the Python structure docs here as well. (Keep in mind that most mention of setup.py can be replaced with pyproject.toml if you’re using poetry instead of setuptools.)