How docassemble uses packages
docassemble interviews can be packaged into Python packages that bundle together:
- YAML files representing interview files that start interviews, or
YAML files that are
include
d in other files; - Any document templates that your interviews use;
- Static content (images, downloadable files) that you want to include with your interview;
- Python modules, which include any classes and other code that you might write to go along with your interviews; and
- Data files (translations, machine learning training data) on which your interview depends.
A Python package also contains metadata such as author information, a brief description, and a list of other Python packages on which the package depends. It also contains a “README” file, typically styled with Markdown, which describes the package in detail.
A package containing docassemble interviews must be a subpackage
of the docassemble
package. In practice, this means that the
official package name with docassemble.
; e.g.,
docassemble.missourifamilylaw
. The docassemble
package itself is
just a shell (a namespace package) that contains subpackages. These
subpackages include docassemble’s core components
(docassemble.base
, docassemble.webapp
) as well as user-created
packages (e.g., docassemble.missourifamilylaw
).
Using the Playground to make packages
One of the features of the docassemble web application is the Playground, a browser-based development platform for developing and testing interviews. This is a “sandbox” in which you can make changes to an interview and immediately test the effect of those changes. You can use the Playground to create a variety of resources (interview files, template files, static files, data files) and then build packages out particular resources.
Every interview that you run on docassemble is part of a package.
When you are running an interview in the Playground, the package is
called docassemble.playground1
or something similar; the 1
at the
end is the user ID of the person who is using the Playground.
Packages like docassemble.playground1
should be considered temporary
and should only be used for development and testing. When you want
your users to actually use your interview, you should move your
interview out of the Playground and into a package with an
appropriate name.
Suppose you have an interview file called custody.yml
, which you are
developing in the Playground. When you right click the “ Share”
button, and copy the link URL, or you look at your browser location
bar when you are testing the interview, you will see that the URL for
the interview contains?i=docassemble.playground1:custody.yml
. This
means that the interview (i
) is in the docassemble.playground1
package and is called custody.yml
. This link always runs the
version of your interview that is current in the Playground. So if
you make a change that accidentally introduces a bug, anyone who uses
the interview will see an error message.
When you have actual users, you want to be able to make “upgrades” to your interviews without causing users to see error messages. You need to have a distinction between your “development” version and your “production” version. Here is a workflow for accomplishing this:
- Develop your
custody.yml
interview in the Playground and keep improving it until it is ready for use by users. - In the “Packages” folder of the Playground, create a package
called, e.g.,
johnsmithlaw
. Give it a version number like 0.0.1. Include within the package thecustody.yml
interview file and any other resources on whichcustody.yml
depends. - This will create a Python package called
docassemble.johnsmithlaw
. - In the “Packages” folder of the Playground, click the “Install”
button. This will install the
docassemble.johnsmithlaw
package on your server. - Your users can now access your interview at a URL that ends with
/interview?i=docassemble.johnsmithlaw:data/questions/custody.yml
. - Now you can continue making changes to the
custody.yml
interview in the Playground, and even if you break something, your users will not get an error, because they will be using the installed version of the package, not the version in your Playground. - When the new version of your interview is ready, you can go to the “Packages” folder and press “Install” again.
- Your users will still use the same URL to access the interview (one
that ends with
/interview?i=docassemble.johnsmithlaw:data/questions/custody.yml
), but now they will be using the new version of your interview.
For more information, see how you run a docassemble interview, the Playground, the packaging your interview section of the hello world example, and the development workflows section.
From the “Packages” folder, you can also press “ Download” to obtain a ZIP file of the Python package. If you open this file, you can see what the structure of a docassemble extension package looks like.
Anatomy of a docassemble package
Here is the file structure of a (fictional) docassemble package
called docassemble.baseball
.
The package is known as docassemble.baseball
within Python code,
but the name docassemble-baseball
, replacing the dot with a hyphen,
is used in other contexts, such as when referring to a package
published on the PyPI site, or a folder on your system that contains
the source code of the package.
There are a lot of subdirectories (this is the nature of namespace packages). There are reasons for all of these subdirectories.
- The top-level directory,
docassemble-baseball
, is important because a complete Python package should be all in one directory. If you are publishing a package on GitHub, this directory should be the root of the repository;docassemble-baseball/.git
will contain the git-related information for the package. - Within that, the
docassemble
directory is necessary so that the package is a subpackage ofdocassemble
. - Within that, the
baseball
directory is necessary because when packages within thedocassemble
namespace package are installed on a system, Python needs them to be in a subdirectory under a directory calleddocassemble
. - Within
baseball
, you havebaseballstats.py
, which contains Python code. Files in this directory correspond with files in the “Modules” folder of the Playground. The__init.py__
file is necessary for declaringbaseball
to be a package; you never have to edit that file. The__init.py__
file is mostly empty except for a__version__
definiton, but its presence is still important. - There is also a
data
directory with subdirectoriesquestions
,static
,sources
, andtemplates
. These are for interviews, static files, data files, and document templates. Thequestions
directory contains the YAML files that are in the main part of the Playground.
Note that in the Playground, there is a “Modules” folder along
with a “Templates” folder, a “Static” folder, etc., but in a
Python package, there is no docassemble/baseball/data/modules
folder, even though there is a docassemble/baseball/data/templates
folder and a docassemble/baseball/data/static
folder. The modules
files (.py
files) are located in a different place: directly under
docassemble/baseball
. Since the main purpose of a Python package
is to store Python modules, modules files are the “main attraction”
and everything else is just associated “data.” A Python module that
you refer to as docassemble.baseball.baseballstats
must be a file
baseballstats.py
that is located in the subdirectory
docassemble/baseball/
. (This is how Python works.)
When installed on the server, the interview hitters.yml
can be run
by going to a link like
https://example.com/interview?i=docassemble.baseball:data/questions/hitters.yml
.
In your own interviews, you can include resources from this package by writing things like the following:
The first block uses include
to incorporate by reference a
YAML interview file located in the data/questions
directory of the
package.
The second block uses a file reference to refer to an image file in
the data/static
directory of the package.
The third block uses content file
within an attachment
to
refer to a Markdown file in the data/templates
directory of the
package.
The fourth block uses modules
to import Python names from the
baseballstats.py
file.
Since a Python package is just a collection of files in a particular structure, you can maintain your docassemble extension packages “offline” (outside of the Playground) if you want. This allows you to expand the directory structure beyond what the Playground supports.
Dependencies
If your package uses code from other Python packages that are not distributed with the standard docassemble installation, you will need to indicate that these packages are “dependencies” of your package.
This will ensure that if you share your package with someone else and they install it on their system, the packages that your package needs will be automatically installed. Otherwise, that person will get errors when they try to use your package.
If you maintain your package in the Packages area of the Playground, you can indicate the dependencies of your package by selecting them from a multi-select list. (If you the Python package you need is not listed, you need to install it on your system using “Package Management” on the menu.)
If you maintain your package off-line, you will need to edit the
setup.py
file and change the line near the end that begins with
install_requires
. This refers to a list of Python packages. For
example:
This line indicates that the package relies on the docassemble
extension package docassemble.helloworld
, as well as the Python
package kombu
. When someone tries to install docassemble.baseball
on their system, docassemble.helloworld
and kombu
will be
installed first, and any packages that these packages depend on will
also be installed.
Note that if your package depends on a package that exists on GitHub
but not on PyPI, you will also need to add an extra line so that the
system knows where to find the package. For example, if
docassemble.helloworld
did not exist on PyPI, you would need to
include:
If you use the Packages area of the Playground to maintain your package, this is all handled for you.
Installing a package
You can install a docassemble extension package, or any other Python package, using the docassemble web application.
From the menu, go to “Package Management.”
docassemble installs packages using the pip package manager. This installation process may take a long time. A log of the output of pip will be shown when the installation is complete. The server will restart so that any old versions of the package that are still in memory will be refreshed.
Installing through GitHub
One way to install Python packages on a server is through GitHub.
- Find the GitHub URL of the package you want to install. This is
in the location bar when you are looking at a GitHub repository.
For example, the GitHub URL of the
docassemble-baseball
package may behttps://github.com/jhpyle/docassemble-baseball
. (No such package actually exists.) - In the docassemble web app, go to Package Management.
- Enter
https://github.com/jhpyle/docassemble-baseball
into the “GitHub URL” field. - The “GitHub Branch” field will be updated with the default branch
of the repository (usually
master
). You can select another branch if you wish to install a different branch of the repository. - Click “Update.”
If you want to install from a private Github repository, you will need a URL that points to your repository and that includes authentication information. To make such a URL, you will need a “personal access token” from Github. If you do not already have a personal access token, log into Github, go to your “Settings,” go to “Developer settings,” and go to the “Personal access tokens” tab. Click “Generate new token.” You can set the “Token description” to whatever you like (e.g. “for installing on docassemble”). Check the “repo” checkbox, so that all of the capabilities under “repo” are selected. Then click “Generate token.” Copy the “personal access token” and keep it in a safe place.
If your token is e8cc02bec7061de98ba4851263638d7483f63d41
, your
GitHub username is johnsmith
, and your package is called
docassemble-missouri-familylaw
, then the GitHub URL for your private
repository will be:
You can enter this into the “GitHub URL” field.
Installing through a .zip file
You can also install Python packages from ZIP files. For example,
if you have a package docassemble-baseball
, the ZIP file
will be called docassemble-baseball.zip
. It will contain
a single directory called docassemble-baseball
, which in
turn contains setup.py
, a subdirectory called docassemble
, and
other files.
- In the docassemble web app, go back to Package Management.
- Under “Zip File,” upload the
.zip
file you want to install. - Click “Update.”
Installing through PyPI
You can also install Python packages from PyPI. PyPI is the
central repository for Python software. Anyone can register on
PyPI and upload software to it. For example, if you want to install
the docassemble-baseball
package:
- Make sure the
docassemble-baseball
package does not already exist on PyPI (note: it doesn’t; it is just a fictional package). - In the docassemble web app, go to Package Management.
- Type
docassemble.baseball
into the “Package on PyPI” field. - Click “Update.”
Once a version of a package is installed on PyPI, it exists there permanently. When you install a version of a package on PyPI, the version will automatically increment.
Permissions on packages
The “Package Management” system remembers which user installed which
package. Thus, if developer Fred installs a package called
docassemble.guardianship
, and developer George tries to install a
package called docassemble.guardianship
, George will get an error.
These permissions do not apply to package dependencies, however.
A user with admin
privileges can install or uninstall any package.
The purpose of the permission system is to facilitate the use of a
single server by a team of developers by implementing some minor
safeguards that prevent developers from interfering with each others’
work. It is not the case that each developer works in an insulated
“sandbox.” All interviews on the system run with the same system
UID (www-data
).
Running interviews from installed packages
Once a docassemble extension package is installed, you can start
using its interviews. For example, if you installed
docassemble.baseball
, and there was an interview file in that
package called questions.yml
, you would point your browser to
http://localhost/interview?i=docassemble.baseball:data/questions/questions.yml
(substituting the actual domain and base URL of your docassemble
site). Note that a URL like this is different from the URL you see
when you are running an interview in the Playground (see
above).
For more information about starting docassemble interviews, see how you run a docassemble interview.
Updating Python packages
To upgrade a package that you installed from a GitHub URL or from PyPI, you can press the “Update” button next to the package name on the “Package Management” screen. You will only see these Update buttons if you are an administrator or if you are the person who caused the packages to be installed. Also, the “Update” buttons will not appear if the package was installed using a ZIP file.
Publishing a package
Publishing on PyPI
The best place to publish a docassemble extension packages is on PyPI, the central repository for Python software.
In order to publish to PyPI, you will first need to create an account on PyPI. You will need to choose a username and password and verify your e-mail address.
Then, go to “Configuration” on the menu and enable the PyPI publishing feature in docassemble configuration like so:
After you save the configuration, go to “Profile” on the menu and fill in “PyPI Username” and “PyPI Password” with the username and password you obtained from PyPI.
Next, go to the “Packages” folder of the docassemble
Playground and open the package you want to publish (e.g.,
docassemble-baseball
).
Press the PyPI button to publish the package to PyPI.
If your package already exists on PyPI, then pressing the Publish button will increment the version of your package. This is necessary because you cannot overwrite files that already exist on PyPI.
When the publishing is done, you will see an informational message with the output of the uploading commands. Check this message to see if there are any errors.
If the publishing was successful, then at the bottom of the page describing your package, you should see a message that the package now exists on PyPI.
You can click on the link to see what the package looks like on PyPI.
Now, on the docassemble menu (of this server or another server), you can go to Package Management and install the package by typing in “docassemble.baseball” into the “Package on PyPI” field.
Publishing on GitHub
You can publish your package on GitHub in addition to (or instead of) publishing it on PyPI. (Publishing on both sites is recommended. PyPI is the simplest and safest way to distribute Python packages, while GitHub is a version control system with many features for facilitating sharing and collaboration.)
To configure integration with GitHub, follow the steps in the setting up GitHub integration section, and edit the GitHub section of the configuration.
When that configuration is done, each user who is a developer will need to connect their GitHub accounts with their accounts on your docassemble server. From the menu, the user should go to “Profile,” click “GitHub integration,” and follow the instructions. If the user is not currently logged in to GitHub in the same browser, GitHub will ask for log in information. (Users without GitHub accounts can create one.) Users will need to give consent to giving the docassemble server to have privileges of making changes to repositories and SSH keys within the GitHub account.
(Note: it is not possible to connect more than one docassemble
account on a single docassemble server with the same GitHub
account. However, it is possible to connect accounts on multiple
servers with the same GitHub account, so long as the appname
on
each docassemble server is different.)
To publish a package on GitHub, go to the Packages area of the Playground and press the GitHub button. You will be asked for a “commit message.” This is a brief, one-line message that describes the changes made to your package since the last time you “committed” changes. Each “commit” is like a snapshot, and the history of “commit” messages is a record of the development of your project.
When you press the “Commit” button after writing the commit message, your package will be “pushed” to a GitHub repository in your account. If a repository does not already exist on GitHub with the name of your package, a new repository will be created.
You can follow the hyperlink to your package’s page on GitHub.
After your first commit, GitHub reports that there have been two
commits; this is because the initial creation of the repository caused
a commit (containing a LICENSE
file only) and then the addition of
the files of your package caused a second commit.
Once your package is on GitHub, then on the docassemble menu, you can go to Package Management and install the package using its GitHub URL.
How GitHub integration works
When you commit changes to GitHub, docassemble will first decide
whether to create a new repository or commit the changes to an
existing repository. It will use the GitHub API to look for a
repository that has the same name as your package. That is, if your
package is named familylaw
, and your GitHub username is jsmith
, it
will look for the repository
https://github.com/jsmith/docassemble-familylaw
. If that repository
does not exist, docassemble will look through the repositories to
which you are permitted to commit
(https://api.github.com/user/repos
). If no repository called
docassemble-familylaw
is found, it will search through GitHub
“organizations” of which you are a member
(https://api.github.com/user/orgs
). For example, if you are a
member of an organization called abcinc
, it will look for a
repository called https://github.com/abcinc/docassemble-familylaw
.
When GitHub integration is enabled, then at the bottom of the
Packages screen, you will see a box that says “This package is not yet
published on your GitHub account” or “This package is published on
GitHub.” If it says “This package is not yet published on your GitHub
account,” this means that docassemble was not able to find a
package called docassemble-familylaw
among your repositories or
repositories to which you have access. Thus, when you press the
Commit button, a new repository
https://github.com/jsmith/docassemble-familylaw
will be created.
If you expect to be able to push changes to a repository in another
account, but docassemble is reporting “This package is published
on GitHub,” make sure that you actually have access to that
repository. One known issue is that if a repository belongs to an
organization, and the administrator of that organization adds you as a
collaborator on that repository, GitHub will not list that
repository in its response to https://api.github.com/user/orgs
.
Thus, if you want to be able to make commits to a repository owned by
an organization, ask the administrator of the organization to make you
a member of the organization.
When you press the “Commit” button, the Git actions that are performed are as follows:
git clone
is used to copy the files from GitHub to the server.- If a commit or pull had previously been performed using the
Playground,
git checkout
is used to switch to the commit that was current as of the time the commit or pull took place. - The Playground files that are selected to be part of the package are copied on top of the cloned files.
- A new branch is created and committed.
- The new branch is merged into the master branch of the repository and pushed.
This means that the changes you have made in the Playground will not overwrite changes that have been made to the remote repository since the last time you did a “pull” or “commit” in the Playground.
This means that if you delete a file in your Playground, and that file
is already part of a GitHub repository, then the next time you do a
“pull” or “commit,” that file will be recreated in your Playground.
There is no feature in docassemble that implements git rm
. Thus, if
you want to delete such a file, delete it both on GitHub and in the
Playground.
Best practices for packaging your interviews
It is a good practice to bundle related interviews in a single package. Think about making it easy for other people to install your packages on their system and make use of your questions and code.
It is also a good practice to separate your interview into at least three files, separately containing:
- mandatory and initial code
- initial blocks
- question and code blocks
This way, other people can take advantage of your work product in interviews that might have a very different purpose.
How docassemble loads modules
During the docassemble web app startup process, the packages in
the docassemble
namespace are scanned and modules are import
ed if
they contain any class definitions or if the
update_language_function()
function is called. The early loading of
class definitions helps to prevent problems when unpickling data from
the SQL database. The update_language_function()
has a global
effect, and it is important that docassemble’s linguistic behavior
is uniform over time and does not vary depending on which interviews
the server has used. This also means that you have to be careful
about which packages you install on your server – even a package you
never use could have an effect on the way your server operates.
If you want to disable the auto-loading of a module, put this line
near the top of the .py file (before any class definitions or
references to update_language_function()
):
If this line is present, then the server will not pre-load the module file.