Custom Python Environments for Jupyter Notebooks
To simply use the environment go to
For additional info about using Python virtual environments with Conda please go the the UFRC page or the Software Carpentries pages from which these procedures were derived.
Background (from UFRC page)
Many projects that use Python
code require careful management of the respective Python environments. Rapid changes in package dependencies, package version conflicts, deprecation of APIs (function calls) by individual projects, and obsolescence of system drivers and libraries make it virtually impossible to use an arbitrary set of packages or create one all-encompassing environment that will serve everyone’s needs over long periods of time. The high velocity of changes in the popular ML/DL frameworks and packages and GPU computing exacerbates the problem.
Getting Started: Conda Configuration
The ~/.condarc
configuration file
conda
‘s behavior is controlled by a configuration file in your home directory called .condarc
. The dot at the start of the name means that the file is hidden from ’ls’ file listing command by default. If you have not run conda
before, you won’t have this file. Whether the file exists or not, the steps here will help you modify the file to work best on HiPerGator. First load of the conda
environment module on HiPerGator will put the current ‘’best practice’’ .condarc
into your home directory.
This path will be found in your home directory (i.e., Home/<myname>
is usually symbolized as : ~
).
Home/<myname>/.condarc
conda
package cache location
conda
caches (keeps a copy) of all downloaded packages by default in the ~/.conda/pkgs
directory tree. If you install a lot of packages you may end up filling up your home quota. You can change the default package cache path. To do so, add or change the pkgs_dirs
setting in your ~/.condarc
configuration file e.g.:
pkgs_dirs:
- /blue/akeil/share/conda/pkgs
…
or
- /blue/akeil/$USER/conda/pkgs
Replace akeil
or mygroup
with your actual group name.
conda
environment location
conda
puts all packages installed in a particular environment into a single directory. By default ‘’named’’ conda
environments are created in the ~/.conda/envs
directory tree. They can quickly grow in size and, especially if you have many environments, fill the 40GB home directory quota. For example, the environment we will create in this training is 5.3GB in size. As such, it is important to use ‘’path’’ based (conda create -p PATH) conda environments, which allow you to use any path for a particular environment for example allowing you to keep a project-specific conda environment close to the project data in /blue/ where you group has terrabyte(s) of space.
You can also change the default path for the ‘’name’’ environments (conda create -n NAME
) if you prefer to keep all conda
environments in the same directory tree. To do so, add or change the envs_dirs
setting in the ~/.condarc
configuration file e.g.:
envs_dirs:
- /blue/akeil/share/conda/envs
…
or
- /blue/akeil/$USER/conda/envs
Replace mygroup
with your actual group name.
Editing your ~/.condarc
file.
One way to edit your ~/.condarc
file is to type:
nano ~/.condarc
If the file is empty, paste in the text below, editing the env_dirs:
and pkg_dirs:
as below. If the file has contents, update those lines.
Your ~/.condarc
should look something like this when you are done editing (again, replacing group-akeil
and USER
in the paths with your actual group and username).
channels:
- conda-forge
- bioconda
- defaults
envs_dirs:
- /blue/akeil/USER/conda/envs
pkgs_dirs:
- /blue/akeil/USER/conda/pkgs
auto_activate_base: false
auto_update_conda: false
always_yes: false
show_channel_urls: false
Use your kernel from command line or scripts
Now that we have our environment ready, we can use it from the command line or a script using something like:
module load conda
conda activate mne
# Run my python script
python amazing_script.py
…
or
a path based setting:
# Set path to environment
# pre-pend to PATH variable
env_path=/blue/akeil/share/mne_1_x/conda/bin
export PATH=$env_path:$PATH
# Run my python script
python amazing_script.py
Setup a Jupyter Kernel for our environment
Often, we want to use the environment in a Jupyter notebook. To do that, we can create our own Jupyter Kernel.
Add the jupyterlab
package
In order to use an environment in Jupyter, we need to make sure we install the jupyterlab
package in the environment:
mamba install jupyterlab
Copy the template_kernel
folder to your path
On HiPerGator, Jupyter looks in two places for kernels when you launch a notebook:
/apps/jupyterhub/kernels/
for the globally available kernels that all users can use. (Also a good place to look for troubleshooting getting your own kernel going)~/.local/share/jupyter/kernels
for each user. (Again, your home directory and the.local
folder is hidden since it starts with a dot)
Make the ~/.local/share/jupyter/kernels
directory: mkdir -p ~/.local/share/jupyter/kernels
Copy the /apps/jupyterhub/template_kernel
folder into your ~/.local/share/jupyter/kernels
directory:
cp -r /apps/jupyterhub/template_kernel/ ~/.local/share/jupyter/kernels/hfrl
This also renames the folder in the copy. It is important that the directory names be distinct in both your directory and the global /apps/jupyterhub/kernels/
directory.
Edit the template_kernel
files
The template_kernel
directory has four files: the run.sh
and kernel.json
files will need to be edited in a text editor. We will use nano
in this tutorial. The logo-64X64.png
and logo-32X32.png
are icons for your kernel to help visually distinguish it from others. You can upload icons of those dimensions to replace the files, but they need to be named with those names.
Edit the kernel.json
file
Let’s start editing the kernel.json
file. As an example, we can use:
nano ~/.local/share/jupyter/kernels/hfrl/kernel.json
The template has most of the information and notes on what needs to be updated. Edit the file to look like:
{
"language": "python",
"display_name": "MNE v1.x",
"argv": [
"~/.local/share/jupyter/kernels/mne_1_x/run.sh",
"-f",
"{connection_file}"
]
}
Edit the run.sh
file
The run.sh
file needs the path to the python
application that is in our environment. The easiest way to get that is to make sure the environment is activated and run the command: which python
The path it outputs should look something like: /blue/group/share/conda/envs/mne_1_x/bin/python
Copy that path.
Edit the run.sh
file with nano
:
nano ~/.local/share/jupyter/kernels/mne_1_x/run.sh
The file should looks like this, but with your path:
#!/usr/bin/bash
exec /blue/akeil/share/conda/envs/mne_1_x/bin/python -m ipykernel "$@"
If you are doing this in a Jupyter session, refresh your page. If not, launch Jupyter.
Your kernel should be there ready for you to use!
Working with yml files
Export your environment to an environment.yml
file
Now that you have your environment working, you may want to document its contents and/or share it with others. The environment.yml
file defines the environment and can be used to build a new environment with the same setup.
To export an environment file from an existing environment, run:
conda env export > mne_1_x.yml
You can inspect the contents of this file with cat mne_1_x.yml
. This file defines the packages and versions that make up the environment as it is at this point in time. Note that it also includes packages that were installed via pip
.
Create an environment from a yaml file
If you share the environment yaml file created above with another user, they can create a copy of your environment using the command:
conda env create –file mne_1_x.yml
They may need to edit the last line to change the location to match where they want their environment created.
Group environments
It is possible to create a shared environment accessed by a group on HiPerGator, storing the environment in, for example, /blue/akeil/share/conda
. In general, this works best if only one user has write access to the environment. All installs should be made by that one user and should be communicated with the other users in the group.