How Orchest works¶
Orchest is powered by your filesystem. Upon launching, Orchest will
mount a directory called the userdir. Its default location is
orchest/orchest/userdir/. Inside this directory it will store the following
files for each pipeline:
- Your scripts that make up the pipeline, for example
- The Orchest Data passing SDK stores step outputs in the
.datadirectory to pass data between pipeline steps.
- Logs are stored in
.logsto show STDOUT output from scripts in the pipeline view.
- An autogenerated pipeline.json file that defines the properties of the pipeline and its steps. This includes: execution order, names, images, etc. Orchest needs this pipeline definition file to work.
Orchest runs as a collection of Docker containers and only stores a global configuration file. The
location for this config is
~/.config/orchest/config.json for Unix based systems and
%UserProfile%\.orchest\config.json for Windows.
Installing additional packages¶
We plan on supporting custom images and/or container commits, to avoid having to reinstall packages each time a pipeline step is run.
Installing additional Python packages¶
Execute commands inside the scripts to install the package before use.
For Jupyter notebooks you can run the following code in a cell:
!conda install <package name>
or for the
pip packages run:
!pip install <package name>
Or directly from within Python (i.e. for Python scripts):
from pip._internal import main as pip pip(['install', '--user', '<package name>'])
Installing additional R packages¶
R packages can be installed with the regular command: