less than 1 minute read

Before the Databricks Unit Catalog’s release, we used init scripts to generate the pip.conf file during cluster startup, allowing each cluster its unique auth token. But with init scripts no longer available in the Unit Catalog’s shared mode, an alternative approach is required.

A workaround involves placing a prepared pip.conf in the Databricks workspace and setting the PIP_CONFIG_FILE environment variable to point to this file. This method, however, presents security concerns: the pip.conf file, containing the auth token, becomes accessible to the entire workspace, potentially exposing it to all users and clusters. See here to check this workaround.

In contrast, the Unit Catalog’s single mode retains init script availability. Here, the pip auth token is stored securely in a vault and accessed via the Databricks secret scope. Upon cluster startup, the init script fetches the token from the vault, generating the pip.conf file. This approach is considerably more secure than the shared mode alternative.

Leave a comment