How can I use bamboolib in Databricks?

Question:

I would like to automatically do Exploratory Data Analysis using Azure Databricks, and I have seen the potential it has as shown for example in this post: https://towardsdatascience.com/the-easy-way-to-do-data-exploration-22b4b8e1dc20

But when following the same steps in Databricks the extension is not enabled. I have tested something like this:

import bamboolib as bam
import pandas as pd

Also testing adding the following lines to enable the extension:

bam.enable()

# Jupyter Notebook extensions
!python -m bamboolib install_nbextensions

I have also read that bamboolib is "joining forces" with Databricks but still don’t find if it is not yet available or any documentation regarding this integrations.

I would really appreciate if anyone knows how to use bamboolib with Databricks

Answers:

You can install bamboolib library using below 2 approach.

  1. pip install bambooliblink

enter image description here

  1. Install library in databricks cluster.

enter image description here

You can refer this article by Rahul Agarwal

Answered By: Abhishek Khandave

I am on the team at Databricks working on the bamboolib integration and I am excited that you want to take bamboolib for a spin.

Update: As of September 13 2022 bamboolib is in public preview within Databricks notebooks that use DBR 11 or higher (DBR 11.1 or higher on GCP).

Link to the AWS docs

Answered By: fwetdb

For me the issue was solved after setting the Databricks Runtime to 11.0 version. Read more about the requirements of bamboolib on Databricks here: https://docs.microsoft.com/en-us/azure/databricks/notebooks/bamboolib#requirements (DBR 11.0 is the minimum requirement)

Answered By: nilson_020