- Get link
- X
- Other Apps
What is Azure
Databricks?
By Rajeev Sharma |
June 06, 2021 5 minute read
Azure Databricks is cloud service that let us setup and use a cluster of Azure instances with Apache Spark, installed, with a Master-Worker nodal dynamic computing.
Workspace
Workspace is an environment for accessing all Azure Databricks assets. It organizes objects like notebooks, libraries, dashboards, & experiments into folders and provide access to data objects and computational resources.
Objects contained in the Azure Databricks folders are:
- Notebook
- Dashboard
- Library
- Experiment
Notebook
It is web-based interface to documents that contain runnable commands, visualizations, and narrative text.
Dashboard
An interface that provides organized access to visualizations.
Library
It is a package of code available to the Notebook or job running on your cluster. Databricks runtimes include many libraries and you can add your own.
Experiment
A collection of MLflow runs for training a machine learning model.
Azure Databricks provides the latest versions of Apache Spark and it allows you to seamlessly integrate with open source libraries. It operates out of a Control plane and Data plane.
Control plane includes backend services that is managed by Azure Databricks in its own Azure account. Notebook commands & many other workspace configurations are stored in Control plane & encrypted at rest.
By Rajeev Sharma |
June 06, 2021 5 minute read
Data plane is managed by Azure account where data resides i.e. where your data is processed. The Azure Databricks connectors used so that the clusters can connect to external data sources outside the Azure account to ingest data or for storage. We can also ingest data from external streaming data sources, such as streaming data, events data, IoT data & more.
By Rajeev Sharma |
June 06, 2021 5 minute read
Comments
Post a Comment
Cloud Computing