Azure Databricks Training

Module 1: Introduction to Azure Databricks

Lesson 1: Overview of Azure Databricks

  • What is Databricks? (Unified Data Analytics Platform)
  • Key components: Workspace, Clusters, Notebooks, Jobs
  • Integration with Azure services (Blob Storage, Synapse, ADF)

Lesson 2: Architecture & Key Concepts

  • Databricks Runtime (Optimized Spark)
  • Workspace organization (Folders, Repos, Teams)
  • Cluster types (All-purpose, Job, High-Concurrency)

Lesson 3: Pricing & Cost Optimization

  • DBU (Databricks Unit) pricing model
  • Cluster auto-scaling & termination policies
  • Spot instances & cost-saving best practices

Module 2: Setting Up Azure Databricks

Lesson 1: Deployment & Configuration

  • Creating a Databricks workspace in Azure
  • Azure Active Directory (AAD) integration
  • Network security (VNet, Private Link)

Lesson 2: Workspace Navigation

  • Databricks UI overview
  • Notebooks, Repos, and Workspace organization
  • Managing users and permissions

Lesson 3: Hands-on Lab

  • Deploy a Databricks workspace
  • Create your first notebook
  • Run a simple PySpark query

Module 3: Data Ingestion & Processing

Lesson 1: Reading & Writing Data

  • Connecting to Azure Data Lake (ADLS Gen2)
  • Delta Lake format (ACID transactions, schema enforcement)
  • Supported data sources (CSV, JSON, Parquet, SQL DB)

Lesson 2: ETL with Spark

  • Transformations (filter, join, aggregate)
  • Structured Streaming for real-time data
  • Optimizing Spark jobs (partitioning, caching)

Lesson 3: Hands-on Lab

  • Ingest data from Azure Blob Storage
  • Clean and transform data using PySpark
  • Write processed data to Delta Lake

Module 4: Delta Lake & Advanced Data Engineering

Lesson 1: Delta Lake Deep Dive

  • Time travel (querying historical data)
  • MERGE, UPDATE, DELETE operations
  • Optimizations (Z-ordering, compaction)

Lesson 2: Databricks SQL & BI Integration

  • SQL Warehouses in Databricks
  • Creating dashboards with Databricks SQL
  • Connecting Power BI to Databricks

Lesson 3: Hands-on Lab

  • Implement SCD (Slowly Changing Dimensions) in Delta
  • Optimize a Delta table for fast queries
  • Build a Databricks SQL dashboard

Module 5: Machine Learning & AI in Databricks

Lesson 1: MLflow for Model Tracking

  • Experiment tracking
  • Model registry & deployment
  • Hyperparameter tuning

Lesson 2: Distributed ML with Spark

  • MLlib for scalable machine learning
  • Feature engineering at scale
  • Integrating with Azure ML

Lesson 3: Hands-on Lab

  • Train a machine learning model using MLflow
  • Deploy a model for batch inference
  • Monitor model performance

Module 6: Job Automation & Orchestration

Lesson 1: Databricks Jobs

  • Scheduling notebooks & workflows
  • Job clusters vs. interactive clusters
  • Error handling & retries

Lesson 2: Integration with Azure Data Factory

  • Triggering Databricks jobs from ADF
  • Parameterized notebooks
  • Monitoring job performance

Lesson 3: Hands-on Lab

  • Schedule a daily ETL job
  • Set up alerts for job failures
  • Orchestrate a multi-step pipeline

Module 7: Security & Governance

Lesson 1: Access Control & Security

  • Role-based access control (RBAC)
  • Secret management (Databricks Secrets)
  • Encryption & compliance

Lesson 2: Performance Tuning & Optimization

  • Cluster configuration best practices
  • Spark UI & debugging
  • Delta Lake performance tuning

Lesson 3: Hands-on Lab

  • Configure table access control
  • Optimize a slow-running Spark job
  • Implement data masking

Module 8: Real-World Use Cases & Capstone Project

Lesson 1: Industry Applications

  • Real-time analytics (IoT, clickstream)
  • Data warehousing with Databricks SQL
  • AI/ML use cases (fraud detection, recommendation engines)

Lesson 2: Capstone Project

  • End-to-end data pipeline: Ingest → Process → Analyze → Visualize
  • Example: Retail sales forecasting

Price:

3000

Day(s):

2

Got Questions?
We're here for you!

Schedule Appointment

Fill out the form below, and we will be in touch shortly.
Contact Information
Vehicle Information
Preferred Date and Time Selection