Intermediate2 days

Azure Data Lake Storage Gen2

Build enterprise-grade data lakes on scalable cloud storage

Overview

Azure Data Lake Storage Gen2 (ADLS Gen2) is the foundation of every modern analytics platform on Azure. Built on Azure Blob Storage with a hierarchical namespace, it delivers Hadoop-compatible access at cloud scale, fine-grained POSIX security, and multi-petabyte storage for any data format. This training teaches you how to architect, secure, and optimise a production data lake that integrates seamlessly with Azure Databricks, Azure Synapse Analytics, Azure Data Factory, and Microsoft Fabric.

What you'll learn

Understand the ADLS Gen2 architecture and how the hierarchical namespace differs from regular Blob Storage
Design a folder structure following data lake zoning best practices (raw, enriched, curated)
Implement fine-grained access control using RBAC and POSIX ACLs
Configure lifecycle management policies to tier and expire data cost-effectively
Connect ADLS Gen2 to Spark, Synapse, ADF, and Microsoft Fabric workloads
Monitor storage performance, diagnose access issues, and secure data at rest and in transit

Programme

Day 1 — Architecture, provisioning & security

What makes ADLS Gen2 different: hierarchical namespace and the ABFS driver
Storage account types, replication options, and performance tiers
Designing data lake zones: raw, enriched, curated, and sandbox layers
Access control deep-dive: Azure RBAC, POSIX ACLs, and Shared Access Signatures
Encryption at rest and in transit: Microsoft-managed vs customer-managed keys
Hands-on: provision a storage account, enable hierarchical namespace, and configure zone folders with correct ACLs

Day 2 — Integration, lifecycle management & optimisation

Connecting ADLS Gen2 to Azure Databricks, Synapse Analytics, and Microsoft Fabric
Azure Data Factory: reading and writing to ADLS Gen2 at scale
Optimising file formats: Parquet, Delta Lake, and ORC — when to use each
Lifecycle management policies: automatically move data to cool and archive tiers
Network security: private endpoints, firewalls, and virtual network service endpoints
Hands-on: build a multi-zone data lake pipeline from raw ingestion to curated Delta tables

Who is this for?

Data engineers designing or modernising an Azure data platform
Cloud architects evaluating scalable storage options for analytics
Platform teams responsible for data governance and access control
Professionals working with Azure Synapse, Databricks, or Microsoft Fabric

Prerequisites

Basic Azure familiarity (portal, resource groups, subscriptions)
Understanding of cloud storage concepts (containers, blobs)
Some experience with data engineering tools is helpful

Tools & technologies covered

Azure Data Lake Storage Gen2Azure Blob StorageABFS DriverAzure RBACAzure Data FactoryDelta LakeAzure PortalAzure CLI

Not sure which course fits your team?

Talk to us — we'll match you to the right training path.

Get in touch