Intermediate2 days

Azure Data Lake Storage Gen2

Build enterprise-grade data lakes on scalable cloud storage

Overview

Azure Data Lake Storage Gen2 (ADLS Gen2) is the foundation of every modern analytics platform on Azure. Built on Azure Blob Storage with a hierarchical namespace, it delivers Hadoop-compatible access at cloud scale, fine-grained POSIX security, and multi-petabyte storage for any data format. This training teaches you how to architect, secure, and optimise a production data lake that integrates seamlessly with Azure Databricks, Azure Synapse Analytics, Azure Data Factory, and Microsoft Fabric.

What you'll learn

  • Understand the ADLS Gen2 architecture and how the hierarchical namespace differs from regular Blob Storage
  • Design a folder structure following data lake zoning best practices (raw, enriched, curated)
  • Implement fine-grained access control using RBAC and POSIX ACLs
  • Configure lifecycle management policies to tier and expire data cost-effectively
  • Connect ADLS Gen2 to Spark, Synapse, ADF, and Microsoft Fabric workloads
  • Monitor storage performance, diagnose access issues, and secure data at rest and in transit

Programme

Day 1 — Architecture, provisioning & security
  • What makes ADLS Gen2 different: hierarchical namespace and the ABFS driver
  • Storage account types, replication options, and performance tiers
  • Designing data lake zones: raw, enriched, curated, and sandbox layers
  • Access control deep-dive: Azure RBAC, POSIX ACLs, and Shared Access Signatures
  • Encryption at rest and in transit: Microsoft-managed vs customer-managed keys
  • Hands-on: provision a storage account, enable hierarchical namespace, and configure zone folders with correct ACLs
Day 2 — Integration, lifecycle management & optimisation
  • Connecting ADLS Gen2 to Azure Databricks, Synapse Analytics, and Microsoft Fabric
  • Azure Data Factory: reading and writing to ADLS Gen2 at scale
  • Optimising file formats: Parquet, Delta Lake, and ORC — when to use each
  • Lifecycle management policies: automatically move data to cool and archive tiers
  • Network security: private endpoints, firewalls, and virtual network service endpoints
  • Hands-on: build a multi-zone data lake pipeline from raw ingestion to curated Delta tables

Who is this for?

  • Data engineers designing or modernising an Azure data platform
  • Cloud architects evaluating scalable storage options for analytics
  • Platform teams responsible for data governance and access control
  • Professionals working with Azure Synapse, Databricks, or Microsoft Fabric

Prerequisites

  • Basic Azure familiarity (portal, resource groups, subscriptions)
  • Understanding of cloud storage concepts (containers, blobs)
  • Some experience with data engineering tools is helpful

Tools & technologies covered

Azure Data Lake Storage Gen2Azure Blob StorageABFS DriverAzure RBACAzure Data FactoryDelta LakeAzure PortalAzure CLI
Not sure which course fits your team?
Talk to us — we'll match you to the right training path.
Get in touch