Saturday, 23 March 2024

Azure Data Engineer Online Training in hyderabad

Azure Data Engineer Online Training in Hyderabad

Introduction to Azure

· Introduction to Azure Cloud

· What is difference between Azure Cloud and On-Premises

· What is Subscriptions and Resource Groups

· Different offerings of Cloud IaaS, PaaS and SaaS

· Creation of Virtual Machine

Introduction to Storage

· Azure Storage

o Azure Blob

o Table

o Message

o Queue

· Azure Data Lake Store Gen I & Gen II

o What is Data Lake

o Data Lake vs. Hadoop

o Blob Storage vs. Data Lake

o Hierarchical Namespace

o Ingestion through different tools i.e.; Azure Data Explorer, AzCopy, Azure CLI, Powershell

Introduction to Azure SQL Database

· Introduction to Azure SQL Database

· Why choosing SQL Server in Azure

· Azure IaaS vs. PaaS database offerings

· IaaS vs. Managed Instance

· SQL Server PaaS deployment options

· Demo - Azure Single Database

· Purchasing models and Service Tier

· Azure Database vs. Azure Data Warehouse

· Elastic Database Pool

o Introduction

o Azure Elastic Database

o Demo - Azure Elastic Database

· Managed Instance Database

o Introduction

o Azure Managed Instance Database

o Difference between on-premises and managed instance

o Migration options for Managed Instance

o Service tiers for Managed Instance

o Demo - Managed Instance

· Azure Database Security

o Introduction

o Azure Database and Managed Instance Security options

o Encrypting Data at Rest and Motion

o High Availability vs. Disaster Recovery

o RTO vs. RPO

o Azure SQL Database High Availability and Disaster Recovery options

o Azure SQL Database Scaling

· Installation of SQL Server 2016 and above in Virtual Machine

· Creation of External Table or PolyBase in On-Premise SQL Server

o Creation of Master Key

o Creation of Database Scoped Credential

o Creation of External Data Source

o Creation of External File Format

o Creation of External Table

· Creation of External Table or PolyBase in Azure SQL Data Warehouse

o Creation of Master Key

o Creation of Database Scoped Credential

o Creation of External Data Source

o Creation of External File Format

o Creation of External Table

· Different Distribution or Shredding Patterns

o ROUND ROBIN

o HASH

o REPLICATION

· Cross Query Databases in Azure SQL Database

o Creation of Master Key

o Creation of Database Scoped Credential

o Creation of External Data Source

o Creation of External Table

· Creation of Elastic Pools in Azure SQL Server between Databases

Data Warehouse Internals and Architecture

· Introduction

· Azure Synapse MPP Architecture

· Storage and Sharding patterns

· Data Distribution and Distributing Keys

· Data Types and Table Types

· Partitioning

· Data Warehouse Concepts

· Dimensions and Facts

· Types of Dimensions and Facts

· Different types of Schemas in Data Warehouse

· Relationship types in Data Warehouse

· Best Practices for Fact and Dimension tables

· Demo - Analyze Data distribution before migration to Azure Synapse

Azure Data Factory

· Introduction to Azure Data Factory

· Creation of Linked Services, Datasets, Pipelines

· Creation of Integration Runtime and different types

· Slowly Changing Dimensions

· Design and implement a Type 1 slowly changing dimension with mapping data flows

· Debug data factory pipelines

· Understand the Azure SSIS Integration Runtime

· Set-up Azure SSIS Integration Runtime

· Run SSIS Package in Azure Data Factory

· Migrate SSIS Packages to Azure Data Factory

· Integrate SQL Server Integration Services Packages within Azure Data Factory

· Activities

o Copy

o Data flow

o Stored Procedure

o Lookup

o ForEach

o Get Metadata

o Filter Activity

o Spark

o U-SQL

o Databricks Notebooks

o Web

o If Condition

o Delete

· Data Flows

o Derived Column

o Join

o filter

o exists

o conditional split

o Lookup, Exists

o Select

o Aggregate

o Rank

o Filter

o Sort

o Alter Row

· Dynamic Queries in ADF

· Sending mails through Logic Apps

· Few more Activities ......

· Dataset and Pipeline Parameterization

· Monitor -- Azure and Visually

· Setup Alerts from Azure Data Factory

Realize Integrated Analytical Solutions with Azure Synapse Analytics

· Introduction

· What is Azure Synapse Analytics

· How Azure Synapse Analytics works

· When to use Azure Synapse Analytics

· Create Azure Synapse Analytics workspace

· Exercise - Create and manage Azure Synapse Analytics workspace

· Describe Azure Synapse Analytics SQL

· Explain Apache Spark in Azure Synapse Analytics

· Exercise - Create pools in Azure Synapse Analytics

· Orchestrate data integration with Azure Synapse pipelines

· Exercise-Identifying Azure Synapse pipeline components

· Visualize your analytics with Power BI

· Understand hybrid transactional analytical processing with Azure Synapse Link

· Use Azure Synapse Studio

· Understand the Azure Synapse Analytical processes

· Explore the Data hub, Develop hub, Integrate hub

· Explore the Monitor hub, Manage hub

· Describe a modern data warehouse

· Define a modern data warehouse architecture

· Exercise - Identify modern data warehouse architecture components

· Design ingestion patterns for a modern data warehouse

· Understand data storage for a modern data warehouse

· Understand file formats and structure for a modern data warehouse

· Prepare and transform data with Azure Synapse Analytics

· Serve data for analysis with Azure Synapse Analytics

Azure Synapse Analytics

Introduction

· Why Warehouse in cloud

· Traditional vs. Modern Warehouse architecture

· What is Synapse Analytics Service

· Create Dedicated SQL Pool and Spark Pool

· Create Azure Synapse Analytics Studio Workspace

· Analyze Data using Dedicated SQL Pool and Spark Pool

· Analyze Data using Apache Spark Notebook

· Analyze Data using Serverless SQL Pool

· Azure Synapse Benefits

Azure Event Hub, IoT Hub and Azure Stream Analytics

· Introduction to Azure Event Hub, IoT Hub and Stream Analytics

· Azure Stream Analytics Job

· Azure Stream Analytics Components

· Azure Stream Analytics Job

· Batching Streaming using Azure Event Hub

· Real Time Streaming using Azure IoT Hub

· Types of Window Functions

o Tumbling Window

o Hoping Window

o Sliding Window

o Session Window

Azure Databricks

· Spark Basics

· Why Spark is difficult? Why Databricks Evolved?

· Why Databricks in Cloud? Introduction to Azure Databricks

· Demo

· Provision Databricks, Clusters and workbook

· Mount Data Lake to Databricks DBFS

· Explore, Analyze, Clean, Transform and Load Data in Databricks

· Azure Databricks Clusters

· Azure Databricks other Important Components

· Databricks - Monitoring

· How to create Cluster

· How to work with Databricks File System

· How to create notebooks and Integrate with ADF

· How to import and export the Notebooks

· How to connect to blob, SQL DB from Databricks

· How to read data files from Azure Blob and Azure Data Lake Store

§ Using Scala, R, Python, Spark SQL Language

· Creating Data Frames

· Converting Data Frames into Temporary Table or Temporary View

· Incremental and Full Load with Azure SQL Data Warehouse

· Understand the architecture of Azure Databricks spark cluster

· Understand the architecture of spark job

· Read data in CSV format

· Read data in JSON format

· Read data in Parquet format

· Read data stored in tables and views

· Write data

· Describe a DataFrame

· Use common DataFrame methods

· Use the display function

· Exercise: Distinct articles

· Describe the difference between eager and lazy execution

· Describe the fundamentals of how the Catalyst Optimizer works

· Define and identify actions and transformations

· Describe the column class

· Work with column expressions

· Perform date and time manipulation

· Use aggregate functions

· Exercise: Deduplication of data

· Describe the Azure Databricks platform architecture

· Perform data protection

· Describe Azure key vault and Databricks security scopes

· Secure access with Azure IAM and authentication

· Describe security

· Exercise: Access Azure Storage with key vault-backed secrets

· Describe the open source Delta Lake

· Exercise: Work with basic Delta Lake functionality

· Describe how Azure Databricks manages Delta Lake

· Exercise: Use the Delta Lake Time Machine and perform optimization

· Describe Azure Databricks structured streaming

· Perform stream processing using structured streaming

· Work with Time Windows

· Process data from Event Hubs with structured streaming

· Describe bronze, silver, and gold architecture

· Perform batch and stream processing

· Schedule Databricks jobs in a data factory pipeline

· Pass parameters into and out of Databricks jobs in data factory

· Integrate with Azure Synapse Analytics

· Understand workspace administration best practices

· List security best practices

· Describe tools and integration best practices

· Explain Databricks runtime best practices

· Understand cluster best practices

Azure Cosmos DB

Introduction to NoSQL DB

· Introduction to NoSQL

· SQL vs. NoSQL

· Types of NoSQL

· NoSQL Offerings by Microsoft

Introduction to Cosmos DB

· Cosmos DB Features

· Cosmos DB - Multi Model 5 APIs

· Table Storage vs. Cosmos DB

· Provision Cosmos DB Account

On-Premise Databases Migration

· DMS -- Database Migration Service

· On-Premise SQL Server to Azure Virtual Machine

· On-Premise SQL Server to Azure SQL Server

vm connector’s: publish vs publish consume in MuleSoft 189

vm connector’s: publish vs publish consume in MuleSoft

The VM (Virtual Machine) connector in MuleSoft 4 offers two primary operations for message exchange within your applications: Publish and Publish Consume. Here's a breakdown of their functionalities and use cases:

VM Publish:

Concept: This operation allows you to send a message to a specific VM queue. The message is placed in the queue, and the Publish flow continues its execution without waiting for a response.
Use Cases:

Ideal for situations where a flow needs to send data to another flow asynchronously.
Useful for decoupling message producers from consumers, allowing them to operate independently.
Well-suited for one-way communication scenarios where the sending flow doesn't require a reply.

VM Publish Consume:

Concept: This operation combines publishing a message with waiting for a response from the consuming flow. It sends a message to a VM queue and then waits for a specific timeout period (configurable) to receive a response message.
Use Cases:

Applicable for scenarios where a flow needs to send data and expect a reply from another flow.
Useful for request-response communication patterns within your application.
Well-suited for situations where the sending flow relies on the response from the consuming flow to proceed further.

Key Differences:

Feature	VM Publish	VM Publish Consume
Operation Type	One-way communication	Request-response communication
Waits for Reply	No	Yes, waits for a defined timeout period
Use Case	Asynchronous data exchange	Synchronous data exchange with response

Choosing the Right Approach:

The selection between VM Publish and VM Publish Consume depends on your communication requirements:

Use VM Publish when you want to send data asynchronously without waiting for a response. This is suitable for decoupling flows and fire-and-forget scenarios.
Use VM Publish Consume when a flow needs to send data and retrieve a response from the receiving flow. This is ideal for request-response interactions within your application.

Additional Considerations:

Error Handling: Consider proper error handling mechanisms for both approaches. In VM Publish, you might need to handle potential failures in the receiving flow that prevent message processing. For VM Publish Consume, handle timeouts or unexpected responses from the receiving flow.
Message Correlation: Implement message correlation techniques (e.g., using correlation IDs) if you require matching responses to specific requests sent using VM Publish Consume.

By understanding the distinctions between VM Publish and VM Publish Consume, you can effectively design asynchronous and synchronous communication patterns within your MuleSoft 4 applications using the VM connector.