Azure

Azure Data Factory (ADF)

Introduction

What is Azure Data Factory (ADF)?



Azure Data Factory (ADF) is a serverless, fully managed data integration solution for ingesting, preparing, and converting all of your data at scale. Azure Data Factory (ADF) is a cloud ETL solution that enables scale-out serverless data integration and transformation. It has a code-free user interface that allows for straightforward authoring as well as single-pane-of-glass monitoring and maintenance.
Existing SSIS packages can also be lifted and shifted to Azure and run in ADF with complete compatibility. You don't have to worry about infrastructure maintenance with SSIS Integration Runtime because it's a fully managed service.


Code-Free ETL as a Service
  • Ingest
  • Control Flow
  • Data Flow
  • Schedule
  • Monitor
        Ingest
    • Multi-cloud & on-premise hybrid copy data
    • 90+ native connectors
        Control Flow
    • Design code-free data pipelines
    • Generate pipelines via SDK
    • Utilize workflow constructs: loops, branches, conditional execution, variables, parameters.
        Data Flow
    • Code-free data transformations that execute in Spark
    • Scale-out with Azure Integration Runtimes
    • Generate data flows via SDK
    • Designers for data engineers and data analysts
        Schedule
    • Build and maintain operational schedules for your data pipelines
    • Wall clock, event-based, tumbling windows, chained
        Monitor
    • View active executions and pipeline history
    • Detail activity & data flow executions
    • Establish alerts & notifications
Accelerate Data Transformation with code-free data flows
Azure Data Factory is a data integration and transformation layer that may be used across several digital transformation projects.
  • Enable citizen integrators and data engineers to drive business and IT-driven analytics and business intelligence.
  • Prepare data, build ETL and ELT procedures, and orchestrate and monitor pipelines without writing any code.
  • Intelligent intent-driven mapping automates copy tasks, allowing you to transform faster.



Re-host & extend SSIS in few clicks
Organizations wishing to modernize SSIS can benefit from Azure Data Factory.
  • Realize up to 88% cost savings with the Azure Hybrid Benefit.
  • Take advantage of the only fully compatible service that makes moving all of your SSIS packages to the cloud simple.
  • The deployment wizard makes migration simple.
  • Combine the Data Factory cloud data pipelines with your strategy for hybrid big data and warehousing projects.
Ingest all your data with built-in connectors


Ingest data from diverse & multiple sources can be expensive, time-consuming & require multiple solutions. Azure Data Factory offers a single, pay-as-you-go service. You can:
  • Choose from more than 90 built-in connectors to acquire data from Big Data sources like Amazon Redshift, Google BigQuery, HDFS; enterprise data warehouses like Oracle Exadata, Teradata; SaaS apps like Salesforce, Marketo & ServiceNow & all Azure Data Services.
  • Use the full capacity underlying network bandwidth, up to 5 GB/s throughputs.

How does Azure Data Factory (ADF) work?
Azure Data Factory (ADF) collects interlinked systems that give data engineers a complete end-to-end platform.
Enterprises have data of various types that are located in disparate sources on-premises, in the cloud, structured, unstructured, and semi-structured, all arriving at different intervals & speeds.

Connect & Collect
Enterprise data comes from a variety of sources, including on-premises, on the cloud, structured, unstructured, and semi-structured data, all arriving at varied intervals and speeds.
The first stage in creating an information production system is to gather all of the necessary data and processing sources.
The next step is to move data as needed to a centralized location for subsequent processing.

Without Azure Data Factory, the enterprise must build custom data movement components or write customer services to integrate these data sources & processing. It's expensive & hard to integrate & maintain such systems. They often lack the enterprise-grade monitoring, alerting, & controls that a fully managed service can offer.

With Azure Data FactoryYou can use Copy Activity in a pipeline to transport data from on-premises and cloud source data stores to a centralized data store in the cloud for further analysis.
You can, for example, collect data in Azure Data Lake Storage and then transform it using an Azure Data Lake Analytics compute service. You can also collect data in Azure Blob storage & transform it later by using an Azure HDInsight Hadoop cluster.

Transform & enrich
Process or convert the acquired data using ADF mapping data flows once stored in a centralized data storage cloud. Data flows make it possible for data engineers to create and maintain data transformation graphs that run on Spark without having to know anything about Spark clusters or programming.
ADF supports external activities for executing your transformations on compute services such as HDInsight Hadoop, Spark, Data Lake Analytics, and Machine Learning if you prefer to code transformations by hand.

CI/CD & publish
Azure Data Factory offers full support for CI/CD of your data pipelines using Azure DevOps & GitHub. This allows you to incrementally develop & deliver your ETL processes before publishing the finished product. After the raw data has been refined into a business-ready consumable form, load the data into Azure Data Warehouse, Azure SQL Database, Azure CosmosDB, or whichever analytics engine your business users can point to from their business intelligence tools.
  • There are 3 primary approaches for continuous methods are:-
    • Continuous Integration
    • Continuous Delivery 
    • Continuous Deployment

Monitor
Azure Data Factory has built-in support for pipeline monitoring via Azure Monitor, API, PowerShell, Azure Monitor logs, & health panel on the Azure portal.



Videos









Comments

  1. This is very informative content.

    ReplyDelete
  2. Well explained and informative. Thanks

    ReplyDelete
  3. Very useful. Thanks for same.

    ReplyDelete
  4. Useful information. Thank for sharing for the same.

    ReplyDelete

Post a Comment

Cloud Computing