Data Factory

Official Documentation

Service Description

Data processing in today's companies is marked by heterogeneous data storage (SQL, NoSQL, unstructured data, etc.) and processing components (databases, Big Data processors, etc.). Data in a company often passes through complex paths from generation or receipt of the data, through various data processing components, to storage or distribution of the data to various recipients. With Data Factory, local data such as that from SQL Server can be processed together with cloud-related data from Azure SQL Database, Blobs, and Tables. These data processing streams can be created, processed, and monitored through simple, highly available data pipelines. Data sources and data recipients can be defined, and the movement of the data in the company can be traced and monitored from a central location.

Getting Started

  1. Azure Data Factory Learning Path
    9/7/2016, Webpage
  2. Introduction to Azure Data Factory
    10/7/2015, Video, 0:59:31
  3. Orchestrating Data and Services with Azure Data Factory
    2/25/2016, Mva
  4. Cortana Intelligence Suite End-to-End
    1/16/2017, Mva
  5. Intro to Data Factory Deep Dive
    10/7/2015, Video, 0:31:34
  6. Building Hybrid Big Data Pipelines with Azure Data Factory
    10/7/2015, Video, 1:11:26

Latest Content

RSS Feed

Title  
Announcing Data Refresh APIs in the Power BI Service Blog
Cloud Tech 10 - 3rd April 2017 Video
Azure Data Factory’s Data Movement is now available in the UK Blog
Cloud Tech 10 - 27th March 2017 Video
Azure Data Factory February new features update Blog
Azure Data Factory and SSIS compared Blog
Design for Big Data with Microsoft Azure SQL Data Warehouse Video
Orchestrating Big Data Pipelines with Azure Data Factory Video
Interview with Lace Lofranco Video
Data Factory supports multiple web service inputs for Azure ML Batch Execution Blog
Ingest data from Apache Cassandra, Salesforce and Data Management Gateway 2.0 release Blog
Advancements in Data Technology Video

Azure Documentation

1. Overview
     1.1. Introduction to Azure Data Factory
     1.2. Concepts
          1.2.1. Pipelines and activities
          1.2.2. Datasets
          1.2.3. Scheduling and execution
2. Get Started
     2.1. Tutorial: Create a pipeline to copy data
          2.1.1. Copy Wizard
          2.1.2. Azure portal
          2.1.3. Visual Studio
          2.1.4. PowerShell
          2.1.5. Azure Resource Manager template
          2.1.6. REST API
          2.1.7. .NET API
     2.2. Tutorial: Create a pipeline to transform data
          2.2.1. Azure portal
          2.2.2. Visual Studio
          2.2.3. PowerShell
          2.2.4. Azure Resource Manager template
          2.2.5. REST API
     2.3. Tutorial: Move data between on-premises and cloud
     2.4. FAQ
3. How To
     3.1. Move Data
          3.1.1. Copy Activity Overview
          3.1.2. Data Factory Copy Wizard
               3.1.2.1. Load 1 TB in 15 minutes
          3.1.3. Performance and tuning guide
          3.1.4. Security considerations
          3.1.5. Connectors
               3.1.5.1. Amazon Redshift
               3.1.5.2. Amazon S3
               3.1.5.3. Azure Blob Storage
               3.1.5.4. Azure Cosmos DB
               3.1.5.5. Azure Data Lake Store
               3.1.5.6. Azure Search
               3.1.5.7. Azure SQL Database
               3.1.5.8. Azure SQL Data Warehouse
               3.1.5.9. Azure Table Storage
               3.1.5.10. Cassandra
               3.1.5.11. DB2
               3.1.5.12. File system
               3.1.5.13. FTP
               3.1.5.14. HDFS
               3.1.5.15. HTTP
               3.1.5.16. MongoDB
               3.1.5.17. MySQL
               3.1.5.18. OData
               3.1.5.19. ODBC
               3.1.5.20. Oracle
               3.1.5.21. PostgreSQL
               3.1.5.22. Salesforce
               3.1.5.23. SAP Business Warehouse
               3.1.5.24. SAP HANA
               3.1.5.25. SFTP
               3.1.5.26. SQL Server
               3.1.5.27. Sybase
               3.1.5.28. Teradata
               3.1.5.29. Web table
          3.1.6. Data Management Gateway
     3.2. Transform Data
          3.2.1. HDInsight Hive Activity
          3.2.2. HDInsight Pig Activity
          3.2.3. HDInsight MapReduce Activity
          3.2.4. HDInsight Streaming Activity
          3.2.5. HDInsight Spark Activity
          3.2.6. Machine Learning Batch Execution Activity
          3.2.7. Machine Learning Update Resource Activity
          3.2.8. Stored Procedure Activity
          3.2.9. Data Lake Analytics U-SQL Activity
          3.2.10. .NET custom activity
          3.2.11. Invoke R scripts
          3.2.12. Reprocess models in Azure Analysis Services
          3.2.13. Compute Linked Services
     3.3. Develop
          3.3.1. Azure Resource Manager template
          3.3.2. Samples
          3.3.3. Functions and system variables
          3.3.4. Naming rules
          3.3.5. .NET API change log
     3.4. Monitor and Manage
          3.4.1. Monitoring and Management app
          3.4.2. Azure Data Factory pipelines
          3.4.3. Using .NET SDK
          3.4.4. Troubleshoot Data Factory issues
          3.4.5. Troubleshoot issues with using Data Management Gateway
4. Reference
     4.1. PowerShell
     4.2. .NET
     4.3. REST
     4.4. JSON
5. Resources
     5.1. Azure Roadmap
     5.2. Case Studies
     5.3. Learning path
     5.4. MSDN Forum
     5.5. Pricing
     5.6. Release notes for Data Management Gateway
     5.7. Request a feature
     5.8. Service updates
     5.9. Stack Overflow
     5.10. Videos
          5.10.1. Customer Profiling
          5.10.2. Process large-scale datasets using Data Factory and Batch
          5.10.3. Product Recommendations

Web Content

Content Type
Azure Data Factory Learning Path Webpage

Tools

Tool Description