Azure Data Factory - ForEach Activity
The ForeEach
activity in Azure Data Factory has some important limitations. One of them is when working with the batch
mode, it would be nice to embed only pipeline activities inside.
The ForeEach
activity in Azure Data Factory has some important limitations. One of them is when working with the batch
mode, it would be nice to embed only pipeline activities inside.
This post is based on the official Azure documentations (Asynchronous messaging options, Compare Azure messaging services, Enterprise integration using message broker and events, Azure Well-Architected Framework) and describes a resume of differences and uses cases for Azure messaging service, including Service Bus, Event Grid, Event Hubs. The official documentations are very good and comprehensive, this post is for my personal reference as a quick reminder.
MS Graph API's endpoint for retrieving users, GET /users can return all users of the tenant. The default limit is 100 users per page, and the maximum limit is 999 users per page. If there are more than 999 users, the response will contain a @odata.nextLink
field, which is a URL to the next page of users. For a big company having a large number of users (50,000, 100,000, or even more), and it can be time-consuming to retrieve all users.
While MS Graph API provides generous throttling limits, we should find a way to parallelize the queries. This post explores sharding as a strategy to retrieve all users in a matter of seconds. The idea is to get all users by dividing users based on the first character of the userPrincipalName
field.For instance, shard 1 would encompass users whose userPrincipalName
starts with a
, shard 2 would handle users starting with b
, and so forth.
During CICD, we often have a large log output, it might be nice to have some common scripts to help us to format the log output, so that we can easily find the information we need.
Recently, when working with Sonar, I found that they have some scripts for such output formatting.
Although Azure provides already a GitHub Actions for Azure Web App to deploy static files to Azure Web App, but we can also do it ourselves with a azure cli command.
During CI/CD processes, and particularly during CI, we frequently hash dependency files to create cache keys (referred to as key
input in Github Action actions/cache and key
parameter in Azure pipelines Cache@2 task). However, the default hash functions come with certain limitations like this comment. To address this, we can use the following pure Bash shell command to manually generate the hash value.
Recently, I began a new project that requires migrating some process from Azure Pipelines to Github Actions. One of the tasks involves retrieving secrets from Azure Key Vault.
In Azure Pipelines, we have an official task called AzureKeyVault@2 designed for this purpose. However, its official counterpart in Github Actions, Azure/get-keyvault-secrets@v1, has been deprecated. The recommended alternative is Azure CLI. While Azure CLI is a suitable option, it operates in a bash shell without multithreading. If numerous secrets need to be fetched, this can be time-consuming.