Azure SFTP: a Bicep example

For many years, I’ve been interested in SFTP. There are two reasons. One, I worked at Ipswitch in the early 2000s and, two, SFTP is the protocol that won’t die. It’s so deeply embedded in inter-enterprise workflows that cloud vendors have had to offer SFTP capabilities in order to be considered fully enterprise-capable platforms.

AWS was the first to offer a cloud SFTP service, and I’ve been a fan of AWS SFTP for some time.

And now it’s Azure’s turn. While it’s still in public preview as of January 2022, Azure’s SFTP solution is unique in several ways:

  • It’s based on Azure storage accounts. There is no SFTP “server” visible to the user and/or application.
  • It uses the Azure Data Lake hierarchical file system on top of blob storage. This permits the creation and use of a Windows- or Linux-like hierarchical file system. It’s better in some respects than AWS SFTP’s use of S3 which uses object keys to mimic a hierarchical file system.

Both Azure SFTP and AWS SFTP are implemented on top of REST-based file systems, which is a good thing. Azure uses blob storage and AWS uses S3. You get the best of both worlds: old school folders when you want them and access to modern app APIs when you want them.

The real challenge with any SFTP environment, no matter whose it is, is securing something that by necessity must be exposed on the public internet. Here, cloud SFTP resources have real advantages. They are inexpensive, so they don’t have to be shared (in contrast to expensive managed file transfer systems, or MFT, which cost so much they have to be a global resource). That reduces the chance for exfiltration or attack from one account in a shared MFT environment on another.

Networking can be used to limit access to the virtual network, further isolating exposure. And, because cloud SFTP uses the provider’s network for external connectivity, that provider has to deal with edge attacks like denials-of-service. In addition, cloud SFTP resources can be isolated to a dedicated cloud account or resource group with its own dedicated virtual network. This is especially attractive for enterprises which have used RFC 1918 private addressing to “extend” their on-premises network to their Azure environment.

Plus, the typical application workflows that MFT systems provide can be easily integrated using cloud functions like Azure Logic Apps and Azure Functions. This can reduce or eliminate technical debt resulting from the use of MFT vendor APIs and scripts. Storage costs can be reduced through automated inventory and storage tiering and migration to lower-cost storage.

So, there are many reasons to deploy not just a single centralized cloud SFTP resource but to encourage architects and developers to create SFTP-enabled storage accounts per use — at a very granular level, be that at a client, application or even at the single data exchange level.

But an Azure storage account is one of the most complicated Azure resources one can deploy. There are many, many settings and choices — and now with Azure SFTP, even more are available. How can you get Azure SFTP right, in terms of security, management and logging?

Easy: just customize the Bicep template below and give it a try in your Azure environment. I’m a relatively new convert to Bicep and it has proved its value in the development of this template. As you will see in the copious comments in the template, there are some intricacies involved in getting Azure Log Analytics working (for logging) and, since Azure SFTP is in preview, I had to hack at the APIs at bit since they are (at this time) undocumented. But I am glad I didn’t have to deal with those complexities in JSON where I always forget a comma here and there.

The template also produces a daily inventory, deletes files untouched for a number of days you can specify and turns on Microsoft Defender for the storage account.

I hope you find this Azure SFTP Bicep template useful.

/*

This template creates an Azure SFTP-enabled storage account using the public preview levels of APIs.

The template creates a new storage account, permits Vnet subnets to be specified that can connect to the storage account, allows whitelisting of external IP addresses (effectively disallowing all others), creates a container (named 'sftp') and a single user ('sftpuser') in the newly created storage account, and creates various logging, management and lifecycle policies for the storage account.

Specifically: 
- Vnet subnets that require access to this SFTP storage account must be specified in the 'vnetSubnetResourceIds' parameter with their _full_ Azure resource ID.
- Vnet subnets that require access to this SFTP storage account _must_ have the 'Microsoft.Storage' serivce endpoint enabled. Private link services are not supported.
- SFTP-enabled storage accounts must use the hierarchical naming capability provided by data lake V2 storage. This is enabled by default.
- Blobs created in the storage account are automatically deleted 30 days after last access to reduce costs.
- A log analytics workspace is created into which transfers are logged. Retention is set to 30 days.
- A daily inventory of the blobs in the sftp container is created and stored at the root of the container ('/'). These may be deleted and are there for the convenience of the user.

Here is an example of the correct way to deploy this template using Azure PowerShell: 

New-AzResourceGroupDeployment -Name deploySftpStorageAccount -ResourceGroupName AzSftpStorageAccountRg -TemplateFile ./main.bicep -Verbose

IMPORTANT: The deployment MUST contain a public key provided by the other end to be placed into parameters sshPublicKey. 

IMPORTANT: Userid and password authentication SHOULD NEVER BE USED. This is why this template deploys ONLY public key/private key authentication.

IMPORTANT: the only parameters and/or properties that should be changed are
- The public key supplied by the connection partner.
- Vnet subnet resource IDs
- Whitelisted IP addresses for connection partners
- The required AppID tag.

IMPORTANT: As of 2022-01-06, the preview capabilities of SFTP-enabled storage accounts _do not_ permit per-folder permissions for users. IOW, if user A and user B are defined for the storage account and user A has permissions on the container, user A will be able to use those permissions on user B's folders in the container. For this reason, the template defines _only_ user 'sftpuser'. If a dev team needs more than one user whose contents needs protection from another, create a separate storage account. Separate per application storage accounts are best practice in any event.

TODO: Update to the GA API level for SFTP-enabled storage accounts when available and define users, if an API exists at that time to do so.

TODO: If/when GA API level for SFTP-enabled storage accounts permits, rework template to allow separate folder and user permissions.

an 2022-01-25

(c) 2022 Air11 Technology LLC -- licensed under the Apache OpenSource 2.0 license, https://opensource.org/licenses/Apache-2.0
	Licensed under the Apache License, Version 2.0 (the "License");
	you may not use this file except in compliance with the License.
	You may obtain a copy of the License at
	http://www.apache.org/licenses/LICENSE-2.0
	
	Unless required by applicable law or agreed to in writing, software
	distributed under the License is distributed on an "AS IS" BASIS,
	WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
	See the License for the specific language governing permissions and
	limitations under the License.
	
	Author's blog: https://yobyot.com

*/

// Begin template

// Begin parameters

@description('The Azure region in which the SFTP storage account will be created.')
param location string = 'eastus2'

@description('Generate a unique name based on the resource group name for the storage account.')
@maxLength(24)
@minLength(3)
param sftpStorageAccountName string = 'sftpstg${uniqueString(resourceGroup().id)}'

@description('This is the name of the container -- the "root" -- folder in the ADLS hierarchy.')
param sftpRootContainterName string = 'sftp'

@description('The name of the SFTP userid which uses public key authenication ONLY.')
param sftpUserName string = 'sftpuser'

@description('Resource IDs of the Vnet subnets authorized to access this storage account. Note: the service endpoint for Microsoft.Storage MUST be enabled before this template is deployed. This should be limited to subnets that are authorized to connect to the SFTP storage blob -- ideally, NSGs are used to limit resource access to/from these subnets from accessing anything else.')
param vnetSubnetResourceIds array = [
  '/subscriptions/12345678-0000-1111-2222-87654321/resourceGroups/yourRg/providers/Microsoft.Network/virtualNetworks/yourVnetName/subnets/yourSubnet1'
  '/subscriptions/12345678-0000-1111-2222-87654321/resourceGroups/yourRg/providers/Microsoft.Network/virtualNetworks/yourVnetName/subnets/yourSubnet2'
]

@description('An array of IPv4 addresses to be whitelisted for access to this SFTP storage account and container. Do not specify RFC 1918 addresses nor CIDRs smaller than /30. This should be a list of the IPs representing machines at the other end of the SFTP transfer.')
param sftpWhiteListedIps array = [
  '173.76.134.153'
]

@description('This parameter must contain the ssh public key to match the private key provided by the other end.')
param sshPublicKey string = 'ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAACAQDFs9cM+ty6smYoXR97RDxsLOukg+lwQJkHVsgeAWk890H2gNg6wNg2VK9HFKZi5WspUxK3IRS4geRyMz+fQDLxlIXwpD+UvB8xq7zTQ5fFGBP7DY7UqPBTPaKOEmbzdab9rcENMHsN+Rd1nmKGUd+X3AYh8OUihHh2SfpN84y1lt6phVMtLTCUeMI4ctPbr6HBTTLl1H0IoNuu1Xaf+2QtMd8OOjaDgS1GD8mtsgx2eepAVu/zP3VH6U8ScPY/cbFalA71buKC9P88LAafqgUqrdjkXArB3A4O/hhNKg/SzUV6PJqs80AKc+xsnSQdXMl1DRJ7SqKbmJN/t6ZBODSPkam8CWZSSBmQSh86dtmKwGCQr0fzIV6DL2qjHq2Pl39Kyx3sCUeqHxx4SncheQk3gquflpcQrBs0YlxUktb/+n7AEq/tDFT2tXFhBBaX5Hhsmn+4cIUXR/nCGqi6wd+P3LXWdqkqyAhgvK5Z9S6nX3lQYf2XD/SqCbuGhp0jWSvAo10LvZFeqqkUXbjoLEYg1wyTqhSyBtpdmCe24qTF/T2K7kb+cDOAqqvgexaaPgF0Ec12AU6MstRo7MmmdcPG6soAAabYTPwSuPG+cIsnxFoPiSdf17I+gkcWh6bqteaLF27skh8XWaR7w3KDv0h9UAdNehMlRUoOp/l/2TCByQ== somerandompublickey'

// End parameters

// Begin variables

var sftpStgAcctAnalyticsWorkspaceName = 'sftpStgLaWksp${uniqueString(resourceGroup().id)}'

// End variables

// Begin resource definitions

// API level 2021-08-01 is undocumented as SFTP is in preview as of 2021-12
resource sftpStorageAccount 'Microsoft.Storage/storageAccounts@2021-08-01' = {
  name: sftpStorageAccountName
  location: location
  kind: 'StorageV2'
  sku: {
    name: 'Standard_LRS' // In preview, only ZRS/LRS are supported for SFTP. No need for premium storage since disk performance is not an issue in SFTP transfers
  }
  properties: {
    networkAcls: {
      defaultAction: 'Deny'
      bypass: 'AzureServices,Logging,Metrics' // Allow Azure services to access the storage account
      // '[for subnet in vnetSubnetResourceIds]' is a Bicep loop to iterate through an array
      virtualNetworkRules: [for subnet in vnetSubnetResourceIds: {
        // The array of resource IDs of Vnet subnets that have a service endpoint for 'Microsoft.Storage' enabled
        id: subnet
        action: 'Allow'
      }]
      // '[for ip in sftpWhiteListedIps]' is a Bicep loop to iterate through an array
      ipRules: [for ip in sftpWhiteListedIps: {
        // An array of public IP addresses that are allowed to send/receive files via SFTP in this storage account
        value: ip
        action: 'Allow'
      }]
    }
    minimumTlsVersion: 'TLS1_2' // Why use anything else?
    supportsHttpsTrafficOnly: true // Another obvious choice
    allowBlobPublicAccess: false // No external user should need REST access to blobs
    allowSharedKeyAccess: true // SFTP only supports public/private key; this is to permit PaaS access
    isHnsEnabled: true // Data Lake V2 (hierarchical filessytem is required for SFTP)
    isSftpEnabled: true // Enables SFTP
  }
}

// Use a different (documented) API level for blobServices and the child resource
resource sftpStorageBlob 'Microsoft.Storage/storageAccounts/blobServices@2021-06-01' = {
  parent: sftpStorageAccount
  name: 'default' // This name is fixed; note that it must be used as part of the segment name for the diagnostic settings in the resource below.
  properties: {
    containerDeleteRetentionPolicy: {
      // Permit recovery of mistakenly deleted container for a week
      enabled: true
      days: 7
    }
    deleteRetentionPolicy: {
      // Permit recovery of mistakenly deleted blobs for a week
      enabled: true
      days: 7
    }
    isVersioningEnabled: false // Not needed for SFTP
    lastAccessTimeTrackingPolicy: {
      // Enable access tracking is a prerequisite for the lifecycle policy below
      enable: true
      name: 'AccessTimeTracking'
      trackingGranularityInDays: 1 // Set granularity so that we can delete after 30 days
    }
  }
  // Nested resource for the sftp container follows.
  // Because of ARM template naming requirements, it's clearer to use a nested child resource rather than using parent references
  resource sftpStorageContainer 'containers' = {
    name: sftpRootContainterName // The 'root' folder in the DLV2 storage will be 'sftp'
    properties: {}
  }
}

resource sftpLocalUser 'Microsoft.Storage/storageAccounts/localUsers@2019-06-01' = {
  name: sftpUserName // Do not change this parameter, which is set to 'sftpuser'
  parent: sftpStorageAccount
  properties: {
    permissionScopes: [
      {
        permissions: 'rcwdl'
        service: 'blob'
        resourceName: sftpRootContainterName
      }
    ]
    // homeDirectory is set to the 'root' directory, which is named 'sftp'. Note the '/' which is required
    homeDirectory: '${sftpRootContainterName}/' // This user will have complete control over the "root" directory in sftpRootContainterName
    // The other end of the SFTP connection must supply an OpenSSH-generated (or compatible) public key
    sshAuthorizedKeys: [
      {
        description: 'SSH public key to authenticate with a connection originative from either sftpWhiteListedIps or VnetSubnetResourceIds'
        key: sshPublicKey
      }
    ]
    hasSharedKey: false
  }
}

// Enable a lifecycle management policy that deletes blobs that haven't been accessed in more than 30 days
resource sftpStorageBlobManagementPolicy 'Microsoft.Storage/storageAccounts/managementPolicies@2021-06-01' = {
  name: 'default'
  parent: sftpStorageAccount
  dependsOn: [
    sftpStorageBlob // If you don't specify the dependsOn explicitly, Azure may try this before the storage account is created
  ]
  properties: {
    policy: {
      rules: [
        {
          name: 'deleteSftpBlob30DaysAfterCreation'
          type: 'Lifecycle'
          definition: {
            actions: {
              baseBlob: {
                delete: {
                  daysAfterLastAccessTimeGreaterThan: 30 // Bye, bye old files (and the charges that go along with them)
                }
              }
            }
            filters: {
              blobTypes: [
                'blockBlob'
              ]
            }
          }
        }
      ]
    }
  }
}

// Create an inventory policy that stores the inventory of the container in the root folder each day. Inventories are created in folders by date and time of inventory.
resource sftpStorageBlobInventoryPolicy 'Microsoft.Storage/storageAccounts/inventoryPolicies@2021-06-01' = {
  name: 'default'
  parent: sftpStorageAccount
  dependsOn: [
    sftpStorageBlob::sftpStorageContainer // If you don't specify the dependsOn explicitly, Azure may try this before the storage account is created
  ]
  properties: {
    policy: {
      enabled: true
      type: 'Inventory'
      rules: [
        {
          destination: sftpRootContainterName
          enabled: true
          name: 'dailyInventorySftpContainer'
          definition: {
            format: 'Csv'
            schedule: 'Daily'
            objectType: 'Blob'
            schemaFields: [
              'Name'
              'Creation-Time'
              'Last-Modified'
              'LastAccessTime'
              'Content-Length'
              'Content-MD5'
              'BlobType'
              'AccessTier'
              'AccessTierChangeTime'
              'AccessTierInferred'
              'Expiry-Time'
              'hdi_isfolder'
              'Owner'
              'Group'
              'Permissions'
              'Acl'
              'Snapshot'
              'Metadata'
            ]
            filters: {
              blobTypes: [
                'blockBlob'
              ]
              includeSnapshots: true
            }
          }
        }
      ]
    }
  }
}

// Enable Microsoft Defender for Storage Accounts. Security alerts are reported to Microsoft Defender for Cloud
resource sftpStorageAccountAtpSettings 'Microsoft.Security/advancedThreatProtectionSettings@2019-01-01' = {
  name: 'current'
  scope: sftpStorageAccount
  properties: {
    isEnabled: true
  }
}

// Create a log analytics workspace into which file transfer operation are logged in table StorageBlobLogs
resource sftpStgAcctAnalyticsWorkspace 'Microsoft.OperationalInsights/workspaces@2021-06-01' = {
  name: sftpStgAcctAnalyticsWorkspaceName
  location: location
  properties: {
    retentionInDays: 30 // Delete logs after 30 days
    sku: {
      name: 'PerGB2018'
    }
  }
}

/* This is a _very_ confusing resource to define.
Issues include that the type is not documented, the API level is not documented and Bicep complains about the type format suggesting a 'scope' property. You must name the resource _exactly_ as shown below: [storage-account-name]/default/Microsoft.Insights/[any string for the diag settings name].
The only documentation I can find for this is the (really lousy) ARM JSON template here: https://docs.microsoft.com/en-us/azure/azure-monitor/essentials/resource-manager-diagnostic-settings#diagnostic-setting-for-azure-storage

The settings that _are_ documented (https://docs.microsoft.com/en-us/azure/templates/microsoft.insights/diagnosticsettings?tabs=bicep) will apply only to the storage account, not to the SFTP blob where the data is stored.

Diag settings are an extension resource, which is why we have to use a provider-style (undocumented and unexplained) type. The log entries for SFTP (StorageWrite, StorageRead, StorageDelete) do not apply to the storage account; they are blobServices log entries. That means that if they are applied at the container level, no blob activites will be logged.

Why bother? Because we _must_ have logs of files sent and received.

DO NOT alter this resource. Deploy it exactly as coded here which means you should not alter any of the LA workspace names and/or the blob and container names.
*/
resource sftpStgAcctLogAnalyticsWorkspaceDiagnosticsSetting 'Microsoft.Storage/storageAccounts/blobServices/providers/diagnosticSettings@2017-05-01-preview' = {
  name: '${sftpStorageAccountName}/default/Microsoft.Insights/diagSettingsBlob'
  properties: {
    workspaceId: sftpStgAcctAnalyticsWorkspace.id
    storageAccountId: sftpStorageAccount.id 
    logs: [
      {
        category: 'StorageRead'
        enabled: true
      }
      {
        category: 'StorageWrite'
        enabled: true
      }
      {
        category: 'StorageDelete'
        enabled: true
      }
    ]
  }
}

// End resource definitions

// Begin outputs

// Output the resource ID of the SFTP container
output sftpStorageContainer string = sftpStorageBlob::sftpStorageContainer.id

// End outputs

// End template

Posted

in

, , , ,

by

Comments

3 responses to “Azure SFTP: a Bicep example”

  1. james Avatar
    james

    Great blog, it has saved me lots of time

    1. Alex Neihaus Avatar
      Alex Neihaus

      Thank you!

      I am glad it was useful to you.

  2. HavingIaCFun Avatar
    HavingIaCFun

    Thanks a lot for the time invested to get that far with this, i nearly bonked my head serveral times against my table while trying to get my bicep template deployed. As why Microsoft has still not implemented folder permission for the local users and does not provide per default an Access Log. Thats a fine example why preview features are preview. You Template helped me wonderfully.

Leave a Reply

Your email address will not be published. Required fields are marked *