Azure Storage offers many useful features and services which makes integration into existing systems easier. New solutions can take advantages of those features on architectural level. Knowing which common patterns are natively supported by Azure Storage can accelerate development radically. Here is an overview of them.
Blob Storage / Blob service / Containers
Shared access tokens
Container can delegate certain permissions via SAS (shared access signature) token to JavaScript code (for example). Every token is signed by a private key. There are three methods of generating the token, each of them provides permissions by a different way:
- Role-based permissions of a user authenticated by Azure Active Directory.
- Individual permissions delegated by user of Azure Active Directory.
- Individual permissions based on one of two keys of storage account, which can be manually regenerated.
These tokens can be easily generated dynamically by code, so it makes sense to limit their duration for a specific time. When the token somehow leaks, attacker has a limited time to perform a malicious action.
Immutable Storage
Many papers must be archived due to legal reasons. Immutable Storage is a digital equivalent of a document archive. The container can be locked with an Immutable Blob Storage policy. Locked blob cannot be deleted, modified, or moved. There are two kinds of locks – Time-based retention and Legal hold. Time-based retention holds the lock for a specified period. Legal hold is an assigned tag which locks container of blob.
Metadata
A blob can hold additional key-value pairs of data. Typical example is the Content-Type, which is served as an HTTP header. However even custom metadata are contained in HTTP headers with a X prefix. It is extremely helpful in plenty scenarios because JavaScript have a capability to read this metadata.
Index tags
Blob index tags provides a built-in capability to list blobs by custom attributes. Blob’s tag can be set during or after upload. Each blob can have up to 10 index tags. Additional pricing is based on the monthly average number of index tags in the storage account.
Hierarchical namespace
Every (general purpose v2) storage can be upgraded to Data Lake Gen2 storage. This migration allows us to take advantage of hierarchical namespace. More specifically it brings us:
- Efficient query of subfolders.
- Much faster renaming or moving blobs.
- Atomic operations with ABFS driver over DFS endpoint.
- Granular POSIX-compliant security.
Soft delete
Containers or blobs don’t have to be always permanently deleted. It is possible to set a period which delays actual deletion. During this time those items are hidden and can be restored. When this period ends, permanent delete occurs automatically.
Access tier
To achieve cost savings, data can be distributed among different storage accounts with a specific access tier that fits best to data nature. Azure currently offers three kinds:
- Hot tier – highest storage costs, lowest access cost.
- Cool tier – about 30 % less expensive than Hot tier but write or read operations are more expensive. If you delete a blob before it becomes 30 days old, you must pay Early deletion fee.
- Archive tier – about 95 % less expensive than Hot tier but the delay between data request and delivery is in hours. Early deletion fee is applied if you delete a blob before it becomes 180 days old. Retrieve an archived blob can take up to 15 hours. If you pay a little bit of money for the priority operation, it can take less than 1 hour.
Blobfuse
Blobfuse allows to access block blob data in your storage account through the Linux file system. It is a virtual file system driver for Ubuntu, Debian, SUSE, CentOS, Oracle Linux and RHEL distributions.
Inventory
Inventory reports is a tool to get an overview of all your data within a storage account. Reports are created periodically – daily or weekly. They have CSV format and are automatically stored to a specific container.
Snapshots
A snapshot is a read-only copy of a blob taken at certain point in time. Snapshots, unlike versions, are created manually. Snapshots of blobs in the Archive tier are not supported.
Versions
Azure Storage can automatically save a previous version every time a blob is modified (or deleted). Previous versions can be listed via SDK (or Azure Portal). Older versions can be stored in different access tier than current (propagated) version.
Task: Delete old blobs
This feature, currently in preview, can simplify many cloud solutions and save many lines of code. It deletes all blobs in a specific container older than a given period.
Lifecycle management
Blobs which haven’t been modified for a specific period can be automatically deleted or moved to cool storage or archive storage. The rule applies to whole storage or to a specific subset (excluding append blobs) based on blob’s name or metadata.
Table Storage / Table service
Cosmos DB Table API
Azure Cosmos DB is accessible is the same way as traditional Table Storage (with newer Azure Table SDK). An entity in Azure Storage can be up to 1 MB in size. An entity in Azure Cosmos DB can be up to 2 MB in size.
Queue Storage
Infinite TTL interval
The maximum message lifetime was always 7 days. It is now possible to opt-in for immortal message which never expires.
File Shares / File Service / Azure Files
Large file shares
Maximum file size 5 TB. If you activate Large file shares option, this limit grows to 100 TB. However, this action is irreversible. A minor side effect is that this storage cannot be geo-redundant, so it is limited to a single region.
Soft delete
If a mapped network drive is connected via SMB protocol, deleted files can be restored via Azure Portal (or PowerShell script). Maximum retainment period is 365 days. Soft delete for NFS or SFTP is supported by Azure Data Lake Storage.
Premium performance
Basic storage accounts are physically on HDDs. Premium accounts are located on SSDs which provides much higher IOPS and much lower latency. Premium storage account can host premium tier file share. It has cheaper transaction cost compared to standard tier. The IOPS and throughput is based on the provisioned size. On the other hand, premium file share does not support any form of geo-redundancy.
Storage access policies
If you authorize access to Storage, Table or Queue via Azure Active Directory, you can assign certain roles to security principals (users, groups, or application services). A role permits or denies specific actions:
- Storage – read, add, create, write, delete, list
- Table – read, add, update, delete
- Queue – read, add, update, process