Architecting A Medium-Sized IT Department in Azure

IT is an essential aspect of any enterprise today. The IT department is responsible for the development and operations of software applications supporting business processes. Most enterprises used to build private, on-premise data centers for hosting these applications. In contrast, modern enterprises develop their entire IT infrastructure on the public cloud. In this blog, we are going to discuss how to architect a medium-sized IT department on Azure.

The infrastructure of the IT department we are going to build consists of 5 bare-metal servers, three virtual machines, and three databases.

Architectural Building Blocks

Microsoft Azure has a complete range of services for building our infrastructure. Let’s see how we are going to use them to satisfy our requirements.

Bare-Metal Servers

While virtualization brings in agility, we still have some enterprise applications that are not designed for VMs. Therefore, bare-metal servers are still a mandatory requirement in our data centers.

Azure offers a wide array of bare-metal server types with storage options NFS, ISCSI, or Fiber Channel. This storage consists of SSD and NVMe disks.

Although most enterprise applications are not bandwidth-intensive, bare-metal servers on Azure have 40/100G NICs to accommodate high-throughput use cases. A bare-metal server can be connected to an Azure VNet similar to a VM, so workloads on bare-metal can securely communicate with other Azure resources.

We can install either Windows or Linux OS on Azure bare-metal servers by sourcing the OS licenses ourselves. We also get root access to the bare-metal servers so that we can have complete control over them.

VMs

Virtual Machines are a key building block in IT infrastructure. They are much more cost-effective than bare-metal servers, so we intend to use VMs for all general purpose applications in our IT department.

Azure offers a range of VMs from a single vCPU up to 416 vCPUs. Azure also offers VM flavors with a high CPU to memory ratio and other flavors with high memory to CPU ratios. Therefore, we can select the most efficient VM flavor that suites our needs. We can also provision VMs with GPUs for specialized applications such as video analytics.

Azure VMs support Windows OS and a range of Linux distributions such as RHEL, CentOS, Ubuntu, and SUSE Linux.

Azure also provides block storage for VMs via Azure managed disks. Internally, these managed disks implement high availability by creating three replicas, so our data remains intact in case of hardware failures.

Networking

Azure has a robust networking solution for building a secure private network for communicating among VMs, bare-metal servers, and services such as databases. Azure Virtual Network, also called the VNet, is the fundamental building block of this networking solution.

A VNet can be considered as one logical private IP network within Azure. A VNet is dedicated to one tenant, and never shared among multiple tenants. One VNet can include multiple IP subnets. Resources such as VMs can be connected to these subnets via virtual network interfaces.

The VNet also enables us to securely expose the VMs and bare-metal servers to the Internet via Azure Firewall or Azure Application Gateway.

Often resources in the cloud are deployed in multiple Azure regions to achieve geographic redundancy. Using Azure Global VNet peering, we can establish a secure communication channel across all these resources in multiple Azure regions. The Global VNet peering is built on a private backbone network owned by Microsoft, so we do not have to worry about security.

Databases

Databases are a mandatory component in an IT data center. In an on-premise data center setting up and operating a database could cost significant time and effort. But, managed database services in Azure can ease off this burden.

Azure offers managed versions of popular open-source, relational database engines like MySQL, MariaDB, and PostgreSQL. There is also Azure SQL Database which is a relational database service designed for the cloud. Compared to open-source databases, Azure SQL Database has better scalability. An Azure SQL Server instance can be scaled up to 8TB to serve large capacity demands.

For applications that require NoSQL databases, we use Azure Cosmos DB. Cosmos DB can scale up to handle high workloads while maintaining a consistent single-digit response time. It also offers APIs compatible with Apache Cassandra and MongoDB to migrate any existing applications to CosmosDB, with minimum code-level changes.

File Storage

An enterprise needs to store a wide variety of structured and unstructured data. Azure provides several storage solutions for this purpose.

Azure Files is a file storage solution for storing documents and images. The stored objects can be accessed via SMB Protocol. For storing large volumes of unstructured data, Azure Blob storage is more suitable.

Disaster Recovery

Disaster recovery is an important responsibility in IT. We must design our IT infrastructure to ensure fast recovery from failures to ensure business continuity. In Azure, we must consider disaster recovery separately for applications and databases.

To understand disaster recovery and redundancy implementation in Azure, we must be familiar with regions and availability zones. An Azure region is a geographical area that consists of multiple data centers connected via a low-latency network. Azure has multiple such regions in different geographical locations all over the globe. An availability zone is a physically separated data center within a region. A region can have one or more availability zones.

Disaster Recovery For Applications

For applications running on VMs, we can implement redundancy by manually creating a replica of the VM in a separate availability zone or a region. Then, the application can be deployed on both VMs to run in an active-standby mode.

Alternatively, we can use Azure Site Recovery for implementing geographic redundancy for VMs. Azure Site Recovery ensures that our VMs and bare-metal servers are replicated at another location to switch over at a failure to continue using the applications.

For applications that we deploy across multiple regions, we can use Azure Front Door as the entry point for our users. Azure Front Door can route the traffic to active and standby applications using a priority mechanism so that users would be automatically switched to the standby if the active setup becomes unavailable.

Disaster Recovery For Databases

The different database services in Azure implements high availability in unique ways. Azure CosmosDB implements high availability by replicating data across Azure regions. Azure SQL Database uses Azure Blob storage which has built-in high availability features. PostgreSQL in Azure uses Azure storage for storing data. This storage crates three copies of its data within a region to ensure high availability. To further improve the high availability of PostgreSQL in Azure, we could use read replicas. Then, in the event of a failure in the primary database, we could switch over to the replica.

Backup

Backup is another important aspect of disaster recovery. A proper backup mechanism should be established in addition to geo-redundancy architecture. It can help recovery from data corruption, accidental delete, etc.

Azure backup is an easy-to-setup service that can backup VMs and databases. It has a feature to get periodic incremental backups to optimize the storage requirements for our backups.

Network Security

Azure VNet provides us with a logically isolated environment for deploying our resources. By choosing to use VNet, we ensure that internal traffic between applications stays on a secure private network and is not exposed to the public. We allow external traffic to reach only the selected resources in our DC.

As the next level of network protection, we use Azure Network Security Groups. A Network Security Group acts as a stateful firewall for our resources. One Network Security Group can include multiple security rules which can allow or deny traffic based on source/ destination IP address and protocol. We use Network Security Groups with a default deny rule and allow only specific traffic between our resources.

We use Azure Firewall, which is a network firewall solution from Azure to protect our virtual resources. The network firewall acts as a perimeter firewall to block unwanted traffic to our virtual resources. This firewall also supports SNAT. So we can provide connectivity to the Internet for our VMs via the firewall.

Design Summary

As discussed already, our IT infrastructure in Azure consists of 5 bare-metal servers, 3 VMs, and 3 databases.

We deploy our resources in two Azure regions with the applications working in active-standby mode. We use Azure Front Door so that users can be seamlessly routed to the standby region if an outage occurs in the active region. This ensures disaster recovery with minimum impact to our users.

 To securely connect our resources within one region, we use Azure VNet. The connectivity across the two regions is created with Azure Global VNet peering. This enables us to connect the resources across the two regions without any VPNs or NAT.

 In each region, we create separate subnets for each of our services. We connect the VMs to these subnets via virtual NICs. The database servers are connected to the VNet subnets via Azure VNet service endpoints. The bare-metal instances are also connected to the same VNet.

We use PostgreSQL as our relational database with read replication across regions. For applications that use NoSQL databases, we use Cosmos DB. We create Cosmos DB with replication across both our regions, so we can instantly switch over at a failure with no loss of data.

We deploy the Azure Firewall instance in each region to connect our VMs and bare-metal servers to the Internet via NAT.

Conclusion

Using the discussed services from Azure, we can implement a fully redundant, secure IT data center entirely on Azure without relying on an on-premise data center.

en_USEN