We are still working with the conceptual design but now looking at application requirements. What does the business want to accomplish with their applications, what requirements do they have for each application, and what dependencies exist that need to be accounted for in our design?
Section 1 – Create a vSphere 6.5 Conceptual Design
Objective 1.2 for the VCAP-DCV Design exam is to gather and analyze application requirements. vBrownBag hosted a 45-minute discussion on Objective 1.2 for the Design exam:
In the first five minutes, Mark Gabryjelski states the most important thing to keep in mind is that the design is not always about what is cool, but meeting business requirements.
During interviews and workshops, we need to define which applications are in scope for the vSphere design. The business may not want to virtualize certain parts of the business, they have a migration plan in place where some apps are going to the cloud, or applications are being replaced by a SaaS.
Application Functional Requirements
Identify the functional requirements the applications support — WHAT do the applications do for the business? Do they run back up software that supports their environment? What processes and tools do they use for collaboration? Are they a hybrid cloud shop and if so, what business connections and integration requirements exist?
HOW should business applications run?
“Of course, they should run [great, fast, responsive, securely, etc.].” Well, those are subjective assessments from a business leader. What they are describing is a user experience or function of the application. We need to drill down and identify specific metrics that will help us identify nonfunctional requirements for their application.
These specific metrics are going to be identified through a few techniques, most of which are similar to what we discussed in Objective 1.1. We will gain insights into the application requirements when we engage in interviews/workshops with the subject matter experts (SMEs) that own, operate, and administer the business’s specific applications. Existing documentation and the amount of detail done in the current state analysis will capture performance and data characteristics that will drive our design decisions.
Metrics gathered will inform our decisions when it comes to designing for Availability, Manageability, Performance, Recoverability, and Security (AMPRS). Keep in mind that I will be separating concepts and features into the design qualities, but they are all interdependent. We need to be sure to keep a holistic view when designing for applications.
In most of the resources I have read through and watched, availability requirements usually boil down to some type of uptime SLA. These could be business-critical applications like a revenue-generating service, internal applications, or application dependencies. The business will have an idea of what availability means to them and each application will be different based on its function. This is also going to be tied to cost. The business justification for running a fault-tolerant application with 99.999% (five-nines) must be rooted in a quantifiable loss to the business. This is usually stated as an SLA that they are obligated to meet for their external or internal customers. This consideration will be unique to each application as well. A business’s eCommerce web server availability requirement is different from its internal knowledgebase server.
|99%||3.65 days||7.2 hours||1.68 hours|
|99.9%||8.76 hours||43.2 minutes||10.1 minutes|
|99.99%||52.6 minutes||4.32 minutes||1.01 minutes|
|99.999%||4.26 minutes||25.9 seconds||6.05 seconds|
|99.9999%||31.5 seconds||2.59 seconds||.605 seconds|
If SMEs do not have SLAs or cannot agree, I like Mark Gabryjelski’s recommendation to make your own SLA. The business will most likely refine but at least we were able to get the conversation started. Having business SLAs and business requirements will help us justify availability design decisions and what type of components may be put into the design.
As we have discussions and workshops to identify the level of availability per application, we should start to think of the tools vSphere provides to support those requirements. vMotion and Storage vMotion allows us to reduce planned downtime and enables transparent host maintenance. vSphere HA leverages multiple ESXi hosts configured as a cluster to provide rapid recovery from outages. vSphere Fault Tolerance provides continuous availability. vCenter High Availability (vCenter HA) protects not only against host and hardware failures but also against vCenter Server application failures. Predictive HA works with DRS to provide early detection and VM evacuation to a healthy host.
Availability considerations can also be impacted by hardware choices and single points of failure. We need to identify component, host, cluster, rack, and datacenter levels of availability to meet uptime SLAs. Current hardware configurations could be constraints or risks to the design.
Manageability requirements come in many forms and can impact our design. The business may need to manage all the virtual infrastructure in one place. vCenter is the core product for managing vSphere and controls datacenter, cluster, and host resources. Integrating with VMware Site Recovery Manager (SRM) centralizes part of their Business Continuity Plan (BCP).
Platform management is becoming more important with hybrid cloud deployments and VMware Validated Designs for SDDCs. If the business is needing unified management to provision, configure, and administer its multi-cloud platforms then the vRealize Suite may become part of the design.
Provisioning, configuration, and automation tools may be in place such as Ansible, Terraform, or SaltStack. How these integrate into vSphere will be part of the design.
Distributed and cloud-native applications bring unique management problems. If a business is currently using something like Kubernetes, the management of the virtual infrastructure, K8s management, and cluster provisioning can get blurred between business roles. Platform and resource consumption by application clusters should be considered. Not part of the 6.5 exam, but vSphere 7 brings Tanzu to solve some of these challenges for Kubernetes.
Like availability, most performance requirements will come from some type of metric. This time it will be linked to your datacenter resources in the form of a non-functional requirement for compute, storage, or networking performance. Customer-facing API latency needs to be less than 500ms, SharePoint needs to support 1000 users, storage will exceed 100,000 IOPS for short periods of time, and top-or-rack switches routinely pass 80+Gbps on uplinks are all performance requirements.
These requirements will be rooted in a business justification either providing a level of service to customers or defined by the performance monitoring and current state analysis we did before. Remember to consider the growth of performance needs in your design. A business may be serving one million web requests today but what will they look like in 5 years?
Some applications may have compute, storage, or network requirements that need special attention such as latency-sensitive applications. Take a look at Deploying Extremely Latency-Sensitive Applications in VMware vSphere.
Cluster design will directly impact the performance of the application. Is the application monolithic and need to scale up or is it distributed and has 1000 small VMs across many hosts? This will inform your cluster scaling strategy.
Business policies can also drive performance requirements. Perhaps management and development environments should not impact production performance. This may mean that we use Network/Storage IO Control and Resource Pools to guarantee resources to production workloads or those management/development workloads reside in different clusters altogether.
Different business units have different needs and depending on the organization you can use logical structures like vApps and Resource Pools to manage resources. We will be digging into the resource management guide in Sections 2 and 3.
The disaster recovery timeline is a critical part of the business as is understanding their business continuity plan. Every business needs a plan to continue operations if their primary site has a disaster. Business goals and application requirements need to be quantified into the main segments in the DR timeline so we can design a proper vSphere environment. This Disaster Recovery 101 post has a great outline of the timeline above and its elements.
Failures happen and we need to plan for them. Based on 2019 revenue data from online store sales, Amazon.com would lose $4,480 of revenue every second their website was unavailable. That does not include 3rd party stores hosted on Amazon or subscription services. Availability and Recoverability are usually tied due to cost constraints. Every business has limited resources and 100% uptime is very expensive. Application availability and recovery should be driven by a business objective and sized appropriately.
RPO, RTO, WRT, and MTD will dictate your backup schedule, location, for which applications, and method of failover and fail-back. Identify third-party backup tools in place.
VMware Site Recovery Manager is the primary VMware product providing automated disaster recovery failover, planned migration and disaster avoidance, and seamless workflow automation with centralized recovery plans. The technical overview of SRM is a great way to get familiar with the technology.
vSphere Replication is another tool included in vSphere Essentials Plus and higher that provides flexible recovery options, ensures consistent application and virtual machine data, and integrates with the VMware product stack.
Security has a huge impact on the conceptual design of a vSphere design. Application security requirements range from business policies on OS configurations, security software installed, and ports and protocols running through subnets. We need to identify application-specific requirements that the business needs to accomplish its goals.
Remembering Confidentiality, Integrity, and Availability can help us engage with SMEs about securing their virtual environments; VM encryption may be a requirement, policies on network segmentation, and workload separation at the host or cluster level.
Backups are part of the recovery plan but also a way to mitigate some security risk in the case of ransomware or other data loss.
The business may follow standardized security frameworks or are responsible for meeting security compliance standards like PCI DSS, HIPAA, or DISA STIGs. All these requirements for applications, data flow, and resource placement will have heavy impacts on the design.
Hybrid and multi-cloud cloud deployments face even more difficult challenges as more data moves out of and between on-premises datacenters and cloud IaaS, PaaS, and SaaS providers.
What infrastructure services do business applications depend on; AD, DNS, DHCP, NTP? These should be part of the current state analysis and listing application dependencies as well as vSphere dependencies will lay a foundation for a successful design. If you introduce new services or components, make sure they are identified.
Some vendors and software require access to the internet for updates and management while some security policies and compliance demand applications are air-gapped. We need to make sure we are identifying those applications and develop a plan to address those requirements.
Clustered VMs and distributed applications are becoming more common as the drive for higher availability continues. Clustered platforms will have internal dependencies that we need to understand.
Some applications may have specific hardware they need to function; hardware tokens, PCI Passthrough devices, direct I/O connections, and GPUs are just a few.
vSphere Upgrades and Migrations
I’ve seen other blog posts reviewing the DCV6 and 6.5 exams stress that you need to know the upgrade paths between sphere versions and component interoperability. I would place this in the conceptual design portion of needing to know what versions of vSphere the business is on and if they need to make upgrades for any of the design quality reasons.
In our logical design, we will look at the specific vSphere version upgrades and component changes like vCenter and the PSC. In line with this, we should know what licensing the business has which will inform us of any constraints on the design and engage with the business about license upgrades for their design.
Once all application requirements are identified we can assess the impact they will have on the design. Many requirements will have second and third-order effects that need to be addressed. Below are some major vendor application virtualization best practices to start identifying:
- Virtualizing Oracle with VMware
- Virtualizing Exchange with VMware
- Virtualizing SharePoint with VMware vSphere
- Virtualizing Microsoft SQL on VMware vSphere
- SAP and VMware Virtualization
If you only have time for one then I recommend this whitepaper: Virtualizing Business-Critical Applications on vSphere
When engaging business leaders, having vSphere ROI and adoption trends discussions may be beneficial to helping them understand the business value of a VMware solution:
Prepare your Conceptual Design
In the next article we will take the business and application requirements from Objectives 1.1 and 1.2; identify risks, constraints, assumptions, and risks; and develop our conceptual design.