====== Roadmap to Virtualization ====== ===== General Roadmap ===== - [[virtualization:profiling|Profile]] the workloads you're planning on virtualizing \\ [[http://spiceworks-faqs.john-refactored.com/doku.php?id=virtualization:profiling]] - Make a guess about other workloads you'll be adding to the mix. - Apply 3-5 year growth factors to these workloads. - Configure hardware to meet the requirements - Shop the hardware among Tier 1 manufacturers \\ Spiceworks has the RFQ tool for this ===== Philosophical Guidelines ===== * Purchase from Tier 1 hardware providers with on-site repair contracts. \\ Read [[http://community.spiceworks.com/topic/post/1440240|erik-ptek's experience with Supermicro support]]. \\ [[http://community.spiceworks.com/topic/post/1440240]] * Use local storage until it's impossible to do so * Realize that live workload migration between hosts is a higher cost feature. Don't assume you need it. Cost-justify it. * If you need shared storage, investigate VSAs from [[http://www.vmware.com/products/datacenter-virtualization/vsphere/vsphere-storage-appliance/overview.html|VMware]] and [[http://h18006.www1.hp.com/products/storage/software/vsa/index.html|HP]] before purchasing an external storage device. \\ http://www.vmware.com/products/datacenter-virtualization/vsphere/vsphere-storage-appliance/overview.html \\ http://h18006.www1.hp.com/products/storage/software/vsa/index.html \\ \\ HP has a two-node, 10 TB bundle for $6,000. \\ [[http://h30094.www3.hp.com/product.aspx?mfg_partno=TA688SC]] * Only buy external storage devices in a manner that matches eliminating single points of failure. That is, buying two hosts for failover but a single NAS/SAN has added configuration and management complexity without eliminating single points of failure. NAS/SAN needs to be purchased in pairs with automatic failover. Otherwise, what's the point? ===== Recipes ===== ==== Entry Level ==== For environment where there are identifiable "key" workloads that need to come back up and "less key" workloads that the environment can do without in the case of a hardware failure. * Primary Host \\ Sized to run all the workloads with headroom for future growth. * Backup Host \\ Sized to house replicas of all the workloads but only runs the key workloads in non-degraded mode. Possibly able to run all the workloads with performance compromises. * Backup / Replication software to periodically create a replica of the workloads from the primary to the backup host. * Requires hardware maintenance windows. ==== Full Backup ==== Create an environment where all the workloads can be run on a second host in the case of the first host failing. * Primary Host \\ Sized to run all the workloads with headroom for future growth. * Backup Host \\ Configured to have at least the same level of resources as the Primary. * Backup / Replication software to periodically create a replica of the workloads from the primary to the backup host. * Requires hardware maintenance windows, but if problems occur on the primary host during a maintenance window, the backup environment can be promoted to Primary. ==== Local Shared Storage ==== When High Availability or vMotion/XenMotion/live-migration are business requirements. * Primary Host \\ Sized to run all the workloads with headroom for future growth. * Backup Host \\ Configured to have at least the same level of resources as the Primary. * Shared Local Storage \\ VMware VSAN, Starwind iSCSI SAN, HP Lefthand VSA, DRBD, etc. generally have a guest on each host that owns most of the local storage. The storage is maintained in a mirrored state and is then presented back to the hosts as shared storage. \\ Best practice is to have separate "air gap" switching infrastucture. * Since the hypervisor sees storage as shared, high availability features (auto guest reboot on backup host in case of primary host failure) and live migration are possible. * You still need backup software. * With live migration features, one generally doesn't need hardware maintenance windows. ==== Shared Storage Appliances ==== When a virtualization environment's host count reaches 4 or more, the scalability of local shared storage becomes more complex, and centralizing shared storage in an appliance becomes more cost-effective. * Host count is 4 or more * Multiple host configurations are possible (but too many causes management overhead costs to rise) * High Availability and migration features are available * To avoid single points of failure, generally two appliances are needed with automatic failover * Requires separate air-gapped switching infrastructure with redundant paths to the appliances * You still need backup software * You need a separate backup target