Primal Philosophy

Foundational thoughts on creating a computing architecture in a medium to large enterprise environment.

Introduction

NAS Storage RackBack in the glory days of Sun Microsystem, they were visionary at that time when the Ethernet and TCP/IP was emerging as the universal network topology and protocol standard. Sun adopted the marketing slogan "the network is the computer" and wisely so. That is the way I have viewed computing from the time I started architecting networks of computers. It isn't about a standalone machine performing a specific localized task only, rather it is a cooperative service that ultimately satisfies a human need as it relates to a service and its related data.

Through the years, I have been successful in tailoring an architecture that required few administrators to efficiently administrate hundreds of computers running on multiple hardware and OS platforms serving both high-end technical, business end user communities, desktop and server incorporating multiple platforms. I have found three areas that require a standard for administration (1) OS configuration (2) separate off data onto devices that are built for that purpose of managing data (i.e. NAS), developing a taxonomy that support its usage and (3) network topology.

OS and Software Management

OS ConfigurationNinety five percent of the OS installation should be normalized into a standard set of configuration files that can be incorporated into a provisioning system such as Red Hat Satellite for Linux. Separating off the data onto specific data appliances frees up backups being performed on fewer machines and do not require kernel tweaks that satisfies both the application service running on the server versus handling backups. Considering that application software doesn't rely on a local registery as does MS Windows, the application software itself can be delivered to multiple hosts from a network share, thus making any given host more fault tolerant.

The argument for “installation per host” is that if there is an issue with the installation on the network share, all hosts suffer. This is a bit of a fallacy. While it is true that if there is an issue, it breaks everywhere, but then again, you fix an issue in one place, you fix it everywhere. The ability to extend the enterprise-wide installation with minimal effort, maximizing your ability to administrate it outweighs the negative for breaking it everywhere. It takes discipline to methodically maintain a centralized software installation.

Data Management

Data ManagementData should be stored on NAS (network attached storage) appliances as they are suited toward optimal data delivery across a network and gives a central source for managing it. Anymore, most data is delivered across a network. NAS appliances are commonly used (such as Netapp) to deliver a "data share" using SMB, NFS or SAN over FCoE protocols.

In the 1990s, the argument against using an ethernet network for delivering data was due to bandwidth and fear for what would happen if the network goes down. Even back then, if you lost the network, the network was held together by backend services for identity management and DNS. In the 21st century, I always chose to install two separate network cards (at least 2 ports each) in each server. I configured at least one port per card together for a trunked pair. One pair would service a data network and the other for frontend/user access. This worked well over the years. Virtualizing/trunking multiple network cards provide a fault tolerant interface whether for user or data access, though I have never seen a network card go bad.

There are a handful of application software that requires SAN storage, though I would avoid SAN unless absolutely required. You are limited by the filesystem laid on the SAN volume and probably have to offload management of the data from the appliance serving the volume. Netapp has a good article on SAN vs. NAS.

Business Continuance/Disaster Recovery and Virtualization

Disaster RecoveryThere is the subject of business continuance and disaster recovery that plays into this equation. Network virtualization is a term that includes network switches, network adapters, load balancers, virtual LAN, virtual machines and remote access solutions. Virtualization is key toward providing support for fault tolerance inside a given data center as well as key toward providing effective disaster recovery. Virtualization across data centers simplify recovery when a single data center fails. All this requires planning, providing replication of data and procedures (automated or not) to swing a service across data centers. Cloud services provide a fault tolerant service delivery as a base offering.

The use of virtual machines is common place these days. I’ve been amused in the past at the administrative practice by Windows administrators who would deploy really small servers that only provided a single service. When they discovered virtualization, they adhered to the same paradigm, but providing a single service per virtual machine. Working with “the big iron”, services would be served off of a single server instance where those services required roughly the same tuning requirements and utilization and performance was monitored. With good configuration management, extending capacity was fairly simple.

Work has been done to virtualize the network topology so that you can deploy hosts on the same network world-wide. For me, this is nirvana for supporting a disaster recovery plan, since a service or virtual host can be moved across to another host, no matter the physical network the hypervisor attached without having to reconfigure its network configuration and host naming service entry.

Virtual networks (e.g. Cisco Easy Virtual Network - a level 3 virtualization) provides the abstraction layer where network segmentation can go wide, meaning span multiple physical networks providing larger network segments across data centers. Having a “super network”, disaster recovery becomes much simpler as the IP address doesn’t have to be reconfigured and reported to related services such as DNS is needed.

Cloud Computing

Cloud Computing My last job as a systems architect, I had the vision for creating a private cloud, with the goal for moving most hypervisors and virtual machines into a private cloud. Whether administering a private or public cloud, one needs a toolset for managing a "cloud". The term "cloud" was a favorite buzzword 10 years ago that was not a definitive term. For IT management it usually meant something like "I have a problem that would be easier to shove into the cloud and thus solve the problem". (Sounded like the out sourcing initiatives in the 1990s). Any problem that exists doesn't go away. If anything the ability to manage a network just became more complicated.

There has been various proprietary software solutions that allows the administrator to address part of what is involved with managing a cloud whether for standing up a virtual host or possibly carving out data space or configure the network. OpenStack looks to be hardware and OS agnostic for managing a private and public cloud environment. I have no experience here, but looks to be a solution that the hardware manufacturers and OS developers have built plugins as well as integration with the major public cloud providers.

Having experience working in a IaaS/SaaS solution, utilizing a public cloud is only effective with small data. Before initiating a public cloud contract, work out an exit plan. If you have a large amount of data to move, you likely will not be able to push it across the wire. There needs to be a plan in place, possibly a contractural term to being able to physically retrieve the data. Most companies are anxious to enter into a cloud arrangement but have not planned for when they wish to exit.

Enterprise-Wide Management

Enterprise-Wide Management There is the old adage that two things are certain in life - death and taxes. Where humans have made something, whether physical or abstract, one thing is certain - it is not perfect and will likely fail at some time in the future. Network monitoring is required so the administrators know when a system has failed. Stages for implementation should include server up/down monitoring followed by work with adopting algorythms for detecting when a service is no longer available. From there, performance metrics can then be collected and work to aggregate those metrics and threshholds into a form that provides support for capacity planning and measure whether critical success factors are met or not.

Another thought toward capacity management. Depending on the criticality of the service offering, the environment should provide for test/dev versus production environments. Some services under continual (e.g. waterfall) development could require separating out test and dev environments in order to stage for a production push.

Provisioning tools are needed to perform quick, consistent installations whether loading an OS or enabling a software service. At a minimum, shell scripts are needed to perform the low-level configuration. At a higher level, software frameworks like OpenStack and Red Hat Satellite are needed to manage a server farm for more than a handful of servers.

Remote Access

Remote AccessRemote access has been around in various forms for the past 20+ years and is becoming a critical function today. VPN (virtual private network) is the term associated with providing secure packet transmission over the extranet. While a secure transport is needed, outside of public cloud services, there is the need for an edge service that provides the corporate user environment "as if" they were inside the office.

Having worked in a company that had high-end graphical workstations used by technical users requiring graphics virtualization and high data performance, we worked with a couple solutions that delivered a remote desktop. NoMachine worked well but we migrated toward Nice Software (now an Amazon Web Service company). At the time we were looking at not only providing a remote access solution, but also a replacement for the expensive desktop workstation while providing larger pipes in the data center to the data farm. Nice was advantageous for the end user in that would start an interactive process on the graphics server farm as a remote application from their desk, suspend the session while their process ran, and remotely connect again to check on that process from home.

Summary

When correctly architected, you create a network of computers that are consistently deployed and easily recreated should the need arise. More importantly, in managing multiple administrators, where a defined architecture exists and understood and supported by all, the efficiency gained allows the admin to work beyond the daily issues due to inconsistent deployment, promotes positive team dynamics and minimizes tribal knowledge.

Last modified February 25, 2021: version 2.0 (70b449f)