SaaS at the Business Edge: Are Your Downtime Fears Justified?

Software-as-a-service (SaaS) business applications have clear advantages. They have great pricing. They are convenient and easy to manage. You get cutting edge technology. However, to get them implemented we have to overcome a very valid objection:  

Sometimes the internet breaks. 

Over the course of two hours on 24 June 2019, the internet broke down for most of the United States. Popular websites and apps were inaccessible on browsers and phones.  

The cause was achingly human while also being deeply technical. It is called a route leak: A Border Gateway Protocol (BGP) route list that was intended as a map to guide traffic between a few networks was published to networks that should not use those directions. It is like all the rush hour freeway traffic being routed to a suburban side street.  

As a result, traffic for 2,400 networks was unfortunately sent through the network of Allegheny Technologies in Pennsylvania. Their infrastructure was not up to the task and most requests failed. 

This 80-year old metals manufacturing company was not meant to be a major hub of the Internet, but for two hours in 2019, it was! (Source: Wikipedia  public domain)

BGP is one of the many arcane arts that usher traffic across the internet. The “inter-net” is a connection of many autonomous networks, and BGP provides rules for how to get from here to there by moving data from one network to another. A BGP route is somewhat like the turn-by-turn directions you get from Google Maps, only it tells data how to get from a server in Bellevue, Washington to your customer support desk in Trenton, New Jersey.

Propagation of a bad BGP table is preventable. This was clearly an error that everyone agrees never should have happened, but it did. And while the Allegheny incident was a high-profile breakage whose source we can identify, this sort of thing happens in harder-to-diagnose ways all the time.  

Due to the nature of internet infrastructure and the laws of probability, they are inevitable. The internet will break, connections will drop, services will fail for no obvious reason. 

The more you know about how the internet functions the more difficult it is to believe that it works at all. Along with leaky BGP routes, services depend on DNS, content delivery networks, cloud service providers, and a variety of technologies run by different companies falling well beyond the reach of the customer support or sales person whose web browser is displaying a cute “504, timed out” message instead of the new customer’s loan document.  

Where does that leave your business operations, particularly now that cloud-based SaaS applications are taking over?  

If your vendor is not taking your concerns about outages seriously, they clearly don’t know much about the “modern” internet. 

The concern naturally increases when the risks are greater. The closer the cloud-based solution is to customer engagement where customers are won and lost, the more reasonably nervous you would be about uptime.  

  • If you are a car dealer and your parts lookup is cloud-based, short downtime is awkward and undesirable.  
  • If your customer-facing staff rely on a scheduling system based in the cloud, downtime is an absolutely terrible prospect.
  • If your medical clinic’s electronic health records or electronic medical records are cloud-based, downtime is completely unacceptable. Significant downtime needs to be beyond belief.    

For some locations, such as many rural and suburban areas of the US, the internet breaks worse and more often. When considering a cloud-based or SaaS solution for a business, concerns about downtime are legitimate and substantiated. Regardless of the technical advantages, inconveniencing customers isn’t worth it. Putting the weak links of the internet between the business and customer interaction at the service counter isn’t worth it. 

As technologists, we can’t just complain and shirk connectivity. These applications are the key to being competitive in the modern marketplace. We have to make cloud solutions functional and reliable. They simplify business operations, keep technology up to date, and save money.  

Despite everything fragile and subject to failure between that key service and our users, we have to create resilience the right level of resilience.  

Key Network Issues for SaaS Deployments 

  • Uptime and bandwidth 
  • Management and support requirements 
  • Security 

Uptime and bandwidth 

Some things you don’t want to know, such as how many problems the internet has at any one time. Not every issue makes the news, but even very short incidents can cause problems for mission-critical real-time applications. A hiccup at the ISP can be enough to drop a call or tangle up a customer service response.  

A study of Bigleaf router performance data shows that a typical single-ISP business experiences 3.5 hours of internet downtime a month. What’s more, they experience an additional 23 hours of severely degraded service from jitter, low throughput, and other internet problems that don’t register as downtime but the effect on applications – and thus customer experience – is the same. It is downtime by another name. 

Calculating management and support 

When networking gets critical, the solutions can be very involved. They can become a problem in themselves. When deciding on quality of service (QoS) settings to optimize a Voice over IP (VOIP) system, are you impacting another mission-critical system? Is YouTube video downloading important to a business operation or can you lower its priority? Do you have to manually tweak and then stress test these applications to see how they interact?

As new applications emerge and the business develops new expectations of network performance, maintaining the network, troubleshooting problems, and new installations can be significant time and budget burdens.   

Security in all things 

Security has to be a part of every conversation now, and the resolution of our network challenges is no exception. The perimeter firewall is a centerpiece of current network security strategies. Particularly in regulated industries with compliance requirements, the business needs to have control over their firewall to keep rules and monitors up to snuff. Network solutions can interfere with existing firewalls and potentially provide a new attack vector. 

The Uptime Reality 

Bigleaf Networks was built with all of these concerns in mind. Our SD-WAN platform allows clients to seamlessly use multiple ISPs for higher reliability and performance of their network making them more reliable than any one ISP by itslef.

In the course of our business, we have a window into the reliability of the internet. In a recent month, all the circuits that our clients used averaged 92.5 percent reliability. That is not measuring just major outages but also moments when throughput, errors, or jitter is preventing the internet from being usable. 

Our data also shows the solution: with Bigleaf  implemented, uptime at the client location was 99.88 percent.  

Bringing a business-critical SaaS application into the office is exciting but scary. There are no guarantees in this world, but using the right SD-WAN solution means that, the next time someone transposes a couple numbers on a BGP table, your operation is more likely to stay up and running. 

Comments are closed.