How to wreck your reputation as a service provider

Maisy's Musings

How to wreck your reputation as a service provider

Posted on May 23, 2016

So, I don't normally have a topic worth discussing, but whilst this is still fresh in my mind, let me tell you a little story. It is a story about how an individual hosting company screwed up oh so royally. And at the time of this post, is still going on...

A bit of background

A few years ago, I discovered a VPS Host called BHost. They were a small UK-based VPS company that were offering decent-spec Virtual Private Servers for a very good price, and would occasionally offer recurring discounts too. To top that off, if you had a problem, you could send them a support request, and they'd reply within an hour. Even for things that they weren't obliged to help you our with (such as setting up a VPN with your VPS).

I had a pretty good relationship with them, and at one point, had three different VPS instances open with them at once. If you had outgrown your server and wanted to beef-up the memory, you could send a request to them and their response would be along the lines of.

No problem! We've upped your container for you. Don't worry about paying the difference - you'll pay the new tarrif next cycle.

That was my experience with them a few years ago, anyway. Then things started to change.

The fall

To start, BHost moved their hardware to a new datacentre, and not long after, they started suffering bandwidth and routing issues - the slightest DDoS on any of their clients and the entire cluster would be knocked offline. I should know, because one of my sites was the target of an attack and the hosts ended up tweeting about it. (That particular attacker was brought to justice and hopefully provided the authorities with intel on other, more serious criminals, but that's for another time...)

Whenever unexpected downtime occurred, we'd sometimes be left waiting for support tickets to be answered, or for information to appear on Twitter. After challenging their support about the matter, I was promised that improved measures would be put in place to deal with incidents and to relay information to customers.

Because of the patchy service and just as patchy support, I cancelled one of my instances, and then cancelled a second instance because it was going un-used.

The fire

Fast forward to a few days ago. One of my users informs me that services are down - did someone sit on something, did something crash, or did someone forget to pay the invoice again? Nope. We were out of space. Which was weird. Out of a quota of 50GB, we were only using 30GB. Why were we out of space? What many providers like to do is over-sell space, knowing that only a handful of users will consume 100% of their quota and that the rest of the clients will comfortably share the rest. What that means, though, is when the disk physically runs out of space, every single client on that machine goes down.

The first big mistake

Most providers can avoid this by limiting how much they provision to a single disk (or machine), and to have monitoring and alerting systems in place for when a drive gets to 80% or 90% capacity. If such a system was in place, then this kind of situation could have been avoided in the first place.

So, no monitoring system was in place to prevent this disaster. Slow clap.

So, when (like me) you realise that this problem isn't your fault, you open a support ticket with a high priority and you ask them to investigate ASAP. They should figure out what's going on pretty quickly and get to the problem. Right? Especially when potentially dozens of other clients are having the same problem and have come to the same conclusion. We're renting Virtualised Linux Servers - we should individually have the computing smarts to understand what's going on.

The nail in the coffin

No. Checking Twitter, plenty of other people have fired-off support tickets, had no reply, and then chased them up on Twitter. The fire is continuing to burn, but the fire department is nowhere to be seen - you keep calling 911 and the phone keeps going to voicemail.

The escape

If you're like me, you had a contingency plan in place. Nightly automated backups. You built your log cabin on wheels and you can steer it away from the danger and into a safer neighbourhood, on the far side of the lake. If only the fire department had just put that fire out before the move was necessary.

If you're not like me and instead you're one of the unfortunate people on Twitter who didn't have backups (why didn't you?), then your log cabin has probably caught fire and the fire department are still not answering. I feel truly, truly sorry for you. If they turn up before you've lost your home and put the fire out, you've still suffered a great deal of damage.

If I were BHost Inc. right now, I'd feel pretty flipping ashamed. You just let your customers down, and you probably let their customers down too. If I were one of BHost's clients (keep in mind that I was one myself), I'd be livid (and I am) - the moment you get your services back online, I'd demand they waive any outstanding invoices, take some backups, and go somewhere else.

Can I escape too?

If you've got some MySQL Databases you're having trouble dumping, then I'd do some research into taking the .MYI files or .FRM and .IBD files used by MySQL, copying them to a new machine, and work on restoring your data from those - it isn't quite as straightforward as having an SQL dump, but it is very possible.

I wish you luck.

Comments

What is "Maisy's Musings"?

It's a blog. A blog that happens to be mine. The name Maisy is just a pet one, but because it sounded catchy, I decided to call it Maisy's Musings. When I can be bothered or inspired to write a post on here, you'll see things from daily sightings, to the occasional rant or cool tech fact.

Read-on at your own peril.