ShootQ Downtime 2011-08-09

What Happened to ShootQ?!

Hey guys and gals. I just wanted to jump in here and give you a full, detailed, brutally honest breakdown of what's going on with today's ShootQ downtime, why it happened, and what we're doing to fix it. Total transparency will help everyone understand precisely why their access to ShootQ was limited today.

First, a little history. ShootQ was started by a very small team, with no full-time operations. As a result, it made a lot of sense for us a the time to host ShootQ and its associated systems and services in a "cloud" based provider. This gave us the ability to get started without having to purchase a bunch of hardware, and to rely on a highly skilled set of people at the provider to keep us up and running. Its served us well.

About a year ago, ShootQ was acquired by Pictage, which has a large team of amazing operations staff, lots of high powered hardware, and high-end monitoring systems and procedures. About 6 months ago, we knew it was going to be necessary to move ShootQ to Pictage-controlled hardware and data centers, but we needed to wait on new hardware, expanded capacity, and a brand new data center. 

Why did we decide this? Because we started to notice that our cloud provider was becoming less reliable. You may have noticed as well. We built systems to minimize the issues, put in better monitoring, and brought in an extra full-time person just to manage and automate ShootQ's operations.

The wait is almost over, we'll be moving in the next few weeks. But, it looks like we weren't quite quick enough to avoid all problems, as we were hit with a large scale issue this morning on our cloud provider's systems. Someone was abusing resources on the same resource pool as our "master database", causing our performance to take a huge hit. We have instant snapshot backups in place, so data is safe, but as soon as the problems hit, we took ShootQ offline to perform emergency maintenance.

We contacted our provider, who was painfully slow to respond to the issues. We called, emailed, filed tickets, and did everything we could to react, but we are extremely frustrated with our provider for the delay. Finally, our provider killed off the abuser in our resource pool, and for a short time, service was restored. Within about 15 minutes, we started to see problems again, so we took the application offline again. They've finally determined that hardware RAID controller that controls disk access on the hardware that hosts the ShootQ master database is the root cause. The team is working on promoting our "slave" database to restore service ASAP.

So, what are we doing to avoid issues like this in the future?

1. First and foremost, we're moving off our existing provider as planned. We'll be moving to a brand new data center, on brand new hardware, that we share with no one. The hardware is multiple times faster and more reliable than what our current provider can offer, and we're in total control. When issues arise, it will take minutes to solve rather than hours.
2. The move will also enable highly automated monitoring and system notifications that allow us to see problems coming minutes, and perhaps hours before they crop up. This provides us an unprecedented ability to resolve problems.
3. We're working on some processes to ensure that not only operations, but also customer care, and other parts of the company will be notified of issues, so when you call customer care you'll get the most up-to-date information about the status of services.

This move has been planned for months, and we've been moving as quickly as we can to make it happen. I wish we could have completed the migration by now to avoid this issue with our provider, but at the very least you deserve to peek behind the curtains, and see how issues arise. I wanted all of our customers to know that we are as frustrated about the downtime as you are, and have had a plan to avoid issues like this for a long time. I apologize for any inconvenience this may have caused you, and I will be happy to answer any questions you have here in detail.

Thanks for your patience!
__________________
--
Jonathan LaCour
VP, Software Products
ShootQ Code Poet Laureate
Have more questions? Submit a request

23 Comments

  • 0
    Avatar
    Luis Alicea

    Thanks for the clear message and good luck with the planned changes.

  • 0
    Avatar
    Permanently deleted user

    That's great and all...but how about a timeline for when we may be back online?

  • 0
    Avatar
    Adina Hayne

    My clients are trying to book.  Can you tell me when we are coming back up?

  • 0
    Avatar
    Saul Padua

    I cant live without shootq !  Hurry up guys !!   jajaja 

  • 0
    Avatar
    Justin Lund

    Hi Dan,

    We're in uncharted territory right now so I don't have a realistic timeframe to give you.  Our team is working hard to resolve it as quickly as they can.  I'll post when I know more.

  • 0
    Avatar
    Steven Wayne

    Mine just came up, is it really working or should I wait to post payments???

  • 0
    Avatar
    Justin Lund

    We are back up and running now.  We're keeping a watchful eye on the app.  Please let our support team know if you're experiencing any issues.

  • 0
    Avatar
    Permanently deleted user

    You are unable to make a contract, telling you there is a problem

  • 0
    Avatar
    Justin Lund

    Thanks Jeremiah, our dev team is working on that issue right now.  It's something that worked in initial testing but failed soon after.  I'll update when it's fixed.

  • 0
    Avatar
    Zoe Dennis

    Having the same problem.  Tried to cancel/revise a booking and I keep getting an error message with every route I take.  Thanks!

  • 0
    Avatar
    Justin Lund

    Still working on the booking issues.

  • 0
    Avatar
    Justin Lund

    Booking issues are resolved.

  • 0
    Avatar
    Phil Holland

    This post is a "how to" on dealing with customer communication in times of technical challenges.  Thanks ShootQ

  • 0
    Avatar
    Greg Howard

    Thanks for the update. Yesterday was frustrating but I appreciate the candour.

     

    Greg

  • 0
    Avatar
    Snap Cubby San Luis Obispo

    I recently watched another very similar company, Mind Body, Inc. (http://www.mindbodyonline.com/) go through a larger, but similar, technical problem that had a much bigger affect on their clients, and I hope ShootQ can learn from their troubles to prevent a similar disaster.  

    Quick background: MindBody is a very similar online client management/booking/billing program for fitness studios, hair salons, etc.  What ShootQ does for photography studios, they do for these other types of businesses.  They recently experienced a "Denial of Service" attack (http://en.wikipedia.org/wiki/Denial-of-service_attack) that left their servers down for a total of 4 days.  I'm not sure if ShootQ's problem was a similar attack, but that isn't really relevant.  The ultimate problem Mind Body's clients experienced was that ALL of their day to day business operations were unavailable when the server went down.  Like ShootQ, this included their daily schedules, client contact info, and billing info.  Friends I have that work at Mind Body said they essentially had to put all staff on board to field angry frustrated calls coming in from their frantic clients trying to run their business.

    The problem both ShootQ's customers and Mind Body's customers share is that we are extremely reliant on these two companies, which is fantastic, but when the service goes down, we have no way of accessing the most basic information of our own business.  No matter who is in charge of ShootQ's servers, ShootQ will always face the possibility of some sort of technical issue that could force downtimes.  And sometimes even the shortest downtimes can create big headaches for us users that may need to access their information during that time.

    This being said, I think it would greatly benefit ShootQ, and of course us customers, to have a simple database available for all ShootQ customers.  This doesn't have to be a fancy program to run in conjunction, but just a simple series of text files that provide a snapshot of where we left off.  Text versions that we can backup to our own computers that could contain a simple text version of dates booked, clients associated with those dates, and a list of clients with contact information.  This wouldn't take away at all from our reliance (and appreciation) of ShootQ, but would give us a manual safty-net that we could access and use to go back to the old-fashioned manual way of managing our business if and when ShootQ faced a technical problem.  If Mind Body had this in place, they would have saved themselves the additional headaches of dealing with their angry customers while they battled to fix their own technical problems, and the cost of business as new customers were turned away and some current customers left in anger.  And the business owners using their software would have benefited exponentially.

    Thanks for listening!

    Eric

  • 0
    Avatar
    Daniel Hennessey

    Problems still with clients submitting questionnaires (3 hour no luck) and most likely problem with contract because two have been sent with no returns. Hope to have resolve quickly!

  • 0
    Avatar
    Daniel Hennessey

    Also not able to send proposals nor emails/questionnaires

  • 0
    Avatar
    Justin Lund

    Daniel - looks like our emailer had a hiccup this morning.  Everything should have gone out by 10:45AM Eastern this morning.

  • 0
    Avatar
    Justin Lund

    Eric - you bring up an excellent point!  While you can't download a database in order to use ShootQ offline, we are working on making it easier to keep electronic versions of your information on your computer.

    You can do some of this already by going into the shoot and clicking the "Print" button from within any shoot.  This brings up a dossier view with shoot info, client info, questionnaires, contracts, etc.  You can print that or save it as a PDF.

  • 0
    Avatar
    Permanently deleted user

    Thanks for the transparency love it!!

    I think being able to download some kind of backup to keep ourselves would be nice. Even if it was just a big old text file.  Something to have in the case of emergency.

  • 0
    Avatar
    Marcus Seeger

    thought you should know that the twitter resource for updates that is displayed when shootq is offline "Shootq will be back soon" - shootqalerts doesn't exit.

    I am assuming you are still have problems with the migration - I was able to login just now but no calender or client details, then shootq crashed and is offline.....

    oh and thanks for the dossier view tip above to print as PDF - great work around for electronic archive. will do this when back on line!

  • 0
    Avatar
    Marcus Seeger

    lol, 15 minutes later, all working again...thank you

  • 0
    Avatar
    Justin Lund

    Just to follow up, tonight's release including a new feature to download a backup of your entire ShootQ account.  Read more about it here - ShootQ Backup

Please sign in to leave a comment.