News:

Welcome to Wittenberg!

Main Menu

Witt outage and data loss recap

Started by Sir Lüc, March 03, 2023, 09:32:34 AM

Previous topic - Next topic

Sir Lüc

Azul,

as many of you will have noticed, our webpresence has been compromised throughout the last two weeks or so due to what our hosting provider defined as a "rather significant failure with (their) primary server". This began with me being unable to access our backend, continued with various DNS woes causing outages for some people in certain areas or using certain ISPs, and ended with a complete outage on March 1-ish.


What happened?

Due to the failure, as you may have noticed, we lost all data that was changed between Feb 16th and the ultimate outage on Mar 1st - Witt posts and PMs, Wiki articles, talossa.com content, and so on. Again, quoting from DoRoyal,

QuoteThe front end of our hosted websites remained online, allowing for front end edits to be made. The backend panel, however, including our backup processes, stopped working completely on the 16th, (meaning that) any edits that were made after the backup process and backend panel stopped working, would only be saved to the old server, and thus, were lost in the migration.

My possibly inaccurate interpretation is: we were on server A, then our data got moved to server B, but we still were interacting with server A. When server A finally went offline (or our address finally pointed to B -- remember all our DNS woes before the actual outage?), we were left with the data on B, which was by then two weeks out of date.

What happened *then*?
Mostly restoring all we could, recreating all subdomains and moving files to the correct places so our new server/control panel wouldn't complain anymore. This took the better part of yesterday for me and new MinTech Dan T. Most low maintenance subdomains were up pretty quickly, Witt was restored around noon TST yesterday, the Wiki took a bit longer and as of earlier today is now up as well.

Didn't we have backups?
Yes, Witt and Wiki are backed up weekly. However, due to the same frontend-backend misconfiguration issue, the backups were also unable to pick up any fresh data after the 16th.

Anything else missing?
We may have lost some Witt attachments and Wiki files. This may be related to why all avatars were reset, incidentally. I tried restoring all I could and there's more backups coming that were done by our host, I'm told. Regardless, if you had a CoA, it still can be linked to from heraldry.talossa.com , or from the wiki.

What do I (the reader) need to do?
DM me if there's anything wrong or broken, and if you had a talossa.com email address, especially an official or mission-critical one, that needs restoring.

Anything else you (Lüc) need to do before we're 100% back up and running?
Yes, some customizations are missing, such as the favicons (no biggie) and Tapatalk integration. Also, as mentioned, email redirects need to be restored. All shall be done later today. That said, you are safe to use all of our services. Even if we can restore additional missing content, we should be able to plug it back in without disrupting fresh data coming in from today onwards.



Thank you to everyone who helped restoring service in some way or the other. This was in the middle of a government transition no less, so for some of them it wasn't even their job anymore. Thank you to Txec and Miestră for interfacing with DoRoyal, Dan T for helping with the actual recovery job, and Breneir for the public-facing updates.
Sir Lüc da Schir, UrB
Directeur Sportif, Gordon Hiatus Support Team

In my free time:
Túischac'h dal Cosă / Speaker of the Cosa
Wittmeister & Permanent Secretary of Backend Admin / Secretar Parmanint per l'Aðmistraziun del Backend
Deputy Scribe of Abbavilla / Distain Grefieir d'Abbavillă

Sir Lüc

Please use this thread to discuss the outage and as a Q&A / error reporting thread of sorts as well. I'll do my best to reply and fix anything that needs fixing.
Sir Lüc da Schir, UrB
Directeur Sportif, Gordon Hiatus Support Team

In my free time:
Túischac'h dal Cosă / Speaker of the Cosa
Wittmeister & Permanent Secretary of Backend Admin / Secretar Parmanint per l'Aðmistraziun del Backend
Deputy Scribe of Abbavilla / Distain Grefieir d'Abbavillă

Sir Txec dal Nordselvă, UrB

I've spoken with our hosting provider who is very upset this all happened. Because things were still mostly working, he (and we) had no idea of the coming calamity. The small-ish problems we started having were symptoms, but Steven at DoRoyal was working hard with us to figure out the problem. It wasn't until the 1st that he discovered the issue, and by then it was too late. I'm not going to place blame with DoRoyal as it looks like the real problem lies with a vendor used by DoRoyal.

His offer of indefinite free service is very generous and we should seriously consider it. We might also look into a way to do our own backups for the future. I'm not sure this is possible, but I do believe Witt backups can be done.
Sir Txec Róibeard dal Nordselvă, UrB, GST, O.SPM, SMM
Secretár d'Estat
Guaír del Sabor Talossan
The Squirrel Viceroy of Arms, The Rouge Elephant Herald, RTCoA
Cunstaval da Vuode
Justice Emeritus of the Uppermost Cort
Former Seneschal

Marcel Eðo Pairescu Tafial, UrGP

I can't upload my profile picture. It gives me an error stating the avatars directory is not writable.
Editing posts is my thing. My bad.
Feel free to PM me if you have a Glheþ translation request!

Sir Txec dal Nordselvă, UrB

I would hope the new government doesn't try to place blame for this situation on Miestrâ as outgoing MinTech. There were two minor things that kinda were weird but seemed more like glitches than real problems. Both were reported I believe (one minor outage and Lüc having trouble signing in to the backend which at the time it was thought was due to gmail and not DoRoyal). Neither Miestrâ, Lüc, nor myself could have known that these were symptoms of a larger problem. Heck, even DoRoyal didn't know.

Miestrâ worked with Lüc and DoRoyal to find a way to get Lüc access and found a path forward. After the new MinTech contacted me about gaining access I added him in and from what I understand he had no problems accessing the backend.

As soon as the March 1 outage became more than just a small one which happens from time to time on even the biggest websites, I submitted a ticket to DoRoyal to find out what was going on. It was then that the major issue was discovered. The fact that it coincided with the 1st day of March and essentially the new government is mere coincidence.
Sir Txec Róibeard dal Nordselvă, UrB, GST, O.SPM, SMM
Secretár d'Estat
Guaír del Sabor Talossan
The Squirrel Viceroy of Arms, The Rouge Elephant Herald, RTCoA
Cunstaval da Vuode
Justice Emeritus of the Uppermost Cort
Former Seneschal

Breneir Tzaracomprada

#5
Quote from: Sir Txec dal Nordselvă, UrB on March 03, 2023, 10:57:59 AMI would hope the new government doesn't try to place blame for this situation on Miestrâ as outgoing MinTech. There were two minor things that kinda were weird but seemed more like glitches than real problems. Both were reported I believe (one minor outage and Lüc having trouble signing in to the backend which at the time it was thought was due to gmail and not DoRoyal). Neither Miestrâ, Lüc, nor myself could have known that these were symptoms of a larger problem. Heck, even DoRoyal didn't know.

Miestrâ worked with Lüc and DoRoyal to find a way to get Lüc access and found a path forward. After the new MinTech contacted me about gaining access I added him in and from what I understand he had no problems accessing the backend.

As soon as the March 1 outage became more than just a small one which happens from time to time on even the biggest websites, I submitted a ticket to DoRoyal to find out what was going on. It was then that the major issue was discovered. The fact that it coincided with the 1st day of March and essentially the new government is mere coincidence.

Why did you see a need to make this post? @Sir Txec dal Nordselvă, UrB
There has been no public commentary blaming of anyone...


Distain, MC
Fighting the good fight

Sir Txec dal Nordselvă, UrB

My concern is that in the aftermath of the great Witt-out, fingers could get pointed. This all happening during the transition was so weird and even though I can't foresee anyone actually ascribing blame, I also have seen in macronational life that when something goes wrong, someone always gets the blame. I wanted to make sure to be as transparent as possible so that the country at large knew what happened.
Sir Txec Róibeard dal Nordselvă, UrB, GST, O.SPM, SMM
Secretár d'Estat
Guaír del Sabor Talossan
The Squirrel Viceroy of Arms, The Rouge Elephant Herald, RTCoA
Cunstaval da Vuode
Justice Emeritus of the Uppermost Cort
Former Seneschal

Sir Txec dal Nordselvă, UrB

I saw a comment on discord that said you were frustrated and angry. I just reread it and saw it wasn't directed at anyone. My apologies.
Sir Txec Róibeard dal Nordselvă, UrB, GST, O.SPM, SMM
Secretár d'Estat
Guaír del Sabor Talossan
The Squirrel Viceroy of Arms, The Rouge Elephant Herald, RTCoA
Cunstaval da Vuode
Justice Emeritus of the Uppermost Cort
Former Seneschal

Breneir Tzaracomprada

Quote from: Sir Txec dal Nordselvă, UrB on March 03, 2023, 11:10:04 AMI saw a comment on discord that said you were frustrated and angry. I just reread it and saw it wasn't directed at anyone. My apologies.

Yes, Txec, this is happening during my time as Seneschal and I feel the responsibility to do as much as I can to assist in its resolution. And to help those who have more capabilites or access than I in doing so.
My inability to be more helpful and lack of access of information did make me angry and frustrated.

But I am aware of the vital work you, Luc, Miestra, and Danihel performed during the crisis.


Distain, MC
Fighting the good fight

Sir Txec dal Nordselvă, UrB

The last few days have been frustrating and I appreciate your support. I definitely am thankful for Lüc!!
Sir Txec Róibeard dal Nordselvă, UrB, GST, O.SPM, SMM
Secretár d'Estat
Guaír del Sabor Talossan
The Squirrel Viceroy of Arms, The Rouge Elephant Herald, RTCoA
Cunstaval da Vuode
Justice Emeritus of the Uppermost Cort
Former Seneschal

Sir Lüc

Quote from: Lüc on March 03, 2023, 09:32:34 AMAnything else you (Lüc) need to do before we're 100% back up and running?
Yes, some customizations are missing, such as the favicons (no biggie) and Tapatalk integration. Also, as mentioned, email redirects need to be restored. All shall be done later today.

Done, I think. Do keep reporting issues on this thread as you find them, though!
Sir Lüc da Schir, UrB
Directeur Sportif, Gordon Hiatus Support Team

In my free time:
Túischac'h dal Cosă / Speaker of the Cosa
Wittmeister & Permanent Secretary of Backend Admin / Secretar Parmanint per l'Aðmistraziun del Backend
Deputy Scribe of Abbavilla / Distain Grefieir d'Abbavillă

Sir Lüc

So, re. the offer of free hosting:

Quote from: Sir Txec dal Nordselvă, UrB on March 03, 2023, 09:44:16 AMI've spoken with our hosting provider who is very upset this all happened. Because things were still mostly working, he (and we) had no idea of the coming calamity. The small-ish problems we started having were symptoms, but Steven at DoRoyal was working hard with us to figure out the problem. It wasn't until the 1st that he discovered the issue, and by then it was too late. I'm not going to place blame with DoRoyal as it looks like the real problem lies with a vendor used by DoRoyal.

His offer of indefinite free service is very generous and we should seriously consider it.

I tend to agree with Txec's assessment here, and I'm the cleanup guy. DoRoyal has always been pretty quick and helpful in responding to tickets; I'm not sure we can get better support elsewhere, and free hosting is kinda unbeatable.

It's not my call to make, ultimately, and I'll assist the government with implementing whatever they decide.

QuoteWe might also look into a way to do our own backups for the future. I'm not sure this is possible, but I do believe Witt backups can be done.

We did do our own weekly backups, and it didn't help, because those are on the old server we were migrated off of. Meanwhile, the ones on this new server were useless because they were all snapshots of the migrated copy on Feb 14.

Unfortunately, I think GV is the only one who ever used backup.wittenberg.talossa.com to grab backups and save them locally, and unless he happened to do so after Feb 19, the "old server backups" are probably gone.
Sir Lüc da Schir, UrB
Directeur Sportif, Gordon Hiatus Support Team

In my free time:
Túischac'h dal Cosă / Speaker of the Cosa
Wittmeister & Permanent Secretary of Backend Admin / Secretar Parmanint per l'Aðmistraziun del Backend
Deputy Scribe of Abbavilla / Distain Grefieir d'Abbavillă

mximo

Thanks to all people who work on this...

I hate server when they do shit like this...

Tnis is why I'm not an IT guys

Mximo
Mximo Carbonèl
Florencia Senator

Baron Alexandreu Davinescu

At this point, we can probably unsticky this and remove the notice at the top of the board, I think?
Alexandreu Davinescu, Baron Davinescu del Vilatx Freiric del Vilatx Freiric es Guaír del Sabor Talossan


Bitter struggles deform their participants in subtle, complicated ways. ― Zadie Smith
Revolution is an art that I pursue rather than a goal I expect to achieve. ― Robert Heinlein

Sir Lüc

Sir Lüc da Schir, UrB
Directeur Sportif, Gordon Hiatus Support Team

In my free time:
Túischac'h dal Cosă / Speaker of the Cosa
Wittmeister & Permanent Secretary of Backend Admin / Secretar Parmanint per l'Aðmistraziun del Backend
Deputy Scribe of Abbavilla / Distain Grefieir d'Abbavillă