Site News: December 1st Outage

by Ryan Smith on 12/1/2022 6:00 PM EST
Comments Locked

31 Comments

Back to Article

  • ballsystemlord - Thursday, December 1, 2022 - link

    I re-entered my latest comments. The others should restore fine with the rest of the site.
    Thanks for telling us.
  • GreenReaper - Thursday, December 1, 2022 - link

    Someone must've grabbed a Cyber Monday deal a year or two ago and forgot to renew...

    Sorry you've had such a stressful start to December, hope you have a good rest of the month and end up with a nice present under the tree to make up for it.
  • linuxgeex - Thursday, December 1, 2022 - link

    @Ryan Smith: Time to add a witness log to your off-site recovery plan, so that you can replay the incremental changes between those backups.
  • at_clucks - Friday, December 2, 2022 - link

    Also a good time to plan for testing the procedures periodically (even yearly). No point in taking backups if you don't know if the restore works, how long it takes, how to do it in a safe, effective manner, etc. Once per year do a restore on a separate site and you get to see what goes wrong and put it in the procedure.
  • Threska - Thursday, December 1, 2022 - link

    " Articles will be back, but we’ve likely lost any comments and user account registrations/updates made since midday Friday."

    Time to hack our caches and get them back.
  • Silver5urfer - Thursday, December 1, 2022 - link

    Good to hear, thanks for explanation. I was worried on what could have happened.
  • GeoffreyA - Friday, December 2, 2022 - link

    Thanks. Glad the recovery plan worked. Maybe this could be an opportunity for an article on backups.
  • Threska - Friday, December 2, 2022 - link

    Connect storage to the torrent networks and give everything saucy names. Soon you'll be backed up.
  • GeoffreyA - Friday, December 2, 2022 - link

    Indeed, little chance of failure there!
  • James5mith - Friday, December 2, 2022 - link

    Onsite cloud storage?

    Can you elaborate on what that means?
  • linuxgeex - Sunday, December 4, 2022 - link

    Properly, it would mean an on-prem AWS-compatible storage engine or the like. But instead they're probably talking about taking snapshots of VMs, and learned the hard way that such snapshots corrupt databases badly.
  • dontlistentome - Friday, December 2, 2022 - link

    I remember when my colleague took out the company's website and community by deleting all the content (shift-delete is dangerous!).

    Our desks were right outside the CEO's office. That was a fun 30 minutes whilst we restored the backup from (luckily) 3 hours earlier that morning whilst making excuses about an issue with the hosting. We lost a little bit of content, but yeah, backups.
    Do them, and don't assume cloud providers are doing it or testing them for you.
  • paulwatsonjr@aim.com - Friday, December 2, 2022 - link

    reminds me of the old saying: Untested backups are nothing more than a false sense of security...
  • schizoide - Friday, December 2, 2022 - link

    I would find a bit more detail to be interesting.

    Was this logical or physical corruption? If physical did you have a DR site? If not, that's the obvious next step.

    Actually if it was logical, I setup DR sites replicated with a delay of several hours, so that works too.
  • iq100 - Friday, December 2, 2022 - link

    A want Anandtech to provide a detailed analysis of how they had a data losing OUTAGE.

    What was the actual hardware that failed.
    What was the actual software that failed.

    After providing the analysis, Anandtech should put on their technical hat and offer a hardware/software design that cannot suffer a data loss.

    A design that that has no single point of failure.
    Or state there is no such design.

    I understand that this may take some time.
    BUT PLEASE POST NOW/IMMEDIATELY THAT ANANDTECH WILL DO THIS, AND AN ESTIMATE DATE FOR PROVIDING SUCH ANALYSIS. Thanks!
  • GeoffreyA - Friday, December 2, 2022 - link

    "A design that that has no single point of failure.
    Or state there is no such design."

    I think it's fair to say the latter wins the prize.
  • Threska - Friday, December 2, 2022 - link

    Only nation-states could afford it.
  • iq100 - Friday, December 2, 2022 - link

    oops .. duplicated some text, and NO WAY ON ANANDTECH TO EDIT (with history of edits), or even delete and create new post. Another Anandtech design issue?
  • GeoffreyA - Saturday, December 3, 2022 - link

    In the future, conceivably, there'll be storage off-Earth. Because, come to think of it, if Earth were wiped out, all our data would be lost, including our art, classical buildings, etc.
  • ballsystemlord - Sunday, December 4, 2022 - link

    That's right! An alien species might be interested in us after we are gone. We need to leave something for them. ;)
  • iq100 - Friday, December 2, 2022 - link

    Geoffrey A wrote:
    "A design that that has no single point of failure.
    Or state there is no such design."

    I think it's fair to say the latter wins the prize.
    ---
    Tandem Computers, long ago, had such a design. Not even expensive with today's inexpensive servers.

    Tandem's NonStop systems use a number of independent identical processors and redundant storage devices and controllers to provide automatic high-speed "failover" in the case of a hardware or software failure. To contain the scope of failures and of corrupted data, these multi-computer systems have no shared central components, not even main memory. Conventional multi-computer systems all use shared memories and work directly on shared data objects. Instead, NonStop processors cooperate by exchanging messages across a reliable fabric, and software takes periodic snapshots for possible rollback of program memory state.

    reference: https://en.wikipedia.org/wiki/Tandem_Computers
    "Tandem's NonStop systems use a number of independent identical processors and redundant storage devices and controllers to provide automatic high-speed "failover" in the case of a hardware or software failure. To contain the scope of failures and of corrupted data, these multi-computer systems have no shared central components, not even main memory. Conventional multi-computer systems all use shared memories and work directly on shared data objects. Instead, NonStop processors cooperate by exchanging messages across a reliable fabric, and software takes periodic snapshots for possible rollback of program memory state."

    What does Anandtech, and everyone else think? Is it possible? Write up the design, here.
  • GeoffreyA - Saturday, December 3, 2022 - link

    I think it's a brilliant design, this extreme redundancy in the spirit of distribution.

    (I get the feeling that even the universe's "data structures" keep track of things in a distributed fashion. When reading current theories, one gets the impression that nothing is global, but the consistent state is built up piece by piece. Perhaps the key is message transfer, rather than storage in some "big table!")
  • The Von Matrices - Friday, December 2, 2022 - link

    If the only data loss is a few of the most recent comments, I would call that a success.

    There is always a tradeoff of what you are willing to lose vs. how much you are willing to pay to avoid loss. You certainly can design a system that cannot suffer a data loss event, especially on a news site where there isn't much much data being generated, but whether the company can afford such a system is another issue. News, especially online news, is an extremely low profit business.
  • Dug - Tuesday, December 6, 2022 - link

    Why? Are you paying their salaries? It's just a website that went down and came back up, it's not a big deal. The "hardware/software design that cannot suffer a data loss. A design that that has no single point of failure" does not exist.
  • The Von Matrices - Friday, December 2, 2022 - link

    When I saw the number of comments the "Best CPUs" article decrease, I thought it was due to the release of the long awaited comment editor. Alas, it was only data corruption. We can only dream...
  • Threska - Saturday, December 3, 2022 - link

    Well that "sponsored post" article lost a lot of comments.
  • ballsystemlord - Sunday, December 4, 2022 - link

    Hopefully only the shill posts (ha ha).
  • fervloka - Saturday, December 3, 2022 - link

    All I want to know is why Anandtech is apparently unaware that AMD launched Genoa.
  • DigitalFreak - Saturday, December 3, 2022 - link

    With the frequency articles are posted, was anything really lost?
  • supdawgwtfd - Sunday, December 4, 2022 - link

    +500
  • RedGreenBlue - Monday, December 5, 2022 - link

    I’m just glad the oldest articles weren’t lost although I guess thewaybackmachine would still have some or most of them. Anandtech is priceless.

Log in

Don't have an account? Sign up now