jeffrallen 3 months ago

> in late 2017, ML cycles surpassed non-ML cycles in the fleet for the first time

Wow, I was not expecting that. TPUs were only introduced 2 years before.

jauntywundrkind 3 months ago

Really nice having this all written down, having some chronology & epochs laid out to talk about.

Couple sections after that stirred me a bit. They're kind of framed a bit as the hardware challenges, as data-center challenges, but I feel like this all applies double to the software, to what glues the system together.

> Technology islands and industry ecosystems

> The initial success of WSCs was driven by being different, out of necessity reinventing many of the then-conventional approaches to system design. However, as WSCs have scaled, and their adoption has increased via public cloud providers, a broad industry ecosystem supports WSC use cases, allowing "build vs buy" decisions.

> Custom designs work best when they target some unique needs of WSC workloads or systems that are currently not satisfied cost-effectively by existing solutions in the market (for example, the design of TPU accelerators for WSC machine learning workloads). For more mature markets, however, volume economics often reduce costs and increase velocity, favoring products built on top of industry standards (example, server form factors). Focus on building modular, composable, and interoperable architectures built on standardized interfaces; without this focus on composability and standards, you may end up on a "tech island" unique to yourself because one custom component forces all others to be custom too. In many ways, this is the hardware equivalent of the monoliths-vs-microservices tradeoff.

And,

> Optimizing the time variable of Moore’s law: agility, modularity, and interoperability

> The traditional formulation of Moore’s law (performance doubles every two years for the same cost) typically focuses on three variables: performance, cost, and time. As performance and cost improvements start slowing down, focusing on the time variable — the velocity of hardware development — can be a good way to optimize the "area-under-the curve" for continued improvements. Incremental smaller benefits, but at more finer granularities, when compounded, can still achieve exponential benefits.

> To achieve such agile, faster improvements, we need to build more modular hardware platforms with appropriate investment in interfaces, standards, etc. Chiplets in particular allow us to co-design in a multi-die system context, allowing cost advantages from die geometries, but also mix-and-match integration across heterogeneous IP blocks and different process technologies.

> The emergence of open source hardware, is another particularly exciting development in this context and enables a more collaborative ecosystem that hardware designers can build on — open-source IP blocks (e.g., Caliptra root of trust), verification and testing suites (e.g., CHIPS alliance, OpenCompute), and even open source tools/PDKs (e.g., OpenRoad). Given how profound open source software has been to WSCs, the opportunity to have a similar impact with open source hardware is significant.

The discussion on roofshotting, on mild 1.3X-2x improvements, done repeatedly, and revisiting & reapplying old successes I think dovetails with the discussion of modularity. Finding patterns that are broad & reappliable across domains is a huge win. Kubernetes for example keeps getting compared to Docker Compose. But docker compose is good for assembling a set of containers. Where-as Kubernetes is a set of management/manufacturing patterns for that happen to include containers. There's platform modularity by scoping your systems layer bigger, by reusing the wins.

I am very hopeful we see a compatibility of datacenters start to emerge. CXL as a very fast fabric interconnect is exciting. Ultra Ethernet Consortium borrowing some RDMA style wins is promising. Hopefully we see industry players arise & serve this market, make a competitive and rich supply side ecosystem that data center builders can keep extracting value from. Right now the market feels early & a boutique interest; getting chiplets and interconnects back to bread and butter of chip making would help drive innovation upwards.

pulse7 3 months ago

With more and more cores and RAM soon we will not need warehouse-scale anymore... everything will be on a single server again - for almost all needs... one can get servers with more than 10TB of RAM and soon we will have more than 1000 cores...

  • throw0101d 3 months ago

    > Wirth's law is an adage on computer performance which states that software is getting slower more rapidly than hardware is becoming faster.

    * https://en.wikipedia.org/wiki/Wirth%27s_law

    • epicureanideal 3 months ago

      There's probably an S curve on this, right? At some point the super high level languages and frameworks that we use to conveniently create software quickly at the cost of poor efficiency, would have to peak, right?

      • Dylan16807 3 months ago

        Only if there's a moderate amount of effort put into optimization. Without it, it's easy to use bad algorithms or bad implementations and there is no limit on slowness.

    • CyberDildonics 3 months ago

      This is for consumer software, but scientific software has actually gradually evolved and gotten faster on a consistent basis (according to bill joy)

  • jeffbee 3 months ago

    You can get a 480-core x86 server with 32TB of RAM right now if you want, but I have a hard time seeing the utility of such a thing. Having one giant thing that costs $500k and fails as one unit never seemed terribly convenient to me.

    • inhumantsar 3 months ago

      I'd guess that's for shops that are severally space and power constrained relative to their needs and/or need to move around immense amounts of compute.

      eg: sports media that need to ship container-offices around the world three times a year. Two of those, one as a cold spare, running immutable OS images, a honking big disk array, and thin clients or laptops. simplifies HVAC, power, noise control, and networking requirements, plus it's only one system to test, validate, and secure, vs a bunch of workstations.

    • riku_iki 3 months ago

      Utility is intense offline data crunching, which will be much more cost efficient and less complex than equal cloud based or self hosted distributed setup.

  • wakawaka28 3 months ago

    You're assuming they won't find stuff to do with more computing power. Based on how software has bloated over time, even machines like you're talking about will be used for simple stuff.

  • Animats 3 months ago

    Speed of light lag, and contention at the memory crossbar, get you at some point.

  • threeseed 3 months ago

    No it won't.

    Customers demand perfect uptime and companies are compelled to deliver it.

    Which means you need to have a highly-available architecture which requires multiple servers across multiple disparate data centres. Especially when it's hosted in the cloud.