The downturn of NVDA and death of SMCI is just not a thing. If there is one thing we can all agree on, Elon Musk has a way making of good decisions.

The largest supercomputer in the world was just assembled in 122 days.

Loading Twitter/X Embed...
If tweet fails to load, click here.

quote:
The GPU servers are Nvidia HGX H100s, a server solution containing eight H100 GPUs each. The HGX H100 platform is packaged inside Supermicro's 4U Universal GPU Liquid Cooled system, providing easy hot-swappable liquid cooling to each GPU. These servers are loaded inside racks which hold eight servers each, making 64 GPUs per rack. 1U manifolds are sandwiched between each HGX H100, providing the liquid cooling the servers need. At the bottom of each rack is another Supermicro 4U unit, this time with a redundant pump system and rack monitoring system.

https://www.tomshardware.com/desktops/servers/first-in-depth-look-at-elon-musks-100-000-gpu-ai-cluster-xai-colossus-reveals-its-secrets

All this to train Grok 3. I expect the "X" family to closely track Open AI's advancements with their own. It will be a race to see which entity creates the next step "Agent AI" and then eventually the mythical AGI.

Negative version, we are probably still heading to the peak if inflated expectations actually until AGI is hit. I do not think the downward side will be technology focused, but the pending loss of millions and millions of jobs. UBI will probably have to become at reality, but that is a deeper conversation for later.

2 ...

Report Post

Posted by hob

Member since Dec 2017

2359 posts

Posted on 10/29/24 at 5:04 pm to DarthRebel

100k gpus with 64 gpus/rack = 1562.5 racks.

I just looked up the SMCI 4U gpu box and it looks like it accepts SXM cards.

The H100 SXM appears to be a 700W part? damn that's hot.

700W x 100k = 70MW just for the gpus. That doesn't include the cooling infrastructure, storage or networking gear.

Since this is money talk, I'm not sure if I'm more impressed with the purchase price or the daily operating expenses.

1 ...

Report Post

Posted by DarthRebel

Tier Five is Alive

Member since Feb 2013

25398 posts

Posted on 10/29/24 at 7:37 pm to DarthRebel

This is why we cannot have nice things here, must DV instead of a response of disagreement

Nobody is going to be mad if you actually post a response to what you do not agree with.

0 ...

Report Post

Posted by DarthRebel

Tier Five is Alive

Member since Feb 2013

25398 posts

Posted on 10/29/24 at 8:20 pm to hob

quote:
Since this is money talk, I'm not sure if I'm more impressed with the purchase price or the daily operating expenses

Well, hold on - this is just the start

https://www.tomshardware.com/pc-components/gpus/elon-musk-is-doubling-the-worlds-largest-ai-gpu-cluster-expanding-colossus-gpu-cluster-to-200-000-soon-has-floated-300-000-in-the-past

quote:

This isn’t the hardware endgame for xAI Collosus hardware expansion, far from it. Musk previously touted a Colossus packing 300,000 Nvidia H200 GPUs throbbing within.

This is great news for Memphis by the way,

Austin and really more specifically Bastrop, TX are also going to be a center hub for some amazing things.

Going back to NVDA and SMCI though, Tesla will be doing some equally large purchases
https://www.teslarati.com/tesla-elon-musk-first-look-cortex-supercluster-giga-texas-video/

quote:

As could be seen in the video, the Giga Texas-based supercluster spans rows upon rows of Nvidia H100/H200 GPUs. Musk has previously noted on X that Cortex will feature about 100,000 H100/H200 GPUs, which should provide some serious muscle for Tesla’s AI training. “Video of the inside of Cortex today, the giant new AI training supercluster being built at Tesla HQ in Austin to solve real-world AI,” Musk wrote in his post on X.

Not the time to short AI stock yet

Page 1