ChatGPT, GPU Constrained.

Too much of a good thing

Jun 03, 2023

OpenAI “is heavily GPU limited at present.”

In the last post we discussed how Nvidia joined the trillion dollar market cap Club driven by the insatiable demand for their Graphical Processing Unit (GPU) chips, to fuel the global rush to AI-enable everything possible online.

What started a couple of decades ago as supporting chips for playing video games, has now become the fuel that makes Large Language Model AI (LLM AI), and Generative AI (GAI). And of course the hugely popular ChatGPT AI app by OpenAI.

Priced in the tens of thousands dollars per high-end GPU chip from NVDA 0.00%↑ , AI infrastructure companies are buying them in the tens of thousands units.

Elon Musk boasted he had secured over 30,000 of Nvidia’s near top of the line H100 GPU chips, which go for over $30,000 each, for his new AI startup X.AI, which he expects to go up against the likes of OpenAI, MSFT 0.00%↑ GOOG 0.00%↑ META 0.00%↑ . Already, Meta and Microsoft are amongst the largest buyers of Nvidia’s GPUs this year (Meta likely #1, so we know there’s a lot of AI augmented stuff coming in Facebook, Instagram, WhatsApp, and Messenger).

That’s why it was surprising to hear this from OpenAI founder/CEO Sam Altman. In a new Sam Altman interview at an AI conference this week with Developers in the UK, he emphasized how GPU availability is gating OpenAI’s plans for this year and beyond. An excerpt from the interview:

Source: Humanloop

“A common theme that came up throughout the discussion was that currently OpenAI is extremely GPU-limited and this is delaying a lot of their short-term plans.
The biggest customer complaint was about the reliability and speed of the API. Sam acknowledged their concern and explained that most of the issue was a result of GPU shortages.”

For more convenient non TL;DR reading, some summary items via Alvaro Cintas on a useful Twitter thread noted here:

“In a recent interview, Sam Altman discussed with Raza Habib the future of OpenAI. They talked about:
- GPU Issues
- Longer Context Windows
- Multimodality
- Regulation and Open Source.
1. GPU Issues:
OpenAI has a lot of short-term goals but they are being delayed due to GPU shortages. One of this goals is Multimodality.
2. Plans for 2023
Their top priority is to have lower the cost and increase the speed of GPT-4.
Other objectives include:
- Longer context windows with possibility of 1 million tokens.”

MKP Note: Anthropic announced support for 100,000 token windows just a few days ago. Memory is the key next improvement area in LLM AIs, as noted in my recent post, “AI, Memory of a Goldfish”.

“- Finetuning API to help developers
- A stateful API so the API remembers the conversation

MKP Note: MORE MEMORY FUNCTIONALITY NEEDED. ADDRESSED IN FUTURE POSTS. PUT A PIN HERE.

3. Multimodality
These were not so great news. Due to the GPU shortage, multimodality might not arrive until 2024.
Remember how exciting was when @gdb did this demo? We might need to wait for this a bit longer.
4. Regulation and Open Source
Sam expressed interest on regulation of future models but not current ones, as he doesn’t think existing models are dangerous.
He is also considering open-sourcing GPT-3!”

Emphasizing the scramble to secure GPU Datacenter capacity this year, Microsoft announced billions in advanced orders for Coreweave, a startup data center company in which Nvidia has an Investment. Scramble for GPUs is the order of the day, all year in this case.

Bottom line, this key item, GPU GATING, applies to OpenAI and all other AI companies for this year and at least some part of 2024.

In the world of AI for now, it is indeed a “It was the best of times, it was the worst of times” moment. But in the long run, the good thing about chips, whether from Silicon or Spuds, “they’ll make more”. It’ll just take a little more time. Stay tuned.

AI: Reset to Zero

Discussion about this post