Sidenote: The Insane Sanity of the Geforce RTX 40 Series Graphics Cards

This week, Nvidia CEO Jensen Huang announced the next generation lineup of Geforce cards, starting with the top 3 for the launch – an RTX 4090 with 24 GB of VRAM, and two variants of RTX 4080 with 16 and 12 GB of VRAM, respectively. These cards start at eyewatering prices – the 12 GB 4080 is cheapest at $899, with the 16GB model moving to $1199 and the 4090 at $1599. These cards were shown in a lineup with the existing top-end RTX 30-series cards, which seems to make clear the intent.

Nvidia is caught in an interesting spot right now – it’s the right time in some aspects to launch a new generation of graphics cards, but at the same time, they have been hit as they were in the tradeoff from 10-series to 20-series – a busted crypto market taking their booming sales low and flooding the market with used 30-series cards right as the next gen lines up for its shot. They’re also facing an increasingly-competitive AMD, whose Radeon RX 6000 series remained reasonably popular for its run and whose successor lineup is about to be announced and launched in November. Anything they launch now faces intense competition – from within and without.

So what did they do?

The Architecture

The RTX 40 series is codenamed Ada Lovelace architecture, and more than triples the transistor count of Nvidia’s Ampere architecture, better known as the RTX 30-series. With this new generation, Nvidia is tripling-down on ray tracing, with the bulk of the added silicon in the new parts dedicated to new ray tracing hardware. Their promise is that through a combination of new fourth-generation Tensor cores, third-gen RT cores, and the use of a technology called Shader Execution Reordering which attempts to reorganize shader work in the GPU to increase efficiency, raytracing performance should increase quite a bit. The reasoning is straightforward – new RT core designs are said to improve the base light-casting performance of the GPU, the new Tensor core designs increase AI denoising performance which removes imperfections in the ray tracing from the final image, and SER (I ain’t typing all that again) increase the shading performance in games, an update which touches both traditional rasterized 3D but also raytracing to further increase framerates. There’s also a relatively decent uplift in CUDA cores, the basic unit of compute performance within Nvidia’s GPU architectures, increasing from 10,752 in the RTX 3090 Ti to 16,384 in the RTX 4090. We’ll…uh, get to the 4080 models in a moment on that front.

The crowning achievement on top of all of this is that the maximum boost clock specs on the top-end card go up substantially, from a peak 1.86 GHz on the 3090 Ti to 2.52 GHz on the 4090. All of this increase in transistor count and speed is due to a substantial die shrink in the manufacturing process, moving from Samsung’s 8-nanometer process to TSMC’s 4nm process which allows more transistors in a similar die area and increases the performance and power efficiency (there’s a scaling here that means some give and take but it’s way too out of scope here). Samsung’s 8NM process was widely regarded as a part of the issues with RTX 30-series cards, which ran quite hot and increased power consumption heavily over the prior generation as a result of silicon inefficiency and the fact that Nvidia cranked clock speeds on the chips quite high for the process at the cost of exponential power consumption gains. TSMC 4nm brings things back under relative control, such that the expected power usage for a 4090 is around the same 450 watts of the 3090 Ti (although peak and transient spikes are expected to push the card up over 600w!).

The RTX 4080 is a bit of a foul play on Nvidia’s part, however. When you hear 16 and 12 GB memory variants, you surely don’t think of two different GPUs – just two different board configurations with the same brain at the heart of things. Well, you’d be wrong here! Nvidia, not content to market in a straightforward manner, has in fact made two different RTX 4080 chips, with the 16 GB model getting a CUDA core count of 9,728 and the 12 GB model getting only 7,680. For comparison, the RTX 3080 that shipped two years ago now had 8,704 such cores, which means that the RTX 4080 with 12GB of VRAM has fewer CUDA cores than the card that preceded it. Now, this doesn’t mean a lot on the surface – performance of a GPU is a sensitive topic because the CUDA cores can gain efficiency through architectural tweaks (like SER) and the higher clock speed of the 4080 will likely still push the cheaper model ahead of its older sibling, but it is incredibly deceptive to the consumer, as the RTX 4080 with 12 GB of VRAM is not the same as the 16 GB variant, but instead what would likely have been called a 4070 if Nvidia was playing nice. I don’t have much doubt that it will outperform 3080s like my own, but this marketing sucks and I sure wish Nvidia hadn’t done it!

So what does all of this mean for performance?

The Performance Claims

Nvidia claims that the RTX 40-series parts are between 2-4x the performance of the 30-series cards they follow in the lineup, but as usual, their slides are missing a lot of data for comparison and their points on the 30-series cards simply did not add up once the cards were in the wild. Will the new cards be good? I think so, sure – mathematically, the raw increases to every facet of performance just kind of have to get there, unless there is a catastrophic driver error or a bottleneck like the relatively tame generational improvement in the memory bundled with the cards. There are way more CUDA cores, way more RT hardware, and clock speeds increased almost a full gigahertz over what came before, all together at the same time.

The 4090 is obviously, like last-gen’s 3090, the uncrowned king (pending what AMD has to say, of course). It’s big, expensive, and loaded with a near-full Ada Lovelace GPU die – it’ll be a potent card. The 4080 variants are both positioned reasonably well performance-wise, without talking about pricing (we’ll get to that, don’t worry). However, the issue here is that as with the RTX 20-series cards, Nvidia is making a huge bet on ray tracing yet again, and a huge block of that expensive and large GPU die will simply sit idle for most games. RTX-enabled titles are still a slow trickle of releases, and while that pile has grown over the last 4 years of RTX (it’s been four years already? fuck…), it’s pretty likely that most of you have not played an RTX-enabled game. For traditional rasterization 3D, the 40-series should still offer a decent performance uplift – the raw throughput of more CUDA cores, more clock speed, and the SER technology should allow for that. If you play a DLSS-enabled game, using resolution scaling to hit higher performance at a slight fidelity cost, then the new series will be even better.

But then there’s pricing. At a floor of nine-hundred US dollars for the sliced-up 12 GB 4080, the value proposition is plainly not there, especially in the middle of a recession. I love my games and I spend a ton of hours a week playing them, but even I, crazy system I have, cannot justify getting even that base-level model for my system, even if I believe Nvidia when they say “2-4x faster.” At $1,199, the full-featured RTX 4080 makes even less sense – the 3080 Ti launched there around a year or so ago and that card was derided for being poor value. COVID supply chain impacts on tech products are substantially lessened now and there’s no looming crypto-boom 3.0 (especially now that Bitcoin is really only in the range of ASIC miners and Ethereum has finally switched from proof-of-work to proof-of-stake, meaning that GPU mining there is not viable), so I can only sigh at Nvidia’s unchecked greed here. Obviously, the 4090 is fucking unreal pricing-wise, but I almost give them slack for that, because it is the halo-card and most people don’t buy them – I haven’t been in that market in like 15 years at this point.

But this brings up a further problem…

The Absolute Scum-fuckery of Nvidia

Okay, so those prices? They’re for Founder’s Edition cards, the sleek, undercooled pretty cards that Nvidia sells direct to consumers for a healthy-ish profit margin. Founder’s Edition cards are…fine, I suppose, in that they work and meet a bare minimum of function for the GPU and memory configuration in question. However, for the true best cards, you need to go to the add-in board partners, who make custom designs with better coolers, beefier power delivery hardware, and can often milk the clockspeeds just that much more to keep things going faster. Those cards, the ones most people will want for the sake of performance, will be more expensive.

EVGA, Nvidia’s most stalwart partner, recently quite publicly split with Nvidia and has exited the GPU market because they could see the writing on the wall – they don’t get the MSRP of the Founder’s cards until we do, and so they’re often spending months designing custom cards and coolers without a sense of how they can price the card, which means that being in that role is a very low margin business. EVGA had a unique problem there in that they did not own production lines of their own, but were instead outsourcing to a third party to actually build the cards, which made their margin problems worse compared to companies like Asus or MSI, who own their own PCB manufacturing and can keep the whole process in-house.

For this generation, there will be fewer board options out there, with markups being reasonably high compared to the Founder’s Edition MSRP, because these new cards need so much more than prior generations in terms of cooling and power delivery. While there have been rumors that the GPU design was pin-compatible with the RTX 3090 Ti boards (meaning you could drop an RTX 4090 chip right into an existing 3090 Ti design and it would just…work), the design and engineering work required to validate that and ensure proper performance and cooling is still immense, and a lot of consumers will (rightly) expect updated product lines with newer, better tech, especially when it comes to cooling. I fully expect board partner 4090 cards to start near to $2,000, with the best overclocked and cooled variants hitting $2,500 or even $3,000.

Nvidia is also playing with software features, however. Nvidia announced DLSS 3.0 at the event as well, a new generation of their resolution scaling feature that is intended to increase performance by using motion vectors and machine-learning training to render a frame at lower resolution but then scale it up to a higher display resolution without losing quality. This tech leans on the AI-accelerating Tensor Cores in the RTX GPUs. However, DLSS 3.0 is only going to work on 40-series cards, because of a claimed “optical flow accelerator” in the new architecture. Now, I won’t fully claim that this is bullshit, but it sure feels like it. The technical description of this process sounds like a pretty standard computational workflow that could be programmed onto existing hardware just as well – there’s nothing in the presentation that screams of custom-built hardware. In fact, the entire point of DLSS is that it saves performance on existing hardware using AI and ML tech, which is present in all RTX cards. This tech is being held up as a thing you can only get on the 40-series, so buy today!

The only problem? In addition to it smelling of bullshit, it also only supports 35 games at launch. As with any Nvidia tech rollout, the titles vary from marquee-level games (Cyberpunk 2077, Microsoft Flight Simulator, a generic entry for Unreal Engine 4 and 5) to middling (bigot JK Rowling’s elf slavery simulator…err, I mean Hogwarts Legacy) to unheard of (can anyone tell me what PERISH or Scathe are, please?). If the Unreal Engine support is native to the engine with no patching or coding needed, that might be a sleeper feature, but otherwise…not a great start! At the same time, though, DLSS 1.0 on RTX 20-series was a flop at first and took a year or so to get its legs underneath it, so hey, maybe that is the trajectory here!

But Nvidia is also not done with the 30-series, as they quite plainly listed the 30-series as a part of the “family” of cards, with an adjusted price for the RTX 3060 to bring people into the fold. Nvidia has quite publicly had issues with a glut of production, as the crypto-boom 2.0 gave them cause to order a massive number of GPUs from foundry partner Samsung, only for those to languish in storage as crypto crashed and prices have started to steadily decline as supply has stabilized. It seems the goal here is pretty clear – by pricing the 40-series far out of reach of most gamers, you force them to instead look at the existing 30-series stock Nvidia would really like for us to buy up first, because selling them cheaply is still better than having pissed-off board partners pestering Nvidia to buy back the chips they oversupplied. The 40-series cards are said to have been in production at this point for months – since late last year there were leaks about pin-compatible GPU dies, new board designs for the 3090 Ti being used as a way to get traction for 4090 manufacturing, and the like. A lot of the timing of this announcement and launch is also strategic, trying to get people to buy 30-series stock before the announcement and now using the MSRP as a wedge to push people towards those cards in yet another way. Nvidia not even announcing a 4070 (well, at least a product named that, because obviously that’s what the 12 GB “4080” would be in a just world) is also a harbinger of this – as with each generation, we’re going to see Nvidia slow roll each new GPU model down the stack to increase sales of old parts.

So basically, this generation is incredibly marked up by design, and I would also suggest as a temporary adjustment, and is using features withheld from prior, otherwise working cards as a means of helping those sales. Big yikes here.

My Take/Conclusion

For me, the 40-series seems cool technically. As a nerd who loves that kind of shit, seeing big increases in CUDA cores, RT hardware, and the like is exciting. It’s new! However, as a consumer, this is a big wait-and-see from me. I’m sure the new cards will outperform their predecessors and be impressive parts – not nearly to the extent that Nvidia believes (especially since they love using DLSS for their slides and comparing to lower-tier DLSS tech or non-DLSS numbers), but a solid increase nonetheless. The biggest implication to me as a 3D modeler and renderer is that if solid software support comes to leverage that massive increase in RT hardware for Blender or other 3D rendering solutions, the 4090 could be a solid investment on that front.

For gamers though, the generation feels pretty bad right now. Withholding DLSS 3.0 from prior cards seems petty and purposeful, the price increase in current market conditions is fucking ridiculous, and Nvidia’s hype machine just says whatever the hell it wants and then independent reviewers prove it wrong. And here’s the thing – as I mentioned above, I think this is by design. The cards seem good, and I bet they will be – but all of this is sandbagging and hamstringing, designed to push buyers away from these cards and to old stock, which will sell through and then see a price drop and mid-gen refresh on the 40-series, at which point it will actually maybe be worth buying. Sure, if you’re a moneybags asshole, you can probably just get on now, and honestly, if I had a sweet fund for a new system in the next month right now, I’d be building a Ryzen 7000 rig with a 4090 and case big enough for massive watercooling, sure. But for the vast, vast majority of folks, there’s just no reason to buy here yet, and in that way, Nvidia has masterfully designed this launch.

I’ll definitely look at performance numbers when those become available, but for now, the 40-series is just an interesting note on how to do a sabotaging product-launch so you can sell through that old product first, and Nvidia is likely going to reap the rewards of this strategy, transparent though it is.

5 thoughts on “Sidenote: The Insane Sanity of the Geforce RTX 40 Series Graphics Cards

  1. I found the RTX Remix an interesting strategy. Giving modders — a passionate group — the ability to bring ray tracing to older DX8 and DX9 titles may end up being a powerful move to have ray tracing become more widespread. That could just be me, as I loved Morrowind and would install the mod and putter around in the game just to see all the visual changes.

    If RTX Remix becomes widespread (or viewed as common) that’s definitely going to put pressure on game developers to add ray tracing to their shipping games. “Amateur modders can add it, why am I paying you for a commercial game that doesn’t have it?” becomes an awkward question to answer/avoid.

    Liked by 1 person

    1. It’s definitely a thing I wish they had done sooner, for sure! Mod culture on PC is huge and there are plenty of amateur devs who are more than willing to put some time into playing around with lighting tech in older games. I’m curious as to how well it will work with games that use strong baked lighting though, since you’re often not dealing with dynamic or probe lights in those old titles and thus can’t quite influence as much change – or, at least, you’re going to see some fight between the baked lighting and the RT that could be a problem. WoW’s RT implementation is sort of halfway for that reason – adds the feature but doesn’t really change the scene in any zone.

      Like

  2. and while that pile has grown over the last 4 years of RTX (it’s been four years already? fuck…), it’s pretty likely that most of you have not played an RTX-enabled game.

    I watched Gamers Nexus’ most recent news post on YouTube, and Steve aptly described it as EVGA getting out and then every other card manufacturer going absolutely bonkers. The sheer size of these cards, taking up 3 and more slot spaces on a PC, are going to strain the physical capability of PCs to simply fit them into a box. (Without breaking the motherboard.) If Intel and AMD get relatively in the ballpark with NVIDIA with maybe 1/2 to 2/3 of the price, they’ll make a killing over NVIDIA. And you just know that if bitcoin starts an upswing again there’ll be yet another run on cards and the gamers won’t have a shot at getting any of them.

    Like

    1. A big part of why I moved to watercooling is to not have a giant, motherboard-bending piece of metal in my system. I trust the engineers to do a good job of making bracing systems that work with the expansion slots in the case and motherboard makers who all case their PCIE 16x slots in metal now, but some of those coolers are just obnoxious!

      At least on the plus side, most modern systems need nothing else on the expansion slots besides a GPU, unless you use an internal HDMI capture card or PCIE NVME SSD that slots into one.

      Like

  3. This was such a great post. Detailed and yet readable.

    We’ve been thinking about updating my husband’s system for a few months now. It was new in early 2016 and it’s had some upgrades along the way. Last year (or the year before? Sometime during covid…) we had to replace his graphics card and of course the market was awful. Pretty sure we grabbed a 1070 or 1080 because it was the only thing both reasonably priced and available at the time. He’s pretty keen to go with an AMD card as they’re better (or better supported) for Linux and allegedly better for machine learning purposes. I read that they’re bad at RTX but like you say it’s a moot point for the games we play.

    Like

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.