IMO its more nuanced. They're likely in production ramp-up of the M5 Ultra Mac Studio, for release in the next ~3 months; they have pre-purchased bins of memory from the supply-constrained major memory supplies; and they need as much as they can get because they want to push an M5 Ultra config to 768gb to continue the "you can run local models" story that the M5 Max Macbook Pro started telling last week.
Going beyond 512gb and into 768gb memory is something of a threshold that will allow Apple to claim local capability for significantly more models. Qwen3-235B, Minimax M2.5, and GLM 4.7 could kind of run with no quantization on 512gb, but they'll comfortably run at 768gb. DeepSeek-V3.2 and GLM 5 may also work at some level of quantization.
Let's assume a Mac Studio M5 would start at $1999, and that a M5 Max upgrade 128 GB would be about the 1000 it is now. Then an M5 Ultra 768 would be something like $1999 + 1000 + $4000 - cheaper than the current top of the line (which will never happen) so I'd presume about the same $10,000.
Or they could finally make the Mac Pro respectable and have it two M5 Ultra Mac Studios stuck together (or give it NUMA RAM: on chip + expandable).
Hasn’t in been that way for years? Almost all of the people I’ve seen selling used Mac Pros use them for creating music. I assume the studio is a better, cheaper option.
Workstations are more than just music, and there are still a few folks that still believe Apple will some day release a new Mac Pro that fits their hardware needs, without having to go either Windows or Linux.
Apple will be the first company to pioneer a new "work for tokens" program; simply commit yourself to six months of servitude with The Company to pay off your new Mac Studio purchase.
> A powerful Neural Accelerator is built into each GPU core of the M5 family of chips, which dramatically speeds up AI tasks like image generation from diffusion models, large language model (LLM) prompt processing, and on-device transformer model training. [1]
My theory is that they're going to release a new Mac Pro that's about half the size of the current one. Enough space for some PCIe slots, but otherwise smaller given the enormous amount of wasted space in that thing since moving from Intel to Apple Silicon. Guessing the rack-mount model, should they continue selling it, will be 3 or 4u instead of 5u.
I know everyone thinks they're going to just kill it, but I don't see it. Apple's move under Tim Cook has been to exhaust supplies (see: filling the Intel Mac Pro chassis with air and not updating the CPU), letting people predict its death (see: 2013 -> 2019 Mac Pro silence), and then redesigning it into something people want while utilizing it as an opportunity to segment specs across their SKUs.
The Studio will remain the high-powered creator machine, whereas the Mac Pro will be retooled into an AI beast.
Why people buy the Studio with the high ram config is actually the unified memory. This is unique to Apple. I'm not sure what Mac Pro would do with PCIe cards . It would be useless for AI because what you want is unified memory that can be used by the GPU/AI not just ram.
Its not entirely unique to Apple: the Ryzen AI Max platform (in the e.g. Framework Desktop) is a unified memory platform. The PlayStation 5 also has a unified memory architecture (which given the chiplet was made by AMD, not too surprising) (people sleep on PlayStation hardware engineering; they're far better at skating to where the puck is headed than most hardware tech companies. remember Cell?)
> I'm not sure what Mac Pro would do with PCIe cards .
Video and Audio Engineers [1] would like to have a word. Not to mention PCIe Network Card. And they do use all the slot in the Cheese Gater although I believe a modern version could have cut those in half.
PCIe cards would indeed be useless for AI unless Apple supports third-party GPUs, but there are certainly some pro creators that would still prefer to have them. I myself work in large-template film/game scoring and while we all love our Mac Studios, they're usually housed in a Sonnet chassis so that we can continue to use PCIe cards. Had Apple kept them in parity with the Studio w/r/t CPU and RAM, the rack-mount version of the Pro would've been a no-brainer.
It is already a walking zombie, Apple clearly no longer cares about the workstation market, regardless of how many "I still believe" t-shirts get sold to wear at WWDC.
They may be trying to sell through the existing CPU before a launch (soft or not) of the M5-based versions (though I've heard the rumor is there will be no M5 Ultra and we might be looking at an M6 Ultra later in the year).
This is most likely the case. They ended their production run and have inventory (or so they thought). Now with the rush for LLM power, they sold out of them and they no longer have that inventory. This was a surprise to their bottom line AND their supply chain logistics plan!
I’m sure they wanted to order more but were priced out for the increase in ram costs. Apple probably decided it wasn’t worth it until they revamped the architecture (and put a larger order in this time around).
I’m not a buyer but I suspect that’s what’s playing out right now behind closed door meetings.
Regret on a $10k desktop rendered obsolete for purpose (the 512GB of RAM only has so many applications) months later is not a great look. It's good long-term brand value thinking to close the regrets window earlier.
“Rendered obsolete” is a doing a lot here. It might have been discontinued, but it is still faster than the rest of the line and the only self-contained computer that can handle models that large.
The most I would say is that it was discontinued, but, depending on how it goes, it might be just sold out for now pending on memory procurement.
Interestingly, the "ultra" Mac Studio released a year ago was based on the older M3, not M4. Apparently, the work to "ultra-fy" a CPU is significant (which makes sense) so there can be a lag.
Not that they have to follow pattern, but the a Mac Studio ultra released later this year might be based on M4. Or one based on M5 might be released a year or more from now.
The M4 Max lacks the UltraFusion interconnect, making an M4 Ultra impossible. We might however see an M5 Ultra due to the new Fusion Architecture in the M5 Pro and M5 Max chips (just announced for the latest MacBook Pro), which uses a high-bandwidth die-to-die interconnect to bond two dies into a single unified SoC—similar in concept to UltraFusion but evolved for better scaling, efficiency, and features like per-GPU-core Neural Accelerators.
Reports and leaks strongly indicate Apple is preparing an M5 Ultra (likely fusing or scaling from the M5 Max using this advanced interconnect tech) for a Mac Studio refresh later in 2026, based on Bloomberg/Mark Gurman and other sources. This would bring back the top-tier "Ultra" option after skipping it entirely for M4.
I suspect that the cost/benefit isn't there. Those who need the "biggest Ultra" will be happy with the previous generation or so, and so they'll refresh that on a 2 or 3 year cycle.
Given that generation gains are not sufficient to make a Max twice as fast as the previous-gen Ultra, a longer cycle is rational. The M3 Ultra is still the fastest M-series system.
The existing ultras are two max dies connected together with TSMC’s CoWoS-S interposer. But as I understand the interposer can have yield issues, so yes — you put two together, but it’s not quite as easy as snapping together legos.
It used to mean that, but the new M5 Pro and M5 Max have separate CPU and GPU chiplets with an interposer, similar to how the previous generation Ultras were based on connecting two Max full dies. So it's unclear whether there will be any Ultra for the M5.
This tells me the Max CPU chiplet has two interfaces to GPU dies. If you can connect two CPU chiplets via the same interface, making an M5 Ultra is doable by joining two CPU chiplets, each with a GPU chiplet attached.
Ditch the Aluminium and go with a copper MacBook Pro. Or silver. If you get it with a terabyte of RAM, the silver shell will be a small part of the total costs.
Argentium 960 would most likely be the best alloy for the job, as it’s a good heat conductor and doesn’t tarnish like pure silver.
There are gaming laptops that come with power bricks rated for higher output than a Mac Studio's power supply. M3 Ultra levels of power dissipation are possible to handle in a laptop, but it wouldn't look much like a MacBook Pro. That kind of gaming laptop typically has four fans (compared to two on a MacBook Pro), and large vents on the sides, bottom, and back of the machine allowing them to move a lot more air through the system.
I think it's unlikely that Apple is paying the spot price for memory. They almost certainly negotiate delivery/price contracts in advance. Maybe the contract for the chips used in the 512GB model will expire soon?
15 years ago I was an intern at Micron and learned they passed on a contract with Apple because Apple insisted on discounts and there wasn't a compelling reason to reduce profit at Micron.
So yeah, Apple probably does pay less. But the market has enough demand that suppliers do say no.
This is actually relevant, because DRAM costs just as much now per Gb as it did 15 years ago (that's controlling for inflation; it's as much as it cost 20 years ago on a pure price basis).
>Apple buys and uses so much RAM across all its product lines that it’s in a better negotiating position than the likes of Framework or Raspberry Pi, but CEO Tim Cook acknowledged in the company’s last earnings call that memory pricing could begin to eat into Apple’s profit margins later this year.
There's also the fact that they were charging $200 to add 8GB of RAM before the prices went up, when that much RAM was something like $70 at retail.
The problem then is that when the supply gets more expensive and you were already charging the maximally-extractive price to customers, they can't eat much more of a price increase, so instead most of it has to come out of margins.
Actually that is relatively cheaper than Apple has ever sold ram. They would always charge $200 for each ram upgrade and it might have been only 4gb or less back then.
The twist now though is they started soldering in the RAM with the retina macbook, so you can't run around apple's extortionate pricing like you could in the past and just buy components off the market.
Such a stupid cartoon evil villain move too, just to force us into getting RAM from them. I have never been memory bandwidth bound (Apple's excuse for soldering in the RAM) in my life and yet I am forced to buy computers that optimize for this at the expense of things I actually care about like serviceability. And also consider the fact it incentivizes people to buy more RAM than they need today in effort to future proof their device, in a time of RAM shortages. And who knows maybe by the time that RAM amount is relevant the CPU can no longer keep up so the hoarding might not even be for anything either.
> I have never been memory bandwidth bound (Apple's excuse for soldering in the RAM)
This isn't even a plausible excuse. For the entry level machines, the soldered RAM only has the same memory bandwidth as ordinary laptops. For the high end machines it likewise doesn't have any more than other high end machines (Threadripper/Epyc/Xeon) which just do the same thing as Apple -- use more memory channels -- without soldering the RAM.
And it's especially a kick in the teeth right now because it means you can't buy a machine with less RAM than you might prefer and then upgrade it later if prices come back down. If it's soldered then only what you can afford at the right now prices is all the machine will ever have.
I think part of what's happening lately is that chip folks are start to realize they can make margin too. Maybe it's possible thanks to consolidation but for sure folks see the crazy margins nvida, apple etc have, and I suspect they're like - we want that too!
I would think the price gouging on memory tiers is why its in a better negotiating position. Having 200% markup means minor market conditions wont prevent them from payment.
It costs the same, we just mark it as an opportunity cost of unloading the memory on the spot market.
If I buy contracts for 1 gold bar at $500, and the gold price runs to $1200, I can either continue to market my gold-containing product for the same profit margin, or I can unload all that gold for $1200/bar and make a profit of $700/bar. If my profit margin is high and it doesn't take many gold bars to make a thousand units, maybe discontinuation doesn't make any sense. But if my product is "solid gold statuary of Dear Leader", and the bars are most of my cost basis, I know what I'd do.
You’re thinking only finance. Their goal in buying the contract is to secure the good. The ability to maintain price will allow them to sell more units which is the number they want to show.
Amazing that it's gotten to that point but I think that's right. It's more intuitive that you would need vertical integration with your processing chips because of the degree of expert specialization necessary to produce them, especially in close coordination with a major product release.
By comparison, ram seems much more a commodity, but the game has changed and it seems like there may be an important strategic interest in sourcing and supplying your own.
Even if so, everyone lives in the same market. If Apple has a contract for those chips at an artificially low price, it's to their advantage to sell them to someone else at market value instead of putting it in a Mac where they'd have to increase price (and take the PR hit) significantly to make the same profit.
Or the fact that if they sell all their RAM without putting it in devices, they won’t be able to sell devices, and some portion of their customer base will leave their ecosystem, possibly forever.
And you think this is the first sign that they’ve decided they’re going to spend the next few years being a RAM reseller before starting to sell consumer products again?
No, but "shipping less RAM" is clearly on that spectrum. The point wasn't about literal product strategy, it's that there's a limit to what actions are financially feasible and it's set by "what else could you do with that junk?"
That's my whole point. M3 Max 128GB -> M3 Ultra 512GB. M5 Max 128GB -> M5 Ultra 512GB. But if M5 Max 192GB -> M5 Ultra 768GB, i.e. Ultra having 4x the memory of Max.
it's Apple and they don't like to adjust prices to the market
other companies would have just hiked the price of the 512GB model to reflect the lack of supply and to allow people who really need that model to pay for it dearly
but that comes with some PR damage that Apple would rather not deal with
Yep, but it they had to double or triple it on short notice, they'd have just removed it from the store instead, and I imagine that the RAM is going into 256GB systems for more $$$ but still nothing really that alarming for the consumer.
This is like believing there's unlimited, instant-on capacity. The same type of "we can just tariff whatever we want, and magically, the market will figure it out".
That makes sense for a few products, but not something that takes billions of dollars, multiple factories, etc to produce.
Also mentioned here: https://news.ycombinator.com/item?id=47291513 - see the article section: "Quietly" and Other Magic Adverbs. Presumably the LLM writing style rubbing off, assuming the LLM hasn't been used to create the content in the first place.
The raspberry pis have been bad value for money for at least 4 or 5 years now unless you're really sensitive to the power draw. Once you add in a case and fan (required if you don't want it to overheat and screw the SD card), the charger, SD card it generally comes in at roughly the same price as more capable intel 1L PC like the Lenovo M920Q (though of course, they aren't new)
Yeah power consumption (and performance per watt) is the main reason I keep buying Raspberry Pi, I haven't find anything similar on that regard, specially for pi zeros
I'm on mobile so can't easily pull up an example part number, but digital signage controllers can often be PoE powered. They're insanely overpriced new from the actual suppliers, but for hobby projects they can normally be sourced relatively easily on ebay. The trick is that many of the ebay sellers don't bother listing the specs, so you need to first search digital sign cintroller/computer on ebay then look up the spec sheet from the model number.
True for base PoE (802.3af, 15.4W), but if you have PoE+ or greater (802.3at, 30W and up) you can start to power more common PCs - I’m running a couple repurposed Chromeboxes from PoE++ adapters.
I really like the ecosystem around them. All of the nice compact hats, the software, the 3D print files. Very googleable which also means easy to get help from an LLM.
Unless you specifically need a pi (unlikely) then they really are awful value now. Hard to really go out of the way to support them now they've stuck two fingers up at the solo/indie/educational community and gone all enterprise.
Second hand mini pc's are a good option. Half the price of a pi 5 + sd + power and you often get them with 16gb ram, a decent ssd, etc.
If you need GPIO then many of the rockchip boards are still fairly affordable and easily had.
The Pi isn't great value, but honestly, I'm finding it hard to find a better trade-off between price, performance and software support right now than the compute modules for embedded projects where you can afford to spin a custom PCB. Especially for low-ish volume or prototype stuff.
I also love the compute modules for their size. Stick one on a nano base board and they’re half the size of a Pi 5. TBH the standard Pis are a bit frustrating with all of the IO. I do not believe the average purchaser is using one as a PC replacement and wants 4 USB ports and 2 HDMI ports. I’ve never seen one in use like that. They are mostly servers or driving a single display without any user input.
100% with you on the IO. I've never even wanted two display output ports with any raspberry pi.
You know what I do want though? An actual damn HDMI port! HDMI cables are everywhere, wherever I am I have unlimited options to connect an HDMI device to some kind of screen. But micro HDMI? The literal only thing in my life that uses it is the Raspberry Pi 4 and 5. There have been plenty of times where I've reached for a Pi 3b instead of a 4 or 5 just because I didn't have a micro HDMI cable.
I do not understand what has gone through their head. How could anyone look at the use case for a Raspberry Pi and decide that two micro HDMI ports is a better choice than one HDMI port? I don't understand it. Like you, my experience with the Pi is that they mostly just sit there, headless, so the only reason I need display output is that it's useful during setup (because they don't have a proper serial console port).
I can't set up a Pi 4 or 5 without going hunting for that micro HDMI cable I bought specifically for that purpose and never use for anything else. I can set up a Pi 3b anywhere, at any time.
The micro hdmi thing (which I too loath) is for digital signage and industrial machinery - we (home users) aren't the audience and haven't been for a long time.
Being able to run two sides of an advertising board, or two control panel screens on a big hunk of metal doing fabrication things in a factory was more important to Raspberry Pi as a business apparently.
Why he heck they didn't just go with 1x normal hdmi and 1x usb-c +DP for the Pi 5 is a mystery, perhaps the SOC doesn't support it or something.
Completely depends on what you're doing. If you're doing a lot of sustained compute, or doing graphics, then yeah you're gonna want some cooling. But it's a useful little machine for all kinds of tasks which don't cause sustained high power consumption.
Two fried on me. One was just running a printserver without a case. It was in summer so ambient temperature was around 32C but still, you telling me you use rpi 5 without even a cooling case?
I have been using a Pi 4 as a desktop computer for a few years (didn't have anything else) with an microSD card and without any fan, heatsink or case. Haven't had anything problems. Obviously, this depends on your environment, but it worked fine for me.
I've had an rPi4 running a copy of a forum and server (for reference) in one of the fancy aluminum cases which passively cools for a couple of years now, no issues.
The big chunky aluminum ones do seem pretty good on the pi 4. I had one in the flirc case for a long time and it never seemed to have issues. Obviously adds to the cost though. Also not sure if the Pi 5 works as well in them given its higher thermals, and the Pi 4 didn't exactly run cool so imagine the 5 might throttle occasionally without active cooling.
Raspberry Pi hasn't been a cheap SBC for a long time. It's now in the same market segment as a NUC, but without the case and with worse price to performance.
Have fun reading 40 answers about how discarded Lenovos from 2017 are cheaper and stay idle at 5W. It springs to 3x the power usage of a pi if they do anything with it but who cares about performance per Watt?
i'm running qwen3.5:0.8b on my orangepi zero 2w, low token/s but it still runs. I think i paid around £14 for it over a year ago but now the same board is double price. I wouldn't buy a computer right now of any kind. It's a bubble.
Interesting, I have a few Pis laying around, I know they'd be low token use, but debated putting some models on them, whats your setup look like if you don't mind me asking? Is there a specific image or package you're using?
You do actually need to run it on a Mac, if (and only if!) you require integration with Mac-only software. But the main factor is probably just "all the cool kids are doing it" ;)
I didn't know that, Sorry about that, but is there no way to make CDP debugger less detectable. Seems doable to me but maybe there's a catch if its not already done by somebody maybe?
iMessage is the only explanation I can find. Minis aren’t powerful enough for agentic models unless you’re getting a rather expensive version (I could see the MX Pro w/ 64GB working). At which point they don’t have the price appeal of the base model anymore.
More likely that the M5 Max Studio is coming out. the M5 Max Macbook Pros just came out.
Also the 512 GB ssd version has a slower SSD than anything 1 TB and up. The new SSDs on the M5 I believe are much faster and what's coming likely will receive that.
There's no doubt there's a ram shortage, and price increases, and the biggest companies in the world lock in their pricing well in advance, and the remaining leftovers are where the consumers experience shortages.
I'm trying to work out if I should buy a 48GB M4 Pro Mac Mini now, or wait for M5 Pro ones later this year. For AI/ML purposes, mostly. As far as I can tell, the new M5 MacBooks didn't go up much or any for the same amount of RAM?
I wouldn't buy a local machine for AI/ML purposes unless you have an actual defined use case and programs to run (perhaps even being able to test them at an Apple Store).
Otherwise you may end up like others using a high-spec Mac mini to just access online models.
At this point is the performance advantage of Apple CPUs even worth it if you can't upgrade the ram itself? I'm thinking you might be better off building a PC and putting the absolute bare minimum RAM in it, with plans to swap that out with good stuff in a year or two once the RAM market stops being insane.
But for ML workloads the comparison isn't between slotted CPU RAM and Apple's unified RAM, it's between Apple's unified RAM and dedicated GPU VRAM, which can more than double even the M3 Ultras bandwidth at up to 1.8TB/sec. Apple Silicon makes a unique set of trade-offs that shine in certain areas but they are still trade-offs nonetheless, so it really depends on what exactly you're doing with the hardware.
Dedicated GPU VRAM is much scarcer than the unified RAM you get on Mac platforms. This is a big deal for SOTA LLMs that combine high memory footprint with a need for high memory bandwidth in order to get acceptable performance.
Why should it be the new norm? We have an abnormal situation now, of massive amounts of investor money being poured into unprofitable bets, that this time had the side effect of eating up hardware components. There are two possible outcomes:
1. Yes, it's the new normal, then production capacity will be increased and prices fall.
2. No, it's not the new normal, the bubble pops and component prices come crashing down when buyers default etc.
Option 2 has been the normal outcome of these situations so far. But sure, questions remains how long all of this will take.
I don't know if it'll be a year or two, hard to say exactly when the AI bubble will pop, but I feel quite certain it's coming. The AI stuff is great but most of the money being thrown around to all these different companies is mostly going to be wasted. Investors don't know who the winners and losers will be, just like when people were investing in pets.com instead of amazon.com.
RAM, disk, CPU, GPU, for me it isn't for quite some time, then again I have been mostly a Windows/UNX person, only using Apple gear when assigned via project delivery.
I suspect they'll still want to offer it given the push they've been making over the last year with RDMA. My guess is the 512gb or larger studio will largely be a byproduct of the systems they're designing for their own AI efforts in the datacenter. I don't think this is the end of it for the longer term.
Apple recently introduced rdma support in mac os. They are probably trying to push those people buying the 512gb configuration towards buying more of the 256gb configuration and clustering them together.
A consumer computer company is not going to push people towards building a miniature HPC cluster. Closest we'll ever get to that is multiple GPUs for video games.*
*Nvidia is no longer a primarily consumer company, so all the other GPU stuff is no counterpoint
Apple isn't a just a consumer computer company. Both iPhones and Macs have very large business markets. In fact, I'd argue that the primary reason Apple hasn't locked down MacOS as much as iOS is that it'd absolutely kill the demand from software developers.
Apple isn’t really a consumer company. It does both consumer and enterprise stuff. Just look at all the fleet management stuff it does for ios and mac os.
And besides that, high end macbook prod and studios are workstation-class computers, not consumer-level computers.
I am doing the reverse, and trying to predict the last year that LLMs use NVIDIA GPUs. It's just an accident of history that video game cards are useful for LLMs, and there is absolutely nothing that NVIDIA is doing from a design standpoint that the big hyperscalers can't do on their own, cutting NVIDIA out, and doing a better job of it as they know their own unique needs. The only advantage NVIDIA has is supply chain relationships and it takes time to establish those, but once that's done, we'll see all the big companies rolling their own silicon and no longer relying on NVIDIA.
Weren't 512GB models selling like hot cakes to the complete surprise of Apple? Wait time was up to 3 months last time I checked. Glad I got mine last October.
> The 512GB Mac Studio was not a mass-market machine—adding that much RAM also required springing for the most expensive M3 Ultra model, which brought the system’s price to a whopping $9,499.
Number of people willing the number of people willing to spend $10,000 on a computer is pretty tiny. Maybe they are common enough in HN circles, but I doubt any one at Apple is losing sleep over them.
Of course, $10,000 workstations for a corporation working on AI products might just be a necessary tool.
Just a guess, but I think it’s entirely possible that Apple sold through the full production run that they intended for this generation of the machine and they don’t want to order a new batch before the next generation of processors come out.
I have to think that Apple is close to replacing the M3 Ultra with an M5 Ultra or something of the sort.
Huge local thinking LLMs to solve math and for general assistant-style tasks. Models like Kimi-2.5-Q3, DeepSeek-XX-Q4/Q5, Qwen-3.5-Q8, MiniMax-m2.5-Q8 etc. that bring me to Claude4/GPT5 territory without any cloud. For coding I have another machine with 3x RTX Pro 6000 (mostly Qwen subvariants) and for image/video/audio generation I have 2x DGX Sparks from ASUS.
We must be twins, i've got the same three working in a cluster.
I was really excited to see where the GB300 Desktops end up, with 768gb ram but now that data is leaking / popping up (dell appears to only be 496gb), we may be in the 60-100k range and that's well out of my comfort zone.
If Apple came out with a 768gb Studio at 15k i'd bite in a heart beat.
Yeah, I didn't want to spend more than 50k for local inference stack. I can amortize it in my taxes so it's not a big deal but beyond it would start eating into my other allocations. I might still get M5 Ultra if it pops up and benchmarks look good, possibly selling M3 Ultra.
The story here is about what lesson was learned by the DRAM cartel after they got busted and hit with large fines. One might hope the lesson learned would be, "we should not fix prices", but what got them in trouble was colluding secretly. What if we just did it via earnings reports, press releases, and other public statements?
While there is some market variance like the 2022 to 2023 glut, DRAM prices haven't fallen in real terms in over 15 years. This was all done by controlling supply, and it was all done in public. It starts with one of the big three putting out a statement like, "Samsung is considering reducing DRAM wafer output due to softness in the mobile PC segment." The actual reason varies and often makes little sense.
This is followed by similar public statements from the other large vendors expressing a willingness to reduce supply. Once everyone commits in this way, the companies follow up with announcements of actual supply reductions. You can watch this happen any time prices start to dip.
My bet is if the DOJ investigates, they will not find the same sort of embarrassing smoking gun emails between representatives of Micron, Hynix, and Samsung. The collusion was all done in public. The companies will claim it is just good business management, a strategy known as "conscious parallelism." They used this exact defense to get a 2022 antitrust lawsuit dismissed.
That said, their goal seemed to be just keeping prices fixed. They wanted to avoid boom and bust cycles, keep profits high, and keep prices stagnant. A massive price hike invites investigations and creates problems. If DRAM prices just never fall, they can enjoy healthy profits with little risk.
But what happens when your intentionally constrained supply hits a sudden large spike in demand? Prices skyrocket, everyone gets mad, and demands investigations. My guess is instead of being thrilled with the price spike, the executives at the large DRAM manufacturers are very worried someone put something incriminating in a document somewhere that can be subpoenaed ("how we're going to fix prices in public and get away with it").
Publicly announcing reductions in DRAM wafer output is not per se nefarious. You need to do it from time to time anyway, if only as part of retooling towards newer technologies that will be required when making DRAM dies for newer standards.
The video is 1 and half hour long. It's a whole documentary. Very detailed and well thought out, but too long for me at the moment. I'll see if its possible to get a summary somehow.
I haven't watched up the video but I went way too into the weeds of the ram crisis.
I am not sure what the video suggests. This is my own understanding of the things after I got way too invested in why does OpenAI need all of this ram all of a sudden. (On a random tuesday)
My understanding is TLDR: The stargate project had OpenAI,Oracle,Softbank etc.
Softbank got the money from Japanese bank loan[0] at low interests rates and actually scrambled to find the 20 Billion $ (they commited combined with oracle to around 500 billion $)
(Btw The datacenter thing is being done in a similar fashion by Oracle)
Almost all of that money when given to OpenAI was used/(will be used?) to commit 20% of the Ram supply of the whole world at a more expensive package because these companies just package ram in different order to get "AI ram" and then Micron shuts down the consumer brand (Crucial)
This has now caused Ram prices to spike 5 times the cost in a couple of months back. Also, the inflation is happening in hard drive and just Nand in general.
The largest impacts I can see that is that even companies like google were scrambling to find Ram. I find this to be one of the larger reasons why they might need so much ram all of a sudden. I mean Google and Anthropic were needing Ram but not 20% of it and not committed in such a way and I am not sure if datacenters are even being built for ram to be stored[1]
OpenAI datacenters in Argentina for example is operated by such a shady company that came like 1-2 years ago IIRC. So a 500 Billion $ Project is just picking any random companies ... Yea no, I have the belief that they don't trust it themselves especially when a company is scrambling for money.
All of this does feel very cartel/monopoly-ish to me to push the competitors out of the market or the people running open source models out of the market and another benefit of it for OpenAI all was that we normal everyday people get impacted too and I am sure that when they made such a large decision, they must have internally thought about it but we all know the morality of OpenAI now after the DoD deal.
But I don't think that google and other companies are that impacted by it all it seems as well. Only the average consumer and Hosting providers (Thus seeing OVH,Hetzner raise prices for example). The average AWS/GCP/Azure makes enough money that they might not even raise money for sometime and they'll be fine having another additional benefit that more people worried about increasing prices would go to Microsoft Azure/GCP/AWS even more so.
Edit: Gamers are being pushed out of consoles and everything too and some are saying seeing the cloud connection and AWS coming out and saying that we want Gamers on cloud (paraphrasing) as meaning that its all done to move everything to cloud.
I do believe that this might be only half the story as OpenAI does benefit from everything moving to the cloud (somewhat) but its done even more to prevent competition in the whole genre as well.
I believe that they thought about it and treated it as a plus point but before all and everything, it helped them thought that it can help them maintain their flimsy lead in AI models as more and more catch up by having a more monopolistic lead by stifling competition by rising prices 5 times. Gamers and normal people were just the largest casuality in this crossfire.
I was thinking in the past month when I found all this that damn, OpenAI's morality sucks and they did all of it on purpose
And then they had the department of defence* deal and the whole controversy surrounding it so yeah, that too.
OpenAI doesn't want your benefit. It wants its profit and when these are conflict, OpenAI doesn't care a cent about you, not anymore than the cent that you give it.
>Almost all of that money when given to OpenAI was used/(will be used?) to commit 20% of the Ram supply of the whole world at a more expensive package because these companies just package ram in different order to get "AI ram" and then Micron shuts down the consumer brand (Crucial)
>[...]
>All of this does feel very cartel/monopoly-ish to me to push the competitors out of the market or the people running open source models out of the market and another benefit of it for OpenAI
Nothing you described is actually "cartel/monopoly-ish" beyond "big players have more money to splash around". It's fine to go look at that and go "grr, I hate big tech companies", but the claim of "It's not a shortage, it's a cartel." isn't substantiated. The latter implies some sort of malice beyond what could be explained by standard scarcity thinking, eg. "there isn't enough RAM to go around. We need RAM, so let's stock up".
My point suggests that there is enough Ram to go around in an ideal world even with LLM's but its rather that stocking up Ram could give you so much benefit over your enemies within this space and leverage that you have no reason not to.
So it isn't there isn't enough ram to go around (period) but rather an ideology similar to this town ain't big enough for the two of us (OpenAI vs Anthropic/Google/Chinese-Open-Weights-Models)
Atleast that's my understanding of the situation and I can be wrong about it too for what its worth.
Incredible digging. I remember reading comments about the reason the price hike was the Sam Altman secured a deal with the few ram producer in secrecy were they promised to reserve a large portion of their production to OpenAI for the next years (I don't remember how long). Supposedly Sam will just to put them in a warehouse to collect dust.
Thanks. I appreciate your kind words, I was thinking of writing some piece/blog about it but procastination is definitely something :) But I am just happy that I finally wrote a comment atleast explaining all/most of my understanding. That's more than fine for me.
> Incredible digging. I remember reading comments about the reason the price hike was the Sam Altman secured a deal with the few ram producer in secrecy were they promised to reserve a large portion of their production to OpenAI for the next years (I don't remember how long). Supposedly Sam will just to put them in a warehouse to collect dust.
I do believe that's gonna be the case as well. Most of the ram is probably not needed currently (thats what I feel like) so its gonna sit on dust, That, or oracle/microsoft will use it within their datacenters as old ram breaks apart to have some more monopoly given their close ties to OpenAI.
Even if OpenAI internally sells them at half the market price to microsoft/oracle, they still technically turn a profit.
I actually felt too conspiratorial thinking about it when I had first discovered it because I was under the previous assumption that OpenAI actually needed the ram myself too. But seeing recent events of OpenAI with Department of Defense, I definitely think that they did this on purpose.
I would say, just post these kind of rants on a substance and X or something, don't LLM format it, just kind of lay it out and fix whatever typos and let loose
it'd be interesting to just hear some thoughts and opinions from someone who has done some research on the topic in a light way vs a huge article/documentary
Couldn't find the summary button anywhere, but, when searching around I found out that you can apparently paste in Youtube links on Google AI studio and summarize it.
I've been in tech for ~40 years now and I've never seen anything like this. The downstream repercussions on consumer products that have no access to cheap memory is devastating and is an extinction level event for most low-cost providers of cell phones, tvs, etc.
This shortage in 2026 is more consequential across the board and impacts consumer electronics as a whole and the fact it's going to last years means that many low cost manufacturers are going to close up shop because they won't be profitable.
I'm pretty sure there were more DRAM manufacturers back then, and spinning up a new fab probably didn't require as much know-how, capital or even time.
One is that it’s a more complicated part with tougher fab requirements.
Two is that it’s not a commodity. AMD can’t make nVidia GPUs. They have to design their own. Everyone has patents and trade secrets and copyrights. Patents expire and knowledge diffuses but that adds another time lag.
AMD and Intel are fully aware of the demand and are working on it.
RAM is a commodity. Totally interchangeable standard part. Also simpler to fab, thus quicker and easier to scale up.
Oh, and I’d like to add: everyone is afraid it’s a bubble that will pop. Nobody wants a bunch of stranded capex. That has also happened before many times. So that puts brakes on it too.
Now how am I supposed to develop Electron apps and use Chrome?
In all seriousness, though, as one of the uninitiated, what would be the value of hosting LLMs on a machine like this that has a lot of memory that you pay for up front versus some sort of VPC-based approach?
For consumers, there's little reason to run unquanted, especially for large models which take less of a hit from quantization. I'm running a 200b model at Q3 with very little degradation. A 1000b model would see even less change.
Yes and the result of this $10k endeavour is a much slower a dumber model than any SoTA $20/mo API. On top of the maintenance burden to keep software/models updated.
The rumor from Gurman is that the M5 Ultra Mac Studio ships in the first half of this year.
This may just be a sign that the M5 Ultra Mac Studio is shipping sooner rather than later, as it's common for Apple to push out ship dates for soon to be replaced products.
We do have leaked benchmarks showing that the M5 Max outperforms the M3 Ultra currently shipping in the Mac Studio, so buying an M3 Ultra Studio right now would be a terrible idea.
This thing has been going on for a while unfortunately. If I told you "Here's an "air", a vanilla, and a "pro"", what would you expect? The vanilla is base, air is lighter, and pro is nicer, right?
Well, for iPads, the base has (had? Haven't closely followed them for a while) an older CPU for some reason. And the Air is actually a "Pro-lite", rather than a weight optimized version.
Don't get started on where the mini sits, or what happens if you want "nicer" features like 60hz+ displays in a small form factor... a feature that budget android tablets have had for years.
Going beyond 512gb and into 768gb memory is something of a threshold that will allow Apple to claim local capability for significantly more models. Qwen3-235B, Minimax M2.5, and GLM 4.7 could kind of run with no quantization on 512gb, but they'll comfortably run at 768gb. DeepSeek-V3.2 and GLM 5 may also work at some level of quantization.
reply