Fun fact, overwatch must have done a similar things because they would let you play back games up until some release when you could no longer replay them unless you'd saved the render.
I think if I remember right there were also funny moments where things didn't look right after patches?
More like RAM producers are providing supplies to the highest bidder, no? If this doesn't peter out supply will normalize at a higher but less insane price eventually.
I guess if it only works at scale capital is maybe the answer. Like enough cash to buy 5 or 10 or even 100 minis seem doable - but if the idea only works well when you have 10,000 running - that makes some sense.
Eh, that doesn't math out. It's the bandwidth per storage density (or ultimately per price) that matters.
If you have great cost per byte but your bandwidth per byte is bad enough that the price per byte doesn't make up for it then you have an issue.
They've started making hard drives with multiple heads because of this issue, they increased density to the point where it's not useful to continue adding density if it doesn't come with more bandwdith.
The cost issues they're seeing (at least from what they've stated) are from users, not internally. Basically, it takes either $5 or $6.25 (depending on 5m or 1h ttl) to re-ingest a 1M context length conversation into cache for opus 4.6, that's obviously a very high cost, and users are unhappy with it.
I think 400k as a default seems about right from my experience, but just having the ability to control it would be nice. For the record, even just making a tool call at 1M tokens costs 50 cents (which could be amortized if multiple calls are made in a round), so imo costs are just too high at long context lengths for them to be the default.
Hey -- I have 0 PHDs so take this with a grain of salt :)
I had thought for a while about a way to store data that makes use of an idea that I had for sub-diffraction limited imaging inspired by STED microscopy.
First an overview of STED. You have a "donut" shaped laser (or toroidal laser) that is fired on a sample. This laser has an inner hole that is below the diffraction limit. This laser is used to deplete the ability of the sample to fluoresce, and then immediately after a second laser is shone on the same spot. The parts of the sample depleted by the donut laser don't fluoresce and so you only see the donut hole fluoresce. This allows you to image below the diffraction limit.
My idea was to apply this along with a layer in the material that exhibits sum frequency generation (SFG). The idea is that you can shine the donut laser with frequency A and a gaussian laser with frequency B at the same spot. When they interact in the SFG material you get some third frequency C as a result of SFG. Then, below that material would be a material that doesn't transmit frequencies C and A.
Then what you'd be left with after the light shines through those two layers is some amount of light at frequency B. The brightness inside the hole and outside of the hole would depend on how much of the light from frequency B converts into frequency C. Sum frequency generation is a very inefficient process, with only some tiny portion of the light participating, but my thinking is that if laser B is significantly less bright than laser A, then what will happen is that most of the light from laser B will participate in sum frequency generation where it mixes with laser A, and that you'll be left with only a tiny bit of laser A outside of the hole, so that you get a nice contrast ratio for the light at frequency A between the hole and the surroundings that then allow you to image whatever is below these layers below the diffraction limit.
In my idea the final layer is some kind of optical storage medium that can be be read/written by the laser below the diffraction limit. Obviously aiming this would be hard :) My idea was that it would be some kind of spinning disk, but I never really got to that point.
It will be hard to convince them otherwise when their jobs are replaced with AI, and they are in their late 40s or later - with no time to adjust and to learn new craft.
Some others mentioned pijul, but I will put in my two cents about it. I have been looking to make use of it because it seems really nice for working with an agent. Essentially you get patches that are independently and can be applied anywhere instead of commits. If there is ambiguity applying a patch then you have to resolve it, but that resolution is sort of a first class object.
I think if I remember right there were also funny moments where things didn't look right after patches?
reply