System checking it just verifies the signature is valid and thus all data presented is valid? Your browser doesn't need to query any Root CAs to trust SSL certificate, https works without internet.
History of entry and visas/etc could be stored on device as well
If you want to argue for a theoretical system that is self-contained, only relies on the data that is present on either the physical (or the theoretical cryptographically signed digital) passport, you're free to do that.
But in the real world, the systems that deal with processing people's entries already cross-reference multiple other existing databases, require internet connectivity to do so, and I think you'll have hard time convincing anyone to stop doing that.
If CBP's systems go down, they will not process (foreign, they'll process US citizens still) arrivals [1], even with physical passports in front of them. I assume the EU ESS works the same.
"If the internet goes down, your border checkpoint is down" is not some terrifying future we need to protect against, it's the reality of the world as you live in right now.
[1]: I've had to wait for an hour, at SFO of all places, because of exactly that happening.
TBF given that a temporary outage is abnormal it makes a certain amount of sense to default to shutting down. Whereas during an extended outage you can pick back up as long as the key parts of your system are capable of operating without the network.
The LLM can find material that it would be hard or time-consuming for you to do.
You still need to verify it, but "find the right things to read in the first place" is often a time intensive process in itself.
(You might, at that point, argue that "what if LLM fails to find a key article/paper/whatever", which I think is both a reasonable worry, and an unreasonable standard to apply. "What if your google search doesn't return it" is an obvious counterpoint, and I don't think you can make a reasonable argument that you journalists should be forced to cross-compare SERPs from Google/Bing/DuckDuckGo/AltaVista or whatever.)
I believe what their point is is that if you give people a "extract-needle-from-haystack" machine and then tell them they have to manually find where in the haystack the needle was, it defeats the purpose of having the machine.
With that said, a good RAG solution would come with metadata to point to where it was sourced from.
> I believe what their point is is that if you give people a "extract-needle-from-haystack" machine and then tell them they have to manually find where in the haystack the needle was, it defeats the purpose of having the machine.
We've got to be careful to not let the perfect be the enemy of the good.
I'm not an LLM enthusiast, but I think you have actually compare it against what the alternative would really be. If you give the journalist a haystack but insufficient time to manually search it properly, they're going to have to take some shortcut. And using an LLM to sort through it and verifying it actually found a needle probably better than randomly sampling documents at random or searching for keywords.
I don't want to come off as an AI-maximalist or whatever, but, I mean, at some point, skill issue, right?
You can use Google to find you results reinforcing your belief that the earth is flat too; but we don't condemn Google as a helpful tool during research.
If you trust whatever the LLM spits out unconditionally, that's sorta on you. But they _can_ be helpful when treated as research assistants, not as oracles.
This is a bogus analogy leaidng to a bogus conclusion.
If something points to the needle in the haytack (saying "this haystack has a needle positioned eighteen centimeters from the top and three left of center"), it's much easier to verify that indeed there is a needle there than it would be to find that needle in the first place.
If an LLM spits out a claim that something happened (citing a certain article), it's less work to read the article and verify the claim than it would be to DISCOVER the article in the first place.
In other words, LLMs can be a time-saving search engine, and the idea that it's just as much work to find+verify information as it is to have the LLM find it and then you verify it is hokum.
Another interpretation is if you have multiple haystacks, and the machine tells you which haystack likely has a needle in it. You still need to extract the needle yourself,
There's things built into iOS and Android and the government does send them; but not for _every_ quake, only for the bigger ones, and if you're close to epicenter.
For big enough quakes you get notification from the government (a VERY loud and specific one too, being in public and hearing _everyones_ phones suddenly go off is... mildly terrifying) too; but they're so frequent and (usually) non-super-threatening that they don't get sent out for _every_ quake.
>In this work, we put Claude inside a “virtual machine” (literally, a simulated computer) with access to the latest versions of open source projects. We gave it standard utilities (e.g., the standard coreutils or Python) and vulnerability analysis tools (e.g., debuggers or fuzzers), but we didn’t provide any special instructions on how to use these tools, nor did we provide a custom harness that would have given it specialized knowledge about how to better find vulnerabilities. This means we were directly testing Claude’s “out-of-the-box” capabilities, relying solely on the fact that modern large language models are generally-capable agents that can already reason about how to best make use of the tools available.
You've moved goalposts from "they haven't open-sourced the process" to "these are marketing materials by Anthropic".
I think you're right to be skeptical, but they _have_ talked about the process publicly.
And I don't think there's anything there that is not reproducible by outsiders? They have access to the same Opus 4.6 that you and I do; though not having to pay for the tokens certainly helps.
I'm pretty sure if you wanted to burn a couple thousand bucks, you'd reproduce at least some of these findings.
The goal post is the same, reproducible. Talking about a process isn’t reproducible. This entire discussion is why I feel developers are so gullible. You are defending a process that’s entirely opaque and you can’t even use. It’s crazy.
You're competing with that for "I want to make sure the person standing in front of me is of legal drinking age" use-case, but for the remote KYC/age-verification usecases, you're competing with a photo of the document and/or a selfie.
Maybe bundling these under the same system is a mistake and they should be separate systems with different considerations; it would certainly help with arguments about it online ;P
I don't know about other countries, but here it requires your passport or actual drivers license, and a 12 or 24 hour wait, to actually activate the drivers license app.
The internet requirement is not there for the person presenting the document, it's for the person/system checking it.
reply