> In fact, most of the time models fail to demonstrate introspection—they’re either unaware of their internal states or unable to report on them coherently.
There is no way for a user to know whether the LLM has introspection in a given case or not, and given that the answer is almost always no it is much better for everyone to assume that they do not have introspection.
You cannot trust that the model has introspection so for all intents and purposes for the end user it doesn't.
https://www.anthropic.com/research/introspection