I’ve found that the key is the “context window”, which is a magic black box of what the app has been working on for the current session.
If I keep the tool focused on what I’ve been working on in the context window, the results have been pretty good. If I ask it to assess something it hasn’t touched, results get more wild.