Ask HN: How do you personally evaluate new LLM models?

2 points by _samjarman 5 hours ago

Hey folks, how do you personally evaluate new HN models? Vibes? Or do you have some tests you like to run? Or do you just use them in your IDE/text iterface for a bit and see how it feels? I know we could probably trust some more public benchmarks but I'm curious on personal evaluation techniques. Thanks!