This is Jessica. The area of AI/ML research dubbed “safety” includes investigating LLMs’ ability to intentionally mislead people, for example by performing worse on benchmarks when they know they are being tested, or withholding information when prompted. With this come … Continue reading …