24
May
Because large language models operate using neuron-like structures that may link many different concepts and modalities together, it can be difficult for AI developers to adjust their models to change the models’ behavior. If you don’t know what neurons connect what concepts, you won’t know which neurons to change. On May 21, Anthropic created a remarkably detailed map of the inner workings of the fine-tuned version of its Claude 3 Sonnet 3.0 model. With this map, the researchers can explore how neuron-like data points, called features, affect a generative AI’s output. Otherwise, people are only able to see the output…