Automated LLM Evaluation
Running automated large language model (LLM) testing using Promptfoo.
Introduction
This page shows how to run automated LLM testing. If you are part of the NIEHS GitHub organization, you can see an example of this in the NIEHS/ToxPipe-Model-Comparisons repository.
Requirements
- Promptfoo installed
.envfile with the following variables:OPENAI_BASE_URLOPENAI_API_KEYOPENAI_API_VERSION
promptfooconfig.yaml
Running Tests
To run the tests, execute the following command while in the same directory that the files above are in. If you run this command from a parent directory, you will have to use relative file paths.
npx promptfoo@latest eval --env-file .envThe results of the test will be output into the same directory. This can be changed by modifying the promptfooconfig.yaml file or using the --output flag when running the command.