Automated LLM Evaluation

Running automated large language model (LLM) testing using Promptfoo.
Modified

July 9, 2024

Introduction

This page shows how to run automated LLM testing. If you are part of the NIEHS GitHub organization, you can see an example of this in the NIEHS/ToxPipe-Model-Comparisons repository.

Requirements

  • Promptfoo installed
  • .env file with the following variables:
    • OPENAI_BASE_URL
    • OPENAI_API_KEY
    • OPENAI_API_VERSION
  • promptfooconfig.yaml

Running Tests

To run the tests, execute the following command while in the same directory that the files above are in. If you run this command from a parent directory, you will have to use relative file paths.

npx promptfoo@latest eval --env-file .env

The results of the test will be output into the same directory. This can be changed by modifying the promptfooconfig.yaml file or using the --output flag when running the command.

Back to top