Compare Fine-tunes

When your model finishes fine-tuning, Entry Point AI will automatically start evaluating it against your validation examples.

Validation examples are processed using 0 for the temperature. This ensures you get the same output every time.

Entry Point AI will also score the resulting outputs using the Scoring Method selected under the project's Evaluation settings.

For classifiers, Exact Match is typically a good choice and will automatically determine if the correct classification was chosen.

For generative outputs, choose Manual or Predictive.

To review the outputs and scores, open the fine-tuned model from the Models page.

Then, scroll down to the Evaluation section.

With manual scoring, you can go through each output and choose the most appropriate rating for its quality. At the end, you will be presented with an overall percentage score that can be compared to other templated or fine-tuned models.

Predictive scoring uses an LLM to automate the scoring process, making this much faster. It's still a good idea to do a human review of predicted output scores.

