Testing rag+gemini
Measuring Mathematical Problem Solving With the MATH Dataset
this is duped from prod for testing
Another really slow repo on prod