test-one-shot
qa_test.parquet
text → text
is_correct
Are these two lists the same? Answer with true or false, one word, all lowercase. List 1: {answer_spans} List 2: {prediction}
Oct 17, 2024, 11:52 PM UTC
Oct 17, 2024, 11:52 PM UTC
5 row sample
264 tokens
5 rows processed, 264 tokens used
Sample Results completed
7 columns, 1-5 of 100 rows