If you use DuckDB to run the tests, you can reference those files as if they were tables (select * from 'in.parquet'), and the tests will run extremely fast
One challenge if you're using Spark is that test can be frustratingly slow to run. One possible solution (that I use myself) is to run most tests using DuckDB, and only e.g. the overall test using Spark SQL.
I've used the above strategy with PyTest, but I'm not sure conceptually it's particularly sensitive to the programming language/testrunner you use.
Also I have no idea whether this is good practice - it's just something that seemed to work well for me.
The approach with csvs can be nice because your customers can review these files for correctness (they may be the owners of the metric), without them needing to be coders. They just need to confirm in.csv should result in expected_out.csv.
If it makes it more readable you can also inline the 'in' and 'expected_out' data e.g. as a list of dicts and pass into DuckDB as a pandas dataframe
One gotya is SQL does not guarantee order so you need to somehow sort or otherwise ensure your tests are robust to this
https://www.goodreads.com/book/show/34890015-factfulness
I can highly recommend it.