This isn't true. Pandas uses Numpy to store columns of data. Theres quite a few technical errors in the article.
The comparison of numpy reading CSV to arrow reading parquet is completely bizarre and totally misses the point of switching out the underlying data format.
> Reading in that CSV file into memory would take Python 55.8 seconds, but PyArrow did the work in 11.8 seconds.
It's later clarified that Pyarrow does load csv here, though the numbers don't fully add up. Also the format change is explained.
> the format is much favored by AI frameworks such as TensorFlow and PyCharm.
Dead Comment
This isn't true. Pandas uses Numpy to store columns of data. Theres quite a few technical errors in the article.
The comparison of numpy reading CSV to arrow reading parquet is completely bizarre and totally misses the point of switching out the underlying data format.
> Reading in that CSV file into memory would take Python 55.8 seconds, but PyArrow did the work in 11.8 seconds.
It's later clarified that Pyarrow does load csv here, though the numbers don't fully add up. Also the format change is explained.
> the format is much favored by AI frameworks such as TensorFlow and PyCharm.
Dead Comment