[Vaex] big data (.csv) covert to hdf5

2020. 9. 2. 13:53분석 Python/Vaex

import vaex
for i, df in enumerate(vaex.from_csv('taxi.csv', chunk_size=100_000)):
    df = df[df.passenger_count < 6]
    df.export_hdf5(f'taxi_{i:02}.hdf5')

vaex.readthedocs.io/en/latest/api.html#vaex.open

 

API documentation for vaex library — vaex 3.0.0 documentation

Parameters: x – expression or list of expressions, e.g. df.x, ‘x’, or [‘x, ‘y’] y – expression or list of expressions, e.g. df.x, ‘x’, or [‘x, ‘y’] limits – description for the min and max values for the expressions, e.g. ‘minma

vaex.readthedocs.io

728x90