Today I received a 45Mb CSV file for importing into a database… Needless to say the application we were importing to didn’t seem to like the size of the file, for what ever reason… So I knocked up a quite bash script to create smaller ‘chunks’ defined as a number of lines, to make importing simpler.
I’m sure there’s many way in which is can be simplified, so if you know any I’d like the contributions!
It’s run like this:
$ ./csv-chunk.sh large-data.csv 5000
The first argument being the filename and the second argument the maximum number of lines for each ‘chunk’. From that 45Mb megalith, 38 files of around 1.2Mb were produced which didn’t seem to break the other end!