Randomly select lines from a text file

Post date: Jun 29, 2014 10:09:05 AM

This technique is very useful when you want to downsample your data. You can use a simple form like this to randomly selecting only 1% of all lines. The technique below avoid loading the whole file into memory:

perl -ne 'print if (rand() < .01)' your_file.txt 

further readings:

randomly selecting 1% from a text file:

http://stackoverflow.com/questions/692312/randomly-pick-lines-from-a-file-without-slurping-it-with-unix

Get even/odd lines from a text file:

http://stackoverflow.com/questions/21309020/remove-odd-or-even-lines-from-text-file-in-terminal-in-linux