Networking

Unix and Linux network configuration. Multiple network interfaces. Bridged NICs. High-availability network configurations.

Applications

Reviews of latest Unix and Linux software. Helpful tips for application support admins. Automating application support.

Data

Disk partitioning, filesystems, directories, and files. Volume management, logical volumes, HA filesystems. Backups and disaster recovery.

Monitoring

Distributed server monitoring. Server performance and capacity planning. Monitoring applications, network status and user activity.

Commands & Shells

Cool Unix shell commands and options. Command-line tools and application. Things every Unix sysadmin needs to know.

Home » Commands & Shells, Files

Generating Random Text Files for Testing

Submitted by on September 16, 2021 – 11:31 am

Sometimes you need dummy folder structures populated with random data for testing your various scripts and processes – backups, file transfers, encryption, compression, etc. Every time I need something like this, I end up writing my little script from scratch.

I’ve written many of these little loops that I’ve lost and forgotten. But here’s the final, definitive version. No more, I promise.

This nested loop will create two folder structures populated with up to 12 subfolders each. Every subfolder will contain up to 120 files. Each file will be no more than 256KB in size, containing lines anywhere between 60 and 280 random alphanumeric characters long. So we are talking about 720MB tops. It’s a good random data set for running various tests.

set +o history
for k in 1 2
do
  mkdir test_set_0${k}
  for i in $(seq -w 01 $(shuf -i 02-12 -n 1))
  do
    mkdir -p ./test_set_0${k}/dir_${i}
    echo "Populating ./test_set_0${k}/dir_${i}"
    for j in $(seq -w 001 $(shuf -i 002-120 -n 1))
    do
      { tr -dc '[:alnum:]' </dev/urandom | fold -w $(shuf -i 60-280 -n 1) | head -c $(shuf -i 512-262144 -n 1) > ./test_set_0${k}/dir_${i}/file_${j} & } 2>/dev/null 1>&2
      pids+=($!)
    done
  done
done
set -o history

Now, if you’re wondering about the set +/-o history lines, this is to make sure the pids+=($!) doesn’t conflict with your shell history. As you can see, each dir_* is populated via a subshell running in the background. This just greatly speeds things up (at the expense of your CPUs, of course).

Print Friendly, PDF & Email

Leave a Reply