Real life example
In order to illustrate this IO resource starvation problem we are going to launch some basic on-liners commands on a public NASDAQ data set from www.nasdaqtrader.com. Data from 2005 that can be downloaded from their FTP server.
Emptying the buff/cache
First all in order to establish a baseline the buffer cache should be emptied. At this moment we have 1.6G of memory used for buff/cache after cleaning the buff/cache
root@mediacenter:~# sync; echo 3 > /proc/sys/vm/drop_caches root@mediacenter:~# free -h
total used free shared buff/cache available Mem:
15Gi 8,2Gi 5,9Gi 715Mi 1,6Gi 6,5Gi
Swap: 0B 0B 0B
Getting the “real” Data
The following code will download using GNU parallel the data set for 2005.
The computer used for this tests has 4 cores and 16G of RAM therefore parallel wget command will download the files in batches of four.
parallel wget ftp://ftp.nasdaqtrader.com/symboldirectory/regshopilot/NASDAQsh2005{1}.zip ::: {01..12}
We will have downloaded 12 zip files with 916M of size
juan@mediacenter:~/tmp/post$ ls -lh *.zip
-rw-r--r-- 1 juan juan 71M may 30 22:51 NASDAQsh200501.zip
-rw-r--r-- 1 juan juan 69M may 30 22:52 NASDAQsh200502.zip
-rw-r--r-- 1 juan juan 77M may 30 22:54 NASDAQsh200503.zip
-rw-r--r-- 1 juan juan 76M may 30 22:55 NASDAQsh200504.zip
-rw-r--r-- 1 juan juan 75M may 30 22:57 NASDAQsh200505.zip
-rw-r--r-- 1 juan juan 78M may 30 22:58 NASDAQsh200506.zip
-rw-r--r-- 1 juan juan 75M may 30 23:00 NASDAQsh200507.zip
-rw-r--r-- 1 juan juan 81M may 30 23:01 NASDAQsh200508.zip
-rw-r--r-- 1 juan juan 75M may 30 23:03 NASDAQsh200509.zip
-rw-r--r-- 1 juan juan 84M may 30 23:05 NASDAQsh200510.zip
-rw-r--r-- 1 juan juan 82M may 30 23:06 NASDAQsh200511.zip
-rw-r--r-- 1 juan juan 78M may 30 23:08 NASDAQsh200512.zip
juan@mediacenter:~/tmp/post$ du -sh
916M .
Please note that while we are downloading the files to local the buff/cache is also being filled. So till the buff/cache is cleaned or overwritten with newer data the access to this files will be quicker.
juan@mediacenter:~/tmp/post$ free -h
total used free shared buff/cache available Mem:
15Gi 8,4Gi 4,8Gi 782Mi 2,5Gi 6,3Gi
Swap: 0B 0B 0B
If we to some math, the 2.5G is the buff/cache used after downloading the files minus the previous value of 1.6G is 0.9G that it the size of all the files.
Simulating getting the data with cat
In order to simulate getting the data from the real source but without consuming www.nasdaqtrader.com‘s band width we are going to use cat to read the files from local filesystem instead from the remote FTP server and putting them in the buff/cache, then we will check that the next accesses take less time by using the time command
juan@mediacenter:~/tmp/post$ free -h
total used free shared buff/cache available
Mem: 15Gi 8,4Gi 5,7Gi 782Mi 1,6Gi 6,3Gi
Swap: 0B 0B 0B
juan@mediacenter:~/tmp/post$ time cat *.zip >/dev/null
real 0m6,649s
user 0m0,014s
sys 0m0,738s
juan@mediacenter:~/tmp/post$ time cat *.zip >/dev/null
real 0m0,239s
user 0m0,001s
sys 0m0,237s
juan@mediacenter:~/tmp/post$ time cat *.zip >/dev/null
real 0m0,211s
user 0m0,001s
sys 0m0,210s
juan@mediacenter:~/tmp/post$ free -h
total used free shared buff/cache available
Mem: 15Gi 8,4Gi 4,8Gi 782Mi 2,5Gi 6,3Gi
Swap: 0B 0B 0B
First cat command took 6.649s while the second and the third attempt took less than 0.239s. Let’s do the math again redaing a file using the buff/cache was almost 28 time faster!!
The buff/cache is the same value of 2.5G – 1.6G = 0.9G Meaning that the cat procedure is simulating perfectly the downloading of the files at buff/cache usage level.
Serial unzip with buff/cache VS parallel unzip without buff/cache
Now that we are comfortable working with buff/cache we are going to test two cases and see how do they perform
Serial unzip with buff/cache
juan@mediacenter:~/tmp/post$ free -h
total used free shared buff/cache available
Mem: 15Gi 8,5Gi 5,4Gi 785Mi 1,8Gi 6,1Gi
Swap: 0B 0B 0B
juan@mediacenter:~/tmp/post$ time cat *.zip >/dev/null
real 0m6,113s
user 0m0,012s
sys 0m0,732s
juan@mediacenter:~/tmp/post$ free -h
total used free shared buff/cache available
Mem: 15Gi 8,5Gi 4,5Gi 785Mi 2,7Gi 6,1Gi
Swap: 0B 0B 0B
juan@mediacenter:~/tmp/post$ time for i in `ls *.zip`; do unzip -d uncompressed $i; done
Archive: NASDAQsh200501.zip
inflating: uncompressed/NASDAQsh20050128.txt
inflating: uncompressed/NASDAQsh20050103.txt
[....]
inflating: uncompressed/NASDAQsh20051219.txt
inflating: uncompressed/NASDAQsh20051216.txt
inflating: uncompressed/NASDAQsh20051215.txt
inflating: uncompressed/NASDAQsh20051214.txt
real 0m53,171s
user 0m32,330s
sys 0m5,866s
Unzipping in a serial loop the 12 files using buff/cache took 53.171s
Parallel unzip without buff/cache
juan@mediacenter:~/tmp/post$ free -h
total used free shared buff/cache available
Mem: 15Gi 8,6Gi 5,7Gi 785Mi 1,3Gi 6,0Gi
Swap: 0B 0B 0B
juan@mediacenter:~/tmp/post$ time ls *.zip |parallel unzip -d uncompressed {}
Archive: NASDAQsh200501.zip
checkdir: cannot create extraction directory: uncompressed
File exists
Archive: NASDAQsh200502.zip
[...]
inflating: uncompressed/NASDAQsh20051216.txt
inflating: uncompressed/NASDAQsh20051215.txt
inflating: uncompressed/NASDAQsh20051214.txt
real 0m50,119s
user 0m34,300s
sys 0m5,715s
Unzipping in parallel the 12 files reading all the data from disk took 50.119s
Summary
Both examples performance are equivalent 53.171s vs 50.119s. It seems that parallel process accessing the disk is equivalent to serial process accessing the buff/cache.
The real performance booster should be parallelization with buff/cache usage. Let’s see how it goes
juan@mediacenter:~/tmp/post$ free -h
total used free shared buff/cache available
Mem: 15Gi 8,4Gi 6,1Gi 771Mi 1,2Gi 6,3Gi
Swap: 0B 0B 0B
juan@mediacenter:~/tmp/post$ time cat *.zip >/dev/null
real 0m8,276s
user 0m0,012s
sys 0m0,738s
juan@mediacenter:~/tmp/post$ free -h
total used free shared buff/cache available
Mem: 15Gi 8,4Gi 5,2Gi 771Mi 2,1Gi 6,3Gi
Swap: 0B 0B 0B
juan@mediacenter:~/tmp/post$ time ls *.zip |parallel unzip -d uncompressed {}
Archive: NASDAQsh200501.zip
inflating: uncompressed/NASDAQsh20050128.txt
inflating: uncompressed/NASDAQsh20050103.txt
[...]
inflating: uncompressed/NASDAQsh20051216.txt
inflating: uncompressed/NASDAQsh20051215.txt
inflating: uncompressed/NASDAQsh20051214.txt
real 0m45,837s
user 0m43,599s
sys 0m7,291s
Unzipping in parallel the 12 files using buff/cache took 45.837s. It is only a little bit better. Disappointing right?
What do all three have in common? they write to the hard disk, the slowest component in the whole pipeline.
Is it really necessary to to write down the files?. Maybe not, as usually we want the data inside the zip files. In this cases, it is better to unzip the files do the processing needed and discard the writing in the hard disk.
Let’s see how it goes by just counting the lines of the unzipped files and not writing anything to disk:
Parallel
juan@mediacenter:~/tmp/post$ time ls *.zip |parallel unzip -c {} |wc -l
165414981
real 0m24,252s
user 0m41,877s
sys 0m15,888s
Serial
juan@mediacenter:~/tmp/post$ time for i in `ls *.zip`; do unzip -c $i; done |wc -l
165414981
real 0m39,785s
user 0m37,365s
sys 0m6,174s
These ones look far better. 🙂 Parallel job performs better that parallel in this situation.
Conclusion
Avoid the disks IO operations as much as possible should be a priority in all cases.
Buff/cache can help reduce the IO if we are able to group in time the reading operation (network or file) and the processing to be done with its content as we will be increasing the odds that Linux’s Kernel finds the required data in the buff/cache (RAM) and not in the network or in the disk.
Homework
Imagine two big ETL processes that have to download TBs of data for processing.
- One that downloads all the data (overwriting the buff/cache) and process all the data as it comes
- One that has a queue system that downloads the data only moments before the data is ready to be processed
Which one will be quicker? And Cheaper?
To be continued…
In the coming post we will try to create a very simple queue system using parallel and use all the Linux Capacity tools like sar, iotop, dstat, iftop, htop, etc to find bottlenecks, performance reports.