top -M or when rounding errors get serious

We all know that a megabyte in binary system is not the same as one million bytes (in decimal system). But have you actually cared much about it? I have to admit I haven't. I know there is a small rounding error, but by and large I always treated 2^10 = 1 kB = 1024 bytes and 10^3 = 1 kB = 1000 as the same thing. (Update: Opening sentence was edited to remove units MB and MiB since it seems even I managed to use them backwards! The math in this article is correct. The rest of the article uses MB, GB and TB mostly to refer to binary magnitudes, which is apparently incorrect. See comments for wikipedia links and discussion.)

More importantly, when you move into larger numbers, rounding errors usually become even less important. Unfortunately, in this case they become bigger:

magnitude binary decimal % error absolute error in kB % error
1 kB 1024 1000 2.4% 24 bytes 1 0.0%
1 MB 1048576 1000000 4.9% 48.6 kB 1024 2.4%
1 GB 1073741824 1000000000 7.4% 73.74 MB 1048576 4.9%
100 GB 107374182400 100000000000 7.4% 7.4 GB 104857600 4.9%
1 TB 1099511627776 1000000000000 10.0% 100 GB 1073741824 7.4%

The above table makes it clear why these rounding errors are becoming a problem even if we could ignore them previously. When we had megabytes or even a few gigabytes of RAM, it didn't matter. In the 1 GB region the error is 74 MB, but I don't want to use up the last 100 MB of RAM anyway, so the rounding error is smaller than anything I care about anyway.

But now I'm working with a server that has more than 100GB of memory. Suddenly the difference in getting your numbers right makes 7 and a half GB worth of difference! 7+ gigs of RAM costs a lot, so of course I will not be leaving that much unused but will try to use it up until the last gigabyte. But to use that RAM, I need to know how much precisely there is to use!

And if that didn't convince you, in a year we'll see the first terabyte level systems come out. At that point the rounding error is 100GB!

And then there is top. I use top because that's what everyone else does too. It has a lot of numbers that are fun to watch. I watch them like everyone else does too. But I don't really understand what they mean.

The reason is that top wants to show me all values in kB (kilobytes). I believe WTF! is an appropriate expression...

I could learn to live with the discrepancies between binary and decimal magnitudes. But this idiot utility chooses to show me values that are neither. I can learn a few numbers by heart: 107 billion is the same as 1 GB, or 132 billion is actually 128 GB, or 109 trillion is 1 TB. But 104 something...? What is that????

I'll tell you what it is. It is 104*1000*1000*1024 which is about 100 GB.

I think top is doing us a great disservice with this approach. It uses numbers that are in their own alternative universe, so it's an additional burden to try to remember what they really mean. It mixes binary (1024) and decimal (1000) arithmetics into the same value which is nonsensical. And worst of all, it makes the rounding error smaller, so it is more tempting to continue to ignore it. But an error of 5 GB is not much better than an error of 7.4 GB, so you can't really ignore it after all!

But wait! you say. GNU utilities always support the -h option to output "human readable" values for cases like this. Guess what: top doesn't :-(

But if you really read the man page to the end, and make an enlightened guess of what some cryptic sentences means, it turns out that top -M is what you want! (I believe the "M" is for "Megabytes", but it also shows gigabytes just fine.)


# top -M
top - 08:13:04 up 19 days, 19:44, 1 user, load average: 0.37, 0.44, 0.38
Tasks: 245 total, 1 running, 244 sleeping, 0 stopped, 0 zombie
Cpu(s): 0.4%us, 0.2%sy, 0.0%ni, 99.1%id, 0.2%wa, 0.0%hi, 0.1%si, 0.0%st
Mem: 125.974G total, 125.568G used, 416.125M free, 140.949M buffers
Swap: 1023.992M total, 256.000k used, 1023.742M free, 15.407G cached
 
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
8002 mysql 15 0 108g 108g 6412 S 7.9 85.9 2091:36 /usr/sbin/mysqld --basedir=/usr --datadir=/data/mysql/var --plugin-dir=/usr/lib64/mysql/plugin --user=mysql --log-err
25576 root 15 0 12876 1204 808 R 1.0 0.0 0:00.03 top -M
1 root 15 0 10352 684 576 S 0.0 0.0 0:02.38 init [3]

Excellent!

Except, there is more...

From the above you'd think there is a MySQL server that is consuming 108 GB of memory. Unfortunately that's not true :-(


# cat /proc/8002/status
Name: mysqld
...
VmPeak: 115257328 kB
VmSize: 114234120 kB
VmLck: 0 kB
VmHWM: 113439968 kB
VmRSS: 113439968 kB
...

Stupid fscking kB values everywhere!

If I've googled this correctly, then VmSize should correspond to the VIRT column in top, and VmRSS to the RES column. These are the total memory allocated (virtual memory) and "resident memory", which is the physical amoun of RAM consumed. VmPeak is the maximum amount of memory ever allocated at a single point in time, meaning some memory has been released compared to that peak value.

So let's do the math just this one time: 114234120 kB = (114234120 * 1024)/(1024 * 1024 * 1024) GB = 108.9421463013 GB.

Looks more like 109 GB than 108 GB to me. I've been monitoring this for a while, and unfortunately it seems that at least for the values in the process list, top simply cuts the number rather than properly rounding it. So this is another place where you can simply lose another gigabyte if you make the mistake of trusting top.

For this latter rounding error there seems to be no cure other than simply not using top if you want gigabyte precision (yeah, sounds like a lot to ask...). Or you could mentally add +1 to all the numbers to be on the safe side, which is rather silly.

http://en.wikipedia.org/wiki/Mebibyte

Megabyte is now assumed to mean the base10 version.
Mebibyte is the base2 version.

Some systems will actually count in MiB, GiB etc (and note the units correctly) but I find these are few and far between. So far I am yet to find a mechanism better than intuition to guess at which units they really mean when they say MB, GB etc although typically it is base2 (except obviously in the case of hard drives and related data counters).

http://lxr.linux.no/linux+v2.6.39/Documentation/filesystems/proc.txt

Nothing there mentions explicitly the units as KiB or MiB, but again this is the kind of thing where there is a lot of legacy documentation buildup and you just have to go with intuition.

You could of course go into the source code and verify it precisely, but since the kernel will be dealing with processor-oriented register and page sizes it would be extremely odd to expect power of 10 units.

So in this context kB would mean 1024 bytes.

About the bookAbout this siteAcademicAccordAmazonAppleBeginnersBooksBuildBotBusiness modelsbzrCassandraCloudcloud computingclsCommunitycommunityleadershipsummitConsistencycoodiaryCopyrightCreative CommonscssDatabasesdataminingDatastaxDevOpsDistributed ConsensusDrizzleDrupalEconomyelectronEthicsEurovisionFacebookFrosconFunnyGaleraGISgithubGnomeGovernanceHandlerSocketHigh AvailabilityimpressionistimpressjsInkscapeInternetJavaScriptjsonKDEKubuntuLicensingLinuxMaidanMaker cultureMariaDBmarkdownMEAN stackMepSQLMicrosoftMobileMongoDBMontyProgramMusicMySQLMySQL ClusterNerdsNodeNoSQLNyrkiöodbaOpen ContentOpen SourceOpenSQLCampOracleOSConPAMPParkinsonPatentsPerconaperformancePersonalPhilosophyPHPPiratesPlanetDrupalPoliticsPostgreSQLPresalespresentationsPress releasesProgrammingRed HatReplicationSeveralninesSillySkySQLSolonStartupsSunSybaseSymbiansysbenchtalksTechnicalTechnologyThe making ofTransactionsTungstenTwitterUbuntuvolcanoWeb2.0WikipediaWork from HomexmlYouTube