Gigabyte: Decimal vs. Binary

Everyone who has ever purchased a hard drive finds out the hard way that there are two ways to define a gigabyte.  

When you buy a “500 Gigabyte” hard drive, the vendor defines it using the decimal powers of ten definition of the “Giga” prefix.

500 * 109 bytes = 500,000,000,000 = 500 Gigabytes

But the operating system determines the size of the drive using the computer’s binary powers of two definition of the “Giga” prefix:

465 * 230 bytes = 499,289,948,160 = 465 Gigabytes

If you’re wondering where 35 Gigabytes of your 500 Gigabyte drive just disappeared to, you’re not alone. It’s an old trick perpetuated by hard drive makers – they intentionally use the official SI definitions of the Giga prefix so they can inflate the the sizes of their hard drives, at least on paper. This was always an annoyance, but now it’s much more difficult to ignore, as it results in large discrepancies with today’s enormous hard drives. When is a Terabyte hard drive not a Terabyte? When it’s 931 GB.

As Ned Batchelder notes, the hard drive manufacturers are technically conforming to the letter of the SI prefix definitions. It’s us computer science types who are abusing the official prefix designations:

Year ApprovedOfficial DefinitionInformal MeaningDifferencePrefix Derived From
gigaGB19601092307%Greek root for giant
teraTB1960101224010%Greek root for monster
petaPB1975101525013%Greek root for five, "penta"
exaEB1975101826015%Greek root for six, "hexa"
zettaZB1991102127018%Latin root for seven, "septum", p dropped, first letter changed to S to avoid confusion with other SI symbols
yottaYB1991102428021%Greek root for eight, "octo", c dropped, y added to avoid having symbol of zero-like letter O

As the size of the prefix grows, so does the gap between the official and informal meaning of the prefix. And yes, there are larger official SI prefixes beyond these, just in case someone needs more than 1000 yottabytes. Ned noted that one of the SI proposals is for the prefix “luma,” representing 1063.

Speaking of impossibly large numbers, if you’re like most people reading this article, then you probably arrived here through Google. Google is a tragically but forever misspelled version of Googol:

googol is 10100, i.e. a 1 followed by 100 zeros. In official SI prefix terms, a googol is approximately a yotta squared, squared. Even larger is the googolplex, which is equal to 10 to the power of a googol (10googol); this number is about the same size as the number of possible games of chess. Even larger numbers have been defined, such as Skewes’ numberGraham’s number, and the Moser, which I won’t even try to describe.

But I digress. When we use gigabyte to mean 230, that’s an inaccurate and informal usage. Instead, we’re supposed to be using the more accurate and disambiguated IEC prefixes. They were introduced in 1998 and formalized with IEEE 1541 in 2000.

kibibyteKiB210
mebibyteMiB220
gibibyteGiB230
tebibyteTiB240
pebibytePiB250
exbibyteEiB260
zebibyteZiB270
yobibyteYiB280

You occasionally see these more correct prefixes used in software, but adoption has been slow at best. There are several problems:

  1. They sound ridiculous. I hear the metric system used more often in the United States than I hear the words “kibibyte” or “mebibyte” uttered by anyone with a straight face. Which is to say, never.
  2. Hard drive manufacturers won’t use them. Drive manufacturers don’t care about being correct. What they do care about is consumers buying their drives because they have the largest possible number plastered on the front of the box. If a big lawsuit wasn’t enough to get them to mend their ways, I seriously doubt that the recommendation of an international standards body is going to sway them.
  3. Tradition rules. It’s hard to give up on the rich binary history of kilobytes, megabytes, and gigabytes, particularly when the alternatives are so questionable.

It’s good to keep in mind the discrepancy between the decimal and binary meanings of the SI prefixes. The difference can bite you if you’re not careful. But I think we’re stuck with contextual, dual-use meanings of the SI prefixes for the foreseeable future. Or perhaps we’re all overthinking this, as Alan Green notes:

Whenever I try to discuss [this] with my friends, they say, “Yotta getta life.”

Related posts

The State of Solid State Hard Drives

I've seen a lot of people play The Computer Performance Shell Game poorly. They overinvest in a fancy CPU, while pairing it with limited memory, a plain jane hard drive, or a generic video card. For most users, that fire-breathing quad-core CPU is sitting around twiddling its virtual

By Jeff Atwood ·
Comments

Beyond RAID

I've always been leery of RAID on the desktop. But on the server, RAID is a definite must: "RAID" is now used as an umbrella term for computer data storage schemes that can divide and replicate data among multiple hard disk drives. The different schemes/architectures

By Jeff Atwood ·
Comments

Large USB Flash Drive Performance

In the last three years, I've gone from carrying a 512 MB USB memory stick to a 16 GB USB memory stick. That's pretty amazing. According to the storagereview.com archives, hard drives with 16 GB of storage were introduced sometime around the beginning of 1999.

By Jeff Atwood ·
Comments

Hard Drives — breaking the Terabyte Barrier

I recently upgraded my home system with one of the 750 gigabyte Seagate perpendicular drives in order to consolidate a number of hard drives I had on my server. 750 gigabytes is a tremendous amount of storage space in a single drive – but it doesn’t quite get us across

By Jeff Atwood ·
Comments

Recent Posts

Let's Talk About The American Dream

Let's Talk About The American Dream

A few months ago I wrote about what it means to stay gold — to hold on to the best parts of ourselves, our communities, and the American Dream itself. But staying gold isn’t passive. It takes work. It takes action. It takes hard conversations that ask us to confront

By Jeff Atwood ·
Comments
Stay Gold, America

Stay Gold, America

We are at an unprecedented point in American history, and I'm concerned we may lose sight of the American Dream.

By Jeff Atwood ·
Comments
The Great Filter Comes For Us All

The Great Filter Comes For Us All

With a 13 billion year head start on evolution, why haven’t any other forms of life in the universe contacted us by now? (Arrival is a fantastic movie. Watch it, but don’t stop there – read the Story of Your Life novella it was based on for so much

By Jeff Atwood ·
Comments