When
is a kilobyte a kibibyte?
And an MB an MiB? |
|
|
Quite often is the short answer. But what are kibibytes,
and indeed mebibytes, gibibytes, tebibytes, pebibytes and exbibytes?
The answers are all in IEC 60027-2, developed by TC
25 (Quantities, units and their letter symbols), published in
November 2000 and now gradually being adopted in the IT world. Essential
details of the new units, their derivations, symbols and approximate
relationship to commonly, if sometimes incorrectly, used metric
equivalents in the Système International (SI), are shown
in the accompanying table.
How do these new standardized units differ from those that have
become so familiar during the last two or three decades’ explosion
in personal computing? And does it really matter? After all, most
people probably think they know quite as much as they need to about
kilobytes or megabytes when they start running out of memory, resources
or hard-disk capacity on their PC. Or when the numbers, time and
therefore cost don’t seem to add up when downloading files
over a modem via their Internet Service Provider (ISP).
The fact is that, while it may not have mattered much to the average
PC user until the last few years, a kilobyte is not necessarily
the 103 or 1 000 bytes that its SI prefix ‘kilo’
would seem to indicate. SI is a decimal (base ten) system, but computers
essentially only recognize whether an electrical signal is on or
off, represented by a 1 or a 0. Mathematically speaking, they are
binary (base two) systems. When it comes to scientists and engineers
in the IT and telecommunications industries, such sources of confusion
and potential incompatibility certainly do matter, and increasingly
so as the numbers that computers crunch get ever bigger.
The second edition of IEC 60027-2 (Letter symbols to be used in
electrical technology – Part 2: Telecommunications and electronics,
to give it its full title) was developed specifically to meet industry’s
expressed needs in data processing and data transmission. It eliminates
confusion by setting out the prefixes and symbols for the binary,
as opposed to decimal, multiples that most often apply in these
fields.
 |
Bits and bytes
A ‘bit’ is a binary digit and a ‘byte’
is a group of bits, usually eight (hence, incidentally, the
French ‘octet’ for a byte). Years ago, at a time
when entire computer capacities barely matched the few tens
of kilobytes represented by this single page of web text,
computer engineers noticed that the binary 210
(1 024) was very nearly equal to the decimal 103
(1 000) and, purely as a matter of convenience, they
began referring to 1 024 bytes as a kilobyte. It was,
after all, only a 2,4 percent difference and all the professionals
generally knew what they were talking about among themselves.
Despite its inaccuracy and the inappropriate use of the decimal
SI prefix, the term was also easy for salesmen and shops to
use, and it caught on with the public. Take, for example,
the ubiquitous and so-called 3,5 inch floppy disk, which is
said to have a capacity of 1,44 megabytes (MB). This is wrong
on at least three counts: first, the word floppy no longer
really applies as it did to the 5,25 inch predecessor; secondly,
the physical size is 90 mm, not 3,5 inches; but more significantly,
the capacity, originally described as 1 440 kilobytes
before being “translated” to 1,44 MB, is in fact
a little over two percent inaccurate because of the double
misuse of a decimal prefix. |
As time has passed, kilobytes have grown into megabytes and megabytes
into gigabytes. Within a few years, ordinary PC or laptop data storage
could well be measured in terabytes and very large industrial or
scientific systems in peta- or even exabytes. The problem is that,
even at the SI tera-scale (1012), the discrepancy with
the binary equivalent (240) is not the 2,4 percent at
kilo-scale but rather approaching 10 percent. At exa-scale (1018
and 260), it is nearer 20 percent. The niceties of mathematics
dictate that the bigger the number of bytes, the bigger the differential,
so the inaccuracies – for engineers, marketing staff and public
alike – are set to grow more and more significant. This is
one good reason for the IEC to have standardized prefixes for binary
multiples.
The other primary reason is that different parts of the IT industry
had started to confuse themselves. In the computing world, for example,
the major disk-drive manufacturers tend to mean what they say in
kilobytes, megabytes, gigabytes and so on of storage, i.e. precisely
1 000, 1 000 000 and 1 000 000 000
bytes respectively, according to the decimal prefix. Memory, on
the other hand, is described using the decimal prefix but actually
supplied in binary quantities, so 512 MB of RAM bought on the high
street generally means 536 870 912 bytes and, as shown
in the table, should more properly be described as 512 MiB (mebibytes)
or 537 MB.
To make matters worse, there has traditionally been inconsistency
among operating systems and system applications as to how they actually
treat the prefixes, leading to apparent anomalies and incompatibilities.
Similar confusions have arisen between the computing and the telecommunications
sectors of the IT world, where data transmission rates have grown
enormously over the past few years. Network designers have generally
used megabits per second (Mb/s) to mean 1 048 576 bits
per second, while telecommunications engineers have traditionally
used the same term to mean a million bits per second. Even the usually
stated bandwidth of a PCI bus, 133,3 MB/s based on it being four
bytes wide and running at 33,3 MHz, is inaccurate because the M
in MHz means a million while the M in MB means 1 048 576.
As noted above, mathematics dictate that the disparities resulting
from mixed and incorrect use of decimal prefixes will become increasingly
significant as capacities and data rates continue to grow. In IEC
60027-2, all branches of the IT industry now have a tool with which
to iron out inconsistency and achieve mathematical clarity as never
before.
 |
|
| Binary
multiples : |
The new prefixes and symbols for binary multiples
standardized in IEC 60027-2 are not part of the SI metric system
of units. But as the table below shows, they are related to the
SI prefixes and symbols for positive powers of ten in a simple way
so that they are easy to remember and use.
Factor
|
Name |
Symbol |
Origin |
SI derivation |
210 |
kibi |
Ki |
kilobinary: (210)1 |
kilo: (103)1 |
| 220 |
mebi |
Mi |
megabinary: (210)2 |
mega: (103)2 |
230 |
gibi |
Gi |
gigabinary: (210)3 |
giga: (103)3 |
240 |
tebi |
Ti |
terabinary: (210)4 |
tera: (103)4 |
| 250 |
pebi |
Pi |
petabinary: (210)5 |
peta:(103)5 |
260 |
exbi |
Ei |
exabinary: (210)6 |
exa: (103)6 |
 |
|
| The
standard : |
IEC
60027-2 Ed. 2.0 (2000-11)
Letter symbols to be used in electrical technology - Part 2: Telecommunications
and electronics
ICS code: 01.060 - TC 25 - 66 pages - CHF 112,00
|