How Do You Know How Many Bits Characters Need
Data Structures And Number Systems
© Copyright Brian Brown, 1984-1999. All rights reserved.
Part one
Reference Books:
- Program Design : P Juliff
- IBM Microcomputer Assembly Linguistic communication : J Godfrey
- Programmers Craft : R Weiland
- Data Storage in a computer : CIT
- Microcomputer Software Design : Due south Campbell
DATA STRUCTURES
But as learning to blueprint programs is important, so is the understanding of the correct format and usage of data. All programs employ some form of data. To design programs which work correctly, a proficient understanding of how data is structured will exist required.
This module introduces you to the diverse forms of information used by programs. We shall investigate how the data is stored, accessed and its typical usage within programs.
A figurer stores information in Binary format. Binary is a number arrangement which uses BITS to shop data.
BITS
A scrap is the smallest chemical element of information used by a reckoner. A fleck holds ONE of Two possible values,
A bit which is OFF is too considered to be Faux or Not Set; a bit which is ON is also considered to be Truthful or Fix.
Because a single scrap tin only store two values, $.25 are combined together into large units in order to concord a greater range of values.
Crumb
A nibble is a group of Iv $.25. This gives a maximum number of 16 possible different values.
ii ** four = 16 (2 to the power of the number of bits) It is useful, when dealing with groups of bits, to make up one's mind which bit of the group has the least value, and which chip has the most or greatest value.
The Least Significant Flake AND The Near Sigificant Fleck
This is deemed to exist scrap 0, and is always drawn at the extreme right. The Near pregnant scrap is always shown on the extreme left, and is the bit with the greatest value.
The diagram beneath shows a Crumb, and each bits position and decimal weight value (for more than information, consult the module on Number Systems).
three 2 1 0 Fleck Number +---+---+---+---+ | | | | | +---+---+---+---+ eight 4 2 i Decimal Weighting Value MSB LSB Lets consider an example of converting binary values into decimal.
Bit 3 ii 1 0 Value 1 0 1 1 Fleck three is set, and then information technology has a decimal weight value of eight Bit 2 is not fix, so it has a decimal weight value of 0 Bit one is set, and then it has a decimal weight value of 2 Bit 0 is prepare, then it has a decimal weight value of 1 Adding upwards all the decimal weight values for each chip= 11 So 1011 in binary is 11 in decimal! For more examples, consult the module on Number Systems.
BYTES
Bytes are a group of 8 $.25. This comprises Two nibbles, as shown beneath.
7 half dozen v four 3 ii 1 0 Bit Number +---+---+---+---+---+---+---+---+ | | | | | | | | | +---+---+---+---+---+---+---+---+ 128 64 32 sixteen eight 4 2 one Decimal Weighting Value MSB LSB Bytes are often used to shop CHARACTERS. They tin can too be used to shop numeric values,
0 to 255 -127 to +128 Binary Coded Decimal [BCD]
Binary lawmaking decimal digits (0-ix) are represented using FOUR bits. The valid combinations of bits and their respective values are
| Binary Value | Digit |
| 0000 | 0 |
| 0001 | one |
| 0010 | two |
| 0011 | iii |
| 0100 | iv |
| 0101 | 5 |
| 0110 | vi |
| 0111 | seven |
| 1000 | 8 |
| 1001 | 9 |
The binary combinations 1010 to 1111 are invalid and are not used.
If the reckoner stores ane BCD digit per byte, its called normal BCD. The unused nibble may be either all 0's or all 1's.
If two BCD digits are stored per byte, its called Packed BCD. This occurs in data transmission where numbers are being transmitted over a communications link. Packed BCD reduces the amount of time spent transmitting the numbers, equally each information byte transmitted results in the sending of 2 BCD digits.
Consider the storing of the digits 56 in Packed BCD format.
7 6 5 iv 3 2 i 0 Bit Number +---+---+---+---+---+---+---+---+ | 1 | 0 | 0 | 1 | 1 | 0 | 1 | 0 | +---+---+---+---+---+---+---+---+ MSB LSB The UPPER nibble holds the value 5, whilst the LOWER nibble holds the value vi.
Condition And Boolean Variables
BOOLEAN variables use a single flake to hold their value, so can simply assume one of two possible states. This is either 0 (considered to be FALSE), or ane (considered to be TRUE).
The estimator handles each boolean variable as a single bit. If the fleck is True, so is has a value of 1. If the flake is False, then it has the value 0.
When a group of $.25 are grouped together to class a limited range of values, this is known as a Condition variable.
Consider the instance in a program where we need to keep track of the number of minutes a phone line is decorated for (inside the limited range of 0 to 60). This does not require the apply of a full integer, so some programming languages permit you to specify the number of $.25 used to allocate to variables with limited ranges.
The advantage of this approach, is that the storage infinite of status variables tin be combined together into a unmarried sixteen or 32 $.25, resulting in a saving of infinite.
Consider where a computer allocates 16 bits of storage per status variable. If nosotros had 3 status variables, the space consumed would be 48 bits. BUT, if all the condition variables could be combined and fitted into a single 16 $.25 of storage, we could save 32 bits of memory. This is very important in real-time systems where retentiveness infinite is at a premium.
Consider the following diagram, which illustrates the packing of boolean and status variables together into a single byte.
7 vi 5 iv iii 2 1 0 Scrap Number +---+---+---+---+---+---+---+---+ | one | 0 | 0 | ane | 1 | 0 | 1 | 0 | +---+---+---+---+---+---+---+---+ | | | | | | | +--- local call/Price telephone call (bit 0) | | +------- extension busy/gratis (chip one) | +------------------- minutes (bits ii-four) +----------------------- extension diverted/TRUE/Imitation The American Standard Code for Information Interchange
ASCII is a figurer code which uses 128 different encoding combinations of a group of seven bits (27 = 128) to represent,
- characters A to Z, both upper and lower instance
- special characters, < . ? : etc
- numbers 0 to 9
- special control codes used for device control
Lets now wait at the encoding method. The table below shows the bit combinations required for each graphic symbol.
| 00 | 01 | 02 | 03 | 04 | 05 | 06 | 07 | 08 | 09 | 0A | 0B | 0C | 0D | 0E | 0F | |
| 00 | NUL | SOH | STX | ETX | EOT | ENQ | ACK | BEL | BS | TAB | LF | VT | FF | CR | Then | SI |
| 10 | DLE | DC1 | DC2 | DC3 | DC4 | NAK | SYN | ETB | CAN | EM | SUB | ESC | FS | GS | RS | U.s. |
| 20 | ! | " | # | $ | % | & | ' | ( | ) | * | + | , | - | . | / | |
| 30 | 0 | i | 2 | 3 | four | 5 | 6 | seven | 8 | 9 | : | ; | < | = | > | ? |
| 40 | @ | A | B | C | D | Due east | F | Chiliad | H | I | J | K | Fifty | M | N | O |
| l | P | Q | R | Due south | T | U | V | Due west | X | Y | Z | [ | \ | ] | ^ | _ |
| 60 | ` | a | b | c | d | east | f | g | h | i | j | m | fifty | thousand | north | o |
| seventy | p | q | r | south | t | u | v | w | x | y | z | { | | | } | ~ | DEL |
A reckoner usually stores data in 8 $.25. The eighth bit is unused in ASCII, thus is usually set up to 0. Some systems may utilize the 8 bit to implement graphics or different language symbols, ie, Greek characters.
Control codes are used in communications and printers. They may be generated from an ASCII keyboard past belongings downwards the CTRL (command) key and pressing some other primal (A to Z, plus {, \, ], ^, <- ).
Example Code the text string 'Howdy.' in ASCII using hexadecimal digits.
H = 48 e = 65 l = 6C l = 6C o = 6F . = 2E thus the string is represented by the byte sequence
48 65 6C 6C 6F 2E CHARACTERS
Characters are non-numeric symbols used to convey language and meaning. In English, they are combined with other characters to course words. Examples of characters are;
a b c d e f g h i j g l thou north o p q r s t u v w x y z A B C D E F G H I J K L M N O P Q R S T U V Westward X Y Z 0 1 2 three four five half dozen 7 8 ix ! @ # $ % ^ & * ( ) _ - = + | \ ` , . / ; ' [ ] { } : " < > ? A computer system commonly stores characters using the ASCII lawmaking. Each character is stored using viii bits of information, giving a total number of 256 different characters (2**eight = 256).
In the loftier level linguistic communication Pascal, characters are divers and used as follows,
var plus_symbol : char; begin plus_symbol := '+'; Variables used in a Pascal programme are declared subsequently the keyword var. The above example declares the variable plus_symbol to be a character type, thus eight bits of retentiveness storage are allocated to store its value (equally yet undetermined).
Within the primary body of the plan, later on the keyword brainstorm, the argument shown assigns the symbol + to the character variable plus_symbol. This is equivalent to storing the ASCII value 2B hexadecimal in the 8 bits of retentiveness allocated to the variable plus_symbol.
TEXT STRINGS
Text strings are a sequence of characters (ie, words or multi- character symbols). Each character is stored one subsequently the other, each occupying eight bits of memory storage.
The text string Hello would be stored as follows
+------+ | 48 | +------+ | 65 | +------+ | 6C | +------+ | 6C | +------+ | 6F | +------+ In Turbo Pascal, text strings are defined and used as follows,
var text_message : cord[22]; begin text_message := 'Welcome to text strings'; The above example declares the variable text_message to be a cord type of up to 22 characters long (simply no more than!). 8 bits of memory storage are allocated to shop each character in the string (a total of 22 bytes), with the value in each byte every bit yet undetermined.
Inside the main body of the program, afterwards the keyword begin, the statement shown assigns the bulletin Welcome to text strings to the cord variable text_message. This stores the ASCII value of each graphic symbol into each successive byte of memory allocated to the variable text_message.
INTEGERS
Numeric information cannot efficiently be stored using the ASCII format. Imagine storing the number 123,769 using ASCII. This would eat 6 bytes, and it would be hard to tell if the number was positive or negative (though we could precede it with the character + or -).
A more efficient way of storing numeric data is to use a different encoding scheme. The encoding scheme in most employ is shown below,
Bits 15 14 13 12 xi 10 9 8 seven half-dozen v iv 3 2 1 0 +---+--------------------------------------------+ | S | Numeric Value in Binary | +---+--------------------------------------------+ Southward = Sign Bit Integers store whole numbers only! They do non contain partial parts. Consider the examples beneath,
Valid Invalid 123 .987 0 0.0 278903 123.09 The sign bit (which is chip 15) indicates whether the number is positive or negative. A logic 1 indicates negative, a logic 0 indicates positive.
The number is converted to binary and stored in bits 0 to 14 of the ii bytes.
Example Store the value +263 equally an integer value. 1) The sign bit is 0. 2) The decimal value +263 is 100000111 in binary. Thus the storage in retentivity of the integer +263 looks like, $.25 15 14 13 12 11 10 9 viii 7 6 five 4 3 ii i 0 +---+---------------------------------------------+ | 0 | 0 0 0 0 0 0 ane 0 0 0 0 0 one 1 one | +---+---------------------------------------------+ When storing negative numbers, the number is stored using the two's complement format.
In Pascal, integers are defined and used every bit follows,
var whole_number : integer; brainstorm whole_number := 1267; The example declares the variable whole_number to be an integer blazon, thus sixteen bits of memory storage are allocated to store its value (every bit all the same undetermined).
Inside the master body of the programme, afterwards the keyword begin, the statement shown assigns the numeric value 1267 to the integer variable whole_number. This is equivalent to storing the bit combination 0000010011110011 in the sixteen bits of retention allocated to the variable whole_number.
Signed integers using 16 $.25 take a number range of,
-32768 to +32767 (+-2^15) To store larger integer values would require more bits. Some systems and languages also support the use of unsigned integers, which are accounted to exist positive only.
FLOATING Bespeak NUMBERS
There are 2 bug with integers; they cannot limited fractions, and the range of the number is express to the number of bits used. An efficient way of storing fractions is called the floating point method, which involves splitting the fraction into two parts, an exponent and a mantissa.
The exponent represents a value raised to the power of two.
The mantissa represents a fractional value between 0 and 1.
Consider the number
12.fifty The number is first converted into the format
2n * 0.xxxxxx where n represents the exponent and 0.xxxxx is the mantissa.
The estimator industry agreed upon a standard for the storage of floating betoken numbers. It is called the IEEE 754 standard, and uses 32 bits of memory (for single precision), or 64 $.25 (for double precision accuracy). The unmarried precision format looks like,
Bits 31 30 29 28 27 26 25 24 23 22 21 20 19 eighteen 17 16 xv 14 13 12 11 10 9 8 7 6 v four iii ii 1 0 +---+-----------------------+---------------------------------------------------------------------+ | S | Exponent value | Mantissa value | +---+-----------------------+---------------------------------------------------------------------+ S = Sign Scrap The sign fleck is 1 for a negative mantissa, and 0 for a positive mantissa.
The exponent uses a bias of 127.
The mantissa is stored as a binary value using an encoding technique.
Working out the FP bit patterns
The number we take is
12.5 which expressed equally fraction to the ability of ii is,
12.v / two = 6.25 6.25 / ii = iii.125 3.125 / 2 = i.5625 1.5625 / ii = 0.78125 NOTE: Continue dividing by 2 till a fraction between 0 and 1 results. The fraction is the mantissa value, the number of divisions is the exponent value.
thus our values now are,
0.78125 * 24 The exponent flake design is stored using an excess of 127. This means that this value is added to the exponent when storing (and subtracted when removing).
The exponent fleck blueprint to store is,
iv + 127 = 131 = '10000011' As the mantissa is a positive value, the sign bit is 0.
The mantissa is a little more than complicated to piece of work out. Each flake represents two to the ability of a negative number. It looks similar,
1st fleck of mantissa = 0.5 2nd = 0.25 3rd = 0.125 4th = 0.0625 5th = 0.03125 etc The mantissa number value nosotros take is 0.78125, which in binary is
11001000000000000000000 (0.5 + 0.25 + 0.03125) How-ever, to brand matters even more complicated, the mantissa is normalized, by moving the bit patterns to the left (each shift subtracts one from the exponent value) till the get-go 1 drops off.
The resulting pattern is then stored.
The mantissa now becomes
10010000000000000000000 and the exponent is adapted to become
131 - ane = 130 = '10000010' The final assembled format is,
Bits 31 thirty 29 28 27 26 25 24 23 22 21 twenty 19 18 17 16 fifteen xiv 13 12 eleven x 9 8 7 6 5 4 3 2 1 0 +--+-----------------------+---------------------------------------------------------------------+ | 0| one 0 0 0 0 0 1 0| i 0 0 one 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 | +--+-----------------------+---------------------------------------------------------------------+ Due south Exponent Mantissa At present lets convert the following storage format back into a decimal number.
Bits 31 30 29 28 27 26 25 24 23 22 21 xx xix eighteen 17 16 fifteen 14 13 12 11 ten nine eight 7 6 5 four 3 2 1 0 +--+-----------------------+---------------------------------------------------------------------+ | 1| i 0 0 0 0 0 i 1| 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 | +--+-----------------------+---------------------------------------------------------------------+ South Exponent Mantissa This gives a negative number with an exponent value of,
131 - 127 = four and a mantissa value of,
ane.0 + 0.5 + 0.0625 = 1.5625 (the ane.0 comes from the scrap which was shifted off when the mantissa was normalized ), thus the number is,
-1.5625 * 24 = -25.0 The numeric range for floating point numbers using the IEEE-754 method, using 32 bits, is
+- 0.838860808 * two-128 to +- 0.838860808 * 2127 or, in decimal,
two.4652 * x-39 to one.4272 * 1038 In Pascal, floating point numbers are defined and used as follows,
var fp_number : real; begin fp_number := 12.50; The example declares the variable fp_number to be a floating- indicate type, thus xxx-two bits of memory storage are allocated to store its value (every bit yet undetermined).
Inside the master body of the program, afterward the keyword brainstorm, the statement shown assigns the numeric value 12.l to the real variable fp_number. This is equivalent to storing the bit combinations 01000001010010000000000000000000 in the thirty-ii $.25 of retentiveness allocated to the variable fp_number.
The advantages of storing floating point numbers in this fashion are,
- multiplication is performed by calculation exponents and mantissa's
- division is performed past subtracting exponents and mantissa'south
- it is easy to compare two numbers to come across which is the greater or lesser
- large number ranges are stored using relatively few bits
The disadvantages of the storage format are,
- errors are created past moving the mantissa bits
- conversion backwards and forrard takes time
Dwelling house | Other Courses | Feedback | Notes | Tests
© Copyright Brian Brown, 1984-1999. All rights reserved.
How Do You Know How Many Bits Characters Need
Source: https://www6.uniovi.es/datas/data1.htm
0 Response to "How Do You Know How Many Bits Characters Need"
Post a Comment