1.2.25

ASCII & Unicode

Test yourself

American Standard Code for Information Interchange

The American Standard Code for Information Interchange (ASCII) character set is the most common character set.

Illustrative background for 7-bitIllustrative background for 7-bit ?? "content

7-bit

  • Initially, each character in ASCII was represented by a seven-bit binary code.
  • That means there was a maximum of 128 characters.
    • This was enough to include all commonly used letters and symbols in the English language.
  • When each letter is represented by seven bits, an 8-bit system can use the extra bit as a check digit.
Illustrative background for 8-bitIllustrative background for 8-bit ?? "content

8-bit

  • An extended ASCII uses all 8 bits.
  • The additional bit allows for an extra 128 characters to be represented.
  • With that extension, most Western languages can use the same character set.
  • Note: In your exam, the representation of ASCII will use 8 bits.

Unicode

Unicode is a character set which was released because of the need to standardise character sets internationally.

Illustrative background for UnicodeIllustrative background for Unicode ?? "content

Unicode

  • Unicode aims to represent every possible character in the world.
  • The most common form of Unicode is UTF-8 and uses between eight and 32 bit binary codes to represent each character.
Illustrative background for Compatability Illustrative background for Compatability  ?? "content

Compatability

  • The first 256 characters in Unicode are identical to extended ASCII, which makes it backwards compatible with documents encoded using older character sets.
  • Characters may not be recognised or displayed correctly if the computer it is being read on is using a different character set from the computer it was created on.
Illustrative background for Types of charactersIllustrative background for Types of characters ?? "content

Types of characters

  • Unicode represents characters from all major alphabets of the world.
  • Unicode is also used to represent emojis!

Logical Ordering

Both ASCII and Unicode store characters in a logical, numerical order.

Illustrative background for ASCII - exampleIllustrative background for ASCII - example ?? "content

ASCII - example

  • Denary
    • 'A' = 65
    • 'B' = 66
    • 'a' = 97
    • 'b' = 98
  • Hex
    • 'A' = 41
    • 'B' = 42
    • 'a' = 61
    • 'b' = 62
Illustrative background for Unicode - examplesIllustrative background for Unicode - examples ?? "content

Unicode - examples

  • 'A' = U+0391
  • 'B' = U+0392
  • 'a' = U+0061
  • 'b' = U+0062
Illustrative background for NotesIllustrative background for Notes ?? "content

Notes

  • The codes for uppercase letters are different from the codes for lowercase letters.
  • The character for 'B' will be one more than the character code for 'A', and so on.

Jump to other topics

1Computer Systems

1.1Systems Architecture

1.2Memory & Storage

1.3Computer Networks, Connections & Protocols

1.4Network Security

1.5Systems Software

1.6Ethical, Legal, Cultural & Environmental Concern

2Computational Thinking, Algorithms & Programming

2.1Algorithms

2.2Programming Fundamentals

2.3Producing Robust Programs

2.4Boolean Logic

2.5Programming Languages & IDEs

Unlock your full potential with Seneca Premium

  • Unlimited access to 10,000+ open-ended exam questions

  • Mini-mock exams based on your study history

  • Unlock 800+ premium courses & e-books

Get started with Seneca Premium