4.1.10
Character Sets
Character Sets
Character Sets
Text data is made up of characters. Character sets allow us to store characters digitally.


Character sets
Character sets
- Text data is made up of characters.
- Each character is assigned its own character code.
- A character set is a collection of all the characters that a computer recognises, along with their binary codes.


What's in a character set?
What's in a character set?
- Character sets include:
- Alphanumeric characters e.g. letters, numbers, and symbols.
- Special characters e.g. new line.


Examples of character sets
Examples of character sets
- There are two main character sets in use:
- American Standard Code for Information Interchange.
- Unicode.
American Standard Code for Information Interchange
American Standard Code for Information Interchange
The American Standard Code for Information Interchange (ASCII) character set is the most common character set.


ASCII
ASCII
- Each character in ASCII is represented by a seven-bit binary code.
- That means there is a maximum of 128 characters.
- ASCII includes all commonly used letters and symbols in the English language.


7-bit letters?
7-bit letters?
- Each letter is represented by seven bits.
- This is useful because when used in an 8-bit system, the extra bit can be used as a check digit.


Limitations of ASCII
Limitations of ASCII
- 128 characters is perfectly fine for the English language. But it does not leave space for characters from other languages.
- An extended ASCII set was released which used all eight bits, but it was still not enough.
- This led to the release of Unicode.
Unicode
Unicode
Unicode is a character set which was released because of the need to standardise character sets internationally.


Unicode
Unicode
- Unicode aims to represent every possible character in the world.
- The most common form of Unicode is UTF-8 and uses between eight and 32 bit binary codes to represent each character.


Compatability with ASCII
Compatability with ASCII
- The first 256 characters in Unicode are identical to extended ASCII, which makes it backwards compatible with documents encoded using older character sets.


Types of characters
Types of characters
- Unicode represents characters from all major alphabets of the world.
- Unicode is also used to represent emojis!
1Components of a Computer
1.1Structure & Function of the Processor
1.2Types of Processors
2Software & Software Development
2.1Systems Software
2.2Applications Generation
2.3Software Development
3Exchanging Data
3.1Compression, Encryption & Hashing
3.3Networks
4Data Types, Data Structures & Algorithms
4.1Data Types
5Legal, Moral, Cultural & Ethical Issues
5.1Computing Related Legislation
6Elements of Computational Thinking
6.1Thinking Abstractly
6.2Thinking Procedurally
6.3Thinking Logically
7Problem Solving & Programming
7.1Programming Techniques
7.2Programming Construction
Jump to other topics
1Components of a Computer
1.1Structure & Function of the Processor
1.2Types of Processors
2Software & Software Development
2.1Systems Software
2.2Applications Generation
2.3Software Development
3Exchanging Data
3.1Compression, Encryption & Hashing
3.3Networks
4Data Types, Data Structures & Algorithms
4.1Data Types
5Legal, Moral, Cultural & Ethical Issues
5.1Computing Related Legislation
6Elements of Computational Thinking
6.1Thinking Abstractly
6.2Thinking Procedurally
6.3Thinking Logically
7Problem Solving & Programming
7.1Programming Techniques
7.2Programming Construction

Unlock your full potential with GoStudent tutoring
Affordable 1:1 tutoring from the comfort of your home
Tutors are matched to your specific learning needs
30+ school subjects covered