Encoding and Decoding Bytes Explained
A computer can only store bytes.
This means that if we want to store anything at all in a computer, we must first convert it to bytes, or encode it.
What’s an encoding?
Different types of data have different available encodings:
To store any of the data above, we must first encode this data using any of its respective encodings.
For instance, to store an image, we must first encode it using
ASCII and all the others listed above are also examples of encodings.
As we can see, an encoding is a format to represent images, video, audio, text, etc. in bytes.
It’s all just bytes?
This all means that all data on our disk is just a bunch of bytes. The bytes could represent a string of text, a video, an image, we don’t know.
And we won’t know until we know what encoding this data is in.
A string of bytes is pretty much useless to us unless we know its encoding.
We can see an example of this in Python.
To encode, or convert to a byte string, we can use
format being the encoding we want to use.
bytestring = 'Random string'.encode('utf-8') print(bytestring) # b'Random string'
Here, we are converting
'Random string' to its byte representation using the encoding
When we print this out, we’ll get
b'Random string'. The
b is Python’s way of denoting a byte string.
However, note that we can’t actually read these bytes in
bytestring. The only reason why it says
b'Random string' and not some byte-gibberish after we encode it is because Python decodes the string from the
UTF-8 format when printing. We only know it’s a byte string from the
Given the encoding, we can decode a byte string in Python using
bytestring.decode('utf-8') # 'Random string'