World's most popular travel blog for travel bloggers.

Why can the alphabet be represented in numbers in base 256

, , No Comments
Problem Detail: 

This is in context of hashing of strings. I'm not sure why a string like, CS, could be represented as CS = 'c'*256 + 's'

Does anyone know about this?

Asked By : John Swoon

Answered By : David Richerby

A number in base ten is just a sequence of digits 0–9, with the string $d_n\dots d_2 d_1 d_0$ representing the number $10^nd_n + \dots + 10^2d_2 + 10^1d_1 + 10^0d_0$. Similarly, a character in an 8-bit character set can be considered to be a "digit" between 0 and 255, so a sequence $d_n\dots d_1d_0$ of these "digits" represents the number $256^nd_n + \dots + 256^2d_2 + 256^1d_1 + 256^0d_0$.

Another way to see this is to write the number out in binary. For example, a 32-bit binary number can be considered as having 32 binary digits (bits) or 4 256-ary digits by collecting the bits into groups of 8. In the same way, an 8-digit decimal number can be considered as a 4-digit number in base-100. For example

$$38572856$$

can be considered as

$$38\,57\,28\,56 = 38\times 100^3 + 57\times 100^2 + 28\times 100^1 + 56\times 100^0 = 38572856\,.$$

For the Latin alphabet, base 26 would be the most natural representation, since the letters can be treated as 26 different "digits". You may also have heard of Base-64 encoding which uses the 64 characters A...Za...z0...9+/ in that order as 64 "digits". The advantage there is that all 64 of the characters can easily be included in a text file and, by using Base-64, you can code three bytes of binary data (24 bits) into four characters (4x6 bits), which is a acceptably inefficient.

Best Answer from StackOverflow

Question Source : http://cs.stackexchange.com/questions/32392

0 comments:

Post a Comment

Let us know your responses and feedback