Binary to Text Converter

Two-way binary to text converter.

Everyday ASCII + UTF-8 Both directions
Rate this calculator · 4.0 (2)

Binary to Text Converter

ASCII · UTF-8 · bidirectional · 8-bit byte groups

Instructions — Binary to Text Converter

1

Pick a direction

Binary → Text decodes 0s and 1s into readable characters. Text → Binary encodes any text into 8-bit binary. The default sample shows “Hello” in binary — click it once to see what 40 bits of UTF-8 look like.

2

Choose encoding

ASCII covers the original 128 characters (0-127): A-Z, a-z, 0-9, punctuation, and control codes. UTF-8 is the Unicode superset that handles every modern script, accented letters, and emoji using 1 to 4 bytes per character. UTF-8 is the default for ~98% of the web (W3Techs, 2024).

3

Set the separator

Space (01001000 01101001) is the most readable. Comma (01001000,01101001) is sometimes used in CSV data. None (0100100001101001) is compact — the decoder auto-chunks long strings into 8-bit groups when the total length is a multiple of 8.

Decode tip: The converter accepts mixed input. A binary string of any length is split on whitespace and commas; tokens that are not exactly 8 bits show an error. If you paste one long blob with no separators, the converter automatically slices it into 8-bit groups.
UTF-8 vs ASCII: Plain English text encodes identically in ASCII and UTF-8 because UTF-8 is backwards-compatible with ASCII. Accented characters (é, ü), Cyrillic, Greek, Asian scripts, and emoji require UTF-8’s multi-byte sequences.

Formulas

Computers store every character as a number. The binary representation is just that number written in base 2 using 8 bits per byte. Decoding binary to text reads those 8-bit groups, looks each number up in the chosen encoding table, and concatenates the resulting characters.

Binary to decimal (one byte)
$$ D = \sum_{i=0}^{7} b_i \times 2^i $$
Each bit b_i (0 or 1) is multiplied by its place value 2^i. Bit 7 is the most significant (value 128), bit 0 the least (value 1). One byte covers 0-255.
Decimal to binary
$$ b_i = \left\lfloor \frac{D}{2^i} \right\rfloor \bmod 2 $$
Repeatedly divide by 2 and keep the remainders. For D = 72 (the letter H): 72 = 64 + 8 = 2^6 + 2^3 = 01001000.
ASCII range
$$ 0 \le \text{code} \le 127 $$
Standard ASCII uses 7 bits; the 8th bit is zero. Extended ASCII (128-255) is an 8-bit superset whose mapping varies by region (Windows-1252, ISO 8859-1, etc.).
UTF-8 byte counts
$$ \text{bytes} = \begin{cases} 1 & U \le 0\text{x7F} \\ 2 & U \le 0\text{x7FF} \\ 3 & U \le 0\text{xFFFF} \\ 4 & U \le 0\text{x10FFFF} \end{cases} $$
Where U is the Unicode code point. ASCII characters use 1 byte; most European accented letters use 2; CJK ideographs use 3; emoji and the Supplementary Multilingual Plane use 4.
UTF-8 leading byte mask
$$ \text{1 byte: } 0xxxxxxx \\ \text{2 bytes: } 110xxxxx\,10xxxxxx \\ \text{3 bytes: } 1110xxxx\,10xxxxxx\,10xxxxxx \\ \text{4 bytes: } 11110xxx\,10xxxxxx\,10xxxxxx\,10xxxxxx $$
Continuation bytes always begin with 10. The leading byte’s prefix tells the decoder how many bytes follow.
Total bits
$$ \text{bits} = \text{bytes} \times 8 $$
A 5-character English word in ASCII or UTF-8 is 40 bits. The word “naïve” in UTF-8 is 6 bytes (48 bits) because the ï takes 2 bytes.

Reference

Common ASCII characters in binary
CharDecimalBinary (8-bit)Hex
(space)320010000020
!330010000121
0480011000030
9570011100139
A650100000141
H720100100048
Z90010110105A
a970110000161
h1040110100068
z122011110107A
?63001111113F

How “Hello” looks in binary

Each letter is one 8-bit byte. The same string is identical in ASCII and in UTF-8.

Letter-by-letter
LetterDecimalBinary
H7201001000
e10101100101
l10801101100
l10801101100
o11101101111
UTF-8 byte sizes
CharacterBytesExample
A-Z, a-z, 0-91A = 01000001
é, ü, ñ2é = 11000011 10101001
€, ©, ™3 (some)€ = 11100010 10000010 10101100
CJK (中文)3中 = 11100100 10111000 10101101
Emoji4 = 11110000 10011111 10011000 10000000

ASCII was published on 17 June 1963 by the American Standards Association (now ANSI). UTF-8 was designed by Ken Thompson and Rob Pike in 1992 and is now the dominant character encoding on the web.

Article — Binary to Text Converter

Binary to Text Converter

A binary to text converter maps 8-bit bytes to printable characters using a standard encoding (ASCII or UTF-8). One byte holds 256 possible values (0–255); ASCII assigns letters, digits and punctuation to codes 0–127. UTF-8 extends this to all 1,112,064 Unicode code points using 1–4 bytes per character.

The math is elementary. Each binary digit (bit) is a power of two; eight of them stacked give a number 0–255; look the number up in a table and you have a character. Everything else is bookkeeping.

What a binary to text converter does

It reads a string of 0s and 1s, splits the string into 8-bit byte groups, and looks up each byte in the chosen encoding table. The result is human-readable text. The reverse process — text to binary — encodes each character as a byte (or a multi-byte UTF-8 sequence) and concatenates the binary.

Binary itself is a positional numeral system in base 2: every digit is either 0 or 1, and the place values are powers of 2 (1, 2, 4, 8, 16, 32, 64, 128 for an 8-bit byte). Computers use binary because transistors are reliable two-state devices. The 8-bit byte became standard with IBM’s System/360 in 1964 and has been the foundation of every modern computer architecture since.

Did you know

The ASCII standard was published on 17 June 1963 by the American Standards Association (now ANSI). A 1969 federal procurement rule required all US government computers to support ASCII, which forced the entire industry to adopt it within a decade. Before ASCII, IBM, Burroughs, and other manufacturers each used their own incompatible character codes.

How binary to text decoding works

To decode a binary string by hand: write the string in groups of 8 (the converter accepts spaces, commas, or no separator at all). For each group, add up the place values of the 1 bits. For 01001000 the 1s are in positions 6 and 3, so the value is 26 + 23 = 64 + 8 = 72. Look 72 up in the ASCII table: H.

The text-to-binary direction works in reverse. Take each character’s numeric code and convert it to base 2, padding with leading zeros to fill 8 bits. The letter e is decimal 101, which is 64 + 32 + 4 + 1, so the bits in positions 6, 5, 2, 0 are 1: 01100101.

Place values inside one byte
bit 7 128
bit 6 64
bit 5 32
bit 4 16
bit 3 8
bit 2 4
bit 1 2
bit 0 1

ASCII: the original binary to text map

ASCII assigns numbers 0–127 to characters. Codes 0–31 and 127 are control characters — tab, line feed, carriage return, escape — not visible glyphs. Codes 32–126 are the printable set: space and ! at 32–33, digits 0–9 at 48–57, uppercase A–Z at 65–90, lowercase a–z at 97–122, common punctuation in between.

One quirk worth remembering: uppercase and lowercase versions of the same letter differ by exactly 32 (which is bit 5). That is why early systems could change case with one bitwise operation. Standard ASCII uses only 7 bits, but bytes are 8 bits wide, so the high bit is always zero. “Extended ASCII” encodings (Windows-1252, ISO 8859-1, MacRoman) use that 8th bit for accented letters, but the mappings disagree.

! “Extended ASCII” is not one thing

Codes 128–255 are not standardised. Windows-1252 puts the Euro sign at 0x80; ISO 8859-1 leaves that slot undefined. A file written in one extended ASCII and read in another produces mojibake (garbled characters). UTF-8 was designed to end this fragmentation by giving every character one unambiguous binary representation.

UTF-8: binary to text for the whole world

UTF-8 was designed by Ken Thompson and Rob Pike in 1992 and is now used by roughly 98% of websites (W3Techs, 2024). It is a variable-length encoding: ASCII characters use 1 byte, most European accented letters use 2 bytes, CJK ideographs use 3 bytes, emoji and other supplementary characters use 4 bytes. The leading byte’s top bits announce how many bytes follow.

The genius of UTF-8 is backwards compatibility: every valid ASCII file is also a valid UTF-8 file. The leading bit of an ASCII byte is always 0, and UTF-8 reserves that pattern for single-byte (i.e., ASCII) characters. Multi-byte UTF-8 sequences always have a leading bit of 1, so they cannot collide with ASCII. This made UTF-8 deployable with zero migration cost.

  • 1 byte — U+0000 to U+007F, exactly the ASCII set
  • 2 bytes — U+0080 to U+07FF, covers most Latin extensions, Greek, Cyrillic, Arabic, Hebrew
  • 3 bytes — U+0800 to U+FFFF, covers CJK ideographs and most other living scripts
  • 4 bytes — U+10000 to U+10FFFF, covers emoji and rarer scripts
  • Leading byte0xxxxxxx (1B), 110xxxxx (2B), 1110xxxx (3B), 11110xxx (4B)
  • Continuation byte — always 10xxxxxx

Common binary to text mistakes

The first error is byte length. Every byte must be exactly 8 bits. A token like 0110010 (7 bits) is not a valid byte and produces an error. If you copy binary from another source and the spacing is wrong, the converter cannot guess where one byte ends and the next begins.

The second error is encoding choice. A byte above 127 is meaningless in standard ASCII. If your binary contains values 128 or higher and you select ASCII, the decoder fails. Switch to UTF-8 and the same bytes are interpreted as the start of a multi-byte sequence. The converter above flags the byte that triggered the failure.

Binary to text worked examples

The classic: “Hello” in binary. Five characters, five bytes, 40 bits: 01001000 01100101 01101100 01101100 01101111. Each byte stands for one letter (H=72, e=101, l=108, l=108, o=111). The string is identical in ASCII and UTF-8 because all five letters fall in the 0–127 range.

An accented word: “naïve” in UTF-8. Five characters but 6 bytes (48 bits), because ï takes 2 bytes: 01101110 01100001 11000011 10101111 01110110 01100101. The third and fourth bytes (11000011 10101111) together encode U+00EF, the ï character.

An emoji: the smiley face. The grinning face emoji has Unicode code point U+1F600 (decimal 128,512). In UTF-8 this requires 4 bytes: 11110000 10011111 10011000 10000000. ASCII cannot represent it at all.

Practical uses of binary to text

Binary to text conversion is taught in introductory computer-science courses to demystify how text becomes data. Beyond pedagogy, the conversion shows up in protocol debugging, low-level file inspection, capture-the-flag puzzles, steganography, and any context where a hex dump is too dense to skim. It is also a staple of escape-room and treasure-hunt puzzles.

In real production code, you almost never convert by hand — the operating system and language runtime do it for you. But knowing the mapping helps when something goes wrong: a mojibake file, an unexpected byte order mark, a stray byte that breaks a parser. The converter above is the fastest way to ground-truth a single string when debugging.

FAQ

Split the binary into 8-bit groups, convert each group to decimal, then look up the decimal in the ASCII table. Example: 01001000 = 64 + 8 = 72 = the letter H. 01100101 = 64 + 32 + 4 + 1 = 101 = the letter e. The string 01001000 01100101 01101100 01101100 01101111 decodes to “Hello.”
ASCII covers 128 characters (codes 0-127), UTF-8 covers all 1,112,064 Unicode characters. Plain English text encodes identically in both because UTF-8 is backwards-compatible with ASCII for codes 0-127. Accented letters, Cyrillic, Greek, CJK ideographs, and emoji require UTF-8’s multi-byte sequences (2-4 bytes per character).
Exactly 8 bits. One byte can represent 256 distinct values (0 through 255, or binary 00000000 through 11111111). ASCII uses the lower half (0-127); extended 8-bit encodings such as Windows-1252 and ISO 8859-1 use all 256 values. UTF-8 uses 1, 2, 3, or 4 bytes per character.
01001000 01100101 01101100 01101100 01101111 — 5 bytes, 40 bits total. Each byte stands for one letter: H=72, e=101, l=108, l=108, o=111. The same binary works in ASCII and UTF-8 because all five characters are below codepoint 128.
Concatenate the 8-bit bytes: 0100100001100101011011000110110001101111 is the same as 01001000 01100101 01101100 01101100 01101111. A decoder splits the string back into 8-bit groups from left to right. The total length must be a multiple of 8.
As a 4-byte UTF-8 sequence, because most emoji have Unicode code points above U+FFFF. The smiley face (code point U+1F600) is 11110000 10011111 10011000 10000000. ASCII cannot represent emoji because ASCII tops out at code 127.
Because the byte (8 bits) is the standard storage unit on modern computers. The 8-bit byte became standard with IBM’s System/360 in 1964. Eight bits give 256 values, enough to cover the full Latin alphabet, digits, punctuation, and control characters with room to spare. Networks, files, and CPU registers are all designed around byte-sized units.
17 June 1963, by the American Standards Association (now ANSI). ASCII (American Standard Code for Information Interchange) replaced dozens of incompatible proprietary encodings used by IBM, Burroughs, and other manufacturers. A 1969 federal procurement rule required US government computers to support ASCII, which cemented its adoption.