Article — Compression Ratio Calculator
Compression ratio calculator: file size reduction explained
Compression ratio is the original file size divided by the compressed file size, expressed as X:1. A 100 MB file compressed to 20 MB has a 5:1 ratio, equivalent to 80% space savings. Modern video codecs routinely achieve 50:1 or higher; lossless archivers like ZIP typically land between 2:1 and 3:1.
The math is simple but the implications are wide: storage cost, network transfer time, streaming quality, and battery life on mobile devices all depend on compression ratios. Understanding what a given ratio means — and what it cannot mean — is essential for evaluating codec choices and storage budgets.
What is compression ratio?
Compression ratio is a dimensionless number that measures how much smaller the compressed output is compared to the original input. A ratio of 1:1 means no compression occurred; a ratio of 10:1 means the output is one-tenth the input size.
The IEEE digital signal processing literature uses the same definition consistently. The ratio can be expressed as either a fraction (original/compressed) or as a percent of space saved, and the two are interconvertible: a 4:1 ratio equals 75% space saved, computed as (1 − 1/4) × 100.
The DEFLATE algorithm — used inside ZIP, gzip, and PNG — is documented in IETF RFC 1951. It combines LZ77 dictionary coding with Huffman entropy coding, the same two techniques that underpin most general-purpose compressors today.
The compression ratio formula
Three formulas cover almost every question.
- Compression ratio = original_size / compressed_size
- Space savings % = (1 − compressed/original) × 100
- Savings from ratio = (1 − 1/CR) × 100
- Bits per byte (after compression) = (compressed/original) × 8
- Inverse: target compressed size = original / desired_ratio
2:1 = 50% saved4:1 = 75% saved10:1 = 90% saved50:1 = 98% saved100:1 = 99% savedLossless vs lossy compression
Lossless compression reconstructs the original bytes exactly. The decompressed file is bit-for-bit identical to the input. ZIP, gzip, PNG, and FLAC are all lossless. Typical ratios are 1.5:1 to 3:1 for general data, higher for highly redundant text or sparse images.
Lossy compression deliberately discards information judged perceptually unimportant. JPEG, MP3, H.264, and AAC are all lossy. They achieve much higher ratios — often 10:1 to 100:1 — at the cost of some quality loss that may or may not be visible to humans.
Typical compression ratios by format
Ratios depend on content, codec settings, and algorithm. Approximate norms:
- ZIP / gzip (DEFLATE) = 2:1 to 3:1 on text; near 1:1 on already-compressed media
- PNG (lossless image) = 2:1 to 3:1 on photographs, much higher on flat graphics
- FLAC (lossless audio) = 1.5:1 to 2:1 on CD-quality PCM
- JPEG (lossy image) = ~10:1 at default quality; 4:1 high quality; 20:1 aggressive
- MP3 (lossy audio) = ~11:1 versus uncompressed 16-bit PCM
- H.264 (lossy video) = 50:1 to 200:1 depending on resolution and quality target
- H.265 / HEVC = roughly 2× H.264 efficiency for similar perceived quality
Compression ratio vs engine ratio
The phrase "compression ratio" appears in two unrelated contexts. In data compression, it measures file size reduction. In automotive engineering, it measures the volume of an engine cylinder before and after compression of the air-fuel mixture — a mechanical efficiency metric.
A modern naturally aspirated gasoline engine runs at roughly 10:1 to 13:1 mechanical compression. A diesel engine, which relies on compression heat to ignite fuel, runs at 14:1 to 23:1. The numbers look similar to data compression ratios but the physics is completely different.
Engine compression ratio describes a physical volume change inside a metal cylinder. Data compression ratio describes a numerical size reduction in a digital file. The math notation is the same; the underlying systems are not comparable.
Why some files cannot be compressed
Shannon's source coding theorem (1948) puts a hard floor on compression: the average compressed length cannot fall below the entropy of the source. Truly random data has maximum entropy and cannot be compressed at all. Already-compressed files (a JPEG, a ZIP, an MP4) are close to maximum entropy from the perspective of a second compressor — running them through gzip a second time typically grows them slightly due to header overhead.
That is also why "re-zipping" a file rarely helps. If the compressor was efficient the first time, the residual redundancy is near zero. Some files actually grow when compressed because of the algorithm's metadata footprint exceeding the few bytes it saves.
For archives that mix many small text files, ZIP and tar+gzip both work well. For media (already lossy-compressed photos, music, video), use a container format like 7z or tar without recompression — you save metadata but not file size.
Compression ratio mistakes to avoid
Three common errors come up around compression ratio. Comparing different content types — text and video are not comparable, since their underlying entropy differs by orders of magnitude. Ignoring codec settings — a JPEG at quality 95 and a JPEG at quality 50 produce wildly different ratios on the same image. Treating high ratio as automatically better — high lossy ratios mean more quality loss, not necessarily better compression. The right metric is quality per bit, not raw ratio.
A subtler trap is comparing compression ratios produced by tools with different default settings. Two ZIP utilities may report identical filename and metadata but use different DEFLATE compression levels, producing files that differ by 10-15% in size. When benchmarking compressors, always specify the level or setting alongside the ratio — otherwise the comparison is meaningless.
Modern operating systems compress filesystem data transparently. APFS on macOS, NTFS compression on Windows, and ZFS on Linux all apply compression in the background. Your visible disk usage is often the compressed footprint, not the raw file size — which is why a quick file size compression ratio check can be a useful sanity check.
One last practical note: compression and encryption interact badly. Encrypted data has near-maximum entropy and resists compression. If you need both, compress first and then encrypt. Compressing already-encrypted data wastes time and produces near-zero space savings, since the entropy is already at the Shannon limit.