What Is a Hash Function?
Hash functions turn any input into a fixed-length fingerprint. Understanding what they guarantee — and what they don't — is the foundation for everything else in this course.
A hash function takes any input — a single character, a gigabyte of video, a JSON object — and produces a fixed-length output called a digest. SHA-256 always produces 64 hex characters. SHA-512 always produces 128. The input size doesn't matter.
Three properties that make hashing useful
Deterministic. The same input always produces the same output. Hash "hello" a million times and you get the same 64 characters every time. This is what makes hashes useful for verification.
One-way. Given a hash output, you can't compute the original input. There's no "unhash" function. You can only verify a guess by hashing it and comparing.
Avalanche effect. Change one bit of the input and roughly half the output bits flip. "hello" and "hellp" produce completely different hashes with no resemblance to each other. This makes it impossible to infer anything about nearby inputs from a known hash.
What hashes actually guarantee
Hashes guarantee integrity, not secrecy. If you download a file and its SHA-256 hash matches the one the server published, the file wasn't tampered with in transit. That's the use case they were designed for.
What they don't guarantee is confidentiality. A hash is public. Anyone can compute SHA-256 of any input. If you hash the word "password" and store that hash, anyone who knows you used SHA-256 can hash "password" themselves and see if the outputs match. This is the core problem with using plain SHA-256 for passwords — we'll dig into it in the next lesson.
Collision resistance
No hash function is perfectly collision-free — two different inputs that produce the same output are theoretically possible. What varies is how hard it is to find one on purpose.
MD5 and SHA-1 have known collision attacks — researchers can deliberately engineer two different inputs with the same hash. This is why you shouldn't use them for integrity checks. SHA-256 has no known practical collision attacks. The birthday attack bound (finding any collision, not a targeted one) for SHA-256 requires around 2¹²⁸ operations, which is not achievable with any foreseeable hardware.
Where hashes show up in practice
Git uses SHA-1 (transitioning to SHA-256) for every commit, tree, and blob. When you run git log, those 40-character strings are SHA-1 hashes. Two commits are identical if their hashes match. Two commits with the same hash but different content would be a successful collision attack against Git.
TLS certificates include a hash of the certificate contents, signed by a certificate authority. Your browser verifies the signature and the hash to confirm the certificate wasn't altered.
File downloads: any reputable source that distributes binaries publishes SHA-256 checksums alongside them. After downloading, you hash the file locally and compare. If they match, the file is intact.
Try it
Hash the same string twice in the tool below — notice the output is identical. Then change one character and observe how completely different the output becomes. That's the avalanche effect in action.