How the Web Actually Works: HTTP from the Ground Up

I’ve been going through Jim Kurose’s networking lectures lately, and I kept finding myself pausing to re-read the same sections. Not because they were confusing - because things I’d been using for years were finally clicking into place.
So here’s me writing down what I learned, in the order it actually started making sense.
What even is HTTP?
HTTP stands for Hypertext Transfer Protocol. It’s the application layer protocol the web runs on.
Two sides: a client (your browser) and a server (some machine somewhere). The clients requests stuff. The server sends it back.
What moves between them are called objects: an HTML file, a JPEG, a Javascript bundle, a video. A single webpage is usually a base HTML file that references a bunch of other objects, each with its own URL.
Simple enough.
HTTP doesn’t handle its own connections. That’s where TCP comes in.
It hands this job to a lower-level protocol called TCP. When you type in a URL, your browser opens a TCP connection to the server on port 80, and then asks for the file.
Opening a TCP connection isn’t free. It takes a round trip - your machine says hello the server says hello back and then you can actually talk. That’s one full round trip time (RTT) just to shake hands. Before a single byte of your webpage arrives.
So every HTTP request carries at least 2 RTTs of overhead one for the handshake, one for the actual request and response.
Do that 20 times and you’ve spent 40 RTTs before the page renders.
HTTP/1.0 vs 1.1: one tiny change that changed everything
Version 1.0 was simple but painful. Open a TCP connection, fetch one object, close the connection. Repeat for every single image, CSS file, or script on the page.
That’s 2 RTTs per object, every time.
Then came HTTP/1.1. And the change was simple: instead of closing the connection after each response, the server just leaves it open.
Open once, fetch everything you need, then close. That cuts subsequent requests from 2 RTTs to 1 RTT each. For a page with 20 objects, that’s not microseconds that’s hundreds of milliseconds real time saved.
And in the world of user experience, hundreds of milliseconds is the difference between this feels snappy and this feels slow.
What HTTP messages actually look like
Here’s something I love about HTTP: the messages are plain text. You can read them with your eyes. That was a deliberate design choice.
A request looks like this:
GET /index.html HTTP/1.1
Host: www.example.com
User-Agent: Mozilla/5.0
Connection: keep-alive
First line tells you everything: method, path, and version. Then headers. For POST requests (like when you submit a form), there’s a body after all this where the actual data lives.
A response looks like this:
HTTP/1.1 200 OK
Date: Sun, 14 Jun 2026 10:00:00 GMT
Content-Type: text/html
Content-Length: 4821
<html>...
Status line, headers, then the actual content. Status codes tell you what happened: 200 means it worked. 301 means the page moved and your browser follows it silently, which is why you sometimes end up on a different URL than you typed. 404 means it’s not there. 500 means the server just gave up.
One method worth knowing: HEAD. It does everything GET does but without the actual body.
HTTP is stateless. Cookies exist because of that.
The server forgets you the moment your request is done. Every. Single. Time.
That’s intentional. Stateless servers are simpler to build and easier to scale. But it creates an obvious problem: how does Amazon know you’re still logged in when you click from one page to another?
The answer is cookies.
When you first login, the server generates a unique ID and sends it back to your browser:
Set-Cookie: user_id=8273645
Your browser saves this. On every future request to that domain, it automatically sends:
Cookie: user_id=8273645
The server looks up that ID in its own database and now remembers who you are.
The crucial insight here: the state lives on the server, in a database. The cookie is just the key. That’s why you can clear your cookies and log out - you’ve thrown away the key, even though the server still has the data.
The four HTTP methods
GET: The default. Fetch stuff. If you need to send data, it goes in the URL after a ?:
GET /search?q=http+tutorial HTTP/1.1
POST: Data goes in the body, not the URL. Cleaner for sensitive stuff, doesn’t get saved in browser history.
HEAD: Like GET but no body. Just the headers. Great for checking if a file exists or like cache validation.
PUT: Upload a file. Replaces whatever’s at that URL with what you’re sending.
The thing about status codes
The status code is the first thing in every server response. Some you’ll see constantly:
Code Meaning 200 OK Everything worked, here’s your file 301 Moved Permanently It’s somewhere else now, check the Location header 400 Bad Request You sent something the server couldn’t understand 404 Not Found It’s not here 505 HTTP Version Not Supported You’re using a version this server doesn’t speak
301 is sneaky. It’s what happens when you click a link and end up somewhere totally different, and your browser just follows the new URL without telling you. I used to think that was magic. Turns out it’s just a status code and a header.
My actual takeaway
The thing that surprised me most was how simple the base protocol is. Request → response, stateless, over TCP. Everything else HTTPS, cookies, HTTP/2 multiplexing, WebSockets is built on top of this.
If you've been writing backend code without really thinking about what's underneath, it's worth a few hours here. It makes a lot of other things click.
I’m learning this stuff in public because writing it down helps it stick. If you spotted something wrong, call me out that’s how we both get better.
I share the process on X.




