Networking for Linux Programmers
(I must apologize, all of this is going to be terribly condensed and omit large parts of history to make it fit into a westernized telling of events that ignores developments which may have occurred earlier or more popularly in other countries but do not form a cohesive English narrative)
Morse Code and the Teleprinter
Current LoopsThe Morse system, more or less, forms the basis for all modern digital signaling protocols. As designed and refined in the mid 1800s, it's a very manual process.
The next step up from the Morse telegraph was invented in 1887ish, was the teleprinter/teletype machine. These still come long before computer systems, and were invented as electro-mechanical systems (like pinball machines).
The teletype, which still lives on in Unix terminology as the TTY, was the first commercial method for automatically delivering text across a wire
Fran Lab TTY Short Connecting TTY to Linux
The teletype/teleprinter machine came as a pair of two devices, an encoding machine with a typewriter-style keyboard (the teletype) and a decoding machine with a typewriter-style ribbon and paper (the teleprinter)
In its original popular form, the teleprinter system used a binary code with fixed-width signaling blocks called the Baudot code. Unlike morse where each character can have any number of parts (and more common characters get shorter sequences) separated by periods of silence / no voltage, the Baudot code used by teletype machines used a fixed 5-component binary code where two machines with synchronized clockwork would check the signaling line at an agreed period, to see whether the it had voltage on it or not. If the line was powered with voltage, it would be considered a "1" for the binary code. If it was not powered, it would be considered a "0" for the binary code.
Imagine a teletype/teleprinter pair, where they've had their clockwork set up to synchronize to check the signaling line once per second. In reality, these systems were run to check multiple times per second, but once per second is an easy demonstration value. If I wanted to write "OK" in this system I could send the sequence of ones and zeros over a 25s period, one bit per second:
11111 - Figure Shift (to shift to the "figure" set of ribbon strikers) 00101 - BEL (to make an audible sound and alert the operator of my incoming message) 11011 - Letter Shift (to shift to the lettering set of ribbon strikers) 11000 - O 01111 - K
Baudot code was used for ages, alongside other less standardized signaling codes, but much later there was a concerted effort to replace the 5-bit code with a standard 7-bit code in the 1960s. This new code, developed by the American Standards Assocation, was named the American Standard Code for Information Interchange (ASCII). Its first commercial use was in the Teletype model 33 and 35 released in 1963 and promoted by the Bell company
The ASCII system, rather than shifting between two sets of character banks, had a set of characters where each was uniquely identified by a single 7-bit code. They added a bunch of new control characters (meant to more precisely control the teleprinter) and symbols.
As a standard system for sending text to paper that used digital signaling, the teleprinter was of high interest to the engineers who developed early digital computers in the 1950s.
The teleprinters were used basically unmodified, but the teletype machines had their keyboards replaced with computer interfaces so that the timed pulses for each symbol could be sent automatically by computer program (e.g. read by punch card) rather than by a human operator.
Through this method, computers were able to produce textual output to paper logs. This was a standard interface for computer operation for decades really until the invention and cost reduction of newer interfaces.
Unix systems were designed around the teletype machine. The first text editor developed for Unix systems, ed, was designed to be used with one of these teletype/teleprinter systems
The teletype keyboard's signals were hooked into a computer system as a readable input, and the computer's timed and synchronized outputs were wired to a teleprinter at the desk. Generally, both the typing keyboard and printer were put into a single physical physical unit.
The ed editor was, in many senses, more of a computer protocol than a usable computer application. On the surface, ed is for document editing, but in reality ed defines an early computer signaling protocol. More friendly to computers than people, the ed software defined a series of binary characters that could be used to encode modifications to a text document. Here's a sample from Wikipedia:
a ed is the standard Unix text editor. This is line number two. . 2i . ,l ed is the standard Unix text editor.$ $ This is line number two.$ w text.txt 63 3s/two/three/ ,l ed is the standard Unix text editor.$ $ This is line number three.$ w text.txt 65 q
Where a
marks to start appending text into a blank document, .
on a line by its own
ends the input, 2i
means to insert a new content at line index 2, ,l
means to list out
the contents of all lines of text in memory, w file.txt
means to write the text from memory into
file, etc.
Notably, ed introduced the first regular expressions as a way to edit existing lines of text without having to rewrite them entirely, which would have been both slow and a waste of typewriter ribbon.
3s/two/three/ being the command to replace the text "two" with "three" on line 3 of the document in memory
When you have one computer and a teletype machine with teleprinter, you can interface with with the computer yourself. When you have two computers set up to use teletype machine interfaces, you can literally plug the output from one into the input of another. This allows one computer to, for example, operate ed on another computer.
Once you've plugged two computers into each other in this manner, you are still using a "TTY" interface, but there is no actual teletype machine in the connection. It's just computers talking to each other.
These kind of interfaces are, more generically, called serial connections. They're named as such because they send data between computers "serially" in the sense that one bit of information is passed per unit of time, in some serial order to produce meaning.
Serial
Serial connections, at their most basic, only require exactly what a telegraph requires. Just a loop of wire between the two computers. Maybe a second loop if you want bi-directional communication. Modern serial interfaces like RS232 usually include a few extra wires that help to signal when computers want to talk to each other instead of blindly emitting data at each other, but those are by no means necessary. Many computer systems still use what's effectively a fancier Baudot code called UART
For UART, two computer systems have to know these things in advance:
- What voltage level will be used to signal on/off. If you get this wrong, you fry the physical circuitry. With a telegraph, the worse you could do here is burn out a lightbulb or a buzzer, but with fancier computer systems you burn out your vacuum tubes or transistors.
- What will the synchronization rate be. Today, we call this the Baud Rate, short for Baudot modulation rate
- Data bits size (the original Baudot code used 5-bit data)
- Stop bits size (like the Morse code rest period between letter signals)
- Some kind of flow control agreement (to prevent a computer from sending data faster than the receiving computer can process it)
Here is an example UART serial message with 8 data bits:
The "pulse" line is synchronized between the two computers, either by them literally sharing an extra wire with a single clock on it, or by the receiving computer resetting its own internal clock when a serial message starts. The important part is that both sides need to have the same pulse frequency (typically, 9600 baud aka 9600 bits per second) so that they will send and sample at the right times to be understood.
Here, there are two "stop bits". During stop bits, the line is kept electrified "high". So while nothing is sending, the line is high all the time. The receiving computer knows it will begin receiving a message when it sees a "start" bit where the line goes "low" and no longer has voltage on it.
So it goes low for one 9600th of a second to signal the message will begin. Then, because the data bits size here is 8, it will for the next eight 9600ths of a second send the data by either apply or removing voltage from the line in timing with the pulse.
Then it will set the line back high with voltage for at least two 9600ths of a second because the stop bits size is 2 in this example.
To be clear, if the baud rate low enough you could do this manually with a telegraph key. "High" means the key is pressed
So, by this UART method or some similar serial interface, two computers can send little data payloads to each other. I want to stress here, ASCII is a 7-bit standard and it was very common for these serial interfaces to be configured to have 7-bit payloads that match up 1:1 with ASCII encoding. The standard of eight bits to the byte wasn't set until 1975 when Intel's microprocessors like the 8080 and 8086 took off.
It wasn't always ASCII though. IBM had their own text encodings called BCDIC which used 6 bits per character, and then later EBCDIC which used 8 bits per character. These were fairly popular at the time too.
Video Terminals
As computer hardware developed and we were able to have things like vacuum tube displays (CRTs and the like), a device was invented called the video terminal.
The video terminal is a device like the teletype, but instead of printing to a literal sheet of paper it takes the characters sent to it and "print"s them onto a screen.
Serial video terminals like DEC's VT100 were incredibly popular. Looked like this
It was the introduction of these video terminals, especially serial video terminals, that started to move computers away from paper. It saved a lot of paper and typewriter ribbon.
Computers were gigantic, and massively expensive. Mainframe computing was the popular choice because it literally was not possible to have a computer at every person's desk. Instead, one mainframe computer would have an entire room dedicated to it and it would have several of these serial terminals sprinkled around the office. They were the original chromebooks/"thin clients" -- can't do anything on their own, but hook them into the mainframe and you had everything at your fingertips
Just have to run the serial cables around the place.
Personal Computers and Cassette Tapes
AAAaaaanyways. It started that we had to use these "thin client" computer terminals because it was not possible to put a computer at every desk. Or any desk really, given how massive they were. But as computers became smaller, lighter, and less expensive to the point they could fit near a desk, we had the personal computing revolution.
The earliest personal computers had no networking of any kind really. The only input and output devices were the keyboard and the display.
Better personal computers supported a serial interface to casette tape drives
Casette tapes, and all the magnetic tape used before it in more industrial settings, can encode electrical signals as the orientation of magnetic particles on the surface of a plastic tape. Mostly this is used with a microphone and speaker to encode analog audio data, but computers can deposit digital electrical data directly with no audio step in the middle.
Like in the UART example, the "high" spots would be put on the tape by strongly magnetizing the particles, and the "low" spots would be put on the tape by demagnetizing the particles or negatively magnetizing the particles.
After the electrical signal has been put onto the tape as magnetic fields, that very tape can be physically moved past a coil of wire to generate a current in that wire and reproduce the original signal. This process of encoding and later decoding the magnetic fields does result in a much lower output voltage, so an amplifier is generally used to bring the signal back up to the original voltage level so that (for casette audio anyway) it will be audible over a speaker set.
Did you ever see the 1983 movie War Games?
It's a good movie which would contextualize a lot of what I'm talking about, but for the purposes of explaining magnetic tape and computer systems I'll just describe the scene I'm thinking of.
A 1980s casette deck is, in principle, a device that records electrical signals (e.g. from a microphone) and then reproduces those electrical signals (e.g. into a speaker set). While the original telephones wired microphones directly to speakers, the invention of tape recording and its predecessors like wire recording (same idea, but used thin wire instead of the iron-on-plastic-tape that casettes are known for) allowed the microphone signal to be captured and used again later. But the technology does not have to record audio sources, it can record any changing electrical field and store it long-term as a magnetic field.
In the film, high schooler David Lightman finds himself having accidentally hacked a United States nuclear warhead control facility after connecting to it under the false belief that the system belonged to a computer games company. He was trying to get access to unreleased video games
At one point in the film, David finds himself locked in a secure room at the facility which has an electronic door access system. Using his pocket-sized portable casette deck, he taps into the wiring of the electronic access system and uses the casette to record the electrical signal that his captors send to unlock the doors.
Later, he reverses the wires so that instead of recording, the cassette deck allows him to play back the recording.
Because these systems were (and many systems actually still are) rather primitive, simply re-playing the same electrical signaling commands to the door system made it open right up. This kind of thing, called a "replay attack" allowed him to escape. These days, replay attacks are still used to do things like break into cars by replaying the radio signals sent by car key fobs.
Anyways, The point I'm trying to illustrate is that electrical signaling and audio systems are fundamentally compatible.
At least for cars, all cars made in the last like 30 years have mitigations against the attack. But some of them don't work properly, so people can use tools like the Flipper Zero (or any radio circuit really) to still do the attacks so long as they know how the car manufacturer did it wrong. That it's possible even in principle though is what made Canada ban the Flipper Zero outright
Anyways, by the time that personal computing was widely available, we already had tons of ways to distribute sound. Television, radio, telephone, cassette tape, vinyl. All of these have been used to store and transmit computer data (even vinyl records).
There used to be a promotion where you could get free software for your personal computer through what was essentially a radio advertisement that ended with the sounds needed to reproduce the programming once it had been recorded to a casette and then played back. Real nerdy stuff.
Telephone Modems
It was in the 1970s and later the 80s that a critical peice of technology (for computer networking) was invented and popularized. That technology being the telephone modem. A modem, which is short "Modulator Demodulator" has the purpose of taking an electrical signal and turning it into an audio signal that can survive transmission across whatever systems it might be sent over. The telephone modem was invented to shift the naive electrical signals around until they produced signals in the same frequency ranges as human voice. This is because telephone systems were optimized for those frequencies. With the use of a telephone modem, you can take your fancy personal computer's casette deck connector and instead of plugging it into a casette deck, you can send those signals all the way over a phone connection.
The earliest telephone modems were extremely simple. They had their own microphone and speaker in a "cradle", and you would set the handset from your old timey desktop or wall mounted telephone into that cradle. The modem would then make its audible sounds to the phone like it was actually talking (but mostly it just made really awful sounding screeching noises).
The scheme worked like this:
- You dial your friend's phone number
- You tell your friend "I want to send you some data" e.g. a text file with some ascii art nudes
- You and your friend run an application like ZMODEM and configure all the serial parameters (baud rate, data bits, etc)
- You and your friend both place your phones on your own telephone modems.
- You let the computers send the data serially, one byte at a time. For "large" files this could take hours.
- You hope the phone call doesn't get disconnected
This setup doesn't use literally the same structure as UART but it follows the same principles. Your computer wiggles the voltage "high" and "low", the modem makes sure that signal survives the phone line, then the other computer watches it go "high" and "low" so it can decode the binary data
The usage of telephone modems wasn't limited exclusively to transmitting files. It was also used (and in fact this use case came much earlier) for transmitting live TTY data. You could take your fancy video terminal teletype machine like the DEC VT100, make a phone call to a computer across town, set the phone into your serial modem's cradle, and allow the video terminal direct access to that computer. This was often done to access some type of mainframe system, like a Bulletin Board System. A Bulletin Board System was, more or less, a computer whose basic purpose was to allow people to sign-in and post on the very first message boards. You could send a message to one of these boards, then someone else who calls the BBS later can log in and read the message from that board.
OSI Model
Have you ever heard of the OSI model?
Right now, everything we've talked about is at layer 1) Physical
The phone lines carry bits that go high and low
The serial line carries bits that go high and low
It's how you arrange those bits that you get to layer 2: Data Link
OSI Model Layer 2: Data Link
So far, we've been using serial to send raw characters back and forth over a wire. That's data, but it's not really much of a link.
Early computer networks had wild link designs. One of the most popular before Ethernet came along was called token ring
A serial line makes it easy for two computers to talk to each other, but if you add a third computer it gets wildly complicated.
Token ring was one solution for how to control three or more computers that all had to talk to each other. The general premise was that each computer in the network had two lines that would carry serial data. One of those lines would go to the previous computer in the loop, and the other line would go into the next computer in the loop.
I'm not sure, does your culture have the concept of a "talking stick" used to moderate conversation in a group?
https://en.wikipedia.org/wiki/Talking_stick this thing
A talking stick, also called a speaker's staff, is an instrument of Indigenous democracy used by a number of Indigenous communities, especially those in the Pacific Northwest nations of North America. The talking stick may be passed around a group, as multiple people speak in turn, or used only by leaders as a symbol of their authority and right...
Token ring network architecture used a virtual talking stick, called the token, which was a signal that was passed around the ring. Whoever had the token was allowed to send a message, which would get passed along computer to computer until it reached the destination computer.
Once it was done talking, it would signal to give the token to the next computer in the loop.
This architecture was popular since it was easy to conceptualize, but it was hated because if any one segment the loop was broken the entire network stopped working.
The joke has been made from time to time about people "dropping the token". It's funny because it's something that really happened and made everyone upset.
Ethernet
Because token ring was so awful to work with in the practical sense, another data link layer was designed which was termed Ethernet. Literally, a network design where you shout your data into the aether. I'm not sure if that's an idiom used where you're from, but here it means you're just shouting and don't know whether anyone is listening or not.
Instead of connecting computers into a ring, Ethernet worked by connecting all the computers to a single really long cable. Two cables really, one for transmission and one for reception. Scratch that, the separate lines didn't come until a little bit later in the story.
Ethernet works like this:
Originally, Ethernet was a standard 10BASE5 lovingly called Thicknet because the cable was thick. The 10 stands for 10Mbit/s bandwidth, the 5 stands for 500 meters in length. An Ethernet network was a 500 meter long thick coaxial cable (about 1CM thick). Each computer wires into the single cable using a "vampire tap" transciever, called that because it literally bites into the side of the cable to pierce the insulation and reach the center conductor.
Everyone both transmits and receives messages on the single cable, almost as if it was a serial device but with one major twist. The twist is that Ethernet introduced data framing and retransmission.
Data framing allows for multiple bytes of data to be transmitted at once, and those bytes are tagged with a source address and destination address. These addresses, Media Access Control addresses (MAC addresses) are uniquely stamped into the firmware of the ethernet cards on computers in the network. Two computers on a network must never share a MAC address. For the most part, no two computers manufactured by licensed companies can have the same MAC, but unlicensed hardware tends to reuse addresses or make them up entirely leading to conflicts. This doesn't happen often, but it has been known to occur when ordering equipment in bulk from unscrupulous manufacturers.
Retransmission is what lets everyone talk on the same wire -- Ethernet cards can detect if another computer on the line tried to talk while it was in the middle of talking. When that happens, both computers that were trying to talk take a time-out for a random length of time, then try again to send their message.
Here's a simple ethernet frame format. When your computer tries to talk Ethernet, it follows this format. First 62 bits, it synchronizes the systems for the upcoming message. Then two bits to mark the start of the message. 48 bits encode which MAC you're sending to. Another 48 to encode which MAC you're sending from (so the destination device can reply). Then 16 bits to encode the length, then a bunch of data, then finally a "check sequence" which contains a check hash of the data to ensure that it wasn't corrupted in transmission.
This is one of those ethernet vampire taps, which bites into the cable (here, a yellow jacketed cable is seen) and adapts to a common serial port.
It reminds me of how old ranchers used to hook telephones up tho their barbed wire fences
Since barbed wire is wire, it carries a phone connection just as well as a regular phone line. And since the fence goes all the way around the property, it connects you to your neighbors.
Anyone who wanted to be part of the barbed wire phone network just had to attach their fence to their neighbor's.
Granted, that formed a party line not a switched phone line. Anyone who picked up their handset was just in a group call with everyone else on the line.
That's the comparison to Ethernet though: with Ethernet, everyone was talking on the same line. Originally anyways
Ethernet is a lot like Serial. That 10BASE5 connection used 1 volt to signal high, 0V to signal low. All 10BASE5 devices communicated at a data synchronization rate of 10Mbit/s.
Ethernet, the data link OSI layer, went through several OSI physical layers before it settled on the cables we use today. After Thicknet came Thinnet which used cables like these
10BASE2, 10 because it's still 10Mbit/s, 2 because the thinner wire reduced the maximum length to about 200 meters. They called it Thinnet, or Cheapernet becaue it was cheaper than Thicknet.
After Thinnet came 10BASE-T, which is the precursor what we use now. 10BASE-T, called Ethernet over Twisted Pair. This introduced two revolutionary shifts which were vital to increasing both the bandwidth of Ethernet, and the security of Ethernet.
The engineers who cooked up this stuff are smart as hell. Ethernet over Twisted Pair typically uses the 8p8c connector design (8p8c standing for 8 position 8 contact). With 8 positions/contacts, that makes 4 pairs. The pairs are twisted not only to avoid external interference, but are also themselves twisted at slightly different rates to avoid it interfering with itself. It's great
At the 10Mbit/s and 100Mbit/s data rates, Ethernet only actually utilizes two pairs (one for transmission, one for receiving). The other two lines were originally meant to enable a single cable to carry both the network connection and phone connection for a desk at an office, but it was those extra pairs that eventually unlocked gigabit+ speeds.
That's the second revolutionary shift: With 10BASE-T, Ethernet now has separate transmit and receive lines instead of one line pulling double duty.
Unlike 10BASE5 and 10BASE2 where everyone just clamped onto a single shared line, 10BASE-T requires a "hub" which takes your transmit line, and connects it to everyone else's receive line. Similarly, it takes everyone else's transmit line, and connects it to your receive line.
By this method, they turned Ethernet from a big jumble of everyone on one wire, into a so-called "star topography" where everyone has to connect to a central hub in order to talk to each other.
Switching
The final bit of innovation that took us from what is fundamentally fancy Serial into modern networking was called switching.
How much do you know about telephone exchanges?
Have you ever heard the story of the guy who built the first automatic telephone exchange?
And the invention of the rotary phone?
Alright. So, back when all phones were landlines and they ran on what we called the "plain old telephone system" POTS, telephones lines were in essence just a big loop of wire that the telephone provider company kept a 48v carrying voltage on.
When you'd pick up your phone, the telephone provider had a physical board with a bunch of lights and sockets on it. Picking up the phone completes the loop of wire, making a light glow on the switchboard so the operator (a person trained in switch-board operation) could see the light, plug their own handset into the socket, and then talk to you.
They'd ask you who you wanted to talk to, and then use a phone book to look up their name and find the socket that goes to the loop of wire for their phone set
Then the operator would use an audio patch cord to jump the connection from your socket to their socket.
Now, I imagine that most people don't understand why an Ethernet switch is called that.
It's because of those telephone switch boards!
A light switch completes a circuit to a lightbulb. A telephone switch board completes a circuit between two telephone sets.
Now, the story goes a bit like this:
Funeral directory Almon Strowger was having trouble with his phone service. It was his belief that this was interference by the competing funerary company in town. The owner of the rival company had a wife, and that wife worked at the telephone company as a switchboard operator. The accusation from Strowger was that the rival company's owner's wife was switching phone connections to her husband's company whenever the customers were asking to talk to Strowger's company. Now, Bell Telephone Company denies this was ever the case but that's neither here nor there.
In response to this perceived slight, Strowger swore he'd put his rival's wife out of a job by inventing a device to take over her job. That way, nobody could interfere with connections. After years of development, he patented a device that could automatically connect two phones by a series of buttons which sent signals down the phone line. He gave each phone a number 0000-9999. The original design had one button per digit column, so to call number 3251, you would have to press the first button three times, the second button twice, the third button five times, and the last button once. Then the automatic switchboard would connect your call to whatever socket was assigned that number.
This was long before computers, operating by some kind of electro-mechanical device that I don't really understand. But the basic operation was set. By sending a series of signals down a telephone line, people could now have their phone connections switched up to the right destinations automatically.
Not too long after the invention and implementation of the electronic switchboard, the rotary phone was invented to simplify the dialing process for any length of phone number. Not just the original four digits.
The operation of the rotary phone is actually pretty similar to his original button design. When the user rotates the dial to a number, it causes the phone to pulse a signal down the line however many times for whatever number that you rotated it to. For example, by rotating the dial to the number 5, the rotary phone would send five pulses down the line as the dial automatically returned to its starting position by spring
So to dial that same 3251 number, it would pulse three times followed by a short pause, then you would have it pulse twice, then a short pause, then it would pulse five times, then a short pause, and then one final pulse. Then the phones would be automatically connected.
The point is, ethernet networks borrowed the idea of a switch from telephone networks. Computer systems were already taking advantage of automatic telephone switchboards by way of serial telephone modems with an auto dialer feature. By using a telephone modem with an auto dialer, a computer could already connect itself to another computer as long as that other computer had a dedicated phone number and phone line assigned to it.
The analogy fits almost one to one. Where in the telephone switchboard, two devices are connected by phone number, in an ethernet switch, two devices are connected by MAC address.
Where an Ethernet hub is like a telephone party line, an ethernet switch is like a telephone switch board.
The key difference here is that while a telephone switchboard switches the entire connection, an Ethernet switchboard only switches one frame of data
So that would be the serial payload that has the synchronization preamble, the start bits, the destination MAC, The source MAC, the data bytes, and the hash.
By switching just these messages, the ethernet switch prevents messages meant for one device from ever reaching another device.
This does improve the security of the network marginally, but importantly it also increases the performance of the system significantly because computers no longer have to filter out messages that were not addressed to them.
At the original ethernet data rate, even a slow switch didn't have a huge impact on performance because it was artificially limited by everyone's clocks having to stay in that slow synchronized speed
These days, ethernet switches are designed to switch in hardware using ASICs, so that there is no actual degradation to system performance. It operates transparently almost as if the switch wasn't there at all.
So, with the introduction of ethernet switches we finally have something starting to resemble a modern computer network. Within one building, ethernet can connect all of the computers efficiently and allow them to send whatever data they want arbitrarily. Across buildings, computers can utilize phone lines and modems to talk to each other. But there's still no solution for cross building communication for entire Ethernet networks. At least there wasn't, until the introduction of Internet Protocol (IP)
Internet Protocol
Although it was possible to go from one computer network to another through a technique called bridging, the US government agency behind the Internet wanted something more robust that addressed the shortcomings of Ethernet alone.
Bridging is a technique where some computer ("the bridge") takes bits it receives on one interface and repeats them on another interface.
A Serial to ethernet bridge at your local University could allow you to take your personal computer, connect to the bridge by a phone connection, and then send Ethernet frames over the serial connection.
The bridge could take those frames from the dumb serial connection and repeat them onto the smarter Ethernet network, where they'll make it to their actual destination.
Then when the destination device replies back, the bridge will take those frames from the Ethernet network and repeat them back over the dumb Serial connection to you.
So like, it's possible to make arbitrarily large Ethernet networks by bridging. So why don't we? Why invent Internet Protocol?
Thicknet and Thinnet only supported a few dozen clients until the line would become too crowded and it became difficult for devices to find time to talk.
But switched Ethernet can support thousands of devices before any real issues occur (under normal operating conditions anyways; someone intentionally bogging down the network can still cause problems).
Switched Ethernet falls short in like three ways:
1.
Switched Ethernet works by the Switch keeping track of which MAC addresses are available on which ports. To work effectively, every switch more or less has to keep a table of which MACs go where. One port can have several MACs associated with it, when switches are plugged into each other.
The ARPA group tasked with developing the Internet imagined a network with Billions of devices, and it wasn't reasonable at the time for a switch to keep track of that many MACs.
2.
Ethernet over Twisted Pair, including both Hub based and Switch based networks, are inherently star shaped. This means that there will always be a "center" node that, if removed, cuts the network into at least two halves that can't talk to each other any more.
To get around this issue, a network has to have redundant connections that can be used if the main connection goes down. In Ethernet, that is called a network loop and network loops cause traffic to get stuck because Ethernet frames tend to just go around the loop forever instead of making it to the correct destination. Along with a few other nuances of Ethernet this tends to take down the entire network. So loops are bad.
Later standards, like Spanning Tree Protocol, have allowed smarter switches to cooperate and break loops automatically. However that's not enough on its own to make Ethernet work in place of Internet.
3.
Ethernet doesn't tell you anything about where a device is only who a device is.
MAC addresses are remarkably like people names. A MAC address has two or three parts. First is the vendor prefix, which analogous to a family name. It is unique to the device manufacturer. Next is that some manufacturers reserve a few bits of the MAC to designate the product line. This is sort of a middle name; not everyone has one. Then finally the rest of the MAC identifies the individual device. This would be like the given name.
With a MAC address, finding someone in the network is like checking census records. Census records are easy to check if you already know something about the target, like they are on a network with only 30 devices / come from a town of only 30 people. But if you're on a network with 5 million devices it suddenly becomes really unweildy to manage the records.
Internet Protocol addresses are fundamentally different. They don't tell you who a device is, they tell you where a device is. They're much more like a home address than a name. Although they do correspond roughly to geographic locations, they are more about position in the network than physical position. Internet addresses are split rather arbitrarily, but as an example the first 8 bits could identify that the address goes to a network at University of Chicago. The next 8 bits could identify that it's in the their Library network. The next 8 bits could identify that it's in the historic records department. And the final 8 bits could identify the specific record keeping machine. These categories are arbitrary, but the point is that at every step of the way the University's network had something to identify where to hand the message off to.
Phone networks had this problem as well. Originally, phone numbers only contained information about which phone it was, not where it was in the phone network. I could have phone 1234, for example. Later they added area codes, so that you could identify the number belongs to phone 1234 in area 456. Then later they added country codes. Who knows, we may get an even more extended phone number format that adds planetary codes one day. The point is, these area codes group together all the devices available in a single telephone exchange. So you know all the devices at area code 123 are reachable through Exchange 123 which is located in a single city, like Minneapolis or wherever
Internet Routing
The United States Department of Defense was most concerned about the second point. They put together a task force under ARPA (Advanced Research Projects Agency) to develop a new kind of network that was highly redundant, because the US government was worried that as computer networks developed to be increasingly important to national security, foreign actors were going to make strategic attacks with traditional and nuclear weaponry that would cripple national communications.
Now, ARPA claims this was not the case, but it's such a compelling story and the government regularly lies about their intentions so who knows:
According to Charles Herzfeld, ARPA Director (1965-1967):
The ARPANET was not started to create a Command and Control System that would survive a nuclear attack, as many now claim. To build such a system was, clearly, a major military need, but it was not ARPA's mission to do this; in fact, we would have been severely criticized had we tried. Rather, the ARPANET came out of our frustration that there were only a limited number of large, powerful research computers in the country, and that many research investigators, who should have access to them, were geographically separated from them.
So out of all these factors, ARPANET was developed and later rebranded into Internet. The "Network of Networks".
The Internet works by creating a new message format for computers to talk to each other, that contained enough information for those messages to not just get sent across local Ethernet networks, but also make their way across networks without utilizing bridging.
Instead, Internet Protocol introduces the idea of routing. Unlike Ethernet where messages have one single defined path through a series of switches, and loops cause major network-crashing violations, Internet Protocol relies on these redundant connections and encourages multiple redundant paths.
Under Internet Protocol, dedicated computers called Routers keep a list of other routers, and a list of what IP address ranges those routers can deliver messages to, and how expensive it would be to utilize that link. Expense here is somewhat arbitrary, but for Internet Protocol specifically we tend to look at expense in terms of milliseconds until delivery (like ping latency).
Imagine three routers , all attached not in a star network, but in a triangular network.
Here, node A would keep a list of routes to C. It would know that it can reach C by link 2. But it would also know that it can reach C by link 3. It chooses link 2 by default because it knows that link 3 takes longer to reach C because it has to pass through B.
Each node here advertises a list of networks it is responsible for. So node A could advertise it is responsible for all addresses starting with the 7 bits 1010101 or 5 bits 11111. Node C could advertise it's responsible for addresses starting with the 8 bits 11101110.
So whenever node A would want to send a message to, for example, Internet Protocol address 11101110.11010101.01010110.10101110 it can check its list to see that node C is responsible for addresses that start the way this one does, and that both links 2 and 3 can reach C but link 2 is the least expensive to use.
The core set of routers on the Internet, that Internet Service Providers hook into, build these lists by using path finding algorithms that are a lot like the ones used in video games. Just like in a video game where too many path finding elements can slow things down, on the Internet only so many routers are actually a part of this core network. We can't hook all the routers in the world in directly or else we end up with scaling issues like with Ethernet. So we treat the few routers very specially. Each ISP typically has just a couple, called Border Gateways since they form a border around each ISP's network and serve as a gateway between networks.
The networks inside border gateways are called Autonomous Systems, because they operate independently with respect to whatever address prefixes they advertise responsibility for. Traffic destined for addresses within the AS never leave the AS or get routed to another AS. There are about a hundred thousand Autonomous Systems currently in operation as the Internet has grown over time, each one advertising the network address prefixes it's responsible for.
Among the Border Gateways, path finding algorithms form all these destination lists and route expense lists automatically so that as the network changes because of outages, the lists can update in real time. These path finding algorithms are implemented by something called Border Gateway Protocol (BGP). BGP is complicated in operation, so it's best to just accept that it works and look the other way.
Here we have a sample Internet that contains only four Internet Service Providers, and only two computers. One is your computer, one is my computer. You're on an Ethernet network with a modem somewhere on the Ethernet network. My PC is attached directly to a modem over Serial. Both of these were relatively common configurations to use Dial-Up Internet with an ISP like AOL (America Online)
Let's say you want to send me something across "The Internet" where everything is set up as pictured here. You would wiggle the voltages on the Ethernet cable coming out of your PC's network card to represent bits in roughly this pattern. The Ethernet frame format is legit, but I fudged the format of the Internet Protocol packet a little because real IP packets have a bunch of extra bits of data that aren't really needed to explain the concepts
// Ethernet synchronization preamble 10101010 10101010 10101010 10101010 10101010 10101010 10101010 101010 11 // Ethernet message begin // Source MAC (Alisa's Computer, 10:20:30:40:50:60) 00010000 00100000 00110000 01000000 01010000 01100000 // 1 0: 2 0: 3 0: 4 0: 5 0: 6 0 // Destination MAC (ISP Gateway's MAC, 11:22:33:44:55:66) 00010001 00100010 00110011 01000100 01010101 01100110 // 1 1: 2 2: 3 3: 4 4: 5 5: 6 6 // Length of data (0x0012 / 18 bytes) 00000000 00010010 // --- Ethernet Data Begin --- // Contains an Internet Protocol message (“a packet”) // Source Address (Alisa's IP Address) 00001111 00010111 00101100 00000001 // 15. 23. 44. 1 // Destination Address (g's IP Address) 00010010 00101011 01000000 00000001 // 18. 43. 64. 1 // Protocol Type: User Datagram Protocol (17 aka 0x11) 00000000 00010001 // Source Port (14459 - usually chosen at random) 00111000 01111011 // Destination Port (23 - common for “telnet” service) 00000000 00010111 // Data Length (2) 00000000 00000010 // UDP Data Hash 01101010 11011001 // Data (“hi”) 01101000 01101001 // h i // --- Ethernet Data End --- // Ethernet frame data hash 10010101 00100110
Aaaaanyways, the point here is that we're layering protocols to allow a message to traverse multiple network types and go from Internet-connected device Alisa's PC to Internet-connected device g's PC
We've stuffed an Internet Protocol packet inside an Ethernet Frame.
With this layering, we've now achieved the bottom 3 parts of the OSI networking model. Posting another OSI Model chart to refer to:
OSI Layer 4: Transport
So, my binary example showed a cut-down version of Internet Protocol's User Datagram Protocol. Where a Telegraph allows you to use Morse code to send Telegrams, Internet Protocol allows you to use UDP to send Datagrams. Basically, any data you want as long as it fits into a single packet, which are normally limited to about a kilobyte in size
The 4th layer of the OSI model, Transport, is implemented by the Internet Protocol's TCP protocol. Together, that is referred to as the TCP/IP networking model. TCP is like UDP, but it adds some additional rules:
Messages relate to each other, to form a single ongoing stream of data.
Messages must be processed in a specific order without any missing parts. TCP packets include a sequence number that allows the receiver to make sure they get ordered correctly.
After a message is sent, the receiving machine must send a message back called an acknowledgement (ACK)
If an ACK is not received for one sent message, that message must be re-sent until the ACK is received
By this set of rules, instead of sending simple size-limited datagrams computers can use Internet Protocol to have an ongoing conversation of arbitrary length
I'm not going to go into how TCP formats its packets because if I'm being honest it's painfully boring information that we can gloss over without really impacting the takeaway
The point here is that TCP/IP takes an Internet connection, and allows you to have an ongoing conversation more or less exactly like a serial connection would. You can send some bits over, and the computer at the other end can send some bits back
This is what birthed telnet, the remote access utility which allowed you to access a computer's terminal over the Internet.
Where a Teletype machine would connect over Serial to send individual characters to a computer's TTY interface, telnet connects over TCP/IP to send individual characters (or groups of characters in order) to connect to a computer's virtual TTY interface.
The implementation of telnet directly addresses ARPA's concern:
Rather, the ARPANET came out of our frustration that there were only a limited number of large, powerful research computers in the country, and that many research investigators, who should have access to them, were geographically separated from them.
When you had to use a telephone modem to dial directly into a research computer's Serial TTY, that phone line and serial port were tied up for your exclusive use. Nobody else could use that connection while you were on the line. Anyone trying to would just get a telephone busy signal.
With telnet though, because it operates over networks where switching happens per-packet instead of per-connection, the research computer can have one link into the network but then serve dozens of connections over that link. Each connection going to its own Virtual TTY.
When you take your TCP connection and use telnet to get a virtual TTY on a remote computer, that connection makes up a session. The 5th layer of the OSI model.
Your terminal emulator which is hosting the telnet session generally handles the 6th layer Presentation, taking the bytes that go across the session and rendering them onto a virtual teleprint video terminal.
And then whatever application is running over that telnet session makes up the 7th layer, Application. For example, sh or bash.
Objectively, the best application to use over telnet was the one hosted at towel.blinkenlights.nl: https://fossbytes.com/watch-star-wars-command-prompt-via-telnet/
Going back to this, it's time for some Internet trivia
Although you can't physically see the routers along the way of this connection, you can virtually see them. The Ping protocol, which is IP protocol 0x01 "ICMP" (as opposed to UDP 0x11 and TCP 0x06), has a feature where your packet specifies "I'm measuring round-trip time but I want the Nth router in the connection to respond instead of the actual destination device"
By pinging and incrementing that "N" value one by one until you reach the actual destination, you can reconstruct the entire path.
This technique is called "trace route", and is implemented on Windows by the tracert tool. On Linux, the most popular tool is mtr "My Trace Route"
E.g., from where I am right now there are 8 routers between me and IP Address 8.8.8.8 (shown here as name dns.google)
You can see the "Avg" time listed for each router gets progressively longer the further down the list you go. Some intermediate connections will report shorter/longer times depending on how quickly they want to respond to ping requests, but it generally trends upwards
In this sample traceroute I sent, we can guess that the first 4 routers belong to my ISP, and then the next 4 routers belong to Google's ISP (which is just Google because they run their own ISP). You can see, the ping latency jumps up significantly between routers 4 and 5.
That's just a guess though. We could measure precisely by looking up who owns the IP address assigned to each of the routers, which is public information, but to be honest I can't be bothered right now.
Tomorrow we can talk about the few remaining technologies that led to the World Wide Web, and eventually VPNs
ARP and DNS
More immediately important to talk about is two related technologies, called ARP and DNS.
The first of these is Address Resolution Protocol (ARP), which is a technology used on Ethernet networks to convert IP Addresses into MAC addresses. By using ARP, your computer can automatically discover that (for example) IP Address 15.23.44.0 corresponds to MAC 11:22:33:44:55:66 on your local network. This is important to make Internet Protocol work effortlessly on Ethernet networks.
Referring back to the network diagram I shared earlier, ARP allows "Alisa's Personal Computer" to configure its IP settings without even thinking about Ethernet MACs. In this network, Alisa's PC would have its IP settings configured so that the "default gateway" was 15.23.44.0. As the default gateway, Alisa's computer would know that any messages not addressed to devices inside the Ethernet network should be sent to that Gateway instead. Alisa's PC's IP settings mark 15.23.44.0 as the default gateway, not 11:22:33:44:55:66. So the MAC must be discovered before traffic can be sent to the gateway, because the Ethernet Switch only works in terms of MAC addresses not IP addresses. (Much later, "layer 3" switches were introduced to have switches that are actually aware of IP addresses, but even now these are expensive and only provide value in niche situations).
This discovery, powered by ARP, works like this roughly:
Alisa's computer wants to reach address 15.23.44.0 but it doesn't know what MAC that corresponds to
Alisa's computer sends an ARP packet in an Ethernet frame addressed to the special MAC FF:FF:FF:FF:FF:FF (called the broadcast MAC)
The Ethernet Switch forwards this packet to all ports on the switch. If your network is set up with a hub instead of a switch, this is the normal behavior.
Whatever device is using IP address 15.23.44.0 will send an ARP response packet back to whatever Ethernet MAC address broadcasted the request. In this case, the ISP gateway attached to the switch by a modem is the one that is using the IP address 15.23.44.0. The response says "I'm MAC 11:22:33:44:55:66 and I'm the one using IP address 15.23.44.0 :)"
Alisa's computer receives the response and caches the pair 15.23.44.0 -> 11:22:33:44:55:66 in a list it keeps (called the ARP table). Next time, it just checks this table to find the MAC instead of sending a broadcast across the whole Ethernet network.
So let's say we have Alisa's house with these equipment inside. Alisa wants to print a document to her printer using Internet Printing Protocol. So she points her document program to print to IPP device 15.23.44.2. Alisa hasn't printed to this device before, so her computer will send an Ethernet broadcast with an ARP request asking "what Ethernet device here has IP Address 15.23.44.2?" and then the printer replies "I'm 12:34:56:AB:CD:EF and I have that IP Address!" Then Alisa's Computer sends the IPP packets as serial-style data across the Ethernet cable, packaging them in Ethernet frames addressed to MAC 12:34:56:AB:CD:EF. The switch takes those serial messages, checks the frames' destination MACs, and then switches the messages to go out of the port that the printer is attached to. Then the printer receives the messages and tries to print the document they contain.
Alright so like, ARP and DNS do not work anything like each other. But they serve a similar purpose. ARP translates IP addresses into MAC addresses. DNS translates human readable addresses into IP addresses.
Ever see The Lion King?
DNS stands for "Domain Name System". The word "Domain" is used in the same sense as that scene in the Lion King, where Simba's dad Mufasa tells him "everywhere the light touches is our kingdom". That kingdom is Simba's Domain.
The Domain Name System allows organizations to assign names to all the devices within their domain, and for those names to be unique globally.
If you set up some records in the domain name system, you'd likely consider the contents of your house networking history as your "domain".
For many practical reasons, the DNS system can't work like ARP where the computer trying to look up an address just broadcasts to the whole network. ARP broadcasts on an Ethernet network, which has like a few hundred members at most. DNS contains hundreds of millions of members.
So the organization ICANN keeps a list of so-called "top level domains". Each entry on this list specifies the IP address of a server which can help you look up the IP for addresses under that level. The list originally had just a handful of entries:
- com - for Commercial organizations
- org - for non-commercial organizations
- net - for ISPs and related organizations
- edu - for universities
- gov - for US government organizations
- mil - for the US Military
Later this list was expanded to include a bunch more entries. Notably, they added a bunch of two-letter country codes. E.g. us and ru.
The servers for each of those country codes are owned by the governments of each country. Typically, the government delegates management of that server to a dedicated organization.
This was a huge deal for some countries, like Tuvalu who got tv and makes a significant portion of their countrywide revenue from it. It is similar with the British Indian Ocean Territory that got io.
DNS is a hierarchical naming system, so the "top level domains" (TLDs) each can have any number of names "below" them in the hierarchy. For example, "com" has "discord" under it. Bizarrely, the group who came up with the standards for DNS decided that instead of going from left to right like a directory path, DNS names should go from right to left. So instead of having com.discord we get names like discord.com. This "fully qualified" name has TLD name com and second-level domain name discord. Fully qualified meaning it shows the entire name up to the top-level.
Although domain names are now mostly used for websites, the domain name system pre-dates the invention of websites by like 6 years. For the longest time, it was standard to have the third-level domain "www" to identify the web server separately from the rest of the domain. E.g. www.discord.com would identify the primary web server on the discord second-level domain under the com commercial TLD.
So here's the scenario: "back in the day" documents were sent around by Fax machine. But now you have this fancy Internet printer. I want to send you a document, for example some pages of a sensitive document. I ask you for the address of your Internet printer, and you tell me it's 15.23.44.2. I forget that immediately. To help with the forgetting problem, you register a domain name system entry.
Let's say you want to name your domain cheesebags. You might approach an Internet company that contracts with the Russian government (or the government's delegated management organization). You pay that Internet company $10 a year to keep that entry valid. Some of that money goes to the Internet company in exchange for hosting your entry. Some of that money goes to the government in exchange for hosting the ru TLD. And some of that money goes to ICANN for hosting the list of TLDs.
As the proud owner of domain name cheesebags.ru, you now set up the third-level domain printer.cheesebags.ru. You pay that Internet company to keep the record that printer.cheesebags.ru is associated with IP address 15.23.44.2.
So, now, when I ask you what the address of your Internet Printer is, you can tell me the address is printer.cheesebags.ru. Now that's a name to remember.
When I put that address into my computer, it checks my computer's network settings for the IP address of a "DNS Server" to use. This used to have to be entered manually according to guidelines from the ISP, though these days the server is usually picked automatically. So, because this is the 1990s, my PC will go to the server my ISP told me to use and ask it "what is the IP address for printer.cheesebags.ru"? Then my ISP's server will check the ICANN list of TLDs and see that ru is managed by Russia and it will go ask Russia "what is the IP address for printer.cheesebags.ru?". Russia will say that this domain is managed by the Internet Company that Alisa is a customer of. So my ISP's server will check with that Internet Company to ask "what is the IP address for printer.cheesebags.ru?" and that Internet Company will check its records and reply that "printer.cheesebags.ru is associated with 15.23.44.2". Then the ISP-owned DNS server will forward that reply back to my computer.
Very notably, a lot of government-mandated website blocks work at this level. The government forces all the ISPs in the country to reject requests for DNS addresses that the government says should be blocked.
Once my PC has discovered the IP address of your internet printer, me printing on your printer is as simple as sending some IP packets with source address 18.43.64.1 and destination 15.23.44.1:
- Across the serial wire to my serial modem
- Across the phone network, into the ISP's modem
- To my ISP's customer-facing gateway router
- To my ISP's internet-facing BGP router,
- Across the internet mesh to your ISP's internet-facing BGP router
- To your ISP's customer-facing gateway router
- To your ISP's modemzv
- Across the phone network into your Ethernet modem
- Across an Ethernet cable into your Ethernet switch
- Across an Ethernet cable into your printer.
My computer does not know the MAC addresses you use on your Ethernet network, but it also does not need to. When the IP packet arrives at your ISP's customer-facing gateway router, that router (which is attached to your Ethernet network by your Ethernet Modem) will itself use ARP to discover the appropriate MAC to use. Then the router will encapsulate the IP packet in an Ethernet frame with the appropriate source and destination MACs. That IP packet inside an ethernet frame is what gets passed the rest of the way of the journey.
Firewalls
This is great but it's a little too great. Because I can print to your printer when you want, but I can also print to your printer when you don't want. And anyone who has your IP address can print to your printer when you don't want. Printer ink and toner are expensive, paper is expensive.
With Fax machines there was some idea of a black fax attack where a malicious person could take a fax machine with a continous document feeder, tape a couple of all-black pages end to end, and then run these pages in an endless loop through the fax so that the victim fax machine wastes all its paper and ink.
With this new internet printer, an attacker doesn't even need a continuous document feeder. They can just use a computer program that sends a new print job every couple seconds.
On a telephone, there's usually a feature somewhere to make the phone automatically reject calls coming in from unknown phone numbers. This allows people to avoid annoyance from spam calls, and fax machines to avoid malicious faxes coming in from bad actors.
On the Internet, a very similar effect is achieved using a device called a firewall.
The original firewalls were physical boxes that did just firewall work. These days, firewalls are mostly just programs that run on basically everything that has a network interface. You can still buy physical firewall boxes, but they're usually only seen in companies big enough to have dedicated network engineer staff.
A Firewall's job is pretty simple. It acts a lot like a bridge, where it takes messages that come in on one side, and just repeats them onto the other side. But instead of blindly repeating all messages, it has a list of rules. In this scenario, the rule list would look something like this:
- Allow traffic from the internet side that's coming from 18.43.64.1 (g's PC) and going to 15.23.44.2 (Alisa's printer)
- Allow traffic from Alisa's side that is coming from or going to anywhere
- Do not allow any other traffic
With these rules in place, my print jobs can pass through the firewall but everyone else's get rejected. Your print jobs can make it to the printer perfectly fine because they don't have to go across the firewall; you're inside the network that the firewall protects.
Tangentially, even today not everyone puts a firewall between their printer and the Internet. See here: https://cybernews.com/security/we-hacked-28000-unsecured-printers-to-raise-awareness-of-printer-security-issues/ an estimated 500,000 printers are just out there, happy to print anything anybody sends from anywhere.
The Internet, broadly, worked this way for years. However, things got shook up quite a bit with the introduction of Wi-Fi
Wi-Fi
Wi-Fi is not a terribly complicated technology. It takes a modem (like we use to talk over the phone network) and instead of going over the phone network it transmits over radio. Here we can see I've upgraded my home network to include Wi-Fi. I've added a "Wireless Bridge" which takes the bits that come in over its interface to the Wi-Fi modem and bridges them onto its interface to Ethernet network.
This new configuration of Modem -> Firewall -> Router -> Switch -> Wireless bridge is what most ISP customers use today. It's such a common and successful configuration that most ISPs give their customers a single box called a "combination wi-fi modem router" that includes all the circuitry and software for all five of those previously separate boxes. Here in the US, one of these combination boxes costs around $100-300. Usually the ISP will rent it to you for an extra $10 a month on your bill.
These wireless bridges go by another name sometimes, Wireless Access Points (WAP). Though Cardi B's song WAP has taken over the meaning of the acronym since it was released in 2020.
Anyways, the point here is that Wi-Fi created a big problem for network security.
Up until the introduction of Wi-Fi, almost all networks operated under the assumption that communication across the network was secure because the signals only passed through wires that were physically secure inside locked offices and houses. Although getting your network tapped on the outside was possible (think like a hollywood style "phone tap" you'd see in a spy movie), for most people it was fine to just assume that the connection wasn't tampered with and all the ISPs making up the Internet mesh were not seriously interested in spying on your connections.
Because Wi-Fi runs over radio, all communications sent over Wi-Fi get broadcasted in a way that can be listened in on from hundreds of meters around when using a strong enough antenna. This led to a need to develop an encryption code to prevent people from listening in on any data being sent across the Wi-Fi connection. The first attempt at this encryption code was termed Wired Equivalent Privacy (WEP), because it intended to allow you to shout your data all over the nearby radio spectrum while still giving you the privacy you would have had if you were using a wired connection.
WEP has been replaced by several newer encryption standards. Today, most networks use WPA2 (Wi-Fi Protected Access version 2) which utilizes AES encryption. This is not a talk about encryption, but the point is that when you connect to a password-protected Wi-Fi network all your data gets securely encrypted by your device (laptop or smartphone) before being blasted on the radio, and then it gets decrypted by the WAP/Wireless Bridge before the bridge forwards it on to the Ethernet network.
So with this Wi-Fi tech, if I want to send you "hi" it goes like this:
- My program sends "hi" to the Operating System's socket API.
- The OS puts "hi" into a UDP/IP packet format.
- The OS puts the UDP/IP packet into an Ethernet frame.
- The OS hands the Ethernet frame over to a Wi-Fi modem driver program. The Driver program is specific to the hardware. Linux builds most drivers into the kernel, but some drivers can come as loadable kernel modules..
- The Driver encrypts the data using AES encryption.
- The Driver delivers the encrypted data as binary to the physical Wi-Fi modem inside your laptop, which is normally attached as an expansion zcard but there are also USB Wi-Fi modems you can buy. If you're really old-school, you can get a Serial Wi-Fi modem.
- The physical Wi-Fi modem inside your laptop or phone modulates the signal so it will go across radio properly, then pushes it as voltage over to an antenna.
- The antenna does what antennas do and causes the voltage to leave the antenna and splatter all around as radio waves.
- The wireless bridge's antenna catches some of those waves which turn back into voltage.
- The wireless bridge's modem demodulates the signal back into binary data.
- The wireless bridge decrypts the binary data back into an Ethernet frame.
- The wireless bridge forwards the Ethernet frame onto the Ethernet network.
- Then all the regular networking stuff we've already talked about happens.
It's a lot of work to send just 16 bits of binary data (01101000 (h) 01101001 (i)), but that's the price we pay for technology.
This is what a Wi-Fi modem expansion card for a desktop PC looks like.
This is what they look like for laptops (the white and black wire there are antennas, one to send and one to receive)
This is what a Serial wifi modem looks like
This is what USB ones look like. The tiny ones have really awful tiny antennas basically curled up inside them, but they do work. I use that design a lot on Raspberry Pi computers
Transport Layer Security
So, with AES-encrypted Wi-Fi all our comptuer security problems are solved right?
Well, no. Mostly, because peope are terrible.
As the prices of Wi-Fi equipment came down, and computers became smaller and easier to carry around, people invented the idea of "public Wi-Fi". This causes two security problems that may not be immediately visible when just looking at / thinking about home wifi.
The first security problem is that, as I said before but without drawing attention to it, Wi-Fi is only encrypted on password-protected networks. When you're on an "open" network that has no password and is instead meant for public use, the connection is completely decodable by anyone nearby with a radio antenna.
Most "public" Wi-Fi networks are now configured with a simple password listed on a sign somewhere on the premises. Even if the company never changes the password, this still improves security tremendously because it enables the connection encryption.
The second security problem is that although a secure Wi-Fi connection ensures third-parties aren't listening from nearby, it does not ensure that the company you're connecting through is not listening. That is, I definitely do not trust the other people sitting at the Internet Cafe, but I probably also do not trust the Internet Cafe itself. This is a much more immediate concern than not trusting your ISP, though the tools invented to address this issue do protect you from ISP snooping as well.
Now, a lot of people are just completely apathetic about security. However with the rising popularity of internet payment process, internet banking, and electronic messaging, the companies who actually run the servers that their customers (and employees) connect to were not quite so apathetic. These companies broadly decided that relying on customers to never connect from public Wi-Fi was futile. Instead, over the past 20 years, basically everyone has adopted a new encryption layer that goes right on top of TCP, called Transport Layer Security (TLS).
TLS (and its predecessor SSL Secure Sockets Layer) applies the same kind of encryption you would see in a secure Wi-Fi connection, but instead of getting decrypted by the Wi-Fi router, it gets decrypted by the server all the way at the other end of the TCP connection. Protocols that include this extra TLS layer generally end in "S" to stand for "Secure". For example:
- File Transfer Protocol (FTP) -> File Transport Protocol Secure (FTPS)
- Hypertext Transfer Protocol (HTTP) -> Hypertext Transfer Protocol Secure (HTTPS)
- Internet Printing Protocol (IPP) -> Internet Printing Protocol Secure (IPPS)
When accessing a website using HTTPS, instead of the HTML being sent as raw ASCII-coded text over the wi-fi network, the HTML gets AES encrypted by the server's TLS layer, sent over TCP to your laptop, then decrypted by your web browser's TLS layer. This means that someone who's just listening on the radio can no longer see what you are looking at in the web browser*. This means that an evil Internet Cafe can't see what you are looking at in the web browser*. This means that an evil ISP or Government can't see what you're looking at in the web browser* (at least, not without a backdoor into either your PC or the Webserver itself. Many governments do have these kinds of backdoors. As I recall, this kind of backdoor is not just a secret in some countries like China, but is actually required by law to be inserted upon request. This is, in short, why the US does not allow Huawei phones to be imported).
* Generally, they can still see the IP address and Domain Name you connect to. Recently a new version of DNS has made it so (when used) they can't see the Domain Name you connect to, but they more or less have to be able to see the IP address because if the traffic was not tagged with an IP address then it would not be deliverable.
So for our example scenario here, let's imagine I'm over here sitting in some Internet Cafe and I want to print some sensitive document pages over to your printer. I don't want the people in the Internet Cafe to snoop on what I'm sending you, so when I go to print I set up the printer as an IPPS printer instead of an IPP printer. Now, when I go to print, every other layer works exactly the same but with an extra step in the middle that encrypts the contents of each TCP packet before it leaves my computer. Now I can send you whatever I want and nobody can see what it is.
Some VPN companies make claims that their VPN software will protect you from hackers trying to steal your credit card information when making online purchases. That could have been true 15-20 years ago, but because of the widespread adoption of TLS and HTTPS this is really just not true anymore. You're already very decently secure against that. We'll talk later about what VPNs can actually do for you and secure you against.
Proxies
TLS improves security by providing personal privacy, but in the corporate world companies improve security through monitoring, which is the opposite of privacy and becomes impossible with the added privacy of TLS (unless the company tampers with the individual computers themselves). This monitoring is where proxies come in to play:
When you request a website, you send your request pretty much directly from your computer to the web server, and then the web server replies directly back to your computer. A proxy shakes this up a bit: when using a proxy, the proxy server sits in the middle, recording and sometimes tampering with the request or response data.
In a company this is done for a few reasons:
First, it gives management a chance to apply request filtering. If your PC asks the proxy to request a web page, but that request is for a site that the workplace does not allow like a known gambling or adult content site, the proxy can reject the request.
Second, it allows for response filtering. If your PC asks the proxy to request a web page which would normally be allowed, but that response raises a flag with an antivirus scanner, the response can be rejected. This kind of response scanning improves network security by keeping known malware out.
Third (though not as common especially at small companies) the proxy can scan uploaded content for matches against a database of company secrets. This is to prevent data exfiltration for the purpose of corporate espionage. I image our federal government as well does this to prevent exfiltration of national secrets in the case of actual international espionage.
Fourth, caching proxies may keep a local copy of requested data so that if the data is requested multiple times, it only has to utilize the company's external bandwidth once. These used to be common to, for example, cache Windows update files.
Here is a proxy, proxying time requests rather than web requests.
Companies often used what are called "transparent proxies", which is a proxy server that works alongside a firewall. The firewall intercepts traffic passing through it, hands that traffic to the proxy, lets the proxy apply any monitoring and filtering it likes, and then the firewall forwards on the request or response after the proxy has scanned or modified the data.
A firewall that does this is often referred to as a "deep packet inspection" firewall, because while a regular firewall only looks at the source and destination parts of an IP packet, a DPI firewall looks deeper into the data portion of the IP packet.
This is speculation on my part, but the widespread support for Internet proxy settings at the OS and Browser level seems to have originated from the fact that DPI firewalls are more expensive than regular firewalls. For a company with multiple external network links, they might not want to pay more for fancier firewalls for every link. Instead, they configure the computers to talk to a central proxy that handles all the traffic in one place, making the process cheaper and easier to manage.
With the introduction of TLS as an extra layer for common traffic types, these firewall based proxies tended to have an issue. The privacy that TLS gives people from spying by Internet Cafes and ISPs also generally prevents spying by DPI firewalls. This is where having dedicated standalone proxies instead of DPI firewalls found its real strength.
DPI firewalls apply filtering by intercepting requests, but standalone proxies work by making requests on your behalf. That gave proxy manufacturers an opportunity to create a new kind of proxy, the SSL Terminating Proxy. These proxies handle all of the SSL/TLS encryption themselves, so that they can inspect the unencrypted data before it leaves or enters the corporate network.
With an SSL Terminating Proxy, my computer will use an unencrypted request to the proxy, to ask the proxy to make a request to somewhere like yandex.ru. The proxy will then make an encrypted HTTPS request to Yandex on my behalf. Then it decrypts the response and forwards the decrypted version back to me.
Optionally, these SSL Terminating Proxies will apply their own encryption when talking to your PC, so that you're still protected from snoops on the wifi.
Free Proxies
As for why you'd use a proxy as someone who is not a corporation, mostly people use them because of how they make requests "on your behalf". By using a proxy server in Norway, requests you make through the proxy look like they originated in Norway. This helps people bypass firewalls and content geo-blocking.
"Free" proxies tend to work in one of two ways:
- They're offered for free because they are SSL/TLS Terminating Proxies and their goal is to inspect or tamper with your data, to steal your banking session data, crypto wallet, discord token, etc
- They work by having users themselves act as proxy servers in the network. So if you sign in from Russia and want to use a US proxy to access US Netflix, your data will pass through the computer belonging to someone in the US who might be using the same proxy service to (for example) access German Netflix via a German user's computer.
The first option here is problematic because you don't want your data stolen. The second option is problematic because, in general, you don't want people to be able to make requests "as you".
Imagine someone uses the second kind of free proxy to access content that is illegal in your country. The blame for that request is going to fall on you, generally.
There is a third type of free proxy service that are powered by a mix of curiosity and altruism, which is the category that proxy networks like Tor fall into. Tor ("the onion router") works by generating a randomized sequence of proxy servers for your request to go through, with each server applying or removing one later of encryption. Like layers of an onion 🧅 .
So with Tor, you might send an encrypted HTTP request (HTTPS) into the first proxy server, then that proxy server applies a second layer of encryption and send the request to a second proxy server. The second proxy server applies a third layer of encryption, then send the request to a third proxy server. The third server decrypts a layer, and forwards it to a fourth proxy server. The fourth server decrypts a layer to get back to the original HTTPS request, and sends that to the actual destination web server. In theory, the only way to trace the request from the original requesting person to the destination server (or from the server back to the requesting person) is to own all of the proxies that were used.
Tor servers are usually hosted by research universities, security companies, nonprofit organizations focusing on expanding global free speech, and government spy networks. These groups typically either want to have a proxy so they can do research on the metadata of traffic that passes through them, or in the case of governments they want to own as many of the proxies as possible to maximize the chances that they own the entire chain and can break the privacy. That way they can see who made requests to what servers and when.
Oooooh I forgot about the 4th kind of free proxy service, which is the botnet proxy 🙂 they offer you free proxy services, but in return your PC becomes part of a botnet that they will eventually use to do something else nefarious. This often overlaps with the 2nd type of proxy. Botnet proxies can either use the botnet as a way to attack other organizations, or they can just drain your own PC's resources with a bitcoin miner or something like that.
Tunnels
Alrighty, so after Proxies we get to Tunnels. Tunnels are when you use higher level networks to transmit data formatted for lower level networks. For example, sending Ethernet frames across TCP packets. Putting Ethernet frames as the data payloads of TCP packets is backwards according to the traditional model; usually, TCP packets go inside the data payload of an Ethernet frame. The backwards layering is more or less what makes it a tunnel.
Typically, you use this tunneling technique alongside bridging in order to reach into networks that you wouldn't be able to get to otherwise. In the same way that you might use a serial modem to call up a Bridge to access an Ethernet network somewhere, you could use Tunneling to make a TCP connection to a Bridge to access an Ethernet network somewhere.
Setting up a Tunnel is usually done to access devices which themselves either can't talk Internet Protocol, can talk Internet Protocol but don't have appropriate access controls (i.e. you can't set a password on it), or it can talk Internet Protocol but it is inside a firewalled network where the administrator does not want to set up access rules individually for every device on the network. Certainly, having a single password on the tunnel is much easier to administrate than trying to put passwords and firewall rules on every single individual device.
The model of giving every individual device/service its own credential system actually caught on during the COVID pandemic when they sent everyone to work from home. It had existed before in different forms, like requiring everyone to enter a username and password individually on each service, but it was a real PITA for users to have to keep signing in every time. Instead, a new management technique called "SSO" Single Sign-On was deployed for a ton of people, where they only have to sign into their company portal one time and then that authentication token becomes valid for all the other devices/services that they want to connect to so they don't have to sign in a second time.
For example, let's say I want to print to your printer from this Internet Cafe I'm sitting at. You originally secured your printer by adding a firewall rule that rejected everyone's IP address except for mine. However, because I'm at the Internet Cafe, my IP address is different. My laptop has connected to this public Wi-Fi and so my laptop has an IP address belonging to the Cafe now instead of an IP address belonging to me/my home network. When I go to print my sensitive document pages, I get a "connection request rejected" error from your firewall. Your printer doesn't support putting a password on the print queue, so you don't want to open up the firewall to let just anybody print. Instead, you install a Tunneling server on your PC, put a password on that, and give me the password.
Now if I want to print, I can connect by TCP to the tunnel, fill in the password you gave, and then use that link to connect to your Ethernet network like I was really there. My laptop will get a new IP address on a new "tunnel interface" which shows up in the network settings the same way its Wi-Fi adapter and Ethernet jack would look. Then I can use that interface to send an Ethernet frame across the tunnel and the bridge to the inside of your network. After the connection has been established and the passwords given and all that, the Ethernet frame could look like this:
- Ethernet Synchronization
- Ethernet message begin
- Ethernet Source
C0:FF:EE:C0:FF:EE
(g's Wi-Fi interface MAC) - Ethernet Destination: Whatever the coffee shop's internet gateway's MAC is.
- Ethernet Data Length
- IP Source
38.62.12.5
(g's coffee shop IP address) - IP Destination
15.23.44.1
(Alisa's PC IP address) - Protocol Type: TCP
- Source Port: Random
- Destination Port: 8000 (whatever Alisa's tunnel program listens on)
- IP Data Length
- IP Data Hash
- Ethernet Synchronization
- Ethernet message begin
- Ethernet Source
BE:EF:BE:EF:BE:EF
(g's laptop tunnel interface MAC) - Ethernet Destination
12:34:56:AB:CD:EF
(Alisa's printer) - Ethernet Data Length
- IP Source Address
15.23.44.3
(g's laptop tunnel interface) - IP Destination Address
15.23.44.2
(Alisa's printer) - Protocol Type: TCP
- Source Port: Random
- Destination Port: 631 (well-known IPP port)
- IP Data Length
- IP Data Hash
- IPP Data containing sensitive document pages
- IP Source Address
- Ethernet frame data hash
- IP Source
- Ethernet frame data hash
The tunneling server running on Alisa's PC takes the tunneling IP packet and un-packages the inner Ethernet frame then bridges that frame over to the Ethernet interface (pushing the raw data onto the Ethernet network without modifying it).
Similarly, if Alisa's PC receives an Ethernet frame sent back by the printer, it will package that frame up as a TCP response packet back to g's PC's address at the Coffee Shop.
Ports
It's really just a number between 0 and 65535, used to allow multiple services to be served by a single IP address / multiple simultaneous connections to be made by a single client.
Some of the numbers have "well known" meanings, like everyone agrees 80 is for HTTP traffic, 443 is for HTTPS, 631 for IPP, 22 for SSH, 23 for Telnet, 25565 for Minecraft, etc
Any service can be configured to communicate on any port, those are just the defaults when you do not specify
When a client makes a request to a server, it sets a random source port so that when the server replies that reply is tagged with the unique response port and can be identified
And to be clear, "port" is meant in the same sense as an airport or a sea port. It identifies a destination for traffic
Granted, in the analogy of an airport it's more like a terminal / gate number than the whole port. Or a specific dock at a sea port. But I think the consideration is that if your computer was a country, it could service traffic at multiple ports simultaneously, where that traffic comes from or goes to any other location
Of course, "simulaneous" is in the sense of whole streams. Because Ethernet is a serial interface, your PC only receives individual packets one at a time. Really, really fast
I guess if you consider the computer as the country, the TCP/IP ports as airports, that would make individual terminals as socket descriptors, and the gates would be threads
Secure Tunneling and VPNs
Tunneling on its own is fine. But what really gets people excited about tunneling isn't sending Ethernet frames over TCP packets, it's sending Ethernet frames or IP packets over encrypted transports like TLS. That is, "secure tunnels"
Regular tunneling can help you put things like access controls onto a system that does not normally support them. Secure tunneling takes it a step forward, allowing you to put encryption on a system that does not normally support it.
Imagine the same tunneling setup that let you password-protect your printer, because your printer does not support setting a password on it. Your particular printer also does not support IPPS (Internet Printing Protocol Secure) because in all honesty, basically no consumer printers actually support IPPS. If I want to send my sensitive document over the regular tunnel to your printer using IPP, both the password for the tunnel and the sensitive document page get sent in a way that can be intercepted.
Using a Secure Tunnel instead of a regular tunnel allows the password to be encrypted, and the sensitive document page to be encrypted. A bad actor at the coffee shop won't be able to see either, even though I'm still technically printing with the insecure IPP protocol.
This idea of a Secure Tunnel forms the "Privacy" part of a VPN (Virtual Private Network)
The other half of a VPN, the "Virtual Network" isn't all that complicated. When I connected by tunnel to Alisa's PC in order to bridge onto your Ethernet network, that Ethernet network was physical network with a physical Ethernet switch and other physically present Ethernet devices to talk to. On a virtual network, there is no real Ethernet network to bridge onto. Computers typically still send Ethernet frames on these virtual networks, but those frames are managed entirely by software and never make it to a real physical Ethernet network.
Traditionally, the VPN server is a program running on a company's Router. Clients will connect to the VPN by secure tunnel, and get attached to a virtual Ethernet switch. Then they can directly access other devices that are also connected to the VPN's virtual Ethernet network, or they can send IP packets that the router can route over to the Internet interface or the regular physical network interface.
So imagine you're on the Internet somewhere and you want to print something at the company you work at. You would make a secure tunnel to the VPN server running on your corporation's router, and you'd provide some ethernet MAC to talk on that network with. You'd be virtually attached to that Virtual Ethernet switch, meaning that when the VPN server decodes your encrypted packet and finds an ethernet frame, it forwards that ethernet frame to the virtual ethernet switch code.
The VPN Interface has an IP address (10.0.0.1) so it can be used as an Internet Protocol gateway, so you send your virtual Ethernet frame with your virtual MAC, your virtual IP address (let's say, 10.0.0.2), then the destination virtual MAC (CC:CC:CC:CC:CC:CC, discovered by ARPing the VPN Interface's IP address 10.0.0.1) and the destination IP address (15.23.44.2).
This corporate router would take the tunnel connection and have it get processed by the the VPN server software. Then the VPN server software would decrypt it, and forward the virtual Ethernet frame to the virtual Ethernet switch. Then the virtual Ethernet switch would forward the frame to the virtual VPN interface. The Ethernet frame goes away, and the VPN Interface routes the IP packet by determining that the LAN Interface is the one that handles 15.23.44.* IP addresses, so the VPN Interface code hands the IP packet to the LAN Interface code. Then the LAN Interface code encodes a new Ethernet frame for the physical Ethernet network where the destination MAC is 12:34:56:AB:CD:EF and the source is AA:AA:AA:AA:AA:AA
That set of three "IP Packet Link" arrows are commonly implemented by the Linux networking stack. When you create iptables rules, it's actually writing the rules of how IP Packets sent through those buffers should be managed.
The "Virtual VPN Interface", on Linux, is generally given interface identifier tunN (like tun0 for the first one configured on a system), so-called "tun" because it's implemented as a tunnel. In the diagram, the "VPN Server Software" and "Virtual Interface" are separate boxes, but really both of those and the virtual ethernet switch are typically implemented by a single software package.
Here you can see I have lo which is a virtual interface to handle traffic on the 127.* network, wlp1s0 which is my physical wifi interface, tailscale0 which is a virtual interface for my Tailscale VPN, lxcbr0 which is apparently for talking to lxc virtual networks, and tun0 which is my OpenVPN virtual interface
That brings us to the VPN services that get advertised on YouTube. These VPN services run servers like the corporate router pictured in the diagram, but generally they don't have any LAN interface. When you use something like NordVPN to secure "up to 6 devices" or whatever, it means you can open 6 tunnels at once to their VPN Server software to attach those devices to the virtual Ethernet switch. The devices connected to NordVPN can talk to each other (if the Virtual Ethernet Switch code allows it) as if they were connected to a physical ethernet network.
That's not the selling point of these services though. The selling point is that because this virtual Ethernet network has a routing interface with routes to the actual Internet, by connecting the tunnel into this virtual Ethernet network you can connect your computer to the Internet but from somewhere else. This achieves a similar effect to a Web Proxy despite that the mechanism is so different. When you use the Internet over that virtual Ethernet connection, over the secure tunnel, once your traffic emerges on the Internet it appears to have come from wherever NordVPN's data center is. Your computer gets an IP address assignment on its tun0 interface of an IP address which belongs to NordVPN. Your traffic no longer appears to be coming from your genuine physical location or your genuine location in IP address space. Assuming you successfully made the initial tunnel connection, you don't have to worry about any weird firewalls blocking access to things (like China's "great firewall"), and because the tunnel is secured you don't have to worry about the people in the internet cafe spying on you, or your ISP spying on you.
That said, you do still have to worry about:
- The VPN service provider spying on you
- The VPN service provider's Internet provider spying on you (if they have one, good VPN service providers are their own ISPs)
- All the routers in the mesh between the VPN provider and the destination of your requests
- Including any governmental interference
The "governmental interference" part is where we start to circle back to why people use VPNs for tasks like media piracy. Under the standard connection model where you can pirate a movie, those piracy requests are tagged with your IP Address which your ISP knows is being used by you, but the media companies can only see that it belongs to the ISP. Government interference (a "court order") means the ISP can be forced to tell the media company that it was you who was using that address at the time the address made those piracy requests.
These VPN providers operate their servers in countries where they don't have to listen to the government of your own country. So, if the Warner Brothers media group gets the United States courts to issue an order to the Lithuanian NordVPN company that NordVPN must cooperate in catching the evil movie pirates using NordVPN IP addresses, NordVPN and the Lithuanian government pretty much tell the American courts to pound sand.
These companies usually do everything they can to not even keep any sort of connection or access logs, so that if they ever were compelled to provide information, they would not have any to provide. But then also some VPN providers claim they do this but in reality they do not, so it's important to contract with a reputable VPN company if you genuinely want to avoid spying.