is it web scale?



Kurt Neufeld


What we'll cover

  • how do computers even work?
  • what does async even mean?
  • does it make everything faster?

How do computers work?

Never mind Python for a moment, how do computers do two things at once?

Threads vs Cores

Back in the olden times, any computer you could afford only had one cpu and it only had one core in that cpu.

But even back in Windows 3.11 days, you could run more than one program at a time. How was such black magic possible?

Threads vs Cores

Well each running program is really called a process. And each process has at least one thread. It's this thread that is actually running on the cpu. Thread is short for thread of execution

The OS is in charge of scheduling each thread. But how can a single core run the OS and multiple threads?

Threads vs Cores

The OS has hardware support, the OS sets a timer and then there is an interrupt that forces the OS to start running again. This was a major difference between x286 and x386 back in the mid 80's.

So basically, the OS just schedules threads to run. Those threads may be different processes or it may be one process with lots of threads, it doesn't really matter.

Let's talk about steak for a minute.

Threads vs Cores

So with one core, only one thread is ever running at one time. With multiple cores then multiple threads can run at the same time, this is called parallelism.

Switching from one thread to another is called a context switch. Alternating between threads of the same process is concurrency.

Speed of Light

System EventLatencyScaled
One CPU cycle0.4 ns1 s
Level 1 cache access0.9 ns2 s
Memory access~100 ns4 min

Speed of Light

System EventLatencyScaled
SSD50–150 μs1.5–4 days
Spinning Rust1–10 ms1–9 months
SFO to NYC65 ms5 years
SFO to HK141 ms11 years

Speed of Light

As interesting as that was, the reason for it was...

System EventLatencyScaled
Context Switch~30us~1 day

Red pill vs Blue pill

Red pill vs Blue pill

Whenever software interacts with the real world (the network, hard drive, clocks, etc) the application doesn't actually interact with anything, it asks the operating system to do it for us.

Your app took the blue pill, happy in its sandbox. The OS is the red pill

Blocking vs Non-blocking

Calling into the OS is usually like calling any function, the calling function passes control into the callee and when the callee finishes, the caller resumes.

That's a blocking call and hopefully the concept is familiar.

Calling into the OS is a context switch.

Blocking vs Non-blocking

But what if instead of blocking and waiting for the OS (for days/years) we instead keep running and the OS will just let us know when it did that thing for us?

That's a non-blocking call. Async is the name of a pattern that only makes non-blocking calls.

Blocking vs Non-blocking

Let's see how far this rabbit hole goes.


So if we want to do lots of things at once, like handle web requests if we're a web server, then we need one thread per request.

That would totally work but it'd be slow as hell because of all the context switching.

With optimizations, this is more or less what the Apache web server does.


But what if we write our program in such a way as to never do a context switch?

That's the heart of asynchronous programming.

This is essentially what the Nginx webserver does and it can handle 100's of thousands of connections with a single thread.

What's the catch?

If async is so great why isn't everybody doing it?

Well a lot more people are. However...

Super confusing at first. Super confusing the rest of the time. Callback hell. The thread can't actually do any work.

No work?

We haven't talked about implementation yet, but if we go back to the Nginx example with 100,000 connections, there isn't a lot of time do anything while you're servicing that many connections.

So if Nginx were to make a blocking call to that totally free service in China then the other 99,999 connections will wait until that very long round-trip is finished.

In an async program you're always racing back to the event loop.

Pseudo-Implementation aka Event Loop

      while True:
        tasks = os.wake_me_if_something_happens() # blocking call

        for task in tasks:
          task.do_callback() # this hogs the cpu

Examples of work: new incoming network connection, network packet to read, network packet successfully sent, timer went off, a file changed

Event Loop

The heart of any async program is the event loop. Lots of libraries implement an event loop that you then build your app on top of.

  • select/epoll (libc)
  • libev (C)
  • ASIO (CPP)
  • Twisted (Python)
  • asyncio (Python3)
  • node.js (JS)

So is it fast?

Maybe but probably not. It depends.

Since work only ever happens in a thread and we only have one thread then a program with two threads will be twice as fast.

A program with 1000 threads will spend so much time context switching that no work will ever get done.

So is it fast?

If you're doing lots of work (CPU bound) then use threads

If you're doing waiting work (IO bound) then use async.


Since Python has the GIL (Global Interpreter Lock) only one thread will ever run at a time (even with lots of cores).

That's why Python is "slow" but totally kicks ass as a web server. Being cpu slow doesn't matter when network time completely dwarfs everything else.

Why is it called Async?

Because events can come in at any time we can't know what order our code will be run in. This is the opposite of syncronous code, hence asynchronous.


  • async programs run in unknown order
  • your functions don't run faster
  • your functions should never block
  • context switches are very expensive
  • converting a sync program to async is non-trivial
  • cores run threads

The End


Good stuff: YouTube: Node.js Is Bad Ass Rock Star Tech