I am apparently in a relatively small minority of humans who use a camera besides my phone for taking pictures. I do this a lot less than I used to, because the phones have gotten really good. But I still do it.

Most of my time with non-phone digital cameras has been spent using either Nikon DSLRs, which were mostly great, but always too big. And Olympus (now “OM Digital Solutions”, but I’ll still use the old brand name) mirrorless cameras, which were great once you went through the pain of setting up the four modes you want to use them in, and also deliciously small by comparison to the Nikons.

Then Nikon got into the mirrorless game, making cameras almost as small as the Olympus, and generally about as great as the DSLRs. At this point I figured my time with Olympus was over, esp. since the Olympus company sold their camera business to some nameless private equity firm which brought the brand back as “OM System”, which is piece of branding no one can love. I concluded that given that Nikon was still making new things, and “OM System” was maybe not, eventually I’d be getting back into the Nikon stuff anyway. So I jumped back in.

The Nikon Z6II is a mirrorless camera body that feels almost exactly like a Nikon DSLR to use (which in turn felt almost exactly like a classic Nikon autofocus film body to use, only better). It can really do almost anything you want. And it unquestionably does two things much better than the Olympus body that I had been using the last 5 or 6 years:

- The noise and sharpness in low light is a lot better.
- The autofocus system is a lot easier to control.

The tracking AF is what sold me. I don’t actually shoot a lot of moving things, but the convenience of being able to put the focus box on a thing, hit the focus button and then recompose however I want while the camera just stays locked on to the focus spot was just great. In most situations this gets you results that are identical to what you used to get by locking focus and recomposing. But if the thing in the box moved around at all, the camera would generally hold on to focus like magic. I loved this every time I used it.

So I spent a few months happily taking the Nikon and the kit lens (an emminently practical 24-70mm F4 zoom lens) around and plugging away with the focus box. Then I put the camera in a box and barely used it.

Later I started thinking about expanding the lens collection. I like zoom lenses. They are convenient and at this point in my life I am too old and lazy for the hassle of constantly changing single focal length lenses around. I generally want three zoom lenses to carry around, even though I usually only use two of them.

- Very wide to wide.
- Wide to normal.
- Long.

Most of the major camera lines have these lenses. But I hesitated getting the Nikon wide zoom and long zoom. To explain my hesitation, I need to explain some boring technical things.

When camera nerds talk about lenses they tend to refer to the lenses in terms of their focal length. The focal length of the lens is nominally the distance from where the light enters the lens to where it hits the capture plane (the sensor, say). Shorter focal lengths have wider fields of view. Longer ones have small fields of view.

The Nikon Z6II has a sensor in it that is the same size as a piece of 35mm film from the old days, so we can use the standard focal length terminology from those days. So the range of focal lengths for the three zooms listed above would be

- Very wide to wide: 14-16mm to 24-35mm
- Wide to midrange telephoto: 24-70mm. Or if you are being adventurous 24-120mm.
- Long: 70-200mm or 70-300mm.

The Olympus cameras that I use have a sensor that is smaller than a piece of 35mm film. In fact, it is strategically set up so that if you are using a lens with some given focal length, the field of view you see is the same as using a lens with double that focal length on a 35mm camera. This rule is harder to explain than to say with an example: A 12mm lens becomes 24mm, 50mm becomes 100mm, etc.

This means that the three lenses I want above are something like this:

- Very wide: 7-14mm (14-28mm equivalent) or 8-25mm (16-50mm equivalent).
- Wide to Midrange: 12-40mm or 12-100mm (24-80mm, or 24-200mm equivalent).
- Long: 40-150mm (80-300mm equivalent).

Because the “actual” focal lengths of the lenses are shorter than what you would build for a 35mm camera, the lenses themselves are inherently smaller for the same range of field of view. Plus, the lenses have to cover a much smaller chip area with a nice image, which lets you make them a lot smaller by volume.

Meanwhile, even though the Nikon mirrorless *body* is in fact a lot smaller than the old DSLRs, the *lenses* you put on them are basically the same size since they have to cover the same piece of CCD. In fact they are usually even a bit bigger than the old F-mount 35mm format lenses, because the Z-mount has grown in diameter for various technical reasons.

In any case, the result is that the Z lenses are *much* bigger than the Olympus stuff. So in the end the *lenses* become the limiting factor in size, weight, and volume when carrying the camera around.

If you compare the size of camera+lens for the two systems your universal conclusion is that the Nikon lenses are longer, fatter, and heavier than the Olympus lenses that cover similar equivalent fields of view.

Here is each camera with a midrange zoom attached. The Olympus is a 12-40mm F2.8 (24-80mm equivalent) and the Nikon is the 24-70 F4.

As we observed above, the bodies are not that different in size (although the Olympus is a bit smaller in every dimension, which adds up). But the Olympus lens is a lot smaller than the Nikon even though it’s a stop faster, and this is one of the more compact Nikon lenses.

Next, the Nikon has the same zoom as before, and the Olympus has a zoom lens with twice the range as before (12-100mm F4 (24-200 equivalent)). The total Olympus package is still a lot smaller because the width of the camera body and the diameter of the lens is so much smaller.

Here is each camera wearing the telephoto zoom I’d use on them. The Nikon lens is a 70-300mm F4.5/5.6 F lens with an adapter, and the Olympus lens is a 40-150 (80-300mm equivalent) constant F4.

The Olympus zoom expands a bit when you actually use it, but is still less than half the size and much lighter. And it’s a stop faster at the long end to boot.

**Unrelated related note:** I also have the 40-150 F2.8 Olympus lens, which is an incredible lens. Even at two stops faster on the long end it’s still a bit smaller than the Nikon. It also has a bad retracting lens hood design that caused me to smash the filter ring on it a couple of years ago and I could not figure out how to get it fixed. This might have made me mad enough to try the Nikon stuff at the time, or it might be completely unrelated. I’m not sure.

At this point the reply-guys in the audience are telling me that real men shoot with prime (single focal length) lenses, and surely you can find a small Nikon body/lens combination there to make you happy to which I say … remember how small the *Olympus* primes are? You can’t win there either.

Here is a Nikon 35mm prime on the Z6 (with an adapter, because I don’t have the Z lens, which is a lot bigger than this one). The Olympus E-M5 body is smaller than my other Olympus, but is the one I would use with this lens because the colors match.

Here is a picture I copied from the internet that has the Z version of the lens on some anonymous Z body. Looks about the same size as my monstrosity above.

Conclusion: the prime lenses will lose too.

Of course, size isn’t everything. In theory you give up a lot going to a smaller sensor, especially in terms of sharpness and noise performance in bad light. In addition, Nikon, for all its faults, is pretty good at building autofocus systems and reasonably straightforward interfaces to run them. The tracking focus system that I described above is great. Olympus has never really been able to keep up. And, there is the overwhelming possibility that by the time I post this page, “OM System” will announce that it is disappearing into the trash bin of camera brand history.

And yet I dithered. I spent almost two years using the Nikon, and never felt like I wanted to keep going with it. It turned out to be because if you more carefully examine the technical advantages that the Nikon allegedly has you can kind of tell they are just the kinds of ghosts that nerds like me chase to spend money.

Yes, the autofocus is great. But, I am *really bad* at actually taking advantage of the the things that it is great at. Shooting good in-focus pictures of things that are in motion is a skill that is *not* easy to learn and certainly not easy to *keep* if you don’t practice a lot. I mostly shoot pictures of stuff that is either standing still or is close to it. So I never practice that stuff. So even if the camera were *perfect* I’d still fuck everything up because the framing, or the timing, or something else would be wrong.

Also yes, the bigger sensor is better. Especially in low light. But honestly I take most bad light pictures with my phone now. It’s easier.

So in the end it turned out that size actually *is* everything. The Nikon Z lenses are, simply put, really large. Even the prime lenses are big. And the wide/tele zoom lenses that I would have wanted to get are super large. I knew I would want to use them. But I also knew I would never want to carry them. So for almost two years I sat around paralyzed. I really liked the camera, but could not get myself to buy the lenses.

And finally, to end the story … during this time that I was dithering Olympus, sorry “OM Digital Solutions”, released not only a new body that is a bit better at the things my current body is bad at, but also a set of newer and even more deliciously small lenses: the small 40-150mm (80-300mm equivalent) telephoto zoom in the comparison photo above, and a really useful super wide to normal 8-25mm (16-50mm equivalent) zoom. So I picked up the lenses on sale, and will keep them in my bag for those situations where the 12-100mm won’t do the job.

Inevitably I’ll probably pick up the new OM body, even though it’s not *that* much better at this stuff, and the UI is still a trash fire. Hopefully the OM system keeps its head above water for as long as I need them to. Or at least long enough to sell me the one body that I’ll use until Apple figures out how to fold a 200mm lens into an iPhone. At that point I’ll well and truly give up on carrying cameras around for good.

**Late Appendix:**

Aside from the size and handling, in retrospect another reason I went back to the Olympus cameras is that with the Z bodies Nikon decided to standardize on *yet another* useless and idiotic card storage format which makes me cart around *yet another* card reader instead of using the SD slot that’s built in to my computer. The only saving grace is that the Z6II has a single stupid card slot and another spare SD card slot for making backup copies, so I ended up using the “spare” slot all the time and completely ignored the expensive main card slot. I’d have rather paid $200 less for a camera that had two “inferior” SD card slots instead.

And yes, we are at a point in the camera industry where the storage cards that a body uses are probably the most interesting distinguishing characteristic between different brands. If I try full frame again I’ll probably defect to Canon just because they don’t use the stupid cards.

In Pittsburgh *Chicken Latino* is a long time favorite Peruvian style roast chicken joint that also serves a variety of other kinds of things, all in portions that are too large.

*Chicken Latino* is also, paradoxically, the home of the cheeseburger in Pittsburgh which is probably second on my list by overall objective “quality” but first on my list of emotional favorites. In these times when people can be remarkably pretentious and self-centered about getting burgers and fries made from only ingredients of the highest quality and correct origins, *Chicken Latino* takes frozen patties and “steak fries” that just fell off the Sysco truck and turns them into a burger and fries better than almost any other in town. The fries are the most puzzling part of this equation. They really should not be good, but they really are better than almost any other fries in this city where almost no one seems to know how to make fries.

Anyway, Latino has been open for about 15 years, but it wasn’t until about five years ago (I think) that they started serving a dream dish for anyone who loves Chinese food, Peruvian food, double starch, and Chinese/Peruvian/Pittsburgh fries fusion cuisine. I refer, of course, to *Lomo Saltado* which is basically a sort of beef stir fry served on top of yellow rice and french fries. Brilliant.

Here is what it looks like:

Who can’t love this?

If I’m honest, the beef stir fry part of this dish is its weakest aspect by far … but the fries, and most importantly, the rice mostly make up for it.

Of course, even something as perfect as this combination can be ruined if you try hard enough. I bring this up because at some point last year I was excited to finally sit down in a new place in town that a lot of local foodies appear to like that had an upscale version of this dish on their menu. So of course I ordered it.

In this implementation, the meat was a lot better than the cheaper cut you get at Latino, but overall the dish was bad. The dish was bad for two reasons. The first was that the fries were bad.

I will not rant here about the bad fries. Bad fries are just a fact of American life, I think. Whereas in (say) France there are places whose entire existence is dedicated to serving nothing but steak with *perfect* french fries, even fancy places run by fancy chefs in the U.S. will serve you sub-par french fries on a routine basis. This is kind of unforgivable, but I guess fries are also technically a tiny bit demanding to do well on a large scale. But really it’s still unforgivable.

The second thing that ruined the expensive plate was that the rice was unforgiveably bad. *This* I will rant about. It tasted like day old rice that you have left partially uncovered in the fridge and then reheated in the microwave for about 30 seconds while forgetting to add a small dribble of water. You bite into it and the kernels break off in your mouth in a mealy semi-crunchy and tasteless mess. But, you are either too lazy, too tired, too hungover, or too hungry to fix it now, and just dump your food on top hoping the sauce from the food will finish the job of bringing the rice back to life. But it does not.

All this for $29 a plate.

This is, of course, not the first time I have gotten unforgivably bad rice in a restaurant. I sent the rice *back* at a fancy French place in Paris once, and they sent me back a perfectly great risotto, which for some reason they could make better than plain white rice. I also got the single worst bowl of white rice that I ever paid money for in my life at a fancy special dinner at a long standing and well loved local Asian fusion joint (it was Soba) in Pittsburgh. That rice was like the fridge rice above, except they hadn’t even really tried to reheat it, I think. This happened a long time ago, and people say I should be over it by now. But I’m not.

I am here to say that it does not have to be this way. Unlike fries, *rice* is completely trivial to cook well. Here is what you do:

- Buy a fucking rice cooker.
- Cook perfect rice every single time.

The cooker will even keep the perfect rice *warm* and perfect for the entire restaurant service. I doubt that there is any single food product that requires fewer brain cells to do well than perfect white rice in a rice cooker. And yet the evidence before us is that people care so little about rice that they won’t even do the bare minimum amount of work needed to make it decent.

At this point in the article at least 15 reply-boys (and girls) will stand up and declare that no functioning human being should need a *dedicated kitchen appliance* taking up their precious counter space for the sole purpose of making sure that the rice is good every time. I am here to say that these people are wrong, because their framing of the question is wrong. The question is not “can I cook OK rice on the stove (or more recently, in the microwave)?”. The question is: “can I push a single button and get perfect rice of any kind any time I want without looking at the cooker again until it’s done?” … and then also keep it warm and perfect for between 8 and 24 hours afterwards.

The second thing is what rice cookers do, and if rice is important enough to you that it will be the main starch in more than 15 out of 35 meals every week, then you will do the right thing and just buy the machine.

But, this attitude about rice is rare here which is why rice always sucks in the U.S. In the U.S. (and most of Europe, really) rice is at best a second tier auxiliary starch that is only used once in a while. In baseball terms, it’s not “an every day player”. So no one actually cares if it’s good or not.

Let us contrast this situation with Japan (and most other places in Asia). In Japan you can walk into *any* 7-11 store *anywhere* in the country and walk over to a cooler with *pre-made* sushi things in it and you will get better rice, even though it’s cold, than what is served in about 99% of all places that serve rice in the West. In particular the rice wrapped in sweet tofu skin is *always great*, from the middle of Tokyo to any random small town with a train station 7-11 no matter how few people live there. This is because they have a rice cooker and they give a shit.

I have often joked that one of my food dreams would be for a single Japanese 7-11 to open within a reasonable driving distance from my home. Not only would this greatly improve upon the quality of snack foods available in the area, it would also instantly become the best sushi restaurant in town and the best cheap East Asian fast food in town. But really this dream is more about just having a place somewhere that cares about the rice as much as you are supposed to.

The rice is important. In sushi, it’s as important as the fish (remember: sushi means rice). In Chinese food it’s as important as all the other dishes on the table, because all of them are improved when put on top of rice. Rice should not be an overlooked side dish that is little more than some extra food cost. We need for it to be elevated to the same level as potatoes, pasta, the fancy sourdough bread, and all those other hideous whole grain products that unlike rice don’t really taste that good. It should be a whole menu unto itself.

The rice is important.

Chicken Latino.

Rose Tea in Oakland.

Cafe 33 in Squirrel Hill.

Chengdu Gourmet/Chengdu 2 (could be better, but not bad).

Mola. Decent sushi rice. Maybe as good as Chaya was, which was the standard back in the day.

Penn Avenue Fish Company. Their rice is good enough for the rice bowls and stuff but the sushi rice is sub par.

Salems

Turkish Kebab House.

Most of the Indian restaurants, although Indian rice is a different aesthetic than East Asian rice (it’s not sticky, what the hell?).

Well, this aged well.

The last few years have been mostly down for the systems that we call “social media.” Once thought to be the pinnacle of the late-capitalist engine for consumer surveillance and arbitrarily profitable advertising engines people *finally* seemed to be having second thoughts about using a system that records and broadcasts their every thought and action to the entire Internet at once all to sell clicks and ads for someone else’s profit. They didn’t think *too hard* about not doing this anymore, but the thought did cross through the collective conciousness, if only for a microsecond. Then the kid pictures continued to go up and the “day in the life” videos continued to be posted. Ah well.

Then in a fun twist, a self-centered billionaire narcisscist dipshit asshole made a joke about buying twitter for 44 *billion* dollars and then it turned that, joke’s on him, he had actually made an offer that he could not back out of. And that went really well.

So now we are in a situation summed up by this message:

This same sentiment is true for most of the major social platforms. Certainly the “big three”, Facebook, Instagram and Twitter are now mostly brands shilling brands mixed with ads for other brands and then if you are really lucky there will be a message once in a while written by someone you actually told the system you wanted to follow. Tiktok is also like this, but there the difference is most of the stuff is still fun.

I am not too sad about these things. I think these systems were bad. They were a bad idea. They were badly designed, badly architected, badly implemented, and badly managed. If in fact we somehow manage to make them die, the world will be better without them.

Consider that the pitch for these systems is as follows:

Get a lot of people to sign up to provide you with all of your content

*for free*.When one out of every several million “creators” gets popular, give them channels to make “deals” with “brands” so they can get paid a pittance or maybe if they are really lucky find a real job in some other part of the industry where they don’t have to crank out a few minutes of content every single day to feed the beast.

Convince your real customers to pay you money for ads to feed to the people while they scroll the free content.

Implement all of this with some of the worst UI ever conceived by man.

Wrap it in a gift box labeled “public town square” or “important intellectual engagement” to make people think they are doing something more than just scroll cat videos, sports highlights, and soft core porn.

It has always been puzzling to me that this pitch worked so well that no one can apparently do any kind of marketing without it. How did we go so wrong?

Now I’m going to pick on twitter some more, because it’s just that easy. Twitter, to me, reached its peak sometime between 2010 and 2015 when it was good for one single thing: watching online commentary on a live event that everyone you follow on twitter was watching at the same time. For me this was NFL football games and sometimes the NBA playoffs. I imagine for a lot of the rest of the world the coverage of European and World soccer had a similar feel. For the non-sports nerds maybe it was TV shows, although no one watches those “live” anymore.

What made this fun was:

Fun interesting people were commenting on what you just saw.

You could throw your dumb thoughts into the firehose and feel like someone might be seeing them.

That’s it. That’s what twitter was good for.

Twitter is bad at literally everything else. It’s not a good chat system (all the threads are upside down). It’s not a good way to manage your content streams (the single timeline makes it too easy to miss things). It’s not good for posting anything longer than a text message, and even if it was it would be a terrible place to read such things. Finally, its implicit broadcast structure makes it too easy, in fact almost inevitable, for any random post of yours get seen by too many of the wrong people with the result being that your account is irreparably destroyed by spam from every single asshole on the Internet.

So yeah, twitter sucked in all ways. But at least it was good at that one thing.

But now it’s not even good at that one thing. With the Chief Dipshit Officer in charge all your journalist “friends” (or their bosses) who used to watch football with you have finally noticed that it’s not good for *them* to be broadcasting their thoughts in this way so that entire river of takes has dried up. So now twitter is just bad.

But you might have heard of a new kind of system named after an elephant that the nerds have been toiling over for the last decade while everyone else mostly either ignored them or just pointed and laughed. Yes indeed this system exists, but don’t get too excited about the hype. They did fix one or two bad things about twitter. Let’s see if you can guess which ones.

No, the threads are still upside down.

No, the reading interface is still terrible.

No, it’s still all timelines, so everything you might have wanted to see just gets washed away if you miss it.

No, the UI for “conversations” there is still mostly the same level of painful.

Yeah, they got rid of search, which you never once used on twitter ever.

The one thing they did do was break up the back end into a lot of separate servers run by independent individuals. So instead of broadcasting your precious thoughts to every asshole on the Internet everywhere, all you do now is broadcast them to *just the assholes on server you picked to join*. To get more assholes to see it, users from the other servers have to follow you to open up a tube to get your posts from one server to the other.

This, the nerds say, will fix everything, and create a system that can truly fulfill the *great potential* for … something … that systems like … checks notes … twitter … could have reached.

I’m here to say that this is not true. The best case for these systems is the following:

A bit less of the asshole wave from twitter. But if your account is popular, not that much less.

None of the fun things, because the big media platforms have realized that social media is a dead end, and social media with deliberately limited audience reach is even more of a dead end.

So, no more watching TV with the whole Internet for you. Instead all that they have built is, best case, a local chat room with one of the worst interfaces for a chat room that you can possibly imagine building.

Here is the thing. And please believe me when I say I am 100% serious about this take:

- USENET was the first federated social network. And it was mostly better than what we have now.

That’s really all you need to know about this.

The most amazing thing about USENET is that even with only ASCII terminals to work with the news *reading* interface *is still better* than every Internet forum and shared media site that I have ever used. If you think about what devices then could do (ASCII *only*) versus what a phone can do now, the fact that 40 years later everything is *still* worse is a pretty damning condemnation of the intellectual capacity of the human race.

Anyway. My only deep thought about why social media systems suck is essentially the same thought as in my original piece. You can’t talk to the whole Internet at once and have a good time. It’s just not possible. So whatever interface we build for this is going to have to be cut up into smaller pieces organized by common interest, much like the old Internet Forums and USENET newsgroups (and these days, slack servers, and discords run by brands). Yes yes, this is what “federation” does, in theory. But again, this is the most obvious fact in the entire world, since even I thought it up.

It remains to be seen if humanity can come up with a better way to interact with itself online. Obviously I am skeptical that it can be done. But I’m not the one to build it for anyway. I post words on the Internet that you can’t even write comments on, because comments are stupid.

A few years ago at “peak pandemic” I wrote a short blurb about fried rice where, among other things, I praised the tireless work that Uncle Roger had been doing to defend this staple of East/South Asian cuisine against a seemingly never-ending onslaught of stupidity, overthinking and general cluenessness.

Sadly, his work has not been enough. Even now in 2023 we who love this dish are still under a constant barrage of bad fried rice recipes. So I felt I had to act. Here I will repeat my long standing simple fried rice recipe, but with a few refinements to reflect the added insight about the dish that I have gained since I wrote that down more than 15 years ago. In addition I’ll provide some reference links to other places to look for good fried rice advice. With all of this material in hand you can now safely ignore any new suggestions for how to make fried rice coming from the mainstream food media of the damned (New York Times, Bon Appetit, etc) and just bookmark this page instead.

Fried rice is easy. Don’t overthink it. Almost every fried rice recipe that is posted on the internet is

- Too complicated.
- Uses too many different ingredients of different types.
- Uses too much of each one.

You do not have a gigantic 16-20 inch restaurant wok sitting on top of a rocket engine burner. You have (maybe) a 12 inch skillet or (if you have been listening to me) a 12 inch non-stick wok. Maybe you have a 14 inch wok. Great for you. They still tell you to put too much shit in that pan. This means

- You can’t move the food around without it falling out of the pan.
- The food does not cook.
- The food does not mix into a pleasureable arrangement of flavors, because you can’t mix it, because it keeps falling out of the pan.

A recent recipe in the NYT told you to pile *all* of the following material into your poor 12 inch skillet:

- 1
*entire pound*of ground meat. - 4 cups of cooked rice.
- 2 eggs.
- Frozen peas, maybe?
- Something else I forgot.

And this is for “four servings”. NYT servings are *huge*.

Anyway, you are now doomed. It doesn’t matter what else you do. You will have a mess.

So, on this page we are doing to start with the easiest recipe, with a volume of food that is manageable. Then I’ll tell you some cool variations, including one of the best fried rices ever that was published in the New York Times, of all places, more than 10 years ago. They have this recipe in the bag and still trot out all kinds of terrible bullshit anyway. I don’t get it. Anyway, here we go.

In its simplest form all you need for fried rice is this:

- Some day old cooked white rice. Let’s say about 3 cups.
*Rule number one*is that this rice has to be pre-cooked and preferably at least a day old. You can’t make fried rice with freshly cooked rice. It has to at least be cold. Short or medium grain is best. Jasmine can work. If you are thinking about brown rice just stop reading and go away now. - An egg or two.
- Some kind of onion/garlic/ginger or other aromatic. Let’s say we have 3 or 4 scallions.
- Salt, a little soy sauce, white pepper, MSG if you want to get fancy and dark soy sauce if you really want to get fancy.

Here is what you do.

First, dice the scallion into little pieces and put them in a bowl. If you are using garlic and ginger, mince that stuff too.

Second, heat your pan on medium-high heat. When it’s hot, add a teaspoon or two of oil (or if you want to live large, use lard) and crack the two eggs into the pan. Stir them around until they are 1/2 cooked.

Now add your onion/ginger/garlic. Mix.

Now add your rice and break it up into little pieces and mix it up with everything else. Work really hard at this, you don’t want any big lumps of rice, but rather all separate kernels.

When the rice is good and broken up add salt to taste, a few sprinkles of white pepper and the MSG if you have it. Then toss a 1/2 to 1 teaspoon of soy around the side of the pan and mix that in. If you have dark soy add a tiny bit to get a deeper brown color.

Mix mix mix mix mix until it looks like fried rice.

You are done.

It will look like this

OK. The first variation is to add meat. Whatever you pick, you don’t need much. 3 or 4oz is usually enough. If you want something really meaty, you could go up to 6oz or maybe half a pound. You can add a lot of different kinds of meat:

- Ground pork/beef/lamb/whatever. If you hate yourself go all out with ground chicken/turkey. I’d rather not.
- Bacon.
- Hot Italian sausage.
- Chorizo.
- That sweet Chinese/Taiwanese cured sausage.
- Kielbasa or other European cured sausage.
- Hot dogs.

The game here is always the same. First fry/saute/brown off the meat so it’s completely cooked. Put it into a bowl. Then do the same thing as we did above, and at some point mix in the meat.

Next, do everything we just did. But at the end mix in frozen peas (maybe even peas and carrots). Classic Chinese American staple:

Next, we can do more interesting vegetables than just the scallions above. Shred up any sort of green veg. that cooks fast:

- Cabbage
- Chinese cabbage
- Leeks
- Bok choy
- Whatever

Saute the vegetable in the pan first, like you did with the meat. When the vegetable is done, do the whole egg fried rice thing above just piling everything on top of it. It will be great. Here we have put all these ideas together for Chinese sausage and cabbage fried rice, with a fried egg on top and chili crisp:

Here is another example with kielbasa and cabbage in it:

And now you might be wondering about the egg on top with the crispy nuggets of something.

This is one of my favorite versions which comes from Mark Bittman at the NYT, via the Jean George restaurant in NYC. That a French person has one of the best fried rice recipes in the world is certainly … something.

Anyway, you can read the recipe here. Basically you take the minced ginger and garlic and brown them in a small pan until crispy. Then you do the fried rice above, but without the egg, and with leeks as the main vegetable. Then you assemble it by putting a fried egg and the crispy garlic on top. Fry the eggs in the oil you used to brown the garlic and ginger. Stupendous.

Bittman made a good video about it too. Watch that here.

There are a few more fancy techniques for incorporating eggs into fried rice that I have not gone over here, but the references below will show you how to do that. Especially Chef Wang. Go to town and have some fun.

For more fried rice insight start at these places:

Chef Wang. His channel has 4 or 5 great videos on this subject, including one on the fanciest most expensive fried rice ever. The egg strand technique in that video is incredible. I wish I could do it, but I’m too lazy.

Also, the Chinese Cooking Demystified people have their own insights, including how to make fried rice without waiting for the rice to sit overnight. Watch their stuff too.

Finally, here is a link to my dumb idea for a fusion fried rice food truck.

My plan today had been to just say “Happy March 1107 2020”, as we have yet again passed one more go around the sun since that great stupidity started. But instead I have a different and unexpected grudge to finally let go of.

In 1981, which is, I guess, 42 years ago, I was in high school and the best movie of the year was *Raiders of the Lost Ark*. *Raiders* is still a pretty fun watch even today, if you can look past the somewhat primitive visual effects and some of the problematic cultural politics. But, the film did not get much love at the Oscars that year, because movies that get love at the Oscars have to be solemn and serious affairs, usually involving a lot of white people drama. So instead of an actual good time, all the awards went to a dour and boring British film about some Olympic athletes or something.

The most insulting aspect of this was handing the *Best Original Score* award to a collection of dour and boring electronic pap instead of John Williams. Be honest, when was the last time you thought about the theme to *Chariots of Fire* without falling asleep? Now run that trumpet fanfare from *Raiders* through your head … see? OK whatever.

Anyway. At the time teen high school me was *furious*, and concluded that the Oscars was a huge scam run by money and old people. And I have never changed my mind.

But, this year I feel like I have to forgive them. About a year ago when I first saw the trailers to *Everything Everywhere All At Once* I immediately went around calling it *The Best Film of 2022*. I did this more after actually seeing it (twice!). And finally last night the Oscars did the right thing and picked a movie with a fun and actually enjoyable energy over a lot of dour and depressing drama for best picture. More importantly, they finally gave Michelle Yeoh her statue after robbing her in 2000 (!!) and not nominating *Crouching Tiger* for any acting awards (you can’t act in Chinese, you see). Oh, and all the other winners from the movie were great too.

So, good job Oscars. You are off the hook now. I bet you feel a great sense of relief.

I had a bit of a forced break from most of the regular things I do. An amazing thing about the world is that doing a job, and hobbies, that mostly involve sitting around and typing at a keyboard can still open you up to severely crippling orthopedic injuries in your hands and wrist area. I have been lucky to mostly avoid issues like this for the last 30 years (knock wood) so of course the thing that took me down ends up being newborn mother’s thumb. I have had this before, and always escaped with just a cortisone shot. But I was not so lucky this time and had to have a “minor” surgery, which took months to recover from.

**Note**: I guess this can also happen to people who use gamepads too much. But I *don’t* use gamepads too much … I spread the thousands of hours of Fromsoft addiction over many weeks and months and am careful to play no more than an hour or two at a time. Oh well. Who knows.

What’s important is that the result was me sitting at home unable to type, cook, shower, and lots of other stuff with my right hand. So I was really fun to be with. Here are some thoughts on things I did instead.

I buy too much music from Mosaic Records. I have since the late 80s. The result is that I have a huge number of ripped tracks from various CD collections that they have put out over the years, but I have actually only listened to a fraction (maybe slightly more than half) of all the tracks.

So I made a playlist of unplayed Mosaic tracks and started shuffling my way through it while I sat in my house either rehabbing the hand or trying not to think about how uncomfortable the hand was.

**Note**: If this ever happens to you, and you end up needing surgery of any kind … sign up for the post-surgery PT and OT *before* the procedure even happens. Trust me on this. The therapy people know *so* much more about how shit will go after the surgeon is done it’s not even funny.

Since November I have shuffled through about 2,500 tracks, and I have about 1,000 to go. I have not listened to all of them deeply and seriously, but I have listened enough to know which I like more than others. Their bread and butter has always been the classic Blue Note and more modern material. And IMHO it still is. The more historical stuff from the Swing period is good, but not to the same level.

One interesting side effect of this exercise is that it turned the music listening part of my brain back on. It had been taking a bit of a vacation lately and I had not been really motivated to engage with the huge amount of recorded music in the world. Getting this engagement back has improved my life.

Another thing I engaged in more than I have in the past is soccer, which in this section of the page I will call football since it seems more appropriate.

The beginning of my break landed right on the beginning of the World Cup which happened in the winter this year because it was held in a part of the world where if you played football in the summer you would fall over dead on the field. For more on this look at this set of videos.

So anyway, the World Cup was cool, especially all the Messi brilliance, the entire world making fun of Ronaldo, and that craaayyyyyyzy final game. But the difference this year was that it came right in the middle of all the major English and European professional seasons. So, after a short break the curious could dive immediately into the Premier League rat hole. In the past I had thought about doing this, but it all seemed too complicated. This year I literally had nothing better to do. But it’s *still* too complicated.

The first thing you notice watching league, or “club” football, as they call it, is that the games are very different from the World Cup. This should not be surprising. The World Cup is a giant high pressure all star game where the happiness of entire nations depends on good performances from teams that have played together for a comparatively short time. So the games tend to be a bit, for lack of a better description, stiff and tight. While it’s an incredible event the quality of the football games is a bit variable. This is me saying what hundreds of millions of people have known for decades.

Premier League is actual teams doing their actual every day jobs, and the games are a lot more fun to watch. Especially the opening minutes of games. The ball goes up and down and back and forth. People run for their lives. There are early scores. Lots of trash talking and drama. It’s great.

Of course the Premier League is, in a literal sense, only the very tip of the top of the iceberg. If you are not careful you will also find yourself diving into the other Euro leagues (Ligue 1, Bundesliga, La Liga, Serie A, etc), the secondary English leagues, FA Cup, League Cup (which for sponsorship reasons is really the Carabao Cup right now), Champions League, Europa League, and who knows what else. One could retire and do nothing but watch English and European football 7 days a week, 10 hours a day, for the rest of one’s life. If I did that for 5 or 10 years I might finally figure out how offside works.

The best thing about these games is the timing. Because of the time zones involved there are no night games that go until midnight. Great! And, the games always finish on time. Wonderful! And, you can play almost two complete English football games in the time it takes to get through a regular season MLB or NFL game … or sometimes two and a half games for a playoff baseball or NFL game. I’m pretty sure if English football had been invented in the U.S. we’d get ad breaks before every corner and free kick. And the free kicks would have a sponsor (and next, the Visa Interest Free Free kick!). Instead we get these weird surreal 3-d cans on the field:

The next best thing about all these games is the songs. After that the best thing is the shirts with collars. But finally the actual best thing is how high variance they are. It seems fairly rare for things to actually go the way people would expect them to, given whatever the current league standings (er, table) are at the time. The bottom of the league will often outplay and destroy the top. Or, some team from the third or fourth tier league will play a team two or three tiers above it to an exciting draw. Exciting draws are not a thing I would have expected to either

- Acknowledge

or

- Enjoy

But there you go.

Finally, to answer the obvious question: I have not picked a team yet. I’m going to bandwagon the front runners (Arsenal, Man United) this year. And I like the Crystal Palace shirts and team name the best. *Wolverhampton Wanderers* and *Nottingham Forest* are also strong contenders in the name contest. Oh, and I like the bubble machines at West Ham.

Izola’s is a buffet style restaurant in Hinesville Georgia. It serves food that is best described as “Classic American Southern”. Early on in the great stupidity they were one of my early finds on TikTok. Every day they would pan a phone camera down the buffet while a friendly voice narrated the menu. Typical items included, say, chicken and rice, chicken and dumplings, fried chicken, baked chicken, panko crusted fried fish, BBQ meatballs, Swedish Meatballs, smothered pork chops, collard greens, green beans, fried cabbage, mashed potatoes, dressing, rice, 2 or 3 kinds of gravy, Mac and Cheese (“scoop that mac”), and so on.

During the pandemic we thought: “when this stupidity is over we should go down there”. Around the time my hand finally started feeling a bit normal we decided that while stupidity in the world would never end, we should go down there for some warmer weather and the food, so we did.

And it was just as glorious as in the videos.

You should go. It’s close to Savannah, which is also a neat place.

The voice input stuff in the Apple operating systems is just good enough to be really annoying when it misses, and not really good enough to use on a regular basis if you can actually type at the keyboard. Not surprisingly it’s especially bad with jargon and other domain specific language. I used it a lot when my hand was at its most useless. But now I’ve mostly stopped again.

Finally, once things improved a lot, I made chili for the Super Bowl game (and the two or three Premier League and related games that played before it on Sunday morning and afternoon). I am pretentiously snobby about my chili because I make my own chili powder and use that as a base.

This year I learned I should be using those dried peppers to make a chili pepper sludge, and then make chili out of that. Kenji does it here.

So I did this with most of the peppers. Then used two or three more to also make some powder to spread around in the meat, and kept the rest of my method the same. And it was great. I’ll do it this way from now on. At some point I’ll update the chili recipe page too so people are not misled into using the previous inferior method.

That’s all I got for now. It’s good to be back. And I’m gonna make that chili again.

**Note**: whatever you do, don’t use the recent NYT chili recipe. The fact that they would call the result of that recipe “spicy” is an insult to the word spicy.

See you next time. Oh! It’s football time!

In part 1 and part 2 we tried to set up enough of the mathematical formalism of quantum mechanics to be able to talk about quantum measurement in a reasonably precise way. If you were smart and skipped ahead to here you can now get the whole answer without reading through all that other tedious nonsense.

For reference, here are the rules that we currently know about quantum mechanics:

States are vectors in a Hilbert space, usually over \mathbb C.

Observables are self-adjoint linear operators on that space.

The possible values of observables are the eigenvalues of the corresponding operator, and the eigenvectors are the states that achieve those values. In addition, for the operators that represent observables, we can find eigenvectors that form an orthonormal basis of the underlying state space.

There is a special observable for the energy of the system whose operator we call H, for the Hamiltonian. Time evolution of states is then given by the Schrödinger equation.

Now we’ll finally talk about measurement.

As before, I am the furthest thing from an expert on this subject. I’m just trying to summarize some interesting stuff and hoping that I’m not too wrong. I’ll provide a list of more better sources at the end.

In quantum mechanics measurements are the connection between eigen-things and observables. We interpret the eigenvalues of the operator representing an observable as the values that we can see from that observable in experiments. In addition, if the system is in a state which is an eigenvector of the operator, then the value you get from the observable will always be the corresponding eigenvalue.

The simplest model of measurement in quantum systems is to say that a measurement is represented by acting with a single operator representing the observable on a single vector representing the state of the system. In this simple model we are doing “idealized” measurements (simple operators) on “pure” states (simple vectors). There are generalizations of both of these ideas that you can pursue if you are interested. See the further reading.

If we perform a measurement on a system that is in a state represented by an eigenvector of the operator, we always get absolutely determined and well defined answers.

For example let’s say we are in a system where the Hilbert space \cal H is two dimensional, so we can represent it as \mathbb C^2 and with scalars from \mathbb C. So, any basis that we define for the space needs only two vectors: | 0 \rangle = \begin{pmatrix}1\\ 0\end{pmatrix} and | 1 \rangle = \begin{pmatrix}0\\ 1\end{pmatrix}

If we have some operator S such that | 0 \rangle and | 1 \rangle are its eigenvectors with eigenvalues \lambda_0 and \lambda_1. Then we know that if we measure either | 0 \rangle or | 1 \rangle with S we’ll get some number with probability 100%:

That is:

S | 0 \rangle = \lambda_0 | 0 \rangle

and

S | 1 \rangle = \lambda_1 | 1 \rangle

But, quantum states come in Hilbert spaces, which are linear. This means that we also have to figure out what to do if our state vector is any linear combination of the eigenvectors. So what if we had a state like this:

c_0 | 0 \rangle + c_1 | 1 \rangle

where c_0 and c_1 are arbitrary constants? In this case the result of doing a measurement will then either be the eigenvalue \lambda_0 with some probability p_0 or \lambda_1 with some other probability p_1.

The Born rule then states that the probability of getting \lambda_0 is

p_0 = { |c_0|^2 \over |c_0|^2 + |c_1|^2 }

and the probability of getting \lambda_1 is

p_1 = { |c_1|^2 \over |c_0|^2 + |c_1|^2 } .

We have seen a version of this rule before, in part 1, but this time I normalized the probabilities like a good boy (so that they add up to 1).

One last puzzle that should be bothering you is the question of whether we can represent *any* state as a linear combination of the eigenvectors of the operator. It turns out we can, because we specified that observables are self-adjoint, so we can invoke the spectral theorem from part 2 which says that given an arbitrary state \psi \in \cal H we can always write the state as a linear combination of the eigenvectors.

In summary: given an arbitrary state vector \psi \in \cal H and an observable represented by an operator S you can calculate the behavior of S on \psi by first expressing \psi as a linear combination of eigenvectors of S (because you can find eigenvectors that form a basis) and then applying the Born rule.

So in our example above, where the operator S has eigenvectors | 0 \rangle and | 1 \rangle, we can first write \psi like this:

\psi = c_0 | 0 \rangle + c_1 | 1 \rangle

And then we use the Born rule to compute the measurement probabilities.

The most famous two-state system in the quantum mechanics literature is the so-called “spin 1\over 2” system. The behavior of these systems was first explored in the Stern-Gerlach experiment. In this experiment you shoot electrons (really atoms with a single free electron) through a non-uniform magnetic field, and see where they end up on a screen on the other side. You would expect them to end up in some continuous distribution of possible points, but it turns out they end up in only one of two points, which we will call “up” and “down”. We’re just going to take this result for granted rather than trying to explain it right now.

We can imagine spin as being like a little arrow over the top of the electron pointing either “up” or “down” along a certain spatial axis (e.g. x, y, or z). The Stern-Gerlach device determines the state of this “arrow” by measuring the behavior of the electron in a magnetic field. So it’s sort of like a magnet … but not really.

The state space for this system is just \mathbb C^2. Each one of the spin states is some linear combination of | 0 \rangle and | 1\rangle above.

It also turns out that there are four convenient operators that we can use as observables: the identity, and a spin operator for each spatial axis which we will call S_x, S_y and S_z. For all the details of where these come from, you can read about the Pauli matrices.

The Pauli matrices are called \sigma_1, \sigma_2 and \sigma_3. And the spin operators S_x, S_y, and S_z are defined as

S_x = {\sigma_1 \over 2}, \quad S_y = {\sigma_2 \over 2}, \quad S_z = {\sigma_3 \over 2} .

I can’t decide if it’s a deep mathematical fact or just a strange coincidence of nature that \mathbb C^2 should have exactly three operators for spin measurements, one in each direction that we need. It seems a bit spooky that it worked out that way.

Note: in all of the computations below I’m leaving out factors of \hbar. This is a standard trick in physics texts … you can use units where \hbar = 1 and then put it back later if you want.

We measure spin using a box with a magnetic field in it. So, imagine that we have some box with one hole on the left, and two holes on the right. We send an electron in the left hole and it comes out the top hole if the spin is up, and the bottom hole if the spin is down. We have three kinds of boxes that each measure the spin in a different direction (again: x, y or z).

So the S_z box looks like this:

We start with a beam of particles where each particle is in a completely random state. Electrons (say) go in the left hole and the spin up stuff is directed out the top right hole and the spin down stuff comes out the bottom right hole. We can then consider what happens if we take a bunch of devices like this, chain them together, and take sequential measurements.

First suppose we put another S_z box right after the first one so that all of the particles that enter the second box come out of the {\small +} hole of the first box. What will happen here is that 100% of this beam will come out the {\small +} hole of the second box. This seems very reasonable, since they were all z-spin up particles.

This behavior might make you think that z-spin is a property that we can attach to the electron, perhaps for all time, like classical properties, and that this box acts like a filter that just reads off the property and sends the particles the right way. Keep this thought in your brain.

Next, we can see that the relationship of S_z to S_x is also straightforward. A particle that has a definite z-spin still has an undefined x-spin:

So here when we put a S_x box right after the S_z box and send all the z-spin up particles through we will get x-spin up half the time and x-spin down half the time. If you study the material on the Pauli matrices above this will make sense because it turns out that the eigenvectors of S_z can be written as a superposition of the S_x eigenvectors with coefficients that make these probabilities 1/2 (and vice versa). In particular:

|z_+\rangle = | 0 \rangle = \begin{pmatrix} 1 \\ 0 \end{pmatrix}, \, {\rm and}\,\, |z_-\rangle = | 1 \rangle = \begin{pmatrix} 0 \\ 1 \end{pmatrix}

|x_+\rangle = {1 \over \sqrt{2}} \begin{pmatrix} 1 \\ 1 \end{pmatrix}, \, {\rm and}\,\, |x_-\rangle = {1 \over \sqrt{2}} \begin{pmatrix} 1 \\ -1 \end{pmatrix}

From this we can figure out that:

|x_+\rangle = {1 \over \sqrt{2}} (|z_+\rangle + |z_-\rangle)

and

|z_+\rangle = {1 \over \sqrt{2}} (|x_+\rangle + |x_-\rangle)

The Born rule then tells us that measuring the x-spin of a z-spin up particle will get you x-spin up half the time and x-down half the time. Similarly, measuring the z-spin of an x-spin up particle will get you z-spin up half the time and z-spin down half the time.

Relationships like this also happen to be true for the all of eigenvectors of all the spin operators. Some of the references at the end go into these details.

Finally, we can push on this idea a bit more by adding yet another S_z box on the end of the experiment above. When we do this we get a result that is somewhat surprising.

We might think that all of the particles coming out of the S_x box should be z-spin “up” since we had filtered for those using the first box. Sadly, this is not the case. Measuring the x-spin seems to wipe away whatever z-spin we saw before. This is surprising. Somehow going through the S_x box has made the z-spin undefined again, and we go back to 50/50 instead of 100% spin up.

So now our problem is this: what is going on in the last spin experiment?

We can interpret the first two experiments as behaving like sequential filters. The first z-spin box filters out just the particles with spin-up, and then we feed those to the second box (either z or x) and get the expected answer.

In order to make sense of the third experiment it seems like we need posit that measurements in quantum mechanics have side effects on the systems that they measure. How can we account for the fact that the z-up property that the particles have before measuring the x-spin seems to disappear after we measure the x-spin?

The standard answer to this question goes something like this:

- We start with particles with some arbitrary spin state.
- But, when the particles that come out of the {\small +} hole of the first z-spin box have a definite spin of |z_+\rangle, or z-up.
- Thus if the second box measures z-spin again, as in the second experiment, all the particles are spin up, and they all come out of the z-up hole.
- But, if the second box is an x-spin box, as in the third experiment, then since |z_+\rangle = {1 \over \sqrt{2}} (|x_+\rangle + |x_-\rangle), the x-spin is indeterminate, and we go back to a 50/50 split.
- Finally, if we now believe that measuring the spin also resets the spin state of the particle, like in step 2 above, then the new state of the particle coming out of the {\small +} hole of the x-spin box will have x-spin up so their state will be |x_+\rangle = {1 \over \sqrt{2}} (|z_+\rangle + |z_-\rangle), which is why in the third and last box the z-spin is indeterminate again.

Thus, we are led to ponder another rule to the four we already had for how quantum mechanics works:

Suppose we have a quantum system that is in some state \psi and we perform a measurement on the system for an observable O. Then the result of this measurement will be one of the eigenvalues \lambda of O with a probability determined by the Born rule. In addition,

afterthe measurement the system will evolve to a new state \psi', which will be the eigenvector that corresponds to the eigenvalue that we obtained.

This is, of course, the (in)famous “collapse of the wave function”, and with the background that I have made you slog through it should really be bothering you now.

We seem to need this rule, along with the original rule about eigenvalues and eigenvectors to make our formalism agree with the following general *experimental* fact:

Whenever we measure a quantum system we always get one definite answer, and if we measure the system again in the same way, we get the same single answer again.

The problem is that the collapse rule completely contradicts our existing time evolution rule, which says that everything evolves continuously and linearly via the Schrödinger equation:

i \hbar \frac{\partial}{\partial t} | \psi(t) \rangle = H | \psi(t) \rangle .

This equation can do a lot of things, but the one thing it cannot do is take a state like this

|ψ\rangle = c_1|ψ_1 \rangle + c_2|ψ_2 \rangle

and remove the superposition. With that equation we can only ever end up in another superposition state, like this:

|ψ'\rangle = c_1' |ψ_1'\rangle + c_2' |ψ_2'\rangle .

To bring this back to our example, suppose our S_x box is modeled as a simple quantum system with three states: |m_0\rangle for when the box is ready to measure something, |m_+\rangle for when it has measured spin up, and |m_-\rangle for when it has measured spin down. Here the m is for machine, or measurement.

In our experiment, at the second box, we start with a particle in the state

|z_+\rangle = {1 \over \sqrt{2}} (|x_+\rangle + |x_-\rangle)

and send it into the S_x box, which starts in the state |m_0\rangle. So the state of the composite system becomes the superposition:

{1 \over \sqrt{2}} (|x_+\rangle + |x_-\rangle)|m_0\rangle .

This state means “the particle is in a superposition of x-spin up and x-spin down, and the measuring device is ready to measure it.” ^{1}

If we believe that Schrödinger evolution is the only rule we have, then this state can only evolve like this:

{1 \over \sqrt{2}} (|x_+\rangle + |x_-\rangle)|m_0\rangle \quad \xrightarrow{\hspace 20pt} \quad {1 \over \sqrt{2}} ( |x_+\rangle|m_+\rangle + |x_-\rangle|m_-\rangle ) .

That is, the box and the particle must evolve to a superposition of “spin up” and “measured spin up” with “spin down” and “measured spin down”. The Schrödinger equation never removes the superposition.^{2}

But we never see states like this. Particles go into measuring devices, and those devices give us a single answer with a single value. The world is not full of superposed Stern-Gerlach devices, or CCDs, or TV screens. Furthermore: cats, famously, are never both alive and dead.

Instead, the particle enters the device and we see a universe where the device tells us a single definitive answer: either spin up or spin down. That is, using our notation above, the real world time evolution seems to always look like this:

{1 \over \sqrt{2}} (|x_+\rangle + |x_-\rangle)|m_0\rangle \quad \xrightarrow{\hspace 20pt} \quad |x_+\rangle|m_+\rangle

or

{1 \over \sqrt{2}} (|x_+\rangle + |x_-\rangle)|m_0\rangle \quad \xrightarrow{\hspace 20pt} \quad |x_-\rangle|m_-\rangle

So we seem to have a fundamental conflict: the Schrödinger equation says we should see superpositions, but in our experiments we never see superpositions.

This, dear friends, is the measurement problem. It is a fundamental contradiction between the observed behavior of real systems in the world, and what the Schrödinger equation dictates.

The literature on the “interpretation of quantum mechanics” is of course full of deep thoughts about the questions that the measurement problem raises. I could not possibly do more than unfairly caricature the various possible stances that one could have about this question, so that’s what I will do. Here are some things we can do:

We can take the collapse rule as a postulate and until we understand how measurement works, just use the rules and try to be happy. This view is often called the “Copenhagen” interpretation, although that’s not really right and the Copenhagen story is actually a lot more complicated than this. A better name for this view is the “standard” or “text book” viewpoint.

We can say that quantum states are mainly a tool for describing the statistical behavior of experiments. Ballentine’s book, which I referenced in part 2, has a careful exposition of one version of this line of thought where the wave function only describes statistical ensembles of systems. There are, of course, a spectrum of different opinions about whether quantum mechanics describes any physical reality at all, or just the behavior of experiments.

We can say that the collapse rule is either not needed or not contradictory because quantum states are not really things that exist in the world. Rather, the quantum state is just a way of describing what we, or some set of rational agents, believes about the world. The most recent version of this idea is probably QBism.

We can think that wave functions do not describe the entire state of the system. Instead, there is some other part of the state that gives systems definite measured properties. The most popular version of this idea is the “pilot wave” or “Bohmian” version of quantum mechanics.

We can decide that superpositions don’t actually collapse, we just can’t see the other branches. This is the Everett and/or the “Many Worlds” idea.

We can say that wave functions actually collapse through some random physical process, and we can use this fact to derive the measurement behavior (and perhaps the Born rule). The most famous theory like this is the GRW stuff.

There are dozens more ideas that I will not list here because I don’t understand them well enough to list them.

If forced to take a stance I would probably say that I am most sympathetic to the more “ontological” theories, like Bohm or Everett. My least favorite idea is probably QBism because I have a hard time being enthusiastic about a world where everything is just the knowledge and credences of rational actors. But, in between these two extremes I enjoy the careful and pragmatic thinking that’s been done about the nature of experiments and measurement in quantum theory. I used Ballentine’s book as an example of this, but there is a lot more where that came from (see Peres for example). I feel like what we really need to do is to attack the core question of what is really happening in quantum and quantum/classical interactions. Until we have a better understanding of that I think we’ll never figure out this puzzle.

When in doubt, I will just appeal to my favorite quantum computer nerd: Scott Aaronson for his point of view, which seems right.

I left out a lot of important details related to the structure of Hilbert space. In the finite dimensional case they don’t matter too much but they are critical in the infinite dimensional case. Watch Schuller’s lectures on quantum mechanics to fill those in.

I really only covered the simplest possible models of quantum states, observables and measurements. Mixed state, density operators, POVMs and all that are missing. Schuller’s lectures or any of the more mathematical books that I listed cover this.

I left out the uncertainty principle, which is kind of a big part of the story to skip. You can talk about it in the context of the spin operators but it’s a lot of work and not directly related to the puzzle that I was trying to get to.

I left out the entire huge world of

*entangled*states because I did not want to introduce any more formalism. Entanglement, Bell’s theorem and all that is also just too big a subject to mention and not go into it, so I left it out Maybe we’ll cover that in a future part 4.I never mentioned decoherence. I am a bad person.

I played fast and loose with normalization when talking about quantum states and operators. I should have been much more careful, but I’m lazy.

I wish I could have talked about the two slit experiment. But, I’d have done a lousy job so go read Feynman instead.

Finally, you can do an experiment similar to the chained spin-box experiment with polarized light. Watch here.

Some more reading for you:

If you want to go all the way to the beginning with the original sources, both of the books by Dirac (or look at the Google Books link which is likely to be more reliable) and von Neumann are still pretty readable.

Travis Norsen’s Foundations of Quantum Mechanics is a great introduction to this material. A good combination of nuts and bolts physics and discussions of the conceptual issues.

David Albert’s Quantum Mechanics and Experience (also at amazon) has a nice abstracted description of the spin-box experiment that I have butchered above. This one goes well with Norsen.

Sakurai’s Modern Quantum Mecanics starts with a good discussion of the spin experiments I used as an example.

An older book, Quantum mechanics and the particles of nature, by Sudbery, goes at this from a point of view that I like. Hard to find though.

Hughes’ The Structure and Interpretation of Quantum Mechanics also starts with spin but is a more philosophical look at the material.

The Stanford Encyclopedia of Philosophy has a lot of material on quantum mechanics and its interpretation. Their summary page is also a bit shorter, yet also more detailed, than my effort here.

You should read this paper by Leifer just for the delicious pun in the title. But it’s also a great breakdown of the various ways that people talk about and interpret the quantum state.

This much more technical paper by Landsman also addresses the very complicated question of how classical and quantum states are related. He has an open access book that expands on these ideas, especially in the chapter on the measurement problem. I don’t really understand any of this, but it seems like the kind of work that needs to be done.

Those in the know will notice that I have not really explained what this notation for product states that I am using here means. I did not have the space to explain tensor products and entanglement, which is a shame because along with measurement entanglement is the second huge conceptual puzzle in quantum mechanics.↩︎

For those keeping track, this is the formula I’ve been trying to get to this whole time. Was the 9000 words worth it?↩︎

Almost every book or article about quantum mechanics seems to start with a passage like this:

Quantum mechanics is arguably the most successful physical theory in the history of science but strangely, no one really seems to agree about how it works.

And now I’ve done it to you too. One of the main reasons people write this sentence over and over again is because of what is called *the measurement problem*. Here is a way to state the measurement problem, which I will then try to explain to you.

The measurement problem refers to the following facts, which seem to contradict each other:

On the one hand, when we measure quantum systems we always see one answer.

On the other hand, if you want to use the regular rules of time evolution in quantum mechanics to describe measurements, then there are states for which measurements should not give you one answer.

In particular, measuring states that describe a *superposition* (see below) can cause a lot of confusion.

In part 1 of this series I gave you a bit of the history and motivation behind the development of quantum mechanics. It followed the development of the theory the way a lot of physics text books do, with lots of differential equations and other scary math. We will now leave all that behind us.

My plan here is to describe enough of the mathematical formalism of quantum mechanics in enough detail to express the measurement problem in a way that is relatively rigorous. This mostly boils down to a lot of tedious and basic facts about linear algebra, instead of all the scary differential equations from part 1. Personally I find the algebraic material a lot easier to understand than the more difficult differential equation solving. But, it will still be an abstract slog, but I’ll try to leave out enough of the really boring details to keep it light.

As with my other technical expositions on subjects that are not about computers, I am the furthest thing from an expert on this subject, I’m just organizing what I think are the most interesting ideas about what is going on here, and hoping that I’m not too wrong. I’ll provide a list of more better sources at the end.

The rules of quantum mechanics are about *states* and *observables*. These are both described by objects from a fancy sort of linear algebra. This involves a lot of axioms that are interesting (not really) but not needed for our purposes. To try and keep this section a bit shorter and less tedious I link out to Wikipedia for many of the mathematical details, and just provide the highlights that we need here.

Quantum states live in a thing called a *Hilbert space*, which is a special kind of vector space. Observables are a particular kind of linear function, or *operator* on a Hilbert space.

The ingredients that make up a Hilbert space are:

A set of

*scalars*. In this case it’s always the complex numbers (\mathbb C).A set of

*vectors*. Here the vectors are the wave functions.A long list of rules about how we can combine vectors and scalars together. In particular vector spaces define a notion of addition (+) for vectors that obeys some nice rules (commutativity, associativity, blah blah blah), and a notion of multiplying vectors by scalars that also obeys some nice rules. For reference, you can find the rules here.

We denote Hilbert spaces with a script “H”, like this: \cal H, and we use greek letters, most popularly \psi to denote vectors in \cal H. For a reason named Paul Dirac, we will dress up vectors using a strange bracket notation like this: | \psi \rangle, or sometimes this way \langle \psi |. This is also how we wrote down the wave functions in part 1.

The most important thing about Hilbert spaces is that they are *linear*. What this means is that any given any two vectors | \psi \rangle and | \phi \rangle and two scalars a and b, any expression like

a | \psi \rangle + b | \phi \rangle

is also a vector in \cal H.

This rule, it turns out, is the most important rule in Quantum Mechanics and is famously called the *superposition principle*. You will also see states that are written down this way called *superposition states*. But, this terminology is more magic sounding than it needs to be. This is just a linear combination of two states, and the fact that you always get another state is also a straightforward consequence of the form of the Schrödinger equation (it is what we call a first order, or *linear* differential equation). Linearity plays a big role in the eventual measurement puzzle, so store that away in our memory for later.

The second most important thing about Hilbert spaces is that they define an *inner product* operation that allows us to define things like length and angle. We write this product this way:

\langle \psi | \phi \rangle

and its value is either a real or complex number.

Now we see a bit of the utility of this strange bracket notation. In Dirac’s terminology the | \psi \rangle is a “ket” or “ket vector” and the \langle \psi | is a “bra”. So you put them together and you get a “bra ket” or “braket”. So all of this silliness is in service of a bad pun. Those wacky physicists thought this joke was so funny that we’ve been stuck with this notation for almost a hundred years now.

There is also some subtle math that you have to do to make sure that the “bra” \langle \psi | is a thing that makes sense in this context, but let’s assume we have done that and it has all worked out.

As always, I refer you to wikipedia for the comprehensive list of important inner product facts.

We can use the inner product to define a notion of distance in a Hilbert space that is similar to the familiar “Euclidean” distance that they teach you in high school. For a given vector \psi the norm of \psi is written \lVert \psi \rVert and is defined as

\lVert \psi \rVert = \sqrt{\langle \psi | \psi \rangle}

Since \langle \psi | \psi \rangle is always positive this is well-defined. You can also define the distance between two vectors in a Hilbert space as \lVert \psi - \phi \rVert.

The inner product and the norm will form the basis for how we compute probabilities using the Born rule, which we saw in part 1.

All of this nonsense with Hilbert spaces and inner products is motivated by wanting to do calculus and mathematical analysis on objects that are *functions* rather than plain numbers (or vectors of numbers). This comes up because the big conceptual shift in quantum mechanics was moving from properties that had values which were real numbers to properties described by complex valued *functions* or *wave functions*. The issue was that we know how to do calculus over the reals, but calculus with function valued objects is a stranger thing. *Functional analysis* is the area of mathematics that studies this, and Hilbert spaces come from functional analysis. In the 30s von Neumann realized that functional analysis, Hilbert spaces, and operators were the right tools to use to build a unified basis for quantum mechanics. And that’s what he did in his famous book.

If we wanted to actually prove some of the things that I will later claim to be true about Hilbert spaces and operators we would need some of the more technical results from functional analysis. Doing such proofs is way above my pay grade so I’m mostly ignoring such things for now. But at the end of this whole story I’ll make a list of things that I left out.

After working out the mathematical basis for quantum theory Von Neumann went on to invent the dominant model that we still use to describe computers. So think about that next time you are feeling yourself after having written some clever piece of code.

The third important fact about Hilbert spaces that we will need is the idea of a *basis*. In a Hilbert space (really any vector space) a *basis* is a set of vectors that one can use to represent any other vector in the space using linear combinations. If this set is *finite*, meaning that you can count up the number of basis vectors you need with your fingers, then we say that the vector space is “finite dimensional”.

The most familiar example of a finite dimensional Hilbert space is \mathbb C^n, where the vectors are made up of n complex numbers, or \mathbb R^n, which is the same, but with real numbers. Here the basis that we all know about is the one made up of the unit vectors for each possible axis direction in the space. So, for n=3 the unit vectors are

\begin{pmatrix} 1 \\ 0 \\ 0 \\ \end{pmatrix}, \quad \begin{pmatrix} 0 \\ 1 \\ 0 \\ \end{pmatrix} \quad {\rm and} \quad \begin{pmatrix} 0 \\ 0 \\ 1 \\ \end{pmatrix}

To write down any vector v in the space all we need is three numbers, one to multiply each unit vector:

v = \begin{pmatrix} a \\ b \\ c \\ \end{pmatrix} = a\begin{pmatrix} 1 \\ 0 \\ 0 \\ \end{pmatrix} + b\begin{pmatrix} 0 \\ 1 \\ 0 \\ \end{pmatrix} + c \begin{pmatrix} 0 \\ 0 \\ 1 \\ \end{pmatrix}

By convention we write vectors in columns, which will make more sense in the next section.

And thus we have built the standard sort of coordinate system that we all know and love from 10th grade math.

This sort of basis for \mathbb C^n also has the property that it is *orthonormal*, meaning that with the standard inner product all of the unit vectors are orthogonal to each other (their mutual inner products are always zero).

In the rest of this piece we will assume that all of our Hilbert spaces have an *orthonormal* basis and that they are finite dimensional. Of course, the more famous state spaces in quantum mechanics (for position and momentum) are infinite dimensional, which is the other reason Hilbert spaces became a thing. But we will not deal with any of that complication here.

In classical mechanics we did not think about observables too much. They were just simple numbers or lists of numbers that in principle you can just read off of the mathematical model that you are working with.

But, in quantum mechanics, observables, like the states before them, become a more abstract thing, and that thing is what we call a *self-adjoint linear operator* on the Hilbert space \cal H. All this means is that for everything we want to observe we have to find a function from \cal H to \cal H that is *linear* and also obeys some more technical rules that I will sort of define below.

Linearity we have seen before. This just means that if you have a operator O that takes a vector \psi and maps it to another vector, then you can move O in and out of linear combinations of vectors. In particular

O(\alpha \psi) = \alpha O (\psi)

and

O(\psi + \phi) = O(\psi) + O(\phi)

The “self-adjoint” (or *Hermitian*) part of the definition of observables is more technical to explain.

As we all know from basic linear algebra, in finite dimensional vector spaces you can, once you fix a basis, write linear operators down as a matrix of numbers. Then the action of the operator on any given vector is a new vector where each component of the new vector is the dot product of the original vector with the appropriate row of the matrix.

So the easiest operator to write down is the identity (\bf 1)… which just looks like the unit vector basis vectors written next to one another

{\bf 1} = \begin{pmatrix} 1 & 0 & 0\\ 0 & 1 & 0\\ 0 & 0 & 1\\ \end{pmatrix}

We can check that the application rule I outlined above works … here we write the vector we are acting on vertically for emphasis:

{\bf 1} \begin{pmatrix} a \\ b \\ c \\ \end{pmatrix} = \begin{pmatrix} 1 & 0 & 0\\ 0 & 1 & 0\\ 0 & 0 & 1\\ \end{pmatrix} \begin{pmatrix} a \\ b \\ c \\ \end{pmatrix} = a\begin{pmatrix} 1 \\ 0 \\ 0 \\ \end{pmatrix} + b\begin{pmatrix} 0 \\ 1 \\ 0 \\ \end{pmatrix} + c \begin{pmatrix} 0 \\ 0 \\ 1 \\ \end{pmatrix}

So it works!

With this background in hand, we can define the *adjoint* of an operator A, which we write as A^* (math) or A^\dagger (physics). Anyway, the adjoint of A is an operator that obeys this rule:

\langle A \psi | \phi \rangle = \langle \psi | A^* \phi \rangle

for any two vectors \psi and \phi in \cal H.

In finite dimensional complex vector spaces (e.g. \mathbb C^n), where operators can be written down as matrices, you can visualize what the adjoint is by transposing the matrix representation and taking some complex conjugates. This is not the cleanest way to define this object since the matrix representation is dependent on a basis, and we can (and did!) define the notion of an adjoint without referencing a basis at all. But it’s not the end of the world.

In infinite dimensional spaces and other more complicated situations finding the adjoint is more complicated. I’ll leave it at that.

A self-adjoint operator is just one whose adjoint is equal to itself. So it obeys the rule:

\langle A\, \psi | \phi \rangle = \langle \psi | A\, \phi \rangle .

We can remove the ^* because A = A^*.

In a lot of physics books you will also see self-adjoint operators referred to as *Hermitian* operators. In the finite dimensional complex case the two terms are equivalent.

Self-adjoint operators have some nice properties for physics. The reason why has to do with eigen-things.

Linear operators map vectors to vectors in a fairly constrained way. You have some freedom in how you transform the vector, but you don’t have *total* freedom since whatever you do has to preserve linear combinations.

But, for every operator there might be a special set of vectors that map to some scalar multiplied by themselves. That is, for some operator A and vector \psi you will have

A \psi = \alpha \psi

where \alpha is just a scalar. What this means, in some sense, is that the operator transforms the original vector to itself. The only thing that changes is its length, or magnitude.

Vectors with this property are called *eigenvectors*, and the constants are called *eigenvalues*. Both words are derived from the German word “eigen” meaning “proper” or “characteristic”, but that doesn’t really matter. This just one of those weird words that stuck around by habit.

Eigenvectors and eigenvalues come up in all kinds of contexts. They are important because they provide a way to characterize complicated transformations in a simpler way. If you have all the eigenvectors you can in principle switch to working in a basis where the transformation is a diagonal matrix, which is a usually simpler representation. The applications of this idea come up all over, from image processing to Google PageRank, to quantum mechanics.

The reason we wanted to have the operators that represent observables be self-adjoint above is that self-adjoint operators have two nice properties related to eigen-things.

All the eigenvalues of a self-adjoint operator are real-valued (even though our state space is over the complex numbers).

There is a famous theorem that says that every self-adjoint operator has a set of eigenvectors that form a

*orthonormal basis*of the underlying Hilbert space. This theorem is called the*spectral*theorem and the eigenvectors/values of the operator are called its*spectrum*. This is a very important result for quantum mechanics.

At this point you might be thinking to yourself, “I have seen this word *spectrum* before”. And you have. One of the earliest problems in quantum mechanics was to explain the spectral lines of the hydrogen atom. So you might be wondering, how do we get from these abstract quantum states and operators to energy? The answer is the next important rule of quantum mechanics, which we are already familiar with from part 1: there is a special observable for the energy of the system whose operator we call H, for the *Hamiltonian*. Time evolution of quantum states is then given by the Schrödinger equation:

i \hbar \frac{\partial}{\partial t} | \psi(t) \rangle = H | \psi(t) \rangle .

You will recall from part 1 that the wave functions, which we now know are the quantum states of a system were all solutions to this equation.

Now, the trick to solving the hydrogen atom is first finding a Hamiltonian H that correctly describes the behavior of the electron in the atom. It turns out that when you do this H will be one of our coveted self-adjoint linear operators on the Hilbert space of wave functions. This means that there will be some set of states that obey this rule:

H | \psi \rangle = E | \psi \rangle

where E here is just a real number, rather than an operator. We use the letter E to stand for energy. These energies will be the energies that appear in the spectrum of the atom.

So here is why we were going on about eigen-things before (and linear operators before that, and vector spaces before that). The Hamiltonian for the hydrogen atom H is a self-adjoint operator whose the eigenvalues are the energies in the spectrum of the atom. The eigenvectors are the electron wave functions that define the fixed energy levels at which we see spectral lines. And an amazing fact about the world is that you can actually set up a model of the hydrogen atom so that things work out in exactly this way. The setup is somewhat technical and complicated, so I don’t cover that here. I’ll use a simpler system to describe the rest of what I want to talk about.

Speaking of which.

At this point we have put together almost all of the formalism that we need. But this post has gone on too long, so I am going to make you read yet another part to get to the real point of this entire exercise. Meanwhile, here is a quick summary of what we have so far:

States are vectors in a Hilbert space, usually over \mathbb C.

Observables are self-adjoint linear operators on that space.

The possible values of observables are the eigenvalues of the corresponding operator, and the eigenvectors are the states that achieve those values. In addition, for the operators that represent observables, we can find eigenvectors that form an orthonormal basis of the underlying state space. Which is really convenient.

There is a special observable for the energy of the system whose operator we call H, for the Hamiltonian. Time evolution of states is then given by the Schrödinger equation.

Of course, I *still* have not said anything about measurement, and you should be furious with me. I promise I will in part 3.

Here are some things I like.

Isham’s Lectures on Quantum Theory, is a nice treatment of the subject that is more mathematically rigorous than most.

Peres and Ballentine are more “physics oriented” books that start from the algebraic point of view. Weinberg is also covers this material, but from a more traditional point of view, but it’s a nice illustration of how the physics view and the algebraic view are related.

Scott Aaronson’s Quantum Computing since Democritus is a nice computer nerd’s view of the world.

Brian C. Hall’s book on Quantum Theory for Mathematicians covers a lot of the more technical details about Hilbert spaces and their operators in more mathematically rigorous way.

Frederic Schuller’s lectures on quantum mechanics also gives you a rigorous mathematical view of this material.

I got it into my head that I should try to explain part of the problem with quantum mechanics on this web site. I am, of course, no expert on this subject at all. But I wanted to do a relatively simple and shallow (but mostly correct) treatment, like my category theory tutorial. So, over the last few months I’ve taken a few different shots at it but never found a way to wind it up into a single coherent train of thought. I wanted to thread my way through the physical puzzles to the mathematical formalism and then end up at the particular formula that, in my mind, sums up at least one of the problems.

I finally realized that trying to fit the whole thing into a single stream of words is beyond my talents as a writer, or at least not a structure that fits well into a single page on this web site. So I decided to split it up. So this first part is just about the move from “classical” mechanics to quantum problems … and then one or more future pages will be about the rest.

As with my other technical expositions on subjects that are not about computers, I am the furthest thing from an expert on this subject, I’m just organizing what I think are the most interesting ideas about what is going on here, and hoping that I’m not too wrong. I’ll provide a list of more better sources at the end.

To understand why quantum mechanics has puzzled people for so long we first have to go back to the mechanics that you might or might not have learned in high school or college physics. You remember …

F = ma,

all those stupid force diagrams with boxes and ramps and ropes and stuff.

It turns out that what all of this nonsense was hiding (which they tell you about sophomore year in college if you major in physics) is that every single one of these problems can be set up so you put some numbers into a single black box, turn a crank, and every answer that you ever needed falls out the other side. This magic box is a set of *differential equations* that describe how the system you have described evolves in time. I am not going to go into the details of how differential equations work, because honestly I don’t know them. But, for reference they look something like this:

\frac {d {x}}{dt} = \frac{\partial H}{\partial p}, \quad \frac {d {p}}{dt} = - \frac{\partial H}{\partial x}

Here, x represents position and p represents momentum (momentum is the mass of the object times its velocity … p = mv. For some reason this is a more convenient way to work than with the velocity directly). H is called the *Hamiltonian*, named after the mathematician who made it up: William Rowan Hamilton. It is a measure of the total energy in the system.

What the formula says, basically, is that if you have a thing and you can express the energy of the thing in the right way, then given any specification of an initial position and velocity of the thing, I can tell you exactly where the thing will be later and how fast it will be moving. All I need is a computer and the formula.

This basic set of mathematics is how we send probes millions of miles into space and have them hit a particular position over (say) Jupiter 5 years from now exactly when we think they will.

We will not really concern ourselves with the mathematical details of all of this, but there are two important thoughts to keep in mind here:

In the above model, every “thing” that we study carries a definite value for these two attributes that we are calling “position” and “momentum.” These two values completely define the behavior of the objects that we are studying using this framework. So, the objects move through space on smooth and

*completely predictable*paths, and it seems like their current state (position and momentum) is absolutely determined by their past state.More importantly, the model above directly computes all possible values of x and p that could possibly exist. That is, when you put your numbers in and turn the crank the numbers that come out are always, within the limitations of experimental error, the numbers that you see when you look at the real world. So you can, for example, throw a ball in the air and carefully track its position and speed at all times, and it will match the formulas pretty much perfectly. Not a lot of mystery.

By the end of the nineteenth century physics had developed two very successful models for how the world works: mechanics and electromagnetism and both fit into the mathematical and intellectual framework outlined above: behaviors determined by smooth and deterministic differential equations that compute values that are “real” in the actual world. Life was good.

The problem was that it didn’t work.

Quantum mechanics was originally born to describe the motion of atoms and things related to atoms. The development of the theory was driven by the experimental discovery of a host of behaviors that “classical” physics could not explain:

The behavior of the so called “black body” radiation.

The photoelectric effect.

The puzzle of why atoms were stable, when according to classical E&M they should immediately collapse.

The appearance of spectral lines that discrete frequencies in the spectrum of an atom.

“Spin” and all that.

The famous two-slit experiment.

And so on. All of these experiments are related to the “motion” of atomic (very very small) particles and radiation. The puzzling thing about these experiments with atoms and light was that while we think of atoms and their constituents as “particles” some of the behaviors that were observed only make sense if you model them as “waves”. On the other hand, classical E&M models light as a wave … but some of these experiments (the photoelectric effect) only made sense if light behaved more like a “particle”.

Over the first quarter of the 20th century various ad-hoc models and ideas were proposed to explain these things. But it wasn’t until the late 20s and early 30s that all of these ideas were codified into a more or less unified theory that we call quantum mechanics. The answer, it turned out, was to model material particles as waves, or “wave functions” that are solutions to a particular differential equation, the famous one from Schrödinger:

i \hbar \frac{\partial}{\partial t} | \psi(x, t) \rangle = H | \psi(x, t) \rangle .

Here the odd notation | \psi \rangle is used to denote the wave function of the quantum particle. I will go more into where this notation comes from in the next part.

The rest of the formula seems familiar enough on a surface level. H is again the Hamiltonian, and as before is related to the total energy of the system you are studying.

In fact, if you noodle around with this formula in just the right way you can come up with some mathematics that does a pretty good job computing the energy levels of the lines in the spectrum of the hydrogen atom. Recall that hydrogen is made up of a single proton with an electron whizzing around it. To explain the spectrum Bohr famously built a hypothetical model of the atom where electrons can only sit in a certain set of orbits that each have specific fixed energies. It turns out that when you set up the equations correctly you can find solutions to a version of the Schrödinger equation that give you wave functions at exactly these energies. You don’t get orbits, but you do get the so-called “stationary states” that are completely stable and match up with the spectral lines perfectly. So in some sense the electron is just sitting there waving around in some space in one of many possible fixed configurations. You’ve seen the pictures of the electron shells, right?

So what we have learned is that we can use Schrödinger’s equation and some smarts to tell us “where” the electron is in the atom. As ever, I will not go into the details. There are any number of books that will explain this to you. For example, Jim Baggot’s book is good for the physics point of view, while Stephanie Singer covers much the same material from a more mathematical viewpoint.

All of this makes you really want to believe that the wave function describes some sort of physical wave-like *thing* spread over all of space (x) and time (t) that will tell you something about the relationship between “where the particle is” and “what the energy is”. The fact that photons and even electrons create interference patterns that are very much like the ones you get from water waves in the two-slit experiment (see Feynman’s famous description here) makes you want to believe this even harder.

But, sadly, this is not so.

The waves in classical mechanics are an aggregate phenomena created by the motion of lots of things (air molecules, water molecules, etc) at once. Even more abstract entities like electromagnetic waves still have a sometimes visible macroscopic manifestation (let there be light!). In addition, as I mentioned before, the classical equations, in some sense, describe behavior that you can *directly observe*. You know the waves are moving through space on a particular trajectory because you can look at (say) the sky and *see* the light shining down on you.

The quantum wave function is nothing like this. Those complex numbers that are waving around are doing so in a space completely disconnected from the real world. In particular, they don’t tell you where the photon or electron *is*. Instead all they tell you is something about the chance that you have of seeing it somewhere if you look there.

But they don’t tell even you this probability directly. Instead, to get probabilities you have to compute something called the *norm* of the wave function, which is a measure of its overall magnitude … like its length if it were a piece of string. We write the norm of the wave function like this: |\psi| or |\psi(x,t)|. If you know how to compute it then the probability of finding an electron (say) at point point x in space would be

P(x,t) = |\psi(x,t)|^2 .

Computing this norm usually involves some kind of fancy integral. This interpretation of the wave function is called the *Born Rule*, and I’m not doing to go into the particular details of how one computes these things here. I will say though that this formula explains the interference patterns that you get in the two slit experiment. This computation turns up in a lot of “beginner” books on quantum mechanics, including the one by Feynman that I linked to above.

This rule feels like the luckiest in a series of lucky guesses. But it is undefeated in terms of experimental confirmation. Every experiment that has been done in quantum mechanics has amounted to thinking about a wave function, defining the right Hamiltonian, and then computing probabilities with the Born Rule, and the numbers are always right. Sometimes they are right to a ludicrous level of precision too.

In the famous double-slit experiment, for example, you send a beam of photons through one screen that has two very thin slits cut into it. Then you put a set of detectors a some distance away behind this screen. The mathematics of quantum mechanics will tell you that you should see an interference pattern on your detector array. It will even tell you the exact shape and configuration of the pattern. If you work hard enough you can probably compute this configuration with a stunning level of precision.

But quantum mechanics can’t really tell you anything about “what happens” to any single particle while it travels between the slits and the detector wall. The theory says nothing about it.

This, I think, is the first great mystery of the theory. It’s not so much that you can only compute and predict probabilities, there are many physical processes for which that is true. The real puzzle is that while the mathematics that I have hinted at above gets all the right answers, it does not appear to provide any insight into any actual physical process from which those answers can be derived. That is, your experiments always work, but it’s never really clear what is “really going on” in the “real” world.

Worse, as is well known, if you try and figure out what happens for yourself by (say) *looking* at each one of the slits to see which way the photon goes … the whole experiment falls apart and you get no quantum interference. Instead the act of measuring the position of the photon in some way seems to lock you into a history where all the photons suddenly take a single well defined path to the detector array, rather than creating the wavy interference that we got before.

This is, as you can imagine, a very unsatisfactory situation. Physics is supposed to tell you *what happened* and *where things go*. Classical mechanics seems to do this perfectly, right down to having an exact and satisfying connection between the mathematical model and what you observe in the real world. We get none of that in quantum mechanics. It is more like a computer program that always spits out the right answer but for which you do not have the source code, so you can’t reason about the exact mechanism by which the answer was generated.

In addition quantum mechanics seems to make you accept a world where the equations that tell you how systems evolve behave one way (the smooth Schrödinger equation) when you leave them alone and another way (no interference) when you look at them. This is one aspect of the so called “measurement problem” and a lot of people smarter than me have thought about it and still find themselves confused. I am also mostly confused about this, but it will take a few more details to get at the core of why.

See you later, in part 2.

If what I have written makes no sense or you want to figure it out for yourself, here are some better sources than this humble web page.

Travis Norsen’s Foundations of Quantum Mechanics is a great introduction to this material. A good combination of nuts and bolts physics and discussions of the conceptual issues.

Baggot’s Quantum Cookbook is a good semi-historical treatment of early QM.

Stephanie Singer’s algebraic treatment of the hydrogen atom is also enjoyable, but much more technical from a mathematical point of view.

This series of lectures from Allan Adams at MIT is very good.

Sean Carroll’s book is OK, as is Philip Ball’s book. They are both good non-technical explanations of the conceptual problems in the theory, to the extent that this is possible. Sabine Hossenfelder’s Youtube channel is also a good source for material at this level.

On a more technical level, this paper about “Quantum Myths” is a nice antidote to the sort of woo woo mysticism that too much of the writing on this subject indulges in.

You should also read all the John Bell stuff, and various things by David Mermin.

So about a month ago I had managed to get past this fight with a solo caster, which as all *Souls* veterans know is the easiest way to solo anything. Even so it took a week or so of noodling around and getting used to the rhythm of things, especially the big change up in phase 2. She becomes much more aggressive in phase 2 and your attack windows become much smaller.

My next goal was to try and get past the fight with minimal casting. To do this I had been running a strength/faith build, mostly using fire damage swords and such since I could not find a faith spell that really worked for general purpose combat.

Let’s see how that went. Yeah, it was going well.

Actually, this video makes it seem worse than it was. I spent a long time trying to work out what my approach was going to be, and died a ton while doing this “research” since I did not really practice any one approach very well. Once I picked something it took about a week to build up enough muscle memory to get through both phases.

I saw three possibilities:

Parry strategy. This works great if you have the reflexes. Which I do not. More later.

Blasphemous Blade strategy. This takes advantage of my faith/melee build to do maximum burst damage because she is weak to fire. It has other issues though. And, the weapon skill on this weapon is a bit too … “casty”. That is, you do a lot of damage from distance if you have enough time to get it off. But those windows are more rare than with the spell caster because the wind up for the attack is too long.

Something else.

Let’s go over these one by one.

Parrying Malenia has two main problems:

You have to learn all the moves that you can parry. There are about six which I will list below.

In a new twist Elden Ring makes you do

*three*parries for every critical against the major bosses. This is to combat the “Gwyn Problem” where if you learn how to parry just well enough the fight is trivial. Needing to constantly get three in a row means that you have to be so good you hardly ever miss.

The main moves that Malenia does which you can parry are:

The short forward dash followed by a wide swing.

The “sword click” followed by a forward dash followed by three fast swings.

The jump to

*your*left followed by a swing.The jump to

*your*right followed by a swing.Two immediate fast swings.

The jump and quick twirl in the air, followed by a swing.

I should have cut together a video showing examples of each of these and me missing. But I forgot to do that before getting past the fight. This Reddit thread describes the moves in some more detail. And this video and this video show most of the moves. But the guy hits all the parries so it doesn’t help to teach you how this fails.

Of course she has other moves you can’t or shouldn’t parry. You can’t parry the spin kick that she uses to get into a lot of her other combos. You can’t parry Waterfowl. You also can’t parry some of the new stuff she does in phase 2. I think you *can* parry the move where she jumps up in the air and then dashes forward and stabs you to death (which she always does after you heal, because she’s a cheater) but it’s not worth it because the attack is so easy to punish anyway. Finally, you don’t need to parry the swing where she swoops up into the air and then lands and asks you to please hit her.

The problem with the parry strat for *me* is that I can only parry the first three of the six moves above with any reliability. God knows I tried. It seems to me that you not only need to get the timing *just right*, especially for the faster moves, you also need to get your spacing *just right* and your parries can fail if you are just a bit too far away, or too close, or not perfectly lined up left to right.

Worse, when my parry misses I don’t know why, because it feels like I did it just like before. If I could even have done five out of six I might have tried for this more because then I would just dodge what I could not parry. Sadly there were two that just could not learn: the two fast swings and the floaty jump to the right. I could get each of them about half the time, but could never figure out what I did wrong when I missed.

So I finally gave up.

I did the fight this way for a while, but didn’t like the feel of it. Later, I also tried dual wielding this sword and a second large sword (Claymore) with fire and doing jump attack spam. That worked pretty well too, but I found it hard to keep the right spacing. So let’s talk about spacing in this fight.

Spacing for this fight is not an issue for the first quarter of her health bar. But, once she is down to 60% or 70% health, she will bring out Waterfowl Dance seemingly at any time, even in windows when you used to think you were safe doing a big attack. Like at the end of this video:

Here she does that jump dash, which you can roll through and then punish. But *not this time*. Joke’s on me.

So, what you want for this fight is something that can poke her for damage from pretty far away, so you can keep your spacing to escape things like Waterfowl as long as you are careful.

The great swords like the Blasphemous Blade and the Claymore are great, but a bit short for this. So I went looking around for other ideas, while at the same time continuing to noodle with this combo.

Note: I realize that if you are good you can evade a point blank Waterfowl Dance using that spin in place and then dodge out at just the right second move … or you can use Bloodhound’s Step, or you can switch to a shield for the first flurry and then dodge out. All of these things will work for you if you have faster muscle memory than I do but they don’t work for me because I don’t. I can only remember two or three moves at a time in a fight like this and needing to switch up in a window even as relatively long and telegraphed as the wind up to WFD is hopeless for me. Believe me I tried. It just never worked.

So the main other thing I tried was to use longer weapons with moderate bleed and/or frost to get my pokes or longer jump attacks. Obviously Rivers of Blood would also have worked here, but I’d already done that.

The real problem with this idea was that this character was a Faith/Strength build and all the moderate bleed weapons are Dex weapons. I tried to do some moderate respecs to get more Dex damage but ultimately didn’t have the patience to try and get to a 50/50/50 Faith/Strength/Dex build … which even if I had done it would have been the ultimate try hard move anyway.

Still, for those of you with Dex builds who don’t want to use Rivers … try cold katanas. These are nice because you can get two kinds of burst damage: one from the frost procs and one from the bleed. I used an Uchi and the Nagakiba with dual wielding. Both are great. They just did not hit hard enough with my build so the fight took too long.

I also tried the Bloody Helice for a while because I saw someone using it in a youtube video. Again, this could have worked on a Dex build. But that’s not what I had.

At this point I had to take an enforced one week vacation from this job to visit some folks and eat Chinese food and stuff. While I was away, I found the way that I would ultimately beat this fight, and of course it was Emarrel who showed me what to do:

This person has been showing me how to play these games better for almost a decade now, and here they are using a giant strength-oriented sword to beat this fight.

Since I had this weapon already, I gave it a try when I got back. And after a week or so it finally worked.

The reason it worked is that the poke attack with the Greatsword (and all the other colossal swords) is super overpowered. It’s as fast as the katanas, faster than the medium sized swords, has fast recovery, and hits harder and from a longer distance than almost anything else. Finally, you can get the poke out of a roll *or* a crouch, so it’s easy to get into as well.

So here is the setup:

Greatsword and 60 strength.

Giant Hunt ash of war, which is also a sort of forward poke and has the added bonus of often throwing the boss up into the air and leaving her to land in a heap. Always fun. Also very good damage for not much FP use.

Put Cold on the sword, for the occasional frostbite burst damage.

Various other body buffs, the strength and stamina Physick for phase 2, and the Godrick rune. You can save scum to not run out of rune arcs and buff items if you want.

Now just keep your spacing, do an R1 poke for fast attacks, and L2 poke for bigger hits, and take the frost bonus when you get it.

If you are better than me you can also chain a bunch of pokes and maybe a jump attack or two to get a stagger for even more burst damage. But I’m not aggressive enough to get this to happen that much. When I try too hard for the stagger I mostly end up out of position and get Waterfowled to death.

Speaking of Waterfowl. If you watch the Emarrel video you will notice that you can dodge this attack if you have almost any distance between you and Malenia when she does the first hop up in the air. So the things to avoid are:

Ending up directly under her when she starts the thing.

Running away too slowly at the start and getting murdered by the first flurry.

Still, I have recently managed to survive this attack even being as badly out of position as in this video:

So maybe some day I’ll be able to avoid it even point blank. But I would not get my hopes up.

What I do instead is try to keep enough distance to be able to run away from the attack and use the long range of the sword pokes to get damage in. This is actually the hardest thing about phase 1 of the fight. She is so passive in phase 1 that to make any progress in the fight you have to walk about and try to hit her in the face. But if you do that you risk being right next to her when the WFD starts.

I ended up using Lightning Spear from range to try and bait out the Waterfowl when I know it’s likely coming anyway. This works remarkably well at least once or twice per run.

Once you are pretty good at phase 1 you just have to see phase 2 enough to learn how to parse the visual cues with all the wings and shit on the screen and get used to the much higher level of aggression. Where in Phase 1 she does a lot of slow walking followed by long combos, in Phase 2 she really doesn’t let up at all and you have to get a tactical poke in here and there to interrupt the flow of attacks. Otherwise you just lose control and die. Like here:

When you can put this all together, and get some good RNG, you will finally win, and it will feel pretty good.

I’m amazed that I only took one hit in that run. I wonder if I can duplicate the effort at level 1. Check back here in six months to see how I do.

After all this time with this fight I still mostly like it. The main obvious bullshit is the blatant input reading (dash attacks after every heal, Waterfowl whenever you over-commit and get too close) that the boss is doing. Phase 1 is also too slow, while phase 2 is too fast. But for me these are minor issues. It’s still the most interesting fight in the game, and I would not really mind going right back to it again right now.

Ten things I forgot to mention in my previous page on *Elden Ring*.

Soul, er, blood echoes, er, whatever farming. There is a great spot where, in the grand FromSoft tradition, you can do pretty fast farming on dumb enemies. You have to do the quest of Varre who you meet in the opening area of the game. It takes a bit of work, and you need to kill one great rune boss. But then you can get infinite levels for a while as long as you have a bow.

Just to be more explicit than I was in my long ramble: Leveling Dex, Faith, and Arcane, running the Uchi and some bleed stuff through the early and mid game, and then melting the late game with the mimic,

*Rivers of Blood*,*Swarm of Flies*, and*Rotten Breath*is a pretty happy way to get through without too much fuss. Even Malenia is not*too*hard with this combo. Of course, get the*Pulley Bow*for the bow cheese!Caelid and Volcano Manor are where all the early power items are. Of course, Caelid and Volcano Manor are both horrible places.

The single worst navigational path in the game is how you get to the other part of the Altus Plateau (where Volcano Manor is) from the part you see first. It’s like driving from Pittsburgh to Cleveland via Toronto.

I tried dual wielding the biggest swords in the game and it was kind of comical.

The jumping attack talisman is sort of over-powered. But of course you have to be good at landing jumping attacks.

One of the most interesting things about this game are all of the different stacking buff items (talismans, armor, physicks, spells) that increase your stats, or your damage, or both. The different combinations are a dizzying puzzle if you are a bit too much into min/maxing. It’s not too hard to find combos that double or triple the damage of specific attacks without necessarily needing to be in some low health “RTSR” mode like in the older games. In my magic run at Malenia the

*Night Comet*spell I used started at about 500-600 damage in a vanilla setup. After the buffs (and a few more levels for a different staff) I got it to do 1300-1500 or more. Fun stuff.Here is the setup: 65-ish INT, Starscourge Heirloom for +5 INT, Marika’s Soreseal for +5 INT and +5 Mind, Godfrey’s Icon for +15% damage on charged attacks, Staff of the Lost in the off hand for +30% damage on Night Sorceries (you only have to hold the staff to get the buff, handy!), Magic Shrouding Cracked Tear in the Physick potion for +20% magic damage (this lasts 3min, so at the end of the fight it was done). I also used

*Rennala’s Full Moon*to for the 10% magic resist debuff, but this only lasts for the first minute of the fight. All the buffs multiply together to get to 2-3x of my original base damage. Super fun.The crazy NPC quest with the talking pot is funny.

The various stat buff items, and the Serpent Hunter spear in Volcano manor are what make me think I could possibly do a low level run of this game. Need to find some friendly spirits to use too. Maybe

*Dung Eater*.Speaking of which, the

*Dung Eater*quest is pretty funny/tragic too. And what weird armor.

Bonus #11: Have to try the bubble horns on a faith build. They look stupidly powerful. Yes, a giant horn that shoots magic bubbles.

I have not written anything since February because of *Elden Ring*. So here we go.

*Elden Ring* is the new game from FromSoft. But you don’t need *me* to tell you that. *Elden Ring* is everywhere. Even my brother tried it. They have sold millions and millions of copies, resulting in millions and millions of twitch streams, youtube videos, tiktoks and no doubt instagram stories (or whatever) about all the strange FromSofty things in *Elden Ring* that various people are running into for the first time.

But, I have some thoughts, so I will write them down. As always, this overview will tell you things that you could have already learned from other places on the Internet weeks or months ago. And there will be spoilers. Even the title is a spoiler.

So, is it good?

Yes, it’s good. It’s probably the best game in the “Souls” framework (i.e. not *Bloodborne* or *Sekiro*) since *Dark Souls*. The love is back baby.

Where to start. Let me explain. No wait. There is too much. Let me sum up.

New names for souls and bonfires, which people will just call souls and bonfires: check.

Giant castles: check.

Giant swamps filled with poison: check.

Villages of the damned? Check.

NPCs and vendors who disappear for no reason, thus locking you out from seeing entire storylines or buying supplies later in the game? Check.

Crazy weapons with crazy move sets? Check.

Andre the Blacksmith? Check (pretty much).

Burn the world down in order to restore order to a fallen society? Check.

Mediocre Dragon fights. Check.

Bow Cheese! Check.

A giant maze of under-city sewer tunnels that all look the same and are populated with terrible giant curse frogs but the actual path through the maze is actually three steps long: check.

Of course, the new game adds a few new things to the mix:

Horse!

Jumping puzzles!

Vertical navigation! I never thought I’d live to see a video game that embodied that relatively unique driving in Pittsburgh feeling of seeing the place you want to be 200 feet below you and having no idea how to get there.

Crafting! Mostly useless. But kinda cool. I am of course famously against crafting.

So many dragon fights.

Giant underground space cities.

And, as usual, FromSoft have also streamlined and softened various aspects of their trademark difficulty engine:

There are almost no long boss runs except in some of the optional dungeons. And even when there are there is a secondary checkpoint system that allows you to avoid them if you want in most cases.

Weapon upgrades are a lot simpler, though still too complicated. There are only two kinds of upgrade stones. There are a ton of them. And, the game now completely separates weapon upgrades from things like elemental (fire, lightning, holy, magic, bleed) damage affinity. The Ashes of War system lets you mix and match weapons with damage types as much as you want with no penalties for switching back and forth. I’ll have more to say about this later because it is by far one of the best things about the new game.

The spirit summoning mechanic is great for people who don’t want to play online, but also want a little help taking the edge off the intense aggression that modern FromSoft bosses tend to have. The spirits are often better than summoning other live players who don’t know the fights yet. They don’t increase the health of the boss and all that. And, if managed correctly they can basically beat the game for you if you want.

The “open world” aspect of the game allows you to distract yourself from the boss you can’t beat by running around and collecting flowers, pots, rocks and clearing mini-dungeons and mini-bosses until you feel like beating your head against the brick wall again. This ability to change the pace of the game is nice.

I had played the game for maybe 50 or 60 hours before realizing you can warp from anywhere you might be standing, and not just from checkpoint locations. The Stockholm syndrome is real.

As a whole I think this video game takes everything that the fans loved about the first *Dark Souls* and brings it up to a somewhat higher level of refinement and execution. I have more detailed thoughts about what they did particularly well below.

As far as overall PVE gameplay goes I think melee characters with utility casting are still probably the most straightforward and strongest for the whole game. Dex/Faith/Bleed in particular seems to be the OP build of choice. That said, you can play the whole game as a caster and have that nice secure caster feeling through the whole game if you do the set up right. This is a welcome change from the later Souls games where it seemed almost impossible to get enough power behind a magic run early in the game. But maybe I was just bad at it.

I have not dabbled in PVP yet, except for one quest-line that required it. I might have more thoughts on that later. Maybe.

OK. Now on to some more specific details.

For me the juicy meat center of the *Souls* games has always been the combat system and the wide variety of weapons and move sets available in the system. No other game series has a combat system that combines relative simplicity (only two kinds of attacks) with the ability to chain various moves into satisfying combos without having to memorize particular sequences of button mashes. Also, the true joy of the games, and the core of their almost infinite replay-ability was always redoing stuff you’ve already done before, but with a different combat style and/or different weapons. In my thoughts on Dark Souls 3 I expressed a bit of disappointment in this area, having found no weapon more fun than the relatively lowly long sword with which to beat the game. So I went back and beat *Bloodborne* twice more.

You will find no such complaints from me with regard to *Elden Ring*. The magic is back … and the lightning, and the holy damage, and the bleed. My god the bleed.

I think this return to form comes from two main sources:

They have somehow figured out how to add some new twists to the already wide universe of move sets and weapon styles from the previous games (two words: bubble horn).

The Ash of War skills are a version 2.0 of the *weapon arts* mechanic from *Dark Souls 3*, something that I spent that entire game ignoring because I could find nothing interesting to do with it. In *Elden Ring* the Ashes of War serve two purposes. First, they add a “skill” or special attack to the weapon (on L2). Unlike the weapon arts from *Dark Souls 3* there are moves here that are incredibly fun to spam over and over again and also coincidentally melt even the hardest in-game enemies, including some of the hardest bosses. So win-win.

But Ashes of War *also* allow you to adjust how the damage on the weapon scales. So, instead of having to choose between 15 different upgrade paths each with their own special upgrade stones you can move a weapon instantly from strength, to dex, to int (magic and cold), to faith (fire and lightning), to bleed and then back again whenever you want by resting at a checkpoint. This way if there are no (say) faith or magic weapons with a move set that you like, no problem! Just take a (say) Claymore and then put it on whatever track you want.

But wait! There is more! Some of the ashes are not really even related to doing *damage* with the *weapon* directly. The most obvious of these are the buffs, especially the buffs involving bleed (oh my god the bleed). There are skills that mimic various spells that you would normally only be able to cast if you had built a caster. There are skills that are there just to stun lock enemies. Finally, there are some skills that are just better dodge rolls and have nothing to do with damage at all.

Over all the Ashes of War are a brilliant refinement of a mechanic that I honestly never gave a shit about. But now I’ll actually run around the game for half an hour to find some obscure skill just to play around with it.

A few favorites:

Taker’s Flames on the Blasphemous Sword (a boss weapon no less!), which does huge fire damage and also heals you from the damage.

Whatever the crazy gravity move is on Radahn’s Sword (another boss weapon!)

Everyone’s favorite L2-spam boss melter: Corpse Piler from the Rivers of Blood katana. The related, but much weaker Bloody Slash is also fun.

The Artorias flippy flippy attack that’s on the Claymore (Lion’s Claw).

Golden Vow, a 20% damage buff. No faith needed.

Square Off, the default on the Long Sword, does

*ludicrous*damage in the early game.I’m told that Bloodhound’s step is great. I haven’t tried it … yet.

The first two items on this list remind me to cover one other aspect of the weapons in this game. The boss weapons are actually good! Well, at least three of them are (Rykard, Radahn, and Malenia). This is unheard of.

Finally, no discussion of the weapons is complete without cheering for the return of the giant pizza cutter wheel from *Bloodborne*. What a great meme weapon:

The video also shows one of the hilarious new status effects in the game: sleep. There is even a sword that deals “sleep damage” as its main thing. Usually sleep is just a softer stagger effect, but the asshole in the video actually goes to sleep and lets you murder him. I find this endlessly enjoyable because I am a child. You do the same thing later in the game when you have to fight both of these assholes at once:

I will never not love this.

Bow cheese gets its own section, because I love it. Any time there is a door that is too small for a large enemy, like fat ogres:

or a giant worm dragon:

or a giant dragon dragon:

Or anytime your foe walks slowly in straight lines:

Or is stuck somewhere and can’t path to you:

You are all set my friend. Just R1-R1-R1-R1 until they die. I will even circumnavigate a giant castle to bow cheese something:

Bows in this game are great. You can even get a move that apes the terrifying rain of arrows that one of the bosses uses against you. I do not make the best use of it in this video:

But with the right build, this move melts things.

Speaking of builds. The fairly open game world of *Elden Ring* makes it perhaps more straightforward than in the other From titles to run around the map picking up useful items and upgrade materials before getting on with the business of “progressing” the game. Of course, the main places to do this seem to also be the most hostile to low level characters (poisoned swamps, towns built on lava, that sort of thing) and as always some of these schemes involve dying on purpose. But, as long as you are good at running away, you can get yourself a pretty powerful setup fairly quickly.

The best things to go after, in order of easiness are:

The Radagon soreseal, which gives you +5 to health, endurance, strength and dex. The extra health and endurance are useful even for non-melee characters. And the strength and dex points will let you get to the minimum stats needed for most of the weapons usable early.

Somber smithing stones. It’s not too hard to get sombers up to 6 without killing much of anything. You used to be able to get up to 9, but I hear that they finally patched out the jump you need to make to get the 7. So you can’t do that anymore.

Various early golden seeds and sacred tears. For healing potions.

Regular smithing stones. You can get the stones up to 5s (and a few 6s) in various mines and tunnels. Just don’t fight the bosses until you really need to. Getting more 6s, and the 7s and 8s is harder. You get three normal weapon levels from each class of regular smithing stone, so +15/+18 normal is sort of like +5/+6 somber. But you need more regular stones to get there (12 for each set of 3 levels) … and in general it’s much more of a pain to get a regular weapon to high level than the somber weapons.

Lots of other buff items, mostly talismans, but also weapon skills and spells and such. Golden vow, the flame damage buff spell, and the physical damage (Strength and Dex) physicks are the best things to get here.

To get an idea of how to approach this watch some of the speed runs on youtube. Speed runners are the best at building powerful characters fairly quickly. Although sometimes the stuff they do to make this happen is impossible for mere humans to duplicate.

The one stat buff item aside, your main concern is leveling weapons. Weapon leveling is the key to getting powerful in FromSoft games, and *Elden Ring* is no different. Character leveling is mostly for health, and thus latitude for making mistakes. In general before the late game you can plan on getting enough stones to upgrade one or two somber and one or two regular weapons to close to max level. You can’t really do more than this, so keep this in mind before when budgeting upgrade materials.

As I mentioned above, one happy development in *Elden Ring* is that it’s not too hard to run a mostly casting magic person through the whole game, without needing to spend most of the early game playing melee while your casting sucks. You have to be a bit picky about which staff you use, and getting a good setup depends a lot on various buffing strategies. But overall it’s pretty viable and fun to just sit back and R1-spam little blue bolts of death (or large purple rocks) and not have to think about learning dodge timings.

My favorite spells right now are:

Great Glintstone Shard because it hits harder than Pebble. Pebble is a also dumb name for a thing that is supposed to deal death.

Night Comet because certain bosses don’t dodge it.

Rock Sling for the stagger.

The giant Comet spell is useful sometimes, but hard to set up.

I guess I should try the flurry of stars spells (Star Shower? Stars of Ruin?) but I didn’t follow the right NPC quests to get them.

I also tried doing a more casting oriented Faith build, but it doesn’t work out. Especially in the early/mid game I could not find an incantation that hit hard enough to use as a main thing. So I got the creepy fire sword from the snake guy instead.

Both Faith and Magic users seem to have a wide variety of small fast weapons (so many katanas) to choose from to supplement casting with melee. The variety of faith weapons seems more extensive and includes all the bleed stuff and all the light sabers.

The choice for casters who also want to use giant swords is a bit more limited. But you can do it. Especially if you use magic just go get the Radahn Sword (the Starscourge Greatsword). It’s hilarious.

I like the “New Chalice Dungeons” (Tunnels and Catacombs) because they feel more connected to the game and less random than the Chalice Dungeons from Bloodborne. The fact that they are not randomly generated is also a win. The “Evergaol” (“Everjail”) mini-boss fights are also fun.

There have been complaints about the balance of enemy difficulty vs. reward. And I will echo these complaints. It’s odd to have a certain class of super tanky respawning level enemy that can be harder to kill than many of the final bosses and gives you almost nothing in return. Not even a lot of souls (no wait … blood echoes … no wait … whatever).

There have been other complaints about recycled bosses. I am less sympathetic to this complaint. I need all the practice I can get learning these complicated boss move sets. So having multiple tries at it is a win for me.

Fuck that tree level man. It’s so mean.

Another combat mechanic that comes from the earlier games that I am still trying to like but mostly just suck at is dual wielding. You can do some stupid damage this way, especially, of course, with bleed and cold. I remain too uncoordinated to make this work, especially as a I reflexively L1 to block and end up swinging instead. Maybe I’ll do a forced dual wield run.

I don’t like the new crystal lizards. Half of them drop nothing. Which stinks and is a waste of time. Do better FromSoft.

So many buffs. So many different kinds of buffs. I have not even talked about the Wonderous Physicks.

Oh. I can’t forget about the single most disturbing FromSoft enemy ever:

Luckily, burning them is hilarious:

OK. With all that out of the way let’s talk about Malenia.

Malenia is an optional boss at the end of the also completely optional “Haligtree” area of the game. This is one of the toughest areas of the game, even if you are at end game stats, so it is appropriate that Malenia is one of the toughest fights in any recent FromSoft game. On an overall difficulty scale she can be right up there with Ishin, the three phase final boss in *Sekiro*, and the Orphan of Kos from *Bloodborne*. The fight is mechanically rich, hard to learn and parse, and hard to control when you are in the middle of it. And yet, if you collect up enough help, luck, or memes it can also be on the trivial side. You can watch a youtube video of someone doing this fight and never actually duplicate their exact experience unless you know *exactly* what they were doing. It is this multiple nature that makes the fight so interesting.

The first time I beat this boss I did it with the mimic, brute force, and a lot of luck. That was fine, for the time, but I immediately felt sort of unsatisfied. So after I finished my first playthrough I immediately started two more, one with a caster and one a more mixed melee and faith character partly to get a first or second look at stuff I missed, and also to get back to this fight to learn it again.

My second time through the fight was with a caster, and I really wanted to do it without the summon. I was able to do this, though there are still parts of phase 2 that I can’t counter with 100% reliability. Phase 1 I mostly understand, and I am proud to say that I mostly learned to dodge her most famous move, the Waterfowl Dance multi-flurry attack. But I have to be far enough away from her to make it work. The trick is to run away from the first two bursts, and then dodge back into her and make her fly over you so she is too far away for the second two bursts to hit you. It’s like how you backstab the Orphan in the first part of that fight.

Getting through phase 2 without a friend is really tough because her aggression is pretty relentless, and some of the attacks in phase 2 come out looking really similar, so you do the counter for the wrong thing and then die.

Here is what it’s like when you lose control. This happens easily in Phase 2 but I’ve had plenty of this feeling in Phase 1 as well.

The next frontier is to do the fight solo and with mostly melee instead of mostly casting. I have not as yet successfully even done a solo fight melee oriented fight where I got to phase 2. Mostly I can’t keep from getting hit, and the way the fight is designed she ramps up her aggression on every hit she lands until finally the whole thing gets out of your control and you die.

The next next frontier will be do try and do the fight (and the rest of the game) at low level. I think this game lends itself to low level runs because even low level characters can use super powerful weapons. But, perfectly avoiding all damage will, of course, be really hard.

This fight will keep me interested for a long time to come, much like Manus in *Dark Souls* or the Orphan in *Bloodborne*. It has a lot of different layers to it, which makes it fun. And, it really rides an almost perfect satisfaction curve as it progresses from looking completely impossible (Waterfowl Dance!?!?!?), to randomly deadly, to sort of controllable as you understand it better and better. You will never be happier and sadder at the same time to finally win a fight in a video game than when you first beat this one. I wonder whether they built this fight first, and then put the rest of the game around it. None of the other late bosses in *Elden Ring* are even half as interesting, so I like to think that this was the case. So of course in true FromSoft fashion they trolled us all by making it “optional”.

Here’s my one solo win. I’m stacking three or four different magic buffs to get that damage (the magic damage wondrous physick, a magic damage talisman, the charged attack talisman, the staff of the lost on the off hand that buffs night comet, the the magic resist debuff from the full moon spell). I’d describe it all in more detail but this post is already too long. So maybe next time.

In my opinion this is the fight that makes the game. I really like it. Playing the game without doing this fight is almost to have not played the game at all. The weapon you get from winning is also great.

Long time readers of this site will know that I have spent a non-trivial portion of my adult life writing Objective-C code for the NeXT^H^H^H^H MacOS and iOS platforms at various stages of their development.

One quirk in the Objective-C runtime that every programmer needs to deal with is the following strange behavior: if you call a method on an object that is `nil`

the call just falls through like nothing happened (usually).

That is, if you make a call like this:

`result = [someObject someMethod:someArgument]`

and `someObject`

is the value `nil`

, so it points at nothing, the behavior of the Objective-C runtime is just to ignore the call like nothing happened and effectively return a value similar to zero or `nil`

.

In the more modern versions of the runtime, the area that the runtime uses to write the result of the method call is zero’d out no matter what the type of the return value will be. In older versions of the runtime you could get into trouble because this “return 0” behavior only worked if the method returned something that was the size of a pointer, or integer, on the runtime platform. And on PowerPC if you called a method that returned a `float`

or `double`

you could get all kinds of undefined suffering.

Anyway, I was having a chat with a nerd friend of mine at work, and we both got curious if this behavior dated back to the *original* Objective-C runtime or if it was added at some point. With the entire Internet at our fingertips surely this could not be that hard to figure out.

So I poked around, and the earliest reference that I could find with straightforward searches was a reference to this behavior on USENET in 1994:

You can read that here.

I did manage to find a copy of the original Objective-C book on archive.org, but there is no mention of this behavior in that book.

I also found this historical article from the ACM last year but it also did not specifically talk about the `nil`

-messaging behavior.

Then I realized I should search bitsavers.org but I really wasn’t sure what to look for and the site was loading slowly. Disappointed, and feeling lazy, I decided to see what twitter thought:

This, it turned out, was the perfect thing to use the giant nerd village for.

Within a few minutes there was confirmation that the behavior certainly existed and was documented by 1995

Then later we got back to 1993

There was runtime nerding by runtime nerds.

There was also funny runtime snarking by the same runtime nerds. This tweet, by the way, is true. I have seen such code and will leave it at that.

Then, I got this message which won the day with a reference back to a post in the Squeak forums, of all things. So now we have the following facts:

The long post verifies that the original Objective-C runtimes threw an error when told to send messages to

`nil`

, and that this was changed to the current fall-through behavior in a release of some software called “ICPack 201”. This package was released by a company called Stepstone, which originally developed and owned the language in the 80s.The only information about this company that I could find on the Internet was the wikipedia entry which mentions “ICPack 201” but does not say when it was released. But, it

*does*say that the package was proposed to the Open Software Foundation when they did their Request for Technology for their window management system, the software that eventually became*Motif*(shudder).Now, the wikipedia entry for Motif says that the RTF for the OSF window manager happened in 1988, so this mean that “ICPack 201” must have shipped sometime around 1988. Hooray!

Finally, a few hours later I got this reference to more NeXT documentation from 1990. Of course this manual is on bitsavers, like I figured it would be.

While in the end I could have found it myself, the great nerd convergence around this question was kinda fun.

But, don’t let this apparently heartwarming story change your mind about twitter. It’s still a cesspool that should mostly be avoided.

But once in a while it’s not too bad.

It turns out that I did find this stackoverflow thread in my first few searches, and it also has the reference to the Squeak post. But I missed it my first time through. So there you go. If I had had better eyes I would have missed out on a minor twitter storm.

I have this long standing problem with the iTunes/iTunes Store/Apple Music catalog system. I can summarize it in one screen shot.

Open up Apple Music on your Mac or iOS device, it does not matter, and in the search field try and search for “Martinů viola”. Martinů was an early 20th century Czech composer of modern classical music that I particularly enjoy. In particular he wrote a lot of interesting music for viola. Anyway, here are the results you will get:

Anyone who knows me has heard me complain about this behavior. My emotional state about it swings from amusement, to detached fatalism, and occasionally to unhinged anger.

But, I have a guess about why it happens.

The thing is, the iTunes (and thus Apple Music) catalog model is built on *songs*. So the search engine is, naturally, also oriented towards finding *songs*. So if the data you want is not in the *song*, and in particular the *song title* or the *main artist*, iTunes will not find it.

But, the classical music albums (and to some extent jazz, as well) defeat this assumption by not putting enough useful metadata in the songs. The titles do not contain the composer. The main artist is usually the performer, and not the composer. The composer is shuttled off to its own field of the song record, which I assume is either not indexed at all, or not weighted heavily when evaluating the relevance of the search results. Here is a modest example:

Here Martinů appears in the album title and in the composer fields, but never in the places that are important: song titles and the main artist field. The result is that when I search for “Martinů” in the general Apple Music catalog, Apple Music does not find high relevancy hits for the name “Martinů” anywhere, but it *does* find good hits for various “corrections” of the spelling of the name. So it thinks I am confused about the name and it shows me that stuff instead. The only way to find this record is to search on the performer’s name. But that assumes that I already knew that I wanted this specific performance, and is thus useless if I want to see all of the available performances of some particular thing.

When I search my *local* catalog for the same string, it does better. Partly this is because I have adjusted my local metadata, partly this is because my local catalog does not have as much music by popular artists whose name is a short edit distance away from “Martinů”.

The fix for this would be to either fix the search engine (unlikely) or fix the data (also unlikely). In particular, if we rewrote all of the Apple Music meta-data to follow this rule from my list of iTunes Rules, then things would be better:

The second rule is: the iTunes data model is fairly simple, so aggressively de-normalize the data. This is especially true for Classical music where the single artist single song model really breaks down. If you are not careful, you’ll go and browse albums or songs on the iPod and see 50,000 titles called “String Quartet XYZ in B Major” and so on. This is useless. The solution is to put the key artist or composer in every field of the database so they will show up in all major views in both iTunes and on the iPod. Of course, you have to do some work to be careful and keep your de-normalized formats as uniform as possible. Life is hard.

Basically, we should spam *all* of the database fields with the composer name … but especially the song titles and main artist fields. Those are the only bits that the search engine really seems to care about (which kinda makes sense), so we should tune the meta-data to take advantage of this.

Then maybe people would start complaining that they get some Martinů viola concerto album when they were searching for their favorite Martina Mcbride record. Or the Halo 2 soundtrack.

Two more notes:

Spotify seems to be better at this, as it appears to realize that the name “Martinů” is important in this context so it does not weight name corrected hits as highly. If you dig a deeper into their search hits you can still see some silliness, but it’s not as bad. This is why I sometimes keep Spotify around for random explorations of different areas of music. But for the most part I find their app annoying to use.

The best way to do these kinds of searches is to type, for example,

`Martinů site:music.apple.com`

into your favorite search engine, because just matching the literal text ends up working better than the various name correction heuristics in the music service search engines.

The James Webb Space Telescope finally took off into orbit today. In the lead up to the launch I had read up some of the background for this device, and let me just say that what they are trying to do with this telescope is super ultra bonkers.

The super short summary is: fold a telescope mirror that is four times the area of the Hubble *and* a giant foil sun-shield that is even larger into a (relatively) tiny little tube and shoot it into an orbit a million miles from the Earth where it unfolds by remote control. The sun-shield keeps the sun off of it so it can stay cold enough to collect the infrared light that its mirror is optimized for.

At this point my brain twitched a bit. Infrared? Why optimize a telescope for infrared?

Well. You may recall the story I told once about telescopes and spectral lines. See, it turns out that when you shine light through a cloud of glowing gas (say) and look at it with an instrument called a *spectrograph* that breaks the light up into a spectrum, the spectrum will be dotted with either dark or bright lines at very specific frequencies. Like this:

The mechanism that causes these lines to appear where they do had no explanation until the early 20th century, and was one of the founding problems of quantum mechanics. But that’s a story for a future article.

The spectral lines of hydrogen and other atoms have taught us more than you might be able to imagine about how the universe works and what it is made out of. As I also mentioned in my older article, one of the most important things they tell us is that the universe is expanding. Let us review.

Since light, in some ways, acts like a wave, it is subject to an effect called the Doppler shift. This is (sort of) the same physics that makes the pitch of a sound seem to shift up if the sound (shorter frequencies) is moving towards you and down (longer frequencies) if the sound is moving away from you.

When you take the spectrum of an object that is emitting light, and that object is moving away from you very quickly, you will notice that all of the spectral lines shift towards the red part of the rainbow. In the 1920s Edwin Hubble used this fact to show that the universe is expanding by collecting light from far away galaxies and noting that the further away the object was, the stronger the red shift. This fact combined with other evidence like the existence of the cosmic microwave background radiation has led to all of our current cosmological models for the overall evolution of the universe.

With this context, the motivation for building a giant IR telescope is more clear. If you make the mirror big enough to see things even further away than the Hubble can see, then the red shifts will be even stronger and will eventually shift the light completely out of the visible and into the infrared. So, what the JWST will do is fly in an orbit a million miles from the Earth and sit behind a giant piece of tin foil blocking all the light and residual IR warmth coming from the Earth, the Moon and the Sun. Then it will point its giant mirror into the great void hoping to collect light that is more than 13 billion years old. This will give us a view of the oldest and most distant stuff that we can currently observe in the universe. These things are so far away and so old that you can’t really imagine it.

Here: the stuff is three or four times as old as the current age of the solar system, which is already older than you can possibly imagine. See?

Anyway, if all goes well over the next few months there will be a new robot telescope sitting in space continuing the grand tradition in modern science of being able to look at things that we can’t even begin to see or imagine in our every day existence. This is both the basic tool and the ultimate allure of science. I of course ended up working with computers, which are nothing if not strange and invisible puzzles. But they don’t tell you too much about the big questions, like how the universe began. At least not directly. I still like to try and keep up with astrophysics though. The ability of humans to look at and learn about things that are so far beyond what we can see around us in our immediate “real” world is the only reason to have any real hope for the future these days. So let’s hope it works.

So we’re coming close to the end of year *two* of the great dark stupidity. I don’t have a pithy end of year post in me this time. It requires too much energy to generate something like that, especially given the general weight of narcissistic self-interested bullshit that hangs so heavily over everything.

As I’ve said before, I am *personally* fine, but the world overall does not show many signs of macroscopic improvement. Though I would be negligent if I did now show at least a bit of appreciation for the few small tastes of real life we had in the summer and early fall. Concerts even. Those were good.

Here’s hoping the pieces get bigger over time.

I’m not optimistic though … let’s let tiktok, of all things, illustrate things.

Today a few things that I’ve added, or improved, to my home kitchen repertoire since the great darkness began. No sourdough bread here, sorry.

OK. This is a bit of a cheat since I’ve already mentioned this in my ode to ma-po tofu in 2020 earlier. Still, it deserves another mention. Make ma-po tofu, put it on tater tots.

That’s all.

I stole this idea from this guy, but tweaked it a bit for my own style.

You can do this an Instant Pot (best), rice cooker (also good) or on the stove (probably fine, but tedious). Put the following things in the pot:

1 rice cooker measuring cup of rice. These are smaller than an actual cup measure. But don’t know by how much.

6 to 7 cups of water, or water and chicken stock.

A bit of “chicken powder”, or MSG, or both.

5 or 6 big slices of ginger.

1 4-6oz piece of frozen white fleshed fish fillet (cod or whatever). This is the one sort of fish I buy from Whole Foods.

Turn the pot on. For the Instant pot I run a “porridge” cycle which is 10-15min at high pressure. You should run whatever cycle is appropriate for your device. When the pot is done, stir the rice around. Add some soy sauce, salt, white pepper, and green onion. Now you have hot breakfast for a week. Throw a soft egg on top for an extra bonus.

Gently poached chicken has been a staple of mine for years. But I thought I’d make an extra shout for it out now for two reasons:

More people need to see this. This is so much easier and better (in some ways) than roast chicken. I don’t know why it’s not more of a thing.

This technique plays a big part in Dave Chang’s new cookbook. So I gotta ride that train too.

Here is what you do. First, buy a medium sized whole chicken. 3 to 4 pounds is ideal.

Now fill a pot with enough water to cover the chicken. Bring it to a soft boil and dunk the chicken in. If you troll around youtube you’ll see a technique involving multiple dunks. But you don’t need to do this. Just drop the chicken in the water and turn the heat to low. Eventually the water will come back to a gentle simmer. Cover the pot and let the chicken sit like that for, say, twenty minutes to an hour depending on how cooked you want the meat. Then turn the heat off and let the chicken sit some more.

When you are done you’ll have beautifully tender, soft and moist chicken meat. Pull it all off the carcass and put it in the bowl to drain and dry off. If some of the meat is not completely cooked, don’t worry. You can finish it off later when you use it in some other dish.

Now take the mostly bare bones and put them back in the pot with onion, celery, carrots and whatever else you want (or, for a more Chinese/East Asian style stock, just some celery, ginger and green onion) and simmer that for another couple of hours. Pour out into pint containers and stick them in the freezer.

Now you can make at least two meals in any number of ways.

For chicken rice, take some rice and put it in the rice cooker with the stock. Cook at normal, serve with the poached meat from above and chili sauce.

For chicken soup saute some onion, celery and carrots in a soup pot on medium heat until they get soft. Then add stock, water, white wine, your favorite greens, and some of the cut up meat. Bring up to a boil and then simmer it for a while. Finish with salt, white pepper, fish sauce or soy sauce and a bit of MSG.

For an extra umami bomb soup try the mushroom soup.

Add the cooked meat to your congee for extra protein.

For chicken and biscuits … start the same way as the chicken soup, but only add a cup or two of stock and then thicken the whole thing with a “bechamel” that you make from half a stick of butter, 4 tablespoons of flour and a cup of milk. Cook that until it’s nice. Serve on top of biscuits.

The possibilities are endless.

Extra note: the skin on a poached chicken seems to be a thing people disagree about. Chinese people seem to love it. Others, no so much. More for me.

This is the single best cooking idea I’ve had in ten years.

The NYC egg and cheese (or, more correctly, baconeggandcheese, one word) is a particular class of sandwich that is hard to find other places. A few places in Pittsburgh try to do breakfast sandwiches but mess them up by fussing too much over the bread, or by doing them on horrible biscuits, or committing any number of other sins. But, one local place has done this really well: The Pear and the Pickle. But, the pandemic has been hard on them, and it’s not clear whether they are going to come back next year.

So I have been experimenting with this dish at home, especially after noting that I could buy the same rolls that P&P use in their shop (the rolls are very important). So here is what I do.

You need: a bag of Mancini’s Egg Kaiser rolls, one egg, some bacon, and one slice of Kraft American cheese.

Cook some bacon however you like to. I make it crispy in a pan, or microwave.

Make a soft fried egg with a bit of salt and pepper. You don’t need too much salt, since the Kraft Single and the bacon will be salty.

While the egg is cooking (for 3 or 4 min) cut a roll in half, put a the Kraft Single on one half and put both halves in your toaster oven and lightly toast it. It takes my toaster just about 4min to get the roll to the right state and melt the cheese a bit. You don’t want it all the way toasted … just a bit, to dry out the upper layers of the roll and make it a bit squishy.

Now lightly squish both halves of the roll and then put in the egg and bacon to construct the sandwich. Cut in half and enjoy.

Our old friends from the Thin Man sandwich shop did a popup today in the space that Black Radish uses as a catering kitchen. Thin Man closed in 2017, a casualty of skyrocketing rents in the Strip District. But while they were open they were easily doing some of the best sandwiches in town.

Their signature eponymous sandwich, the *Thin Man*, is a thick schmear of a great chicken liver mousse, endive, and bacon on a crusty roll:

Today they also had a spicy Vietnamese style beef meatball sub, that was great.

Anyway. I post this just as an extended shout out that does not fit well into our currently shitty social media channels. And also to point out that I realized at this event that in my earlier missive about perfect sandwiches I had left Thin Man off the list. And I should not have. Their stuff is great, and as good as I’ve ever had. Though the bread might be a bit too fancy. And don’t sleep on the chicken liver mousse. Get thee to the next popup and get some, if you can.

In a fit of nerd cliché, I spent the last month or two trying to understand the *Yoneda Lemma*. It turned out that what I really needed to do was to figure out how every different writer comes up with their own strange notation to write the result down. So of course I wrote a document explaining this to myself. In an equally predictable twist, to do this I made up my own notation for everything. But I list most of the others too, since that was the point.

Then I translated the \LaTeX into markdown (mostly with pandoc, I’m not an idiot) and added this blurb. So now you can read it here too. This page was the inevitable result of making a web site that can render \TeX. So I might as well own it.

But, the pdf looks much better: so you should read that instead.

**Note**: I am not a mathematician or a category theory expert. I just wrote this down trying to figure out the language. So everything in this document is probably wrong.

The Yoneda Lemma is a basic and beloved result in category theory. Even though it is called a “lemma”, a word usually used to describe a minor result that you prove on the way to the main event, the Yoneda lemma *is* a main event. It is a result that expresses one of the main goals of category theory: it characterizes universal facts about general abstract constructs.

Its statement is deceivingly simple [9]

Let \mathbf{C} be a locally small category. Let X be an object of \mathbf{C}, and let F: \mathbf{C}\to {\mathbf {Sets}} be a functor from \mathbf{C} to the category {\mathbf {Sets}}. Then there is an invertible mapping \mathop{\mathrm{\mathit{Hom}}}(\mathbf{C}(X, -),F) \cong FX that associates each natural transformation \alpha:\mathbf{C}(X,-) \Rightarrow F with the element \alpha_X(1_X) \in FX. Moreover, this correspondence is natural in both X and F.

But as Sean Carroll famously wrote about general relativity, “…, these statements are incomprehensible unless you sling the lingo” [1].

I am going to do the following dumb thing: having stated a version of the lemma above I’m going to define only the parts of the category theory needed to explain what the lingo means. There are five or six layers of abstraction that I will try to explain. As for the larger meaning of the result itself, you are on your own. I won’t explain that, or even really show you how the proof goes.

In the spirit of video game speedruns [6], we will skip entire interesting areas of category theory in the name of getting to the end of our “game” as fast as possible. Clearly this will be no substitute for really learning the subject. Any of the references listed at the end will be a good place to start to better understand the whole game.

**Note**: Again, I am not a mathematician or a category theory expert. I just wrote this down trying to figure out the language. So everything in this document is probably wrong.

Categories have a deliciously chewy multi-part definition.

**Definition 1**. A *category* \mathbf{C} consists of:

A collection of

*objects*that we will denote with upper case letters X, Y, Z, ..., and so on. We call this collection \mathop{\mathrm{\mathit{Objects}}}(\mathbf{C}). Traditionally people write just \mathbf{C} to mean \mathop{\mathrm{\mathit{Objects}}}(\mathbf{C}) when the context makes clear what is going on.A collection of

*arrows*denoted with lower case letters f, g, h, ..., and so on. Other names for*arrows*include*mappings*or*functions*or*morphisms*. We will call this collection \mathop{\mathrm{\mathit{Arrows}}}(\mathbf{C}).

The objects and arrows of a category satisfy the following conditions:

Each arrow f connects one object A \in \mathop{\mathrm{\mathit{Objects}}}(\mathbf{C}) to another object B \in \mathop{\mathrm{\mathit{Objects}}}(\mathbf{C}) and we denote this by writing f: A \to B. A is called the

*source*(or*domain*) of f and B the*target*(or*codomain*). Source and target are somewhat more intuitive terms, but domain and codomain connect the language to functions in other areas of mathematics.For each pair of arrows f:A \to B and g : B \to C we can form a new arrow g \circ f: A \to C called the

*composition*of f and g. This is also sometimes written gf.For each A \in \mathop{\mathrm{\mathit{Objects}}}(\mathbf{C}) there is an arrow 1_A: A \to A, called the

*identity*at A that maps A to itself. Sometimes this object is also written as \mathrm{id}_A.

Finally, we have the last two rules:

For any f: A \to B we have that 1_B \circ f and f \circ 1_A are both equal to f.

Given f: A \to B, g: B \to C, h: C\to D we have that (h \circ g) \circ f = h \circ (g \circ f), or alternatively (hg)f = h(gf). What this also means is that we can always just write hgf if we want.

Given a category {\mathbf C} and objects A, B in {\mathbf C} we write \mathop{\mathrm{\mathit{Arrows}}}_{\mathbf C}(A,B) to mean the collection of all arrows from A to B in {\mathbf C} if we are being maximally careful. In practice we will usually write \mathop{\mathrm{\mathit{Arrows}}}(A,B) because the {\mathbf C} subscript is tedious and it’s usually clear what category A and B came from. People also write \mathop{\mathrm{\mathit{Hom}}}(A, B) or \mathop{\mathrm{\mathit{Hom}}}_{\mathbf{C}}(A,B), or \mathop{\mathrm{\mathit{hom}}}(A, B) or just \mathbf{C}(A,B) to mean \mathop{\mathrm{\mathit{Arrows}}}(A,B). Here “\mathop{\mathrm{\mathit{Hom}}}” stands for homomorphism, which is a standard word for mappings that preserve some kind of structure. Category theory, and the Yoneda lemma, it it turns out, is mostly about the arrows.

I have broken with well established tradition in mathematical writing and mostly spelled out names for clarity rather than engaging in the strange and random abbreviations that I see in most category theory texts. The general fear of readable names in the mathematical literature is fascinating to me, having spent most of my life trying to think up readable names in program source code. Life is too short to deal with names like \mathop{\mathit{ob}}, or \mathbf{Htpy}, or \mathbf{Matr}. Luckily, for this note the only specific category that we will run into is the straightforwardly named {\mathbf {Sets}}, where the objects are sets and the arrows are mappings between sets.

Speaking of sets, in the definition of categories we were careful about not calling anything a *set*. This is because some categories involve collections of things that are too “large” to be called sets and not get into set theory trouble. Here are two more short definitions about this that we will need.

**Definition 2**. A category \mathbf{C} is called *small* if \mathop{\mathrm{\mathit{Arrows}}}(\mathbf{C}) is a set.

**Definition 3**. A category \mathbf{C} is called *locally small* if \mathop{\mathrm{\mathit{Arrows}}}_{\mathbf{C}}(A,B) is a set for every A, B \in \mathbf{C}.

For the rest of this note we will only deal with locally small categories, since in the setup for the lemma, we are given a category \mathbf{C} that is locally small.

Finally, one more notion that we’ll need later is the idea of an *isomorphism*.

**Definition 4**. An arrow f: X \to Y in a category \mathbf{C} is an *isomorphism* if there exists an arrow g: B \to A such that gf = 1_X and fg = 1_Y. We say that the objects X and Y are *isomorphic* to each other whenever there exists an isomorphism between them. If two objects in a category are isomorphic to each other we write X \cong Y.

Note that in the category {\mathbf {Sets}} the isomorphisms are exactly the invertible mappings between sets. An invertible mapping is also called a *bijection* (because it’s injective and surjective, you see), so you will see that word sometimes.

As we navigate our way from basic categories up to the statement of the lemma we will travel through multiple layers conceptual abstraction. At the base of this ladder are the categories which themselves are already an abstraction of the many ways that we express “mathematical structures”. But we have much higher to climb. Functors are the first step up.

Functors are the *arrows between categories*. That is, if you were to define a category where the objects were all categories of some kind then the arrows would be functors.

**Definition 5**. Given two categories \mathbf{C} and \mathbf{D} a *functor* F : \mathbf{C}\to \mathbf{D} is defined by two sets of parallel rules. First:

For each object X \in \mathbf{C} we assign an object F(X) \in \mathbf{D}.

For each arrow f: X \to Y in \mathbf{C} we assign an arrow F(f): F(X) \to F(Y) in \mathbf{D}.

So F maps objects in \mathbf{C} to objects in \mathbf{D} and also arrows in \mathbf{C} to arrows in \mathbf{D} such that the sources and targets match up the right way. That is, the source of F(f) is F applied to the source of f, and the target of F(f) is F applied to the target of f. In addition the following must be true:

If f:X \to Y and g: Y \to Z are arrows in \mathbf{C} then F(g \circ f) = F(g) \circ F(f) (or F(gf) = F(g)F(f)).

For every X \in \mathbf{C} it is the case that F(1_X) = 1_{F(X)}.

Thus, the mappings that make up a functor preserve all of the structure of the source category in its target, namely the sources and targets of arrows, composition, and the identities.

If F: \mathbf{C}\to \mathbf{D} is a functor from a category \mathbf{C} to another category \mathbf{D}, X \in \mathbf{C} is an object in \mathbf{C}, and f: X \to Y is an arrow in \mathbf{C} we may write F X to mean F(X) and Ff to mean F(f). This is analogous to the more compact notation for composition of arrows above.

Functors can be notationally confusing because we are using one name to denote two mappings. So if F: \mathbf{C}\to \mathbf{D} and X \in \mathbf{C} then F(X) is the functor applied to the object, which will be an object in \mathbf{D}. On the other hand, if f : A \to B is an arrow in \mathbf{C} then we also write F(f) \in \mathbf{D} for the functor applied to the arrow. This makes sense but can be a little weird. Sometimes in proofs and calculations the notations will shift back and forth without enough context and can be disorienting.

Natural transformations are the next step up the ladder. If functors are arrows between categories, then natural transformations are arrows between functors.

**Definition 6**. Let \mathbf{C} and \mathbf{D} be categories, and let F and G be functors \mathbf{C}\to \mathbf{D}. To define a *natural transformation* \alpha from F to G, we assign to each object X of \mathbf{C}, an arrow \alpha_X:FX\to GX in \mathbf{D}, called the *component* of \alpha at X.

In addition, for each arrow f:X\to Y of \mathbf{C}, the following diagram has to commute:

This is the first commutative diagram that I’ve tossed up. There is no magic here. The idea is that you get the same result no matter which way you travel through the diagram. So here \alpha_Y \circ F and G \circ \alpha_X must be equal.

We write natural transformations with double arrows, \alpha: F \Rightarrow G, to distinguish them in diagrams from functors (which are written with single arrows):

You might wonder to yourself: what makes natural transformations “natural”? The answer appears to be related to the fact that you can construct them from *only* what is given to you in the categories at hand. The natural transformation takes the action of F on \mathbf{C} and lines it up exactly with the action of G on \mathbf{C}. No other assumptions or conditions are needed. In this sense they define a relationship between functors that is just sitting there in the world no matter what, and thus “natural”. Another apt way of putting this is that natural transformations give a canonical way of moving between the images of two functors [3].

As with arrows, it will be useful to define what an isomorphism means in the context of natural transformations:

**Definition 7**. A *natural isomorphism* is a natural transformation \alpha: F \Rightarrow G in which every component \alpha_X is an isomorphism. In this case, the natural isomorphism may be depicted as \alpha: F \cong G.

In the last two sections we have defined functors, and then the natural transformations. Given that functors and natural transformations look a lot like objects and arrows, the next obvious thing is to use them to make a new kind of category.

**Definition 8**. Let \mathbf{C} and \mathbf{D} be categories. The *functor category* from \mathbf{C} to \mathbf{D} is constructed as follows:

The objects are functors F: \mathbf{C}\to \mathbf{D};

The arrows are natural transformations \alpha:F\Rightarrow G.

Right now you should be wondering to yourself: “wait, does this definition actually work?” I have brazenly claimed without any justification that the it’s OK to use the natural transformations as arrows. Luckily it’s fairly clear that this works out if you just do everything component-wise. So if we have all of these things:

Three functors, F: \mathbf{C}\to \mathbf{D} and G: \mathbf{C}\to \mathbf{D} and H:\mathbf{C}\to \mathbf{D}.

Two natural transformations \alpha: F \Rightarrow G and \beta: G \Rightarrow H

One object X \in \mathbf{C}.

Then you can define (\beta \circ \alpha)(X) = \beta(X) \circ \alpha(X) and you get the right behavior. Similarly, the identity transformation 1_F can be defined component-wise: (1_F)(X) = 1_{F(X)}.

There are a lot of standard notations for the functor category, none of which I really like. The most popular seems to be [\mathbf{C}, \mathbf{D}], but you also see \mathbf{D}^{\mathbf{C}}, and various abbreviations like \mathop{\mathit{Fun}}(\mathbf{C},\mathbf{D}) or \mathop{\mathit{Func}}(\mathbf{C},\mathbf{D}), or \mathop{\mathit{Funct}}(\mathbf{C},\mathbf{D}). I think we should just spell it out and use \mathop{\mathrm{\mathit{Functors}}}(\mathbf{C},\mathbf{D}). So there.

Now we can define this notation:

**Definition 9**. Let \mathbf{C} and \mathbf{D} be categories, and let F, G \in \mathop{\mathrm{\mathit{Functors}}}(\mathbf{C}, \mathbf{D}). Then we’ll write \mathop{\mathrm{\mathit{Natural}}}(F, G) for the set of all natural transformations from F to G, which in this context is the same as the arrows from F to G in the functor category.

You will also see people write \mathop{\mathrm{\mathit{Hom}}}(F, G), \mathop{\mathrm{\mathit{Hom}}}_{[\mathbf{C},\mathbf{D}]}(F,G), or [\mathbf{C},\mathbf{D}](F,G) for this. Or, if \mathbf{K} is a functor category then people will write \mathop{\mathrm{\mathit{Hom}}}_{\mathbf{K}}(F,G) or \mathbf{K}(F,G) for this.

The next conceptual step that we need is a way to relate *functors* to *objects*. The following definition is a natural way to do this once you see how it works but is also probably the most confusing definition in these notes.

**Definition 10**. Given a locally small category \mathbf{C} and an object X \in \mathbf{C} we define the functor
\mathop{\mathrm{\mathit{Arrows}}}(X,-) : \mathbf{C}\to {\mathbf {Sets}} using the following assignments:

A mapping from \mathbf{C}\to {\mathbf {Sets}} that assigns to each Y \in \mathop{\mathrm{\mathit{Objects}}}(\mathbf{C}) the set \mathop{\mathrm{\mathit{Arrows}}}(X,Y)

A mapping from \mathop{\mathrm{\mathit{Arrows}}}(\mathbf{C}) \to \mathop{\mathrm{\mathit{Arrows}}}({\mathbf {Sets}}) that assigns to each arrow f: A \to B to a mapping f_* defined by f_*(g) = f\circ g for each arrow g: X \to A.

The notation \mathop{\mathrm{\mathit{Arrows}}}(X,-) needs a bit of explanation. Here the idea is that we have defined a mapping with two arguments, but then fixed the object X. Then we use the “-” symbol as a placeholder for the second argument. So \mathop{\mathrm{\mathit{Arrows}}}(X,Y) is the value of the mapping as we vary the second argument through all the other objects in \mathbf{C}. This is a bit of an abuse of notation since we are apparently using the symbol \mathop{\mathrm{\mathit{Arrows}}} to mean two different things (one is a set, the other a functor). Oh well.

The definition of the mapping for arrows also needs a bit of explanation. Given A,B \in \mathbf{C} and an arrow f: A \to B, it should be the case that \mathop{\mathrm{\mathit{Arrows}}}(X,-) applied to f is an arrow that maps \mathop{\mathrm{\mathit{Arrows}}}(X,A) \to \mathop{\mathrm{\mathit{Arrows}}}(X,B). We will call this arrow f_*. If g: X \to A is in \mathop{\mathrm{\mathit{Arrows}}}(X,A) then the value that we want for f_* at g is f_*(g) = (f \circ g): X \to B. This mapping is called the *post-composition* map of f since we apply f *after* g. You also see it written as f \circ -. The *pre-composition* map is then f^* or - \circ f.

Thus, we have worked out that the value of \mathop{\mathrm{\mathit{Arrows}}}(X,-) at f should be the arrow f \circ -. Sometimes you will see this written \mathop{\mathrm{\mathit{Arrows}}}(X, f) = f \circ -, which I find a bit odd because now we are overloading the kinds of things that can go into the “-” slot.

Check over this formula in your head, and note that there are *two* function applications (one for the functor, and one inside that for the post-composition arrow), and two different kinds of placeholder.

Other notations for this functor include \mathop{\mathrm{\mathit{Hom}}}(X, -), \mathop{\mathrm{\mathit{Hom}}}_\mathbf{C}(X, -), H^X, h^X, and just plain \mathbf{C}(X,-). In my notation we should have written this as \mathop{\mathrm{\mathit{Arrows}}}_{\mathbf{C}}(X, -), but I’m lazy. This kind of functor is also called a *hom-functor*.

Finally, we can give two more important definitions.

**Definition 11**. Given an object X \in \mathbf{C} we call the functor \mathop{\mathrm{\mathit{Arrows}}}(X,-) defined above the functor *represented* by X.

In addition, we can characterize another important relationship between objects and functors:

**Definition 12**. Let \mathbf{C} be a category. A functor F:\mathbf{C}\to{\mathbf {Sets}} is called *representable* if it is naturally isomorphic to the functor \mathop{\mathrm{\mathit{Arrows}}}_\mathbf{C}(X,-):\mathbf{C}\to{\mathbf {Sets}} for some object X of \mathbf{C}. In that case we call X the *representing object*.

Next we move a bit sideways. Duality in mathematics comes up in a lot of different ways. Covering it all is way beyond the scope of these notes. But the following definition is a basic part of category theory so it’s worth including.

**Definition 13**. Let \mathbf{C} be a category. Then we write \mathbf{C}^{\mathrm op} for the *opposite* or *dual* category of \mathbf{C}, and define it as follows:

The objects of \mathbf{C}^{\mathrm op} are the same as the objects of \mathbf{C}.

\mathop{\mathrm{\mathit{Arrows}}}(\mathbf{C}^{\mathrm op}) is defined by taking each arrow f :X \to Y in \mathop{\mathrm{\mathit{Arrows}}}(\mathbf{C}) and flipping their direction, so we put f': Y \to X into \mathop{\mathrm{\mathit{Arrows}}}(\mathbf{C}^{\mathrm op}).

In particular for X, Y \in \mathop{\mathrm{\mathit{Objects}}}(\mathbf{C}) we have \mathop{\mathrm{\mathit{Arrows}}}_{\mathbf{C}}(A, B) = \mathop{\mathrm{\mathit{Arrows}}}_{\mathbf{C}^{\mathrm op}}(B, A) (or \mathbf{C}(A, B) = \mathbf{C}^{\mathrm op}(B, A).

Composition of arrows is the same, but with the arguments reversed.

The *principle of duality* then says, informally, that every categorical definition, theorem and proof has a dual, obtained by reversing all the arrows.

Duality also applies to functors.

**Definition 14**. Given categories \mathbf{C} and \mathbf{D} a *contravariant* functor from \mathbf{C} to \mathbf{D} is a functor F: \mathbf{C}^{\mathrm op}\to \mathbf{D} where:

We have an object F(X) \in \mathop{\mathrm{\mathit{Objects}}}(\mathbf{D}) for each X \in \mathop{\mathrm{\mathit{Objects}}}(\mathbf{C}).

For each arrow f : X \to Y \in \mathop{\mathrm{\mathit{Arrows}}}(\mathbf{C}) we have an arrow F(f): FY \to FX in \mathop{\mathrm{\mathit{Arrows}}}(\mathbf{D}).

In addition

For any two arrows f, g \in \mathop{\mathrm{\mathit{Arrows}}}(\mathbf{C}) where g \circ f is defined we have F(f) \circ F(g) = F(g \circ f).

For each X \in \mathop{\mathrm{\mathit{Objects}}}(\mathbf{C}) we have 1_{F(X)} = F(1_X)

Note how the arrows and composition go backwards when they need to. With this terminology in mind, we call regular functors from \mathbf{C}\to \mathbf{D} *covariant*.

Now we have all the language we need to look at the statement of the lemma again. So, here is what we wrote down before, more verbosely, and in my notation.

**Lemma 1** (Yoneda). Let \mathbf{C} be a locally small category, F:\mathbf{C}\to {\mathbf {Sets}} a functor, and X \in \mathop{\mathrm{\mathit{Objects}}}(\mathbf{C}). We can define a mapping from \mathop{\mathrm{\mathit{Natural}}}(\mathop{\mathrm{\mathit{Arrows}}}(X, -),F) \to FX by assigning each transformation \alpha: \mathop{\mathrm{\mathit{Arrows}}}(X, -) \Rightarrow F the value \alpha_X(1_X) \in FX. This mapping is invertible and is natural in both F and X.

So now we can break it down:

In principle the natural transformations from \mathop{\mathrm{\mathit{Arrows}}}(X, -) \Rightarrow F could be a giant complicated thing.

But actually it can only be as large as FX. The fact that this mapping is invertible implies that \mathop{\mathrm{\mathit{Natural}}}(\mathop{\mathrm{\mathit{Arrows}}}(X, -),F) and FX are isomorphic (that is, \mathop{\mathrm{\mathit{Natural}}}(\mathop{\mathrm{\mathit{Arrows}}}(X, -),F) \cong FX).

In other words, every natural transformation from \mathop{\mathrm{\mathit{Arrows}}}(X, -) to F is the same as an element of the set FX. In particular, all we need to know is how \alpha_X(1_X) is defined to know how any of the natural transformations are defined.

Which is pretty amazing.

To write this in the dual language, you just change \mathop{\mathrm{\mathit{Arrows}}}(X, -) to \mathop{\mathrm{\mathit{Arrows}}}(-, X), which switches the direction of all the arrows and the order of composition in the composition maps.

So with that, here are some other ways people write the result, and how their lingo translates to my notational scheme. As one last bit of terminology, in some of the definitions below the word *bijection* is used to mean an invertible mapping.

This statement is due to Tom Leinster 5], and uses the contravariant language.

**Lemma 2** (Yoneda). Let \mathbf{C} be a locally small category. Then [\mathbf{C}^\mathrm{op},{\mathbf {Sets}}](H_X, F)
\cong
F(X)
naturally in X \in \mathbf{C} and F \in [\mathbf{C}^\mathrm{op},{\mathbf {Sets}}].

Here [\mathbf{C}^{\mathrm op}, {\mathbf {Sets}}] is the category of functors from \mathbf{C}^{\mathrm op} to {\mathbf {Sets}} and H_X means \mathop{\mathrm{\mathit{Arrows}}}(-,X). The notation [\mathbf{C}^\mathrm{op},{\mathbf {Sets}}](H_X, F) denotes the arrows in the functor category [\mathbf{C}^\mathrm{op},{\mathbf {Sets}}] between H_X and F, so it’s the same as \mathop{\mathrm{\mathit{Natural}}}(H_X, F).

Emily Riehl’s [8] version is what I used at the top:

**Lemma 3** (Yoneda). Let \mathbf{C} be a locally small category and X \in \mathbf{C}. Then for any functor F : \mathbf{C}\to {\mathbf {Sets}} there is a bijection
\mathop{\mathrm{\mathit{Hom}}}(\mathbf{C}(X,-), F) \cong FX
that associates each natural transformation \alpha:\mathbf{C}(X,-) \Rightarrow F with the element \alpha_X(1_X) \in FX. Moreover, this correspondence is natural in both X and F.

Here \mathop{\mathrm{\mathit{Hom}}}(\mathbf{C}(X,-), F) means \mathop{\mathrm{\mathit{Natural}}}(\mathop{\mathrm{\mathit{Arrows}}}(X,-), F). I think this is my favorite “standard” way of writing this.

Peter Smith [11] does this:

**Lemma 4** (Yoneda). For any locally small category \mathbf{C}, object X \in \mathbf{C}, and functor F:\mathbf{C}\to {\mathbf {Sets}} we have \mathop{\mathit{Nat}}(\mathbf{C}(X,-),F) \cong FX both naturally in X \in \mathbf{C} and F \in [\mathbf{C}, {\mathbf {Sets}}].

He uses the [\mathbf{C}, {\mathbf {Sets}}] notation for the functor category, and \mathop{\mathit{Nat}} where we use \mathop{\mathrm{\mathit{Natural}}}.

Paolo Perrone [8] writes the dual version, and uses the standard term "presheaf" to mean a functor from \mathbf{C}^{\mathrm op} to {\mathbf {Sets}}.

**Lemma 5** (Yoneda). Let \mathbf{C} be a category, let X be an object of \mathbf{C}, and let F:\mathbf{C}^\mathrm{op}\to{\mathbf {Sets}} be a presheaf on \mathbf{C}. Consider the map from \mathop{\mathrm{\mathit{Hom}}}_{[\mathbf{C}^\mathrm{op},{\mathbf {Sets}}]} \bigl(\mathop{\mathrm{\mathit{Hom}}}_\mathbf{C} (-,X) , F \bigr) \to FX assigning to a natural transformation \alpha:\mathop{\mathrm{\mathit{Hom}}}_\mathbf{C} (-,X)\Rightarrow F the element \alpha_X(\mathrm{id}_X)\in FX, which is the value of the component \alpha_X of \alpha on the identity at X.

This assignment is a bijection, and it is natural both in X and in F.

Here he writes \mathop{\mathrm{\mathit{Hom}}}_\mathbf{C} for \mathop{\mathrm{\mathit{Arrows}}}_\mathbf{C} and \mathop{\mathrm{\mathit{Hom}}}_{[\mathbf{C}^\mathrm{op},{\mathbf {Sets}}]} to mean the arrows in the functor category [\mathbf{C}^\mathrm{op},{\mathbf {Sets}}], which are the natural transformations.

Finally, Peter Johnstone [4] has my favorite, relatively concrete statement:

**Lemma 6** (Yoneda). Let \mathbf{C} be a locally small category, let X be an object of \mathbf{C} and let F:\mathbf{C}\to {\mathbf {Sets}} be a functor. Then

(i) there is a bijection between natural transformations \mathbf{C}(X, -) \Rightarrow F

(ii) the bijection in (i) is natural in both F and X.

Now your reward for having climbed all the way up this abstraction ladder with me is yet another abstraction!

Suppose you are given an object Y and you apply the Yoneda lemma by substituting \mathop{\mathrm{\mathit{Arrows}}}(Y,-) for the functor F. Then \mathop{\mathrm{\mathit{Natural}}}(\mathop{\mathrm{\mathit{Arrows}}}(X, -),\mathop{\mathrm{\mathit{Arrows}}}(Y,-)) \cong\mathop{\mathrm{\mathit{Arrows}}}(Y,-)(X) = \mathop{\mathrm{\mathit{Arrows}}}(Y,X) Note the order of the arguments! We can also write: \mathop{\mathrm{\mathit{Arrows}}}(X,Y) \cong\mathop{\mathrm{\mathit{Natural}}}(\mathop{\mathrm{\mathit{Arrows}}}(-, X),\mathop{\mathrm{\mathit{Arrows}}}(-,Y)) Each of the functors \mathop{\mathrm{\mathit{Arrows}}}(-, X) maps from \mathbf{C}^\mathrm{op}\to {\mathbf {Sets}} because that’s how we defined the represented functors. So now let’s jump up one more level of abstraction. We define a functor that maps objects to the functors that they represent, and arrows to the natural transformations between those functors. Given an object Y\in\mathbf{C} define the functor \mathop{Y\!o}:\mathbf{C}\to \mathop{\mathrm{\mathit{Functors}}}(\mathbf{C}^\mathrm{op}, {\mathbf {Sets}}) by \mathop{Y\!o}(Y) = \mathop{\mathrm{\mathit{Arrows}}}(-, Y) : \mathbf{C}^\mathrm{op}\to {\mathbf {Sets}} and given an arrow f: A \to B with A,B \in \mathbf{C} define \mathop{Y\!o}(f) = f_* = (f \circ -) : \mathop{\mathrm{\mathit{Arrows}}}(-,A) \Rightarrow\mathop{\mathrm{\mathit{Arrows}}}(-,B) This definition has the same “shape” as the one for represented functors, but we have abstracted over all the objects and arrows. Also note that we could have also defined this as \mathop{Y\!o}:\mathbf{C}^\mathrm{op}\to \mathop{\mathrm{\mathit{Functors}}}(\mathbf{C}, {\mathbf {Sets}}) using duality. All that changes is the order of the arguments in the functors.

The Yoneda lemma can now be used to prove that these mappings are invertible, so \mathop{Y\!o} is what is called an *embedding* of the category \mathbf{C} inside the functor category \mathop{\mathrm{\mathit{Functors}}}(\mathbf{C}^\mathrm{op}, {\mathbf {Sets}}). Thus \mathop{Y\!o} is called the *Yoneda embedding*, and you can read about the rest of the details in the references.

This construction tells us why people say things like, “Every object in a category can be understood by understanding the maps into (or out of) it.” This statement can be made precise:

**Corollary 7**. Let \mathbf{C}, X, and Y be given as above.

X and Y are isomorphic if and only if for every object A \in \mathbf{C}, the sets \mathop{\mathrm{\mathit{Arrows}}}(X, A) and \mathop{\mathrm{\mathit{Arrows}}}(Y, A) are naturally isomorphic.

X and Y are isomorphic if and only if the functors that they represent are naturally isomorphic. In particular, if X and Y represent the same functor then they must be isomorphic.

To close, a few final thoughts, and no more abstraction.

First, the modern internet is something of an endless treasure trove for the amateur category theory nerd. I have listed my favorite references at the end of this note, and it’s amazing that you can download almost them all for free, and sometimes with source code! When trying to understand something that is as deep an abstraction stack as this result it is very useful to be able to look at it from many different points of view. So, I am grateful for all of the sources.

Second, I wish I could have thought of a better notation for the represented functor than \mathop{\mathrm{\mathit{Arrows}}}(X,-) with all that placeholder nonsense. I don’t like how the placeholders can stand in for anything you want and how their meaning can shift and change in different contexts. But, even with those problems it’s better than hiding the definition behind yet another layer of naming (e.g. H_X), which is the only other obvious choice.

Third, you might have found my use of \mathop{Y\!o} for the Yoneda embedding to be frivolous, and perhaps childish. And I would have agreed. But then I read in multiple sources that the Yoneda embedding is sometimes denoted by よ, the hiragana kana for “Yo”.

Given this, how could I resist?

Finally, I need to shout out the excellent tutorial video by Emily Riehl that demonstrates how this result works the specific category of matrices [10]. The whole Yoneda picture suddenly became more clear while I was watching this talk the second time. Her book, *Category Theory in Context*, is also excellent [9]. Recommended.

\mathbf{C}, \mathbf{C}^\mathrm{op} - Categories and opposite categoies.

\mathop{\mathrm{\mathit{Objects}(\mathbf{C})}} - Objects in a category category \mathbf{C}. Often just written \mathbf{C}.

\mathop{\mathrm{\mathit{Arrows}}}(\mathbf{C}) - Arrows in a category.

\mathop{\mathrm{\mathit{Arrows}}}_\mathbf{C}(X,Y) - Arrows between two objects. Also written \mathop{\mathrm{\mathit{Arrows}}}(X,Y) or \mathop{\mathrm{\mathit{Hom}}}(X,Y) or \mathop{\mathrm{\mathit{Hom}}}_\mathbf{C}(A, B) or just \mathbf{C}(X,Y).

f: X \to Y - An arrow from X to Y.

g \circ f, gf - Composition of arrows.

X \cong Y - Isomorphism.

F:\mathbf{C}\to\mathbf{D} - A functor from \mathbf{C} to \mathbf{D}.

\alpha: F \Rightarrow G - Natural transformation.

\mathop{\mathrm{\mathit{Functors}}}(\mathbf{C}, \mathbf{D}) - Functor category between \mathbf{C} and \mathbf{D}. Also [\mathbf{C},\mathbf{D}] or \mathbf{D}^\mathbf{C}.

\mathop{\mathrm{\mathit{Natural}}}(F, G) - The collection of natural transformations from F to G. Also written [\mathbf{C},\mathbf{D}](F,G), or \mathop{\mathit {Nat}}(F,G) or just \mathop{\mathrm{\mathit{Hom}}}(F,G).

\mathop{\mathrm{\mathit{Arrows}}}(X, -) - The represented or “arrow” functor for X. Also called the “hom” functor and written \mathbf{C}(X,-), H^X, \mathop{\mathit{hom}}, or \mathop{\mathrm{\mathit{Hom}}}(X,-).

f \circ -, - \circ f - Pre- and post-composition maps. Also written f_* and f^*.

\mathop{Y\!o} - Yoneda Embedding.

[1] Sean Carroll, *A No-Nonsense Introduction to General Relativity*, 2001.

[2] Eugenia Cheng, *The Joy Of Abstraction*, 2022.

[3] Julia Goedecke, *Category Theory Notes*, 2013.

[4] Peter Johnstone, *Category Theory*, notes written by David Mehrle, 2015.

[5] Tom Leinster, *Basic Category Theory*, 2016.

[6] LobosJr, *Dark Souls 1 Speedrun, Personal Best*, 2013.

[7] Saunders Mac Lane, *Categories for the Working Mathematician*, Second Edition, Springer, 1978.

[8] Paolo Perrone, *Notes on Category Theory with examples from basic mathematics*.

[9] Emily Riehl, *Category Theory in Context*, Dover, 2016.

[10] Emily Riehl, ACT 2020 Tutorial: *The Yoneda lemma in the category of matrices*.

[11] Peter Smith, *Category Theory: A Gentle Introduction*, 2019.

This one is so easy it’s almost cheating. This scheme is based on a recipe I have stolen from Marcella Hazan. Buy her book, it’s in there. But I’ve adjusted the flow a bit to make it easier to follow. For me.

First get out your pasta cooking pot. Fill it with 3-4 quarts of water and start it heating. Toss a few big pinches of salt into the water. You don’t want *too* much water because you want the noodles to be able to make the water starchy while they cook, so you can use the starchy water later. You can do the rest of the recipe while the water gets up to boiling and while the pasta cooks. This recipe is calibrated to about a pound of pasta, give or take.

Now chop up three or four cloves of garlic and 3 or 4 slices of bacon (or more, go for it). Put these into a medium sizes saute pan with some oil and cook it until the bacon is crispy and frying in its own fat for a while. When it’s done deglaze the pan with white wine and reduce for a few minutes, then turn the heat off.

At this point your pasta water should be ready for you to cook the spaghetti, so drop it in.

While the pasta cooks, crack three eggs into a bowl and beat them. Add salt and pepper to taste. This will not take that long, so when you are done grate about a cup or a bit more of a mix of Romano and Parmesan cheese. The dark secret here is that Romano is better for this dish, and if you want to you can just not use any Parmesan at all.

When the pasta is almost done, turn the heat on under the bacon for a bit to warm it up.

When the pasta is actually done take out one scoop of the water. Then drain the rest and dump the noodles back into the pot. Add the bacon and garlic and all the oil in that pan and mix it around. Then add the egg and 3/4ths of the cheese and mix it around. If it’s too saucy, add cheese. If it’s too dry, add a bit of pasta water. When that is done, garnish with some fresh green herb if you want, but I never seem to do this anymore. Then add as much black pepper as you can stand to grind in without your hands cramping up.

Now you can gorge yourself.

Oh yeah, if you are nervous about raw eggs, be careful where you buy your eggs. Or you could always wuss out and get pasteurized eggs.

I love two things about this dish.

You never actually have to cook the sauce, per se. You just have to mix it together.

It’s really a bacon and eggs breakfast on top of pasta, with cheese. How brilliant is that?

Finally, if you are more efficient, you can probably do this with only one pot and the saute pan. But this flow is a bit easier.

**Notes from 2023**:

Yes I know I should be using guanciale. The stuff is hard to find locally though.

Also apparently I should not be using garlic. I’m surprised that Hazan would have a controversial stance on this. But there you go.