Binding Redirects

This isn’t part of the series on Stack Overflow’s architecture, but is a topic that has bitten us many times. Hopefully, some of this information helps you sort out issues you hit.

You’re probably here because of an error like this:

Could not load file or assembly ‘System.<…>, Version=4.x.x.x, Culture=neutral, PublicKeyToken=<…>’ or one of its dependencies. The system cannot find the file specified.

And you likely saw a build warning like this:

warning MSB3277: Found conflicts between different versions of “System.<…>” that could not be resolved.

Whelp, you’re not alone. We’re thinking about starting a survivors group. The most common troublemakers here are:

If you just want out of this fresh version of DLL Hell you’ve found yourself in…

The best fix

The best fix is “go to .NET Core”. Since .NET Framework (e.g. 4.5, 4.8, etc.) has a heavy backward compatibility burden, meaning that the assembly loader itself is basically made of unstable plutonium with a hair trigger coated in flesh eating bacteria behind a gate made of unobtanium above a moat of napalm filled with those jellyfish that kill you…that won’t ever really be fixed.

However, .NET Core’s simplified assembly loading means it just works. I’m not saying a migration to .NET Core is trivial, that depends on your situation, but it is generally the best long-term play. We’re almost done porting Stack Overflow to .NET Core, and this kind of pain is one of the things we’re very much looking forward to not fighting ever again.

The fix if you’re using .NET Framework

The fix for .NET Framework is “binding redirects”. When you are trying to load an assembly (and if it’s strongly named, like the ones in the framework and many libs are), it’ll try to load the specific version you specified. Unless it’s told to load another (likely newer) version. Think of it as “Hey little buddy. It’s okay! Don’t cry! That API you want is still available…it’s just over here”. I don’t want to replicate the official documentation here since it’ll be updated much better by the awesome people on Microsoft Docs these days. Your options to fix it are:

  • Enable <AutoGenerateBindingRedirects> (this doesn’t work for web projects - it doesn’t handle web.config)
    • Note: this isn’t always perfect. It doesn’t handle all transitive cases especially around multi-targeting and conditional references.
  • Build with Visual Studio and hope it has warnings and a click to fix (this usually works)
    • Note: this isn’t always perfect either, hence the “hope”.
  • Manually edit your *.config file.
    • This is the surest way (and what Visual Studio is doing above), but also the most manual and/or fun on upgrades.

Unfortunately, what that “manual editing” article doesn’t mention is the assembly versions are not the NuGet versions. For instance, System.Runtime.CompilerServices.Unsafe 4.7.0 on NuGet is assembly version 4.0.6.0. The assembly version is what matters. The easiest way I use to figure this out on Windows is the NuGet Package Explorer (Windows Store link - easiest install option) maintained by Claire Novotny. Either from within NPE’s feed browser or from NuGet.org (there’s a “Download package” option in the right sidebar), open the package. Select the framework you’re loading (they should all match really) and double click the DLL. It’ll have a section that says “Strong Name: Yes, version: 4.x.x.x

For example, if you had libraries wanting various versions of System.Numerics.Vectors, the most likely fix is to prefer the latest (as of writing this, 4.7.0 on NuGet which is assembly version 4.0.6.0 from earlier). Part of your config would look something like this:

<configuration>
  <runtime>
    <assemblyBinding xmlns="urn:schemas-microsoft-com:asm.v1">
      <dependentAssembly>
        <assemblyIdentity name="System.Runtime.CompilerServices.Unsafe" publicKeyToken="b03f5f7f11d50a3a" culture="neutral"/>
        <bindingRedirect oldVersion="0.0.0.0-4.0.6.0" newVersion="4.0.6.0"/>
      </dependentAssembly>
      <!-- ...and maybe some more... -->
    </assemblyBinding>
  </runtime>
</configuration>

This means “anyone asking for any version before 4.0.6.0, send them to that one…it’s what in my bin\ folder”. For a quick practical breakdown of these fields:

  • name: The name of the strongly named DLL
  • publicKeyToken: The public key of the strongly named DLL (this comes from key used to sign it)
  • culture: Pretty much always neutral
  • oldVersion: A range of versions, starting at 0.0.0.0 (almost always what you want) means “redirect everything from X to Y”
  • newVersion: The new version to “redirect” to, instead of any old one (usually matches end of the oldVersion range)

When do I need to do this?

Any of the above approaches to fixing it need to be revisited when a conflict arises again. This can happen when:

Note that binding redirects are to remedy differences in the reference chain (more on that below), so when all of your things reference the same version down their transitive chains, you don’t need one. So sometimes removing a binding redirect is the quick fix in an upgrade.

But seriously, .NET Core. Of all the reasons we’re migrating over at Stack Overflow, binding redirects are in my top 5. We’ve lost soooo many days to this over the years.

What’s going on? Why do I need this?

The overall problem is that you have two different libraries ultimately wanting two different versions of one of these (and potentially anything else — this is a general case the three above are just the most common). Since dependencies are transitive, this means you just reference a library and the tooling will get what it needs. (That’s what <PackageReference> does in your projects.) But what that means is you could have library A → B → System.Vectors (version 1.0), and another reference to something like D → E → F → System.Vectors (version 1.2).

Uh oh, we have two things wanting two different versions. You may be thinking (for Windows) “but…won’t they be the same file in the bin\ directory? How can you have both? How does that even work?” You’re not crazy. You’re hitting the error because one of those two versions won. It should be the newer one.

Okay, so it’s broken. The code wanting the other version is what’s erroring. But maybe not on startup! That’s another fun part. It happens the first time you touch a type that references types in that assembly. Bonus: if this is in a static constructor, that type you’re trying to touch will most likely disappear. Poof. Gone. What does the app do without it? WHO KNOWS?! But buckle up, because you can bet your butt it’ll be fun. Imagine the compiler said “okay” but that type wasn’t really there later…because that’s basically the situation you find yourself in.

So, we need to redirect things. In theory, the framework takes care of some of these indirections, but with some pieces deployed via NuGet it’s a bit of a mess and it’s far from perfect. That’s why you’re here. I hope this helped.

In some versions of the framework, things aren’t quite right in what shipped - the few libraries up top are troublemakers more than all the others for this reason. Those are the versions that were glitched in various versions of .NET Framework. “Why don’t you fix .NET 4.??" is a question Microsoft gets a lot. They did — it's called "the next version". Being unable to break people, that's the only real way to fix it and not cause *other* damage. Keep in mind, we're talking about over a billion computers to update here. So, when you update to .NET 4.7.2 (most of the redirects are fixed here), .NET 4.8, etc. — more and more of the cases do go away. But with NuGet and later versions...yeah, they can come back. Generally speaking though, the later version of .NET Framework you're targeting and running on, the better. There has been progress.

Anyway, I hope this helps some people out there. I should have written this up 5 years ago, and wish I had this info available to me 10 years ago. Good luck on fixing whatever brought you here!

Stack Overflow: How We Do App Caching - 2019 Edition

This is #5 in a very long series of posts on Stack Overflow’s architecture.
Previous post (#4): Stack Overflow: How We Do Monitoring - 2018 Edition

So…caching. What is it? It’s a way to get a quick payoff by not re-calculating or fetching things over and over, resulting in performance and cost wins. That’s even where the name comes from, it’s a short form of the “ca-ching!” cash register sound from the dark ages of 2014 when physical currency was still a thing, before Apple Pay. I’m a dad now, deal with it.

Let’s say we need to call an API or query a database server or just take a bajillion numbers (Google says that’s an actual word, I checked) and add them up. Those are all relatively crazy expensive. So we cache the result – we keep it handy for re-use.

Why Do We Cache?

I think it’s important here to discuss just how expensive some of the above things are. There are several layers of caching already in play in your modern computer.

Continue reading...

Stack Overflow: How We Do Monitoring - 2018 Edition

This is #4 in a very long series of posts on Stack Overflow’s architecture.
Previous post (#3): Stack Overflow: How We Do Deployment - 2016 Edition

What is monitoring? As far as I can tell, it means different things to different people. But we more or less agree on the concept. I think. Maybe. Let’s find out! When someone says monitoring, I think of:

You are being monitored!

…but evidently some people think of other things. Those people are obviously wrong, but let’s continue. When I’m not a walking zombie after reading a 10,000 word blog post some idiot wrote, I see monitoring as the process of keeping an eye on your stuff, like a security guard sitting at a desk full of cameras somewhere. Sometimes they fall asleep–that’s monitoring going down. Sometimes they’re distracted with a doughnut delivery–that’s an upgrade outage. Sometimes the camera is on a loop–I don’t know where I was going with that one, but someone’s probably robbing you. And then you have the fire alarm. You don’t need a human to trigger that. The same applies when a door gets opened, maybe that’s wired to a siren. Or maybe it’s not. Or maybe the siren broke in 1984.

I know what you’re thinking: Nick, what the hell?

Continue reading...

HTTPS on Stack Overflow: The End of a Long Road

Today, we deployed HTTPS by default on Stack Overflow. All traffic is now redirected to https:// and Google links will change over the next few weeks. The activation of this is quite literally flipping a switch (feature flag), but getting to that point has taken years of work. As of now, HTTPS is the default on all Q&A websites.

We’ve been rolling it out across the Stack Exchange network for the past 2 months. Stack Overflow is the last site, and by far the largest. This is a huge milestone for us, but by no means the end. There’s still more work to do, which we’ll get to. But the end is finally in sight, hooray!

Fair warning: This is the story of a long journey. Very long. As indicated by your scroll bar being very tiny right now. While Stack Exchange/Overflow is not unique in the problems we faced along the way, the combination of problems is fairly rare. I hope you find some details of our trials, tribulations, mistakes, victories, and even some open source projects that resulted along the way to be helpful. It’s hard to structure such an intricate dependency chain into a chronological post, so I’ll break this up by topic: infrastructure, application code, mistakes, etc.

Continue reading...

Stack Overflow: How We Do Deployment - 2016 Edition

This is #3 in a very long series of posts on Stack Overflow’s architecture.
Previous post (#2): Stack Overflow: The Hardware - 2016 Edition

We’ve talked about Stack Overflow’s architecture and the hardware behind it. The next most requested topic was Deployment. How do we get code a developer (or some random stranger) writes into production? Let’s break it down. Keep in mind that we’re talking about deploying Stack Overflow for the example, but most of our projects follow almost an identical pattern to deploy a website or a service.

I’m going ahead and inserting a set of section links here because this post got a bit long with all of the bits that need an explanation:

Continue reading...
View Archive (22 posts)