Windows Hello in Webauthn-rs

Recently I’ve been working again on webauthn-rs, as a member of the community wants to start using it in production for a service. So far the development of the library has been limited to the test devices that I own, but now this pushes me toward implementing true fido compliance.

A really major part of this though was that a lot of their consumers use windows, which means support windows hello.

A background on webauthn

Webauthn itself is not a specification for the cryptographic operations required for authentication using an authenticator device, but a specification that wraps other techniques to allow a variety of authenticators to be used exposing their “native” features.

The authentication side of webauthn is reasonably simple in this way. The server stores a public key credential associated to a user. During authentication the server provides a challenge which the authenticator signs using it’s private key. The server can then verify using it’s copy of the challenge, and the public key that the authentication must have come from that credentials. Of course like anything there is a little bit of magic in here around how the authenticators store credentials that allows other properties to be asserted, but that’s beyond the scope of this post.

The majority of the webauthn specification is around the process of registering credentials and requesting specific properties to exist in the credentials. Some of these properties are optional hints (resident keys, authenticator attachment) and some of these properties are enforced (user verification so that the credential is a true MFA). Beyond these there is also a process for the authenticator to provide information about it’s source and trust. This process is attestation and has multiple different formats and details associated.

It’s interesting to note that for most deployments of webauthn, attestation is not required by the attestation conveyance preference, and generally provides little value to these deployments. For many sites you only need to know that a webauthn authenticator is in use. However attestation allows environments with strict security requirements to verify and attest the legitimacy of, and make and model of authenticators in use. (An interesting part of webauthn is how much of it seems to be Google and Microsoft internal requirements leaking into a specification, just saying).

This leads to what is effectively, most of the code in webauthn-rs - attestation.rs.

Windows Hello

Windows Hello is Microsoft’s equivalent to TouchID on iOS. Using a Trusted Platform Module (TPM) as a tamper-resistant secure element, it allows devices such as a Windows Surface to perform cryptographic operations. As Microsoft is attempting to move to a passwordless future (which honestly, I’m on board for and want to support in Kanidm), this means they want to support Webauthn on as many of their devices as possible. Microsoft even defines in their hardware requirements for Windows 10 Home, Pro, Education and Enterprise that as of July 28, 2016, all new device models, lines or series … a component which implements the TPM 2.0 must be present and enabled by default from this effective date.. This is pretty major as this means that slowly MS have been ensuring that all consumer and enterprise devices are steadily moving to a point where passwordless is a viable future. Microsoft state that they use TPMv2 for many reasons, but a defining one is: The TPM 1.2 spec only allows for the use of RSA and the SHA-1 hashing algorithm which is now considered broken.

Of course, if you have noticed this means that TPM’s are involved. Webauthn supports a TPM attestation path, and that means I have to implement it.

Once more into the abyss

Reading the Webauthn spec for TPM attestation it pointed me to the TPMv2.0 specification part1, part2 and part3. I will spare you from this as there is a sum total of 861 pages between these documents, and the Webauthn spec while it only references a few parts, manages to then create a set of expanding references within these documents. To make it even more enjoyable, text search is mostly broken in these documents, meaning that trying to determine the field contents and types involves a lot of manual-eyeball work.

TPM’s structures are packed C structs which means that they can be very tricky to parse. They use u16 identifiers to switch on unions, and other fun tricks that we love to see from C programs. These u16’s often have some defined constants which are valid choices, such as TPM_ALG_ID, which allows switching on which cryptographic algorithms are in use. Some stand out parts of this section were as follows.

Unabashed optimism:

TPM_ALG_ERROR 0x0000 // Should not occur

Broken Crypto

TPM_ALG_SHA1 0x0004 // The SHA1 Algorithm

Being the boomer equivalent of JWT

TPM_ALG_NULL 0x0010 // The NULL Algorithm

And supporting the latest in modern cipher suites

TPM_ALG_XOR 0x000A // TCG TPM 2.0 library specification - the XOR encryption algorithm.

ThE XOR eNcRyPtIoN aLgoRitHm.

Some of the structures are even quite fun to implement, such as TPMT_SIGNATURE, where a matrix of how to switch on it is present where the first two bytes when interpreted as a u16, define a TPM_ALG_ID where, if it the two bytes are not in a set of the TPM_ALG_ID then the whole blob including leading two bytes is actually just a blob of hash. It would certainly be unfortunate if in the interest of saving two bytes that my hash accidentally emited data where the first two bytes were accidentally a TPM_ALG_ID causing a parser to overflow.

I think the cherry on all of this though, is that despite Microsoft requiring TPMv2.0 to move away from RSA and SHA-1, that when I checked the attestation signatures for a Windows Hello device I had to implement the following:

COSEContentType::INSECURE_RS1 => {
    hash::hash(hash::MessageDigest::sha1(), input)
        .map(|dbytes| Vec::from(dbytes.as_ref()))
        .map_err(|e| WebauthnError::OpenSSLError(e))
}

Conclusion

Saying this, I’m happy that Windows Hello is now in Webauthn-rs. The actual Webauthn authentication flows DO use secure algorithms (RSA2048 + SHA256 and better), it is only in the attestation path that some components are signed by SHA1. So please use webauthn-rs, and do use Windows Hello with it!

User gesture is not detected - using iOS TouchID with webauthn-rs

I was recently contacted by a future user of webauthn-rs who indicated that the library may not currently support Windows Hello as an authenticator. This is due to the nature of the device being a platform attached authenticator and that webauthn-rs at the time did not support attachment preferences.

As I have an ipad, and it’s not a primary computing device I decided to upgrade to iPadOS 14 beta to try out webauthn via touch (and handwriting support).

The Issue

After watching Jiewen’s WWDC presentation about using TouchID with webauthn, I had a better idea about some of the server side requirements to work with this.

Once I started to test though, I would recieve the following error in the safari debug console.

User gesture is not detected. To use the platform authenticator,
call 'navigator.credentials.create' within user activated events.

I was quite confused by this error - a user activated event seems to be a bit of an unknown term, and asking other people they also didn’t quite know what it meant. My demo site was using a button input with onclick event handlers to call javascript similar to the following:

function register() {
fetch(REG_CHALLENGE_URL + username, {method: "POST"})
   .then(res => {
      ... // error handling
   })
   .then(res => res.json())
   .then(challenge => {
     challenge.publicKey.challenge = fromBase64(challenge.publicKey.challenge);
     challenge.publicKey.user.id = fromBase64(challenge.publicKey.user.id);
     return navigator.credentials.create(challenge)
       .then(newCredential => {
         console.log("PublicKeyCredential Created");
              ....
         return fetch(REGISTER_URL + username, {
           method: "POST",
           body: JSON.stringify(cc),
           headers: {
             "Content-Type": "application/json",
           },
         })
       })

This works happily in Firefox and Chrome, and for iPadOS it event works with my yubikey 5ci.

I investigated further to determine if the issue was in the way I was presenting the registration to the navigator.credentials.create function. Comparing to webauthn.io (which does work with TouchID on iPadOS 14 beta), I noticed some subtle differences but nothing that should cause an issue like this.

After much pacing, thinking and asking for help I eventually gave in and went to the source of webkit

The Solution

Reading through the webkit source I noted that the check within the code was looking for association of how the event was initiated. This comes from a context that is available within the browser. This got me to think about the fact that the fetch api is async, and I realised at this point that webauthn.io was using the jQuery.ajax apis. I altered my demo to use the same, and it began to work with TouchID. That meant that the user activation was being lost over the async boundary to the fetch API. (note: it’s quite reasonable to expect user interaction to use navigator.credentials to prevent tricking or manipulating users into activating their webauthn devices).

I emailed Jiewen, who responded overnight and informed me that this is an issue, and it’s being tracked in the webkit bugzilla . He assures me that it will be resolved in a future release. Many thanks to him for helping me with this issue!

At this point I now know that TouchID will work with webauthn-rs, and I can submit some updates to the library to help support this.

Notes on Webauthn with TouchID

It’s worth pointing out a few notes from the WWDC talk, and the differences I have observed with webauthn on real devices.

In the presentation it is recommended that in your Credential Creation Options, that you (must?) define the options listed to work with TouchID

authenticatorSelection: { authenticatorAttachment: "platform" },
attestation: "direct"

It’s worth pointing out that authenticatorAttachment is only a hint to the client to which credentials it should use. This allows your web page to streamline the UI flow (such as detection of platform key and then using that to toggle the authenticatorAttachment), but it’s not an enforced security policy. There is no part of the attestation response that indicates the attachement method. The only way to determine that the authenticator is a platform authenticator would be in attestation “direct” to validate the issuing CA or the device’s AAGUID match the expectations you hold for what authenticators can be used within your organisation or site.

Additionally, TouchID does work with no authenticatorAttachment hint (safari prompts if you want to use an external key or TouchID), and that attestation: “none” also works. This means that a minimised and default set of Credential Creation Options will allow you to work with the majority of webauthn devices.

Finally, the WWDC video glosses over the server side process. Be sure to follow the w3c standard for verifying attestations, or use a library that implementes this standard (such as webauthn-rs or duo-labs go webauthn). I’m sure that other libraries exist, but it’s critical that they follow the w3c process as webauthn is quite complex and fiddly to implement in a correct and secure manner.

docker buildx for multiarch builds

I have been previously building Kanidm with plain docker build, but recently a community member wanted to be able to run kanidm on arm64. That meant that I needed to go down the rabbit hole of how to make this work …

What not to do …

There is a previous method of using manifest files to allow multiarch uploads. It’s pretty messy but it works, so this is an option if you want to investigate but I didn’t want to pursue it.

Bulidx exists and I got it working on my linux machine with the steps from here but the build took more than 3 hours, so I don’t recommend it if you plan to do anything intense or frequently.

Buildx cluster

Docker has a cross-platform building toolkit called buildx which is currently tucked into the experimental features. It can be enabled on docker for mac in the settings (note: you only need experimental support on the coordinating machine aka your workstation).

Rather than follow the official docs this will branch out. The reason is that buildx in the official docs uses qemu-aarch64 translation which is very slow and energy hungry, taking a long time to produce builds. As mentioned already I was seeing in excess of 3 hours for aarch64 on my builder VM or my mac.

Instead, in this configuration I will use my mac as a coordinator, and an x86_64 VM and a rock64pro as builder nodes, so that the builds are performed on native architecture machines.

First we need to configure our nodes. In /etc/docker/daemon.json we need to expose our docker socket to our mac. I have done this with the following:

{
  "hosts": ["unix:///var/run/docker.sock", "tcp://0.0.0.0:2376"]
}

WARNING: This configuration is HIGHLY INSECURE. This exposes your docker socket to the network with no authentication, which is equivalent to un-authenticated root access. I have done this because my builder nodes are on an isolated and authenticated VLAN of my home network. You should either do similar or use TLS authentication.

NOTE: The ssh:// transport does not work for docker buildx. No idea why but it don’t.

Once this is done restart docker on the two builder nodes.

Now we can configure our coordinator machine. We need to check buildx is present:

docker buildx --help

We then want to create a new builder instance and join our nodes to it. We can use the DOCKER_HOST environment variable for this:

DOCKER_HOST=tcp://x.x.x.x:2376 docker buildx create --name cluster
DOCKER_HOST=tcp://x.x.x.x:2376 docker buildx create --name cluster --append

We can then startup and bootstrap the required components with:

docker buildx use cluster
docker buildx inspect --bootstrap

We should see output like:

Name:   cluster
Driver: docker-container

Nodes:
Name:      cluster0
Endpoint:  tcp://...
Status:    running
Platforms: linux/amd64, linux/386

Name:      cluster1
Endpoint:  tcp://...
Status:    running
Platforms: linux/arm64, linux/arm/v7, linux/arm/v6

If we are happy with this we can make this the default builder.

docker buildx use cluster --default

And you can now use it to build your images such as:

docker buildx build --push --platform linux/amd64,linux/arm64 -f Dockerfile -t <tag> .

Now I can build my multiarch images much quicker and efficently!

Developer Perspective on Docker

A good mate of mine Ron Amosa put a question up on twitter about what do developers think Docker brings to the table. I’m really keen to see what he has to say (his knowledge of CI/CD and Kubernetes is amazing by the way!), but I thought I’d answer his question from my view as a software engineer.

Docker provides resource isolation and management to applications

Lets break that down.

What is a resource? What is an application?

It doesn’t matter what kind of application we write: A Rust command line tool, an embedded database in C, or a webserver or website with Javascript. Every language and that program requires resources to run. Let’s focus on a Python webserver for this thought process.

Our webserver (which is an application) requires a lot of things to be functional! It needs to access a network to open listening sockets, it needs access to a filesystem to read pages or write to a database (like sqlite). It needs CPU time to process requests, and memory to create a stack/heap to work through those requests. But as well our application also needs to be seperated and isolated from other programs too, so that they can not disclose our data - but so that faults in our application do not affect other services. It probably needs a seperate user and group, which is a key idea in unix process isolation and security. Maybe also there are things like SELinux or AppArmor that also provide extra enhancements.

But why stop here there are many more. We might need system controls (sysctls) that define networking stack behaviour like how TCP performs. We may need specific versions of python libraries for our application. Perhaps we also want to limit the system calls that our python application can perform to our OS.

I hope we can see that the resources we have, really is more than simply CPU and Memory here! Every application is really quite involved.

A short view back to the past …

In the olden days, as developers we had to be responsible for these isolations. For example on a system, we’d have to select a bind address so that we could be configured to only use a single network device for listening on. This not only meant that our applications had to support this behaviour, but that a person had to read our documentation, and find out how to configure that behaviour to isolate the networking resource.

And of course many others. To limit the amount of CPU or RAM that was available required you to configure ulimits for the user, and to select which user was going to run our application.

Many problems have been seen too with a language like python where libraries are not isolated and there are conflicts between which version different applications require. Is it the fault of python? The application developer? It’s hard to say …

What about system calls? With an interpretted language like python, you can’t just set the capabilities flags or other hardening options because they have to be set on the interpretter (python) which is used in many places. An example where the resource (python) is shared between many applications preventing us from creating isolated python runtimes.

Even things like SELinux and AppArmor required complex, hand created profiles, that were cryptic at best, or led to being disabled in the common case (It can’t be secure if it’s not usable! People will always take the easy option …).

And that’s even before we look at init scripts - bash scripts that had to be invoked in careful ways, and were all hand rolled, each adding different mistakes or issues. It was a time where to “package” and application and deploy it, required huge amounts of knowledge of a broad range of topics.

In many cases, I have seen another way this manifested. Rather than isolated applications (which was too hard), every application was installed on a dedicated virtual machine. The resource management then came as an artifact of every machine being seperate and managed by a hypervisor.

Systemd

Along came systemd though, and it got us much further. It provided consistent application launch tools, and has done a lot of work to manage and provide resource management such as cgroups (cpu, mem), dynamic users, some types of filesystem isolation and some more. Systemd as an init system has done some really good stuff.

But still problems exist. Applications still require custom SELinux or AppArmor profiles, and systemd can’t help managing network interfaces (that still falls on the application).

It also still relies on you to put the files in place, or a distribution package to get the file content into the system.

Docker

Docker takes this even further. Docker manages and arbitrates every resource your application requires, even the filesystem and install process. For example, a very complex topic like CPU or memory limit’s on Linux, becomes quite simple in docker which exposes CPU and memory tunables. Docker allows sysctls per container. You assign and manage storage.

docker run -v db:/data/db -v config:/data/config --network private_net \
    --memory 1024M --shm-size 128M -P 80:8080 --user isolated \
    my/application:version

From this command we can see what resources are involved. We know that we mount two storage locations. We can see that we confine the network to a private network, and that we want to limit memory to 1024M. We also can see we’ll be listening on port 80 which remaps to the container internally. We even know what user we’ll run as so we can assign permissions to the volumes. Our application is also defined as is it’s version.

Not only can we see what resources we are using there are a lot of other benefits. Docker can dynamically generate selinux/apparmor isolation profiles, so we get stronger process isolation between containers and host processes. We know that the filesystem of this container is isolated from others, so they can have and bundle the correct versions of dependencies. We know how to start, stop, and even monitor the application. It can even have health checks. Logs will (should?) go to stdout/err which docker will forward to a log collector we can define. In the future each application may even be in it’s own virtual memory space (IE seperate vm’s).

Docker provides a level of isolation to resources that is still hard to achieve in systemd, and not only that it makes very advanced or complex configurations easy to access and use. Accesibility of these features is vitally important to allow all people to create robust applications in their environments. But it also allows me as a developer to know what resources can exist in the container and how to interact with these in a way that will respect the wishes of the deploying administrator.

Docker Isn’t Security Isolation

It’s worth noting that while Docker can provide SELinux and AppArmor profiles, Docker is not an effective form of security isolation. It certainly makes the bar much much higher than before yes! And that’s great! And I hope that bar continues to rise. However today we do live in an age where there are many attacks still locally on linux kernels and the fix delay in these is still long. We also still see CPU sidechannels, and these will never be resolved while we rely on asynchronous CPU behaviour.

If you have high value data, it is always best to have seperated physical machines for these applications, and to always patch frequently, have a proper CI/CD pipeline, and centralised logging, and much much more. Ask your security team! I’m sure they’d love to help :)

Conclusion

For me personally, docker is about resource management and isolation. It helps me to define an interface that an admin can consume and interact with, making very advanced concepts easy to use. It gives me trust that applications will run in a way that is isolated and known all the way from development and testing through to production under high load. By making this accessible, it means that anyone - from a single docker container to a giant kubernetes cluster, can have really clear knowledge of how their applications are operating.

virt-manager missing pci.ids usb.ids macos

I got the following error:

/usr/local/Cellar/libosinfo/1.8.0/share/libosinfo/pci.ids No such file or directory

This appears to be an issue in libosinfo from homebrew. Looking at the libosinfo source, there are some aux download files. You can fix this with:

mkdir -p /usr/local/Cellar/libosinfo/1.8.0/share/libosinfo/
cd /usr/local/Cellar/libosinfo/1.8.0/share/libosinfo/
wget -q -O pci.ids http://pciids.sourceforge.net/v2.2/pci.ids
wget -q -O usb.ids http://www.linux-usb.org/usb.ids

All is happy again with virt-manager