Firstyear's blog-a-log Adventures of a professional nerd. en-us Mon, 04 Apr 2022 00:00:00 +1000 <![CDATA[Enable caBLE on your iPhone for testing]]> Enable caBLE on your iPhone for testing

caBLE allows a nearby device (such as your iPhone) to be used an a webauthn authenticator. Given my work on WebauthnRS I naturally wanted to test this! When I initially tried to test caBLE with webauthn via my iPhone, I recieved an error that the operation wasn’t available at this time. There was no other information available.


After some digging into, I found the log message from AuthenticationServicesAgent which stated:

“Syncing platform authenticator must be enabled to register a platform public key credential; this can be enabled in Settings > Developer.”

Enabling Developer Settings

Run on your mac. On your iPhone close and reopen settings. Then search for “developer”.

Inside of that menu enable Syncing Platform Authenticator and Additional Logging under PassKit.

After that you should be able to test caBLE!

Mon, 04 Apr 2022 00:00:00 +1000 <![CDATA[Documentation PR’s Welcome - Why Docs Are Not A Beginner Friendly Task]]> Documentation PR’s Welcome - Why Docs Are Not A Beginner Friendly Task

Recently I was reporting a usability issue with a library, mainly related to it’s confusing or absent documentation. A friend of mine saw the exchange and commented (quite accurately) that it went along the lines of:

  • Me: This library should improve it’s documentation
  • Project: PR’s welcome
  • Me: I can’t write the docs because I don’t know how this works without documentation

My friend also commented “[this] is probably the realest interaction I’ve seen in a while.” and “It kinda shakes up the idea that if people want something in OSS they should do it themselves, and that docs are the easiest way to contribute”.

I completely agree with these observations.

Documentation writing is not a task for a drive by contributor, or a beginner. Documentation writing is a skill in and of itself, that requires many things. From my perspective, I believe documentation writing requires:

  • Linguistic ability - to express ideas clearly to a diverse audience of readers.
  • Emotional intelligence - to empathise with their audience and to think “what would help them in these words”?
  • Technical knowledge - the understanding of the problem space to know how to write the correct documentation.

Generally, when someone is a beginner to a project they lack an understanding of the project audience so they aren’t able to empathise with “what could be useful for a reader”. They also tend to lack the depth and breath of technical knowledge to know what to write for the documentation since by definition, they are a newcommer to this project.

A great way to connect with a beginner is to listen to their frustrations and what challenges the encountered with your project. First, connect how that frustration could be either a failing of the user interface and it’s design (even an API is a user interface). Second, only once you have eliminated design limitations, then consider it to be a documentation problem. As the project contributor, you are in a better position to write documentation than a beginner.

Telling someone, especially a beginner, “docs PR’s welcome” is both belitting to the skill that is technical writing, but also generally considered in opensource to mean “I don’t give a shit about your problem, fuck off”.

Tue, 15 Mar 2022 00:00:00 +1000 <![CDATA[How CTAP2.0 made UserVerification even more confusing]]> How CTAP2.0 made UserVerification even more confusing

I have previously written about how Webauthn introduces a false sense of security with how it manages UserVerification (UV) by default. To summarise, when you request “preferred” which means “perform UV if possible”, it can be bypassed since relying parties’s (RP) do not check if UV was actually performed, and Webauthn makes no recommendations on how to store credentials in a manner that allows future checking to ensure UV is requested or validated correctly.

From this, in Webauthn-RS we made the recommendation that you use either “required” to enforce all credentials have performed UV, or “discouraged” to request that no UV is performed by credentials during authentication or registration.

At the same time, in the Webauthn-RS project we begun to store two important pieces of credential metadata beyond the Webauthn specification - the result of UV from registration, and the policy that was requested at the time of registration. We did this because we had noticed there were classes of credentials, that even in “discouraged” would always verify themself at registration and authentication. Because of this property, we would enforce that since UV was performed at registration, we could continue to enforce UV on a per credential basis to detect possible credential compromise, and to further strengthen the security of credentials used with Webauthn-RS.

This created 3 workflows:

  • Required - At registration and authentication UV is always required
  • Discouraged + no UV - At registration and authentication UV is never required
  • Discouraged + always UV - At registration and authentication UV is always required

A bug report …

We recieved a bug that an authenticator was failing to work with Webauthn-RS, because at registration it would always force UV, but during authentication it would never request UV. This was triggering our inconsistent credential detection, indicating the credential was possibly compromised.

In this situation, the authenticator used an open-source firmware, so I was able to look at the source and identify the programming issue. During registration UV is always required, but during “discouraged” in authentication it’s never required matching the reported bug.

The author of the library then directed me to the fact that in CTAP2.0 this behaviour is enshrined in the specification.

Why is this behaviour bad?

I performed a quick poll on twitter, and asked about 5 non-technical friends about this. The question I asked was:

You go to a website, and you’re asked to setup a yubikey. When you register the key you’re asked for a pin. Do you now expect the pin to be required when you authenticate to that website with that yubikey now?

From the 31 votes on twitter, the result was 60% (21 / 30) that “yes” this PIN will always be required. From the people I asked directly, they all responded “yes”. (This is in no way an official survey or significant numbers, but it’s an initial indication)

Humans expect things to behave in a consistent manner. When you take an action one time, something will always continue to behave in that way. The issue we are presented with in this situation is that CTAP2.0 fundamentally breaks this association by changing the behaviour between registration and authentication. It also is not communicated that the different is registration vs authentication, or even why this behaviour is changed.

As a result, this confuses users (“Why is my pin not always required?!”) and this can at worst cause users to be apathetic about the UV check, where it could be downgraded from “required/preferred” to “discouraged” and the user would not notice or care about “why is this different?”. Because RP’s that strictly follow the Webauthn specification are open to UV bypass, CTAP2.0 in this case has helped to open the door for users to be tricked into this.

The other issue is that for a library like Webauthn-RS we lose the ability to detect credential compromise or attempts to bypass UV when in discouraged, since now UV is not consistently enforced across all classes of authenticators.

Can it be fixed?

No. There are changes in CTAP2.1 that can set the token to be “always verified” and for extensions to be sent that always enforce UV of that credential, but none of these assist the CTAP2.0 case where none of these elements exist.

As an RP library author we have to assume and work out ways to interact with credentials that are CTAP2.0_pre, CTAP2.0, CTAP2.1, vendor developed and more. We have to find a way to use the elements at hand to create a consistent user interface, that also embed security elements that can not be bypassed or downgraded.

I spent a lot of time thinking about how to resolve this, but I can only conclude that CTAP2.0 has made “discouraged” less meaningful by adding in this confusing behaviour.

Is this the end of the world?

Not at all. But it does undermine users trust in the systems we are building, where people may end up believing that UV is pointless and never checked. There are a lot of smart bad-people out there and they may utilise this in attacks (especially when combined with the fact that RP’s who strictly follow the Webauthn standard are already open to UV bypass in many cases).

If the goal we have is to move to a passwordless world, we need people to trust their devices behave in a manner that is predictable and that they understand. By making UV sometimes there, sometimes not, it will be a much higher barrier to convince people they can trust these devices as a self contained multifactor authenticator.

I use an authentication provider, what can I do?

If possible, setup your authentication provider to have UV required. This will cause some credentials to no longer work in your environment, but it will ensure that every authenticator has a consistent experience. In most cases, your authentication provider is likely to be standards compliant, and will not perform the extended verification discussed below meaning that “preferred” is bypassable, and “discouraged” can have inconsistent UV requests from users.

What can an RP do?

Because of this change, there are really only three workflows now that are actually consistent for users where we can enforce UV properties are observed correctly.

  • Required - At registration and authentication UV is always required
  • Preferred + with UV - At registration and authentication UV is always required
  • Preferred + no UV - At registration and authentication UV should not be required

The way that this is achieved is an extension to the Webauthn specification. When you register a credential you must store the state of the UV boolean at registration, and you must store the policy that was requested at registration. During authentication the following is checked in place of the webauthn defined UV check:

if credential.registration_policy == required OR authentication.policy == required {
    assert(authentication.uv == true)
} else if credential.registration_policy == preferred AND credential.registration_uv == true {
    // We either sent authentication.policy preferred or discouraged, but the user registered
    // with UV so we enforce that behaviour.
    assert(authentication.uv == true)
} else {
    // Do not check uv.

There is a single edge case in this work flow - since we now send “preferred” it’s possible that a credential that registered without UV (IE via Firefox which doesn’t support CTAP2.0_pre or greater) will be moved to using a platform that does support CTAP2.0_pre or greater, and it will begin to request UV. It is however possible in this scenario that once the credential begins to provide UV we can then store the credential.uv as true and enforce that for future authentications.

The primary issue with this is that we will begin to ask for the user’s PIN more often with credentials which may lead to frustration. Biometrics this is less of a concern as the “touch” action is always required anyway. However I think this is acceptable since it’s more important for a consistent set of behaviours to exist.

Previously I have stated that “preferred” should not be used since it is bypassable, but with the extensions to Webauthn above where policy and uv at registration are stored and validated, preferred gains a proper meaning and can be checked and enforced.


In the scenarioes where “discouraged” and “preferred” may be used, UV is meaningless in the current definition of the Webauthn specification when paired with the various versions of CTAP. It’s merely a confusing annoyance that we present to users seemingly at random, that is trivially bypassed, adds little to no security value and at worst undermines user trust in the systems we are trying to build.

When we are building authentication systems, we must always think about and consider the humans who will be using these systems, and the security properties that we actually provide in these systems.

Wed, 19 Jan 2022 00:00:00 +1000 <![CDATA[Nextcloud - Unable to Open Photos Library]]> Nextcloud - Unable to Open Photos Library

I noticed since macos 11.6.2 that Nextcloud has been unable to sync my photos library. Looking into this error in I saw:

error       kernel  System Policy: Nextcloud(798) deny(1) file-read-data /Users/william/Pictures/Photos Library.photoslibrary

It seems that Nextcloud is not sandboxed which means that macos enforces stricter permissions on what this can or can not access, which is what prevented the photos library from syncing.

To resolve this you can go to System Preferences -> Security and Privacy -> Privacy -> Full Disk Access and then grant full disk access which will allow it to read the filesystem.

I tried to allow this via the files and folders access but I was unable to add/remove new items to this list.

Tue, 21 Dec 2021 00:00:00 +1000 <![CDATA[Transactional Operations in Rust]]> ]]> Sun, 14 Nov 2021 00:00:00 +1000 <![CDATA[Results from the OpenSUSE 2021 Rust Survey]]> Results from the OpenSUSE 2021 Rust Survey

From September the 8th to October the 7th, OpenSUSE has helped me host a survey on how developers are using Rust in their environments. As the maintainer of the Rust packages in SUSE and OpenSUSE it was important for me to get a better understanding of how people are using Rust so that we can make decisions that match how the community is working.

First, to every single one of the 1360 people who responded to this survey, thank you! This exceeded my expectations and it means a lot to have had so many people take the time to help with this.

All the data can be found here

What did you want to answer?

I had assistance from a psychology researcher at a local university to construct the survey and her help guided the structure and many of the questions. An important element of this was that the questions provided shouldn’t influence people into a certain answer, and that meant questions were built in a way to get a fair response that didn’t lead people into a certain outcome or response pattern. As a result, it’s likely that the reasons for the survey was not obvious to the participants.

What we wanted to determine from this survey:

  • How are developers installing rust toolchains so that we can attract them to OpenSUSE by reducing friction?
  • In what ways are people using distribution rust packages in their environments (contrast to rustup)?
  • Should our rust package include developer facing tools, or is it just another component of a build pipeline?
  • When people create or distribute rust software, how are they managing their dependencies, and do we need to provide tools to assist?
  • Based on the above, how can we make it easier for people to distribute rust software in packages as a distribution?
  • How do developers manage security issues in rust libraries, and how can this be integrated to reduce packaging friction?

Lets get to the data

As mentioned there were 1360 responses. Questions were broken into three broad categories.

  • Attitude
  • Developers
  • Distributors


This section was intended to be a gentle introduction to the survey, rather than answering any specific question. This section had 413 non-answers, which I will exclude for now.

We asked three questions:

  • Rust is important to my work or projects (1 disagree - 5 agree)
  • Rust will become more important in my work or projects in the future.  (1 disagree - 5 agree)
  • Rust will become more important to other developers and projects in the future (1 disagree - 5 agree)
../../../_images/1.png ../../../_images/2.png ../../../_images/3.png

From this there is strong support that rust is important to individuals today. It’s likely this is biased as the survey was distributed mainly in rust communities, however, we still had 202 responses that were less than 3. Once we look at the future questions we see strong belief that rust will become more important. Again this is likely to be biased due to the communities the survey was distributed within, but we still see small numbers of people responding that rust will not be important to others or themself in the future.

As this section was not intended to answer any questions, I have chosen not to use the responses of this section in other areas of the analysis.


This section was designed to help answer the following questions:

  • How are people installing rust toolchains so that we can attract them to OpenSUSE by reducing friction?
  • In what ways are people using distribution rust packages in their environments (contrast to rustup)?
  • Should our rust package include developer facing tools, or is it just another component of a build pipeline?

We asked the following questions:

  • As a developer, I use Rust on the following platforms while programming.
  • On your primary development platform, how did you install your Rust toolchain?
  • The following features or tools are important in my development environment (do not use 1 - use a lot 5)
    • Integrated Development Environments with Language Features (syntax highlight, errors, completion, type checking
    • Debugging tools (lldb, gdb)
    • Online Documentation (,
    • Offline Documentation (local)
    • Build Caching (sccache)

Generally we wanted to know what platforms people were using so that we could establish what people on linux were using today vs what people on other platforms were using, and then knowing what other platforms are doing we can make decisions about how to proceed.


There were 751 people who responded that they were a developer in this section. We can see Linux is the most popular platform used while programming, but for “Linux only” (derived by selecting responses that only chose Linux and no other platforms) this number is about equal to Mac and Windows. Given the prevalence of containers and other online linux environments it would make sense that developers access multiple platforms from their preferred OS, which is why there are many responses that selected multiple platforms for their work.


From the next question we see overwhelming support of rustup as the preferred method to install rust on most developer machines. As we did not ask “why” we can only speculate on the reasons for this decision.


When we isolate this to “Linux only”, we see a slight proportion increase in package manager installed rust environments, but there remains a strong tendancy for rustup to be the preferred method of installation.

This may indicate that even within Linux distros with their package manager capabilities, and even with distributions try to provide rapid rust toolchain updates, that developers still prefer to use rust from rustup. Again, we can only speculate to why this is, but it already starts to highlight that distribution packaged rust is unlikely to be used as a developer facing tool.

../../../_images/7.png ../../../_images/8.png ../../../_images/9.png

Once we start to look at features of rust that developers rely on we see a very interesting distribution. I have not included all charts here. Some features are strongly used (IDE rls, online docs) where others seem to be more distributed in attitude (debuggers, offline docs, build caching). From the strongly supported features when we filter this by linux users using distribution packaged rust, we see a similar (but not as strong) trend for importance of IDE features. The other features like debuggers, offline docs and build caching all remain very distributed. This shows that tools like rls for IDE integration are very important, but with only a small number of developers using packaged rust as developers versus rustup it may not be an important area to support with limited packaging resources and time. It’s very likely that developers who are on other distributions, mac or windows are more comfortable with a rustup based installation process.


This section was designed to help answer the following questions:

  • Should our rust package include developer facing tools, or is it just another component of a build pipeline?
  • When people create or distribute rust software, how are they managing their dependencies, and do we need to provide tools to assist?
  • Based on the above, how can we make it easier for people to distribute rust software in packages as a distribution?
  • How do developers manage security issues in rust libraries, and how can this be integrated to reduce packaging friction?

We asked the following questions:

  • Which platforms (operating systems) do you target for Rust software
  • How do you or your team/community build or provide Rust software for people to use?
  • In your release process, how do you manage your Rust dependencies?
  • In your ideal workflow, how would you prefer to manager your Rust dependencies?
  • How do you manage security updates in your Rust dependencies?

Our first question here really shows the popularity of Linux as a target platform for running rust with 570 out of 618 responses indicating they target Linux as a platform.


Once we look at the distribution methods, both building projects to packages and using distribution packaged rust in containers fall well behind the use of rustup in containers and locally installed rust tools. However if we observe container packaged rust and packaged rust binaries (which likely use the distro rust toolchains) we have 205 uses of the rust package out of 1280 uses, where we see 59 out of 680 from developers. This does indicate a tendancy that the rust package in a distribution is more likely to be used in a build pipeline over developer use - but rustup still remains most popular. I would speculate that this is because developers want to recreate the same process on their development systems as their target systems which would likely involve rustup as the method to ensure the identical toolchains are installed.

The next questions were focused on rust dependencies - as a staticly linked language, this changes the approach to how libraries can be managed. To answer how we as a distribution should support people in the way they want to manage libraries, we need to know how they use it today, and how they would ideally prefer to manage this in the future.

../../../_images/12.png ../../../_images/13.png

In both the current process and ideal processes we see a large tendancy to online library use from, and in both cases vendoring (pre-downloading) comes in second place. Between the current process and ideal process, we see a small reduction in online library use to the other options. As a distribution, since we can not provide online access to crates, we can safely assume most online crates users would move to vendoring if they had to work offline for packaging as it’s the most similar process available.

../../../_images/14.png ../../../_images/15.png

We can also look at some other relationships here. People who provide packages still tend to ideally prefer online crates usage, with distribution libraries coming in second place here. There is still significant momentum for packagers to want to use vendoring or online dependencies though. When we look at ideal management strategies for container builds, we see distribution packages being much less popular, and online libraries still remaining at the top.


Finally, when we look at how developers are managing their security updates, we see a really healthy statistic that many people are using tools like cargo audit and cargo outdated to proactively update their dependencies. Very few people rely on distribution packages for their updates however. But it remains that we see 126 responses from users who aren’t actively following security issues which again highlights a need for distributions who do provide rust packaged software to be proactive to detect issues that may exist.


By now we have looked at a lot of the survey and the results, so it’s time to answer our questions.

  • How are people installing rust toolchains so that we can attract them to OpenSUSE by reducing friction?

Developers are preferring the use of rustup over all other sources. Being what’s used on linux and other platforms, we should consider packaging and distributing rustup to give options to users (who may wish to avoid the curl | sh method.) I’ve already started the process to include this in OpenSUSE tumbleweed.

  • In what ways are people using distribution rust packages in their environments (contrast to rustup)?
  • Should our rust package include developer facing tools, or is it just another component of a build pipeline?

Generally developers tend strongly to rustup for their toolchains, where distribution rust seems to be used more in build pipelines. As a result of the emphasis on online docs and rustup, we can likely remove offline documentation and rls from the distribution packages as they are either not being used or have very few users and is not worth the distribution support cost and maintainer time. We would likely be better to encourage users to use rustup for developer facing needs instead.

To aid this argument, it appears that rls updates have been not functioning in OpenSUSE tumbleweed for a few weeks due to a packaging mistake, and no one has reported the issue - this means that the “scream test” failed. The lack of people noticing this again shows developer tools are not where our focus should be.

  • When people create or distribute rust software, how are they managing their dependencies, and do we need to provide tools to assist?
  • Based on the above, how can we make it easier for people to distribute rust software in packages as a distribution?

Distributors prefer cargo and it’s native tools, and this is likely an artifact of the tight knit tooling that exists in the rust community. Other options don’t seem to have made a lot of headway, and even within distribution packaging where you may expect stronger desire for packaged libraries, we see a high level of support for cargo directly to manage rust dependencies. From this I think it shows that efforts to package rust crates have not been effective to attract developers who are currently used to a very different workflow.

  • How do developers manage security issues in rust libraries, and how can this be integrated to reduce packaging friction?

Here we see that many people are proactive in updating their libraries, but there still exists many who don’t actively manage this. As a result, automating tools like cargo audit inside of build pipelines will likely help packagers, and also matches their existing and known tools. Given that many people will be performing frequent updates of their libraries or upstream releases, we’ll need to also ensure that the process to update and commit updates to packages is either fully automated or at least has a minimal hands on contact as possible. When combined with the majority of developers and distributors prefering online crates for dependencies, encouraging people to secure these existing workflows will likely be a smaller step for them. Since rust is staticly linked, we can also target our security efforts at leaf (consuming) packages rather than the libraries themself.


Again, thank you to everyone who answered the survey. It’s now time for me to go and start to do some work based on this data!

Fri, 08 Oct 2021 00:00:00 +1000 <![CDATA[Gnome 3 compare to MacOs]]> Gnome 3 compare to MacOs

An assertion I have made in the past is that to me “Gnome 3 feels like MacOs with rough edges”. After some discussions with others, I’m finally going to write this up with examples.

It’s worth pointing out that in my opinion, Gnome 3 is probably still the best desktop experience on Linux today for a variety of reasons - it’s just that for me, these rough edges really take away from that being a good experience for me.

High Level Structure Comparison

Here’s a pair of screenshots of MacOS 11.5.2 and Gnome 40.4. In both we have the settings menu open of the respective environment. Both are set to the resolution of 1680x1050, with the Mac using scaling from retina (2880x1800) to this size.

../../../_images/gnome-settings-1.png ../../../_images/macos-settings-1.png

From this view, we can already make some observations. Both of these have a really similar structure which when we look at appears like this:


The skeleton overall looks really similar, if not identical. We have a top bar that provides a system tray and status and a system context in the top left, as well as application context.

Now we can look at some of the details of each of the platforms at a high level from this skeleton.

We can see on the Mac that the “top menu bar” takes 2.6% of our vertical screen real-estate. Our system context is provided by the small Apple logo in the top left that opens to a menu of various platform options.

Next to that, we can see that our system preferences uses that top menu bar to provide our application context menus like edit, view, window and help. Further, on the right side of this we have a series of icons for our system - some of these from third party applications like nextcloud, and others coming from macos showing our backup status, keyboard, audio, battery, wifi time and more. This is using the space at the top of our screen really effectively, it doesn’t feel wasted, and adds context to what we are doing.

If we now look at Gnome we can see a different view. Our menu bar takes 3.5% of our vertical screen realestate, and the dark colour already feels like it is “dominating” visually. In that we have very little effective horizontal space use. The activities button (system context) takes us to our overview screen, and selecting the “settings” item which is our current application has no response or menu displayed.

The system tray doesn’t allow 3rd party applications, and the overview only shows our network and audio status and our clock (battery may be displayed on a laptop). To find more context about our system requires interaction with the single component at the top right, limiting our ability to interact with a specific element (network, audio etc) or understand our systems state quickly.

Already we can start to see some differences here.

  • UI elements in MacOS are smaller and consume less screen space.
  • Large amounts of non-functional dead space in Gnome
  • Elements are visually more apparently and able to be seen at a high level, where Gnome’s require interaction to find details

System Preferences vs Settings

Let’s compare the system preferences and Settings now. These are still similar, but not as close as our overall skeleton and this is where we start to see more about the different approaches to design in each.

The MacOS system preferences has all of it’s top level options displayed in a grid, with an easily accesible search function and forward and back navigation aides. This make it easy to find the relevant area that is required, and everything is immediately accessible and clear. Searching for items dims the application and begins to highlight elements that contain the relevant topic, helping to guide you to the location and establishing to the user where they can go in the future without the need to search. Inside any menu of the system preferences, search is always accesible and in the same consistent location of the application.


When we look at Gnome, in the settings application we see that not all available settings are displayed - the gutter column on the left is a scrollable UI element, but with no scroll bars present, this could be missed by a user that the functionality is present. Items like “Applications” which have a “>” present confusingly changes the gutter context to a list of applications rather than remaining at the top level when selected like all other items that don’t have the “>”. Breaking the users idea of consistency, when in these sub-gutters, the search icon is replaced with the “back” navigation icon, meaning you can not search when in a sub-gutter.

Finally, even visually we can see that the settings is physically larger as a window, with much larger fonts and the title bar containing much more dead space. The search icon (when present) requires interaction before the search text area appears adding extra clicks and interactions to achieve the task.

When we do search, the results are replaced into the gutter element. Screen lock here is actually in a sub-gutter menu for privacy, and not discoverable at the top level as an element. The use of nested gutters here adds confusion about where items are due to all the gutter content changes.


Again we are starting to see differences here:

  • MacOS search uses greater visual feedback to help guide users to where they need to be
  • Gnome hides many options in sub-menus, or with very few graphical guides which hinders discovery of items
  • Again, the use of dead space in Gnome vs the greater use of space in MacOS
  • Gnome requires more interactions to “get around” in general
  • Gnome applications visually are larger and take up more space of the screen
  • Gnome changes the UI and layout in subtle and inconsistent ways that rely on contextual knowledge of “where” you currently are in the application

Context Menus

Lets have a look at some of the menus that exist in the system tray area now. For now I’ll focus on audio, but these differences broadly apply to all of the various items here on MacOS and Gnome.

On MacOS when we select our audio icon in the system tray, we are presented with a menu that contains the current volume, the current audio output device (including options for network streaming) and a link to the system preferencs control panel for further audio settings that may exist. We aren’t overwhelmed with settings or choices, but we do have the ability to change our common options and shortcut links to get to the extended settings if needed.


A common trick in MacOS though is holding the option key during interactions. Often this can display power-user or extended capabilities. When done on the audio menu, we are also able to then control our input device selection.


On Gnome, in the system tray there is only a single element, that controls audio, power, network and more.


All we can do in this menu is control the volume - that’s it. There are no links to direct audio settings, device management, and there are no “hidden” shortcuts (like option) that allows greater context or control.

To summarise our differences:

  • MacOS provides topic-specific system tray menus, with greater functionality and links to further settings
  • Gnome has a combined menu, that is limited in functionality, and has only a generic link to settings
  • Gnome lacks the ability to gain extended options for power-users to view extra settings or details

File Browser

Finally lets look at the file browser. For fairness, I’ve changed Gnome’s default layout to “list” to match my own usage in finder.


We can already see a number of useful elements here. We have the ability to “tree” folders through the “>” icon, and rows of the browser alternate white/grey to help us visually identify lines horizontally. The rows are small and able to have (in this screenshot) 16 rows of content on the screen simultaneously. Finally, not shown here, but MacOS finder can use tabs for browsing different locations. And as before, we have our application context menu in the top bar with a large amount of actions available.


Gnomes rows are all white with extremely faint grey lines to delineate, making it hard to horizontally track items if the window was expanded. The icons are larger, and there is no ability to tree the files and folders. We can only see ~10 rows on screen despite the similar size of the windows presented here. Finally, the extended options are hidden in the “burger” menu next to the application close.

A theme should be apparent here:

  • Both MacOS and Gnome share a very similar skeleton of how this application is laid out
  • MacOS makes better use of visual elements to help your eye track across spaces to make connections
  • Gnome has a lot of dead space still and larger text and icons which takes greater amounts of screen space
  • Due to the application context and other higher level items, MacOS is “faster” to get to where you need to go

Keyboard Shortcuts

Keyboard shortcuts are something that aide powerusers to achieve tasks quicker, but the challenge is often finding what shortcuts exist to use them. Lets look at how MacOS and Gnome solve this.


Here in MacOS, anytime we open a menu, we can see the shortcut listed next to the menu item that is present, including disabled items (that are dimmed). Each shortcut’s symbols match the symbols of the keyboard allowing these to be cross-language and accessible. And since we are in a menu, we remain in the context of our Application and able to then immediately use the menu or shortcut.

In fact, even if we select the help menu and search a new topic, rather than take us away from menu’s, MacOS opens the menu and points us to where we are trying to go, allowing us to find the action we want and learn it’s shortcut!


This is great, because it means in the process of getting help, we are shown how to perform the action for future interactions. Because of the nature of MacOS human interface guidelines this pattern exists for all applications on the platform, including third party ones helping to improve accessibility of these features.

Gnome however takes a really different approach. Keyboard shortcuts are listed as a menu item from our burger menu.


When we select it, our applications context is taken away and replaced with a dictionary of keyboard shortcuts, spread over three pages.


I think the use of the keyboard icons here is excellent, but because we are now in a dictionary of shortcuts, it’s hard to find what we want to use, and we “taken away” from the context of the actions we are trying to perform in our application. Again, we have to perform more interactions to find the information that we are looking for in our applications, and we aren’t able to easily link the action to the shortcut in this style of presentation. We can’t transfer our knowledge of the “menus” into a shortcut that we can use without going through a reference manual.

Another issue here is this becomes the responsibility of each application to create these references and provide them, rather than being an automatically inherited feature through the adherence to human interface guidelines.


Honestly, I could probably keep making these comparisons all day. Gnome 3 and MacOS really do feel very similar to me. From style of keyboard shortcuts, layout of the UI, the structure of it’s applications and even it’s approach to windowing feels identical to MacOS. However while it looks similar on a surface level, there are many rough edges, excess interactions, poor use of screen space and visual elements.

MacOS certainly has it’s flaws, and makes it’s mistakes. But from a ease of use perspective, it tries to get out of the way and show you how to use the computer for yourself. MacOS takes a back seat to the usage of the computer.

Gnome however feels like it wants to be front and centre. It needs you to know all the time “you’re using Gnome!”. It takes you on a small adventure tour to complete simple actions or to discover new things. It even feels like Gnome has tried to reduce “complexity” so much that they have thrown away many rich features and interactions that could make a computer easier to use and interact with.

So for me, this is why I feel that Gnome is like MacOS with rough edges. There are many small, subtle and frustrating user interactions like this all through out the Gnome 3 experience that just aren’t present in MacOS.

Sun, 12 Sep 2021 00:00:00 +1000 <![CDATA[StartTLS in LDAP]]> StartTLS in LDAP

LDAP as a protocol is a binary protocol which uses ASN.1 BER encoded structures to communicate between a client and server, to query directory information (ie users, groups, locations, etc).

When this was created there was little consideration to security with regard to person-in-the-middle attacks (aka mitm: meddler in the middle, interception). As LDAP has become used not just as a directory service for accessing information, but now as an authentication and authorisation system it’s important that the content of these communications is secure from tampering or observation.

There have been a number of introduced methods to try and assist with this situation. These are:

  • StartTLS
  • SASL with encryption layers
  • LDAPS (LDAP over TLS)

Other protocols of a similar age also have used StartTLS such as SMTP and IMAP. However recent research has (again) shown issues with correct StartTLS handling, and recommends using SMTPS or IMAPS.

Today the same is true of LDAP - the only secure method of communication to an LDAP server is LDAPS. In this blog, I’ll be exploring the issues that exist with StartTLS (I will not cover SASL or GSSAPI).

How does StartTLS work?

StartTLS works by starting a plaintext (unencrypted) connection to the LDAP server, and then by upgrading that connection to begin TLS within the existing connection.

┌───────────┐                            ┌───────────┐
│           │                            │           │
│           │─────────open tcp 389──────▶│           │
│           │◀────────────ok─────────────│           │
│           │                            │           │
│           │                            │           │
│           │────────ldap starttls──────▶│           │
│           │◀──────────success──────────│           │
│           │                            │           │
│  Client   │                            │  Server   │
│           │──────tls client hello─────▶│           │
│           │◀─────tls server hello──────│           │
│           │────────tls key xchg───────▶│           │
│           │◀────────tls finish─────────│           │
│           │                            │           │
│           │──────TLS(ldap bind)───────▶│           │
│           │                            │           │
│           │                            │           │
└───────────┘                            └───────────┘

As we can see in LDAP StartTLS we establish a valid plaintext tcp connection, and then we send and LDAP message containing a StartTLS extended operation. If successful, we begin a TLS handshake over the connection, and when complete, our traffic is now encrypted.

This is contrast to LDAPS where TLS must be successfully established before the first LDAP message is exchanged.

It’s a good time to note that this is inefficent as it takes an extra round-trip to establish StartTLS like this contrast to LDAPS which increases latency for all communications. LDAP clients tend to open and close many connections, so this adds up quickly.

Security Issues

Client Misconfiguration

LDAP servers at the start of a connection will only accept two LDAP messages. Bind (authenticate) and StartTLS. Since StartTLS starts with a plaintext connection, if a client is misconfigured it is trivial for it to operate without StartTLS.

For example, consider the following commands.

# ldapwhoami -H ldap:// -x -D 'cn=Directory Manager' -W
Enter LDAP Password:
dn: cn=directory manager
# ldapwhoami -H ldap:// -x -Z -D 'cn=Directory Manager' -W
Enter LDAP Password:
dn: cn=directory manager

Notice that in both, the command succeeds and we authenticate. However, only in the second command are we using StartTLS. This means we trivially leaked our password. Forcing LDAPS to be the only protocol prevents this as every byte of the connection is always encrypted.

# ldapwhoami -H ldaps:// -x -D 'cn=Directory Manager' -W
Enter LDAP Password:
dn: cn=directory manager

Simply put this means that if you forget to add the command line flag for StartTLS, forget the checkbox in an admin console, or any other kind of possible human error (which happen!), then LDAP will silently continue without enforcing that StartTLS is present.

For a system to be secure we must prevent human error from being a factor by removing elements of risk in our systems.


A response to the above is to enforce MinSSF, or “Minimum Security Strength Factor”. This is an option on both OpenLDAP and 389-ds and is related to the integration of SASL. It represents that the bind method used must have “X number of bits” of security (however X is very arbitrary and not really representative of true security).

In the context of StartTLS or TLS, the provided SSF becomes the number of bits in the symmetric encryption used in the connection. Generally this is 128 due to the use of AES128.

Let us assume we have configured MinSSF=128 and we attempt to bind to our server.

┌───────────┐                            ┌───────────┐
│           │                            │           │
│           │─────────open tcp 389──────▶│           │
│           │◀────────────ok─────────────│           │
│  Client   │                            │  Server   │
│           │                            │           │
│           │──────────ldap bind────────▶│           │
│           │◀───────error - minssf──────│           │
│           │                            │           │
└───────────┘                            └───────────┘

The issue here is the minssf isn’t enforced until the bind message is sent. If we look at the LDAP rfc we see:

BindRequest ::= [APPLICATION 0] SEQUENCE {
     version                 INTEGER (1 ..  127),
     name                    LDAPDN,
     authentication          AuthenticationChoice }

AuthenticationChoice ::= CHOICE {
     simple                  [0] OCTET STRING,
                             -- 1 and 2 reserved
     sasl                    [3] SaslCredentials,
     ...  }

SaslCredentials ::= SEQUENCE {
     mechanism               LDAPString,
     credentials             OCTET STRING OPTIONAL }

Which means that in a simple bind (password) in the very first message we send our plaintext password. MinSSF only tells us after we already made the mistake, so this is not a suitable defence.

StartTLS can be disregarded

An interesting aspect of how StartTLS works with LDAP is that it’s possible to prevent it from being installed successfully. If we look at the RFC:

If the server is otherwise unwilling or unable to perform this
operation, the server is to return an appropriate result code
indicating the nature of the problem.  For example, if the TLS
subsystem is not presently available, the server may indicate this by
returning with the resultCode set to unavailable.  In cases where a
non-success result code is returned, the LDAP session is left without
a TLS layer.

What this means is it is up to the client and how they respond to this error to enforce a correct behaviour. An example of a client that disregards this error may proceed such as:

┌───────────┐                            ┌───────────┐
│           │                            │           │
│           │─────────open tcp 389──────▶│           │
│           │◀────────────ok─────────────│           │
│           │                            │           │
│  Client   │                            │  Server   │
│           │────────ldap starttls──────▶│           │
│           │◀───────starttls error──────│           │
│           │                            │           │
│           │─────────ldap bind─────────▶│           │
│           │                            │           │
└───────────┘                            └───────────┘

In this example, the ldap bind proceeds even though TLS is not active, again leaking our password in plaintext. A classic example of this is OpenLDAP’s own cli tools which in almost all examples of StartTLS online use the option ‘-Z’ to enable this.

# ldapwhoami -Z -H ldap:// -D 'cn=Directory Manager' -w password
ldap_start_tls: Protocol error (2)
dn: cn=Directory Manager

The quirk is that ‘-Z’ here only means to try StartTLS. If you want to fail when it’s not available you need ‘-ZZ’. This is a pretty easy mistake for any administrator to make when typing a command. There is no way to configure in ldap.conf that you always want StartTLS enforced either leaving it again to human error. Given the primary users of the ldap cli are directory admins, this makes it a high value credential open to potential human input error.

Within client applications a similar risk exists that the developers need to correctly enforce this behaviour. Thankfully for us, the all client applications that I tested handle this correctly:

  • SSSD
  • nslcd
  • ldapvi
  • python-ldap

However, I am sure there are many others that should be tested to ensure that they correctly handle errors during StartTLS.

Referral Injection

Referral’s are a feature of LDAP that allow responses to include extra locations where a client may look for the data they requested, or to extend the data they requested. Due to the design of LDAP and it’s response codes, referrals are valid in all response messages.

LDAP StartTLS does allow a referral as a valid response for the client to then follow - this may be due to the requested server being undermaintenance or similar.

Depending on the client implementation, this may allow an mitm to proceed. There are two possible scenarioes.

Assuming the client does do certificate validation, but is poorly coded, the following may occur:

┌───────────┐                            ┌───────────┐
│           │                            │           │
│           │─────────open tcp 389──────▶│           │
│           │◀────────────ok─────────────│           │
│           │                            │  Server   │
│           │                            │           │
│           │────────ldap starttls──────▶│           │
│           │◀──────────referral─────────│           │
│           │                            │           │
│           │                            └───────────┘
│  Client   │
│           │                            ┌───────────┐
│           │─────────ldap bind─────────▶│           │
│           │                            │           │
│           │                            │           │
│           │                            │ Malicious │
│           │                            │  Server   │
│           │                            │           │
│           │                            │           │
│           │                            │           │
└───────────┘                            └───────────┘

In this example our server sent a referral as a response to the StartTLS extended operation, which the client then followed - however the client did not attempt to install StartTLS again when contacting the malicious server. This would allow a bypass of certification validation by simply never letting TLS begin at all. Thankfully the clients I tested did not exhibt this behaviour, but it is possible.

If the client has configured certificate validation to never (tls_reqcert = never, which is a surprisingly common setting …) then the following is possible.

┌───────────┐                            ┌───────────┐
│           │                            │           │
│           │─────────open tcp 389──────▶│           │
│           │◀────────────ok─────────────│           │
│           │                            │  Server   │
│           │                            │           │
│           │────────ldap starttls──────▶│           │
│           │◀──────────referral─────────│           │
│           │                            │           │
│           │                            └───────────┘
│  Client   │
│           │                            ┌───────────┐
│           │────────ldap starttls──────▶│           │
│           │◀──────────success──────────│           │
│           │                            │           │
│           │◀──────TLS installed───────▶│ Malicious │
│           │                            │  Server   │
│           │───────TLS(ldap bind)──────▶│           │
│           │                            │           │
│           │                            │           │
└───────────┘                            └───────────┘

In this example the client follows the referral and then attempts to install StartTLS again. The malicious server may present any certificate it wishes and can then intercept traffic.

In my testing I found that this affected both SSSD and nslcd, however both of these when redirected to the malicous server would attempt to install StartTLS over an existing StartTLS channel, which caused the server to return an error condition. Potentially a modified malicious server in this case would be able to install two layers of TLS, or a response that would successfully trick these clients to divulging further information. I have not yet spent time to research this further.


While not as significant as the results found on “No Start TLS”, LDAP still is potentially exposed to risks related to StartTLS usage. To mitigate these LDAP server providers should disable plaintext LDAP ports and exclusively use LDAPS, with tls_reqcert set to “demand”.

Thu, 12 Aug 2021 00:00:00 +1000 <![CDATA[Getting started with Yew]]> ]]> Sun, 20 Jun 2021 00:00:00 +1000 <![CDATA[Compiler Bootstrapping - Can We Trust Rust?]]> Compiler Bootstrapping - Can We Trust Rust?

Recently I have been doing a lot of work for SUSE with how we package the Rust compiler. This process has been really interesting and challenging, but like anything it’s certainly provided a lot of time for thought while waiting for my packages to build.

The Rust package in OpenSUSE has two methods of building the compiler internally in it’s spec file.

    1. Use our previously packaged version of rustc from packages
    1. Bootstrap using the signed and prebuilt binaries provided by the rust project


There are many advocates of bootstrapping and then self sustaining a chain of compilers within a distribution. The roots of this come from Ken Thompsons Turing Award speech known as Reflections on trusting trust . This details the process in which a compiler can be backdoored, to produce future backdoored compilers. This has been replicated by Manish G. detailed in their blog, Reflections on Rusting Trust where they successfully create a self-hosting backdoored rust compiler.

The process can be visualised as:

┌──────────────┐              ┌──────────────┐
│  Backdoored  │              │   Trusted    │
│   Sources    │──────┐       │   Sources    │──────┐
│              │      │       │              │      │
└──────────────┘      │       └──────────────┘      │
                      │                             │
┌──────────────┐      │       ┌──────────────┐      │      ┌──────────────┐
│   Trusted    │      ▼       │  Backdoored  │      ▼      │  Backdoored  │
│ Interpreter  │──Produces───▶│    Binary    ├──Produces──▶│    Binary    │
│              │              │              │             │              │
└──────────────┘              └──────────────┘             └──────────────┘

We can see that in this attack, even with a set of trusted compiler sources, we can continue to produce a chain of backdoored binaries.

This has led to many people, and even groups such as Bootstrappable promoting work to be able to produce trusted chains from trusted sources, so that we can assert a level of trust in our produced compiler binaries.

┌──────────────┐              ┌──────────────┐
│   Trusted    │              │   Trusted    │
│   Sources    │──────┐       │   Sources    │──────┐
│              │      │       │              │      │
└──────────────┘      │       └──────────────┘      │
                      │                             │
┌──────────────┐      │       ┌──────────────┐      │      ┌──────────────┐
│   Trusted    │      ▼       │              │      ▼      │              │
│ Interpreter  │──Produces───▶│Trusted Binary├──Produces──▶│Trusted Binary│
│              │              │              │             │              │
└──────────────┘              └──────────────┘             └──────────────┘

This process would continue forever to the right, where each trusted binary is the result of trusted sources. This then ties into topics like reproducible builds which assert that you can separately rebuild the sources and attain the same binary, showing the process can not have been tampered with.

But does it really work like that?

Outside of thought exercises, there is little evidence of these attacks being carried out in reality.

Last year in 2020 we saw supply chain attacks such as the Solarwinds supply chain attacks which was reported by Fireeye as “Inserting malicious code into legitimate software updates for the Orion software that allow an attacker remote access into the victim’s environment”. What’s really interesting here was that no compiler was compromised in the process like our theoretical attack, but code was simply inserted and then subsequently was released.

Tavis Ormandy in his blog You don’t need reproducible builds covers supply chain security, and examines why reproducible builds are not effective in the promises and claims they present. Importantly, Tavis discusses how trivial it is to insert “bugdoors”, or pieces of code that are malicious and will not be found, and can potentially be waved off as human error.

Today, we don’t even need bugdoors, with Microsoft Security Response Centre reporting that 70% of vulnerabilities are memory safety issues.

No amount of reproducible builds or compiler bootstrapping chain can shield us from the reality that attackers today will target the softest area, and today that is security issues in our languages, and insecure configuration of supply chain infrastructure.

We don’t need backdoored compilers when we know that a security critical piece of software written in C is still exposed to the network.

But lets assume …

Okay, so lets assume that backdoored compilers are a real risk for a moment. We need to establish a few things first to create our secure bootstrapping environment, and these requirements generally are extremely difficult to meet.

We will need:

  • Trusted Interpreter
  • Trusted Sources

This is the foundation, having these two trusted entities that we can use to begin the process. But what is “trusted”? How can we define that these items are truly trusted?

One method could be to check the cryptographic signatures of the released source code, to validate that it is “what was released”, but this does not mean that the source code is free from backdoors/bugdoors which are the very thing we are attempting to shield ourselves from.

What would be truly required here is a detailed and complete audit of all of the source code to these compilers, which would be a monumental task in and of itself. So today instead, we do not perform source code audits, and we blindly trust the providers of the source code as legitimate and having provided us tamper-free source code. We assert that blind trust through the validation of those cryptographic signatures. We blindly trust that they have vetted every commit and line of code, and they have not had their own source code supply chain compromised in some way to provide us this “trusted source”. This gives us a relationship with the producers of that source, that they are trustworthy and have performed vetting of code and their members with privileges, that they will “do the right thing”™.

The second challenge is asserting trust in the interpreter. Where did this binary come from? How was it built? Were it’s sources trusted? As one can imagine, this becomes a very deep rabbit hole when we want to chase it, but in reality the approach taken by todays linux distributions is that “well we haven’t been compromised to this point, so I guess this one is okay” and we yolo build with it. We then create a root of trust in that one point in time, which then creates our bootstrapping chain of trust for future builds of subsequent trusted sources.

So what about Rust?

Rust is interesting compared to something like C (clang/gcc), as the rust project not only provides signed sources, they also provide signed static binaries of their compiler. This is because unlike clang/gcc which have very long release lifecycles, rust is released every six weeks and to build version N of the compiler, requires version N or N - 1. This allows people who have missed a version to easily skip ahead without needing to build every intermediate version of the compiler.

A frequent complaint is the difficulty to package rust because any time releases are missed, you must compile every intermediate version to adhere to the bootstrappable guidelines and principles to created a more “trusted” compiler.

But just like any other humans, in order to save time, when we miss a version, we can use the rust language’s provided signed binaries to reset the chain, allowing us to miss versions of rust, or to re-package older versions in some cases.

                        ┌──────────────┐             ┌──────────────┐
                 │      │   Trusted    │             │   Trusted    │
              Missed    │   Sources    │──────┐      │   Sources    │──────┐
             Version!   │              │      │      │              │      │
                 │      └──────────────┘      │      └──────────────┘      │
                 │                            │                            │
┌──────────────┐ │      ┌──────────────┐      │      ┌──────────────┐      │
│              │ │      │Trusted Binary│      ▼      │              │      ▼
│Trusted Binary│ │      │ (from rust)  ├──Produces──▶│Trusted Binary│──Produces───▶ ...
│              │ │      │              │             │              │
└──────────────┘ │      └──────────────┘             └──────────────┘

This process here is interesting because:

  • Using the signed binary from rust-lang is actually faster since we can skip one compiler rebuild cycle due to being the same version as the sources
  • It shows that the “bootstrappable” trust chain, does not actually matter since we frequently move our trust root to the released binary from rust, rather than building all intermediates

Given this process, we must ask, what value do we have from trying to adhere to the bootstrappable principles with rust? We already root our trust in the rust project, meaning that because we blindly trust the sources and the static compiler, why would our resultant compiler be any more “trustworthy” just because we were the ones who compiled it?

Beyond this the binaries that are issued by the rust project are used by thousands of people every day through tools like rustup. In reality, these have been proven time and time again that they are trusted to be able to run on mass deployments, and that the rust project has the ability and capability to respond to issues in their source code as well as the binaries they provide. They certainly have earned the trust of many people through this!

So why do we keep assuming both that we are somehow more trustworthy than the rust project, but simultaneously they are fully trusted in the artefacts they provide to us?


It is this contradiction that has made me rethink the process that we take to packaging rust in SUSE. I think we should bootstrap from upstream rust every release because the rust project are in a far better position to perform audits and respond to trust threats than part time package maintainers that are commonly part of Linux distributions.

│ ┌──────────────┐                              │ ┌──────────────┐
│ │   Trusted    │                              │ │   Trusted    │
│ │   Sources    │──────┐                       │ │   Sources    │──────┐
│ │              │      │                       │ │              │      │
│ └──────────────┘      │                       │ └──────────────┘      │
│                       │                       │                       │
│ ┌──────────────┐      │      ┌──────────────┐ │ ┌──────────────┐      │      ┌──────────────┐
│ │Trusted Binary│      ▼      │              │ │ │Trusted Binary│      ▼      │              │
│ │ (from rust)  ├──Produces──▶│Trusted Binary│ │ │ (from rust)  ├──Produces──▶│Trusted Binary│
│ │              │             │              │ │ │              │             │              │
│ └──────────────┘             └──────────────┘ │ └──────────────┘             └──────────────┘

We already fully trust the sources they release, and we already fully trust their binary compiler releases. We can simplify our build process (and speed it up!) by acknowledging this trust relationship exists, rather than trying to continue to convince ourselves that we are somehow “more trusted” than the rust project.

Also we must consider the reality of threats in the wild. Does all of this work and discussions of who is more trusted really pay off and defend us in reality? Or are we focused on these topics because they are something that we can control and have opinions over, rather than acknowledging the true complexity and dirtiness of security threats as they truly exist today?

Wed, 12 May 2021 00:00:00 +1000 <![CDATA[Open Source Enshrines the Wrong Privilege]]> Open Source Enshrines the Wrong Privilege

Within Open Source/Free Software, we repeatedly see a set of behaviours - hostile or toxic project owners, abusive relationships, aggression towards users, and complete disregard to users of the software. Some projects have risen above this and advanced the social behaviours in their communities, but these are still the minority of projects.

Many advocates for FLOSS have been trying to enhance adoption of these technologies in communities, but with the exception of limited non-technical audiences, this really hasn’t gained much ground.

It is my opinion that these community behaviours, and the low adoption of FLOSS technologies comes back to what our Open Source licenses enshrine - the very thing they embody and create.

The Origins of Free Software

The story of Free Software starts with an individual (later revealed as abusive), who was frustrated at not being able to access software on a printer so that he could alter it’s behaviour. This has been extended to the idea that Free Software “grants people control over their own lives and software”.

This however, is not correct.

What Free Software licenses protect is that individuals with time, resources, specialised technical knowledge and social standing have the possibility to alter that software’s behaviour.

When we consider that the majority of the world are not developers or software engineers, what is it that our Free Software is doing to protect and support these individuals? Should we truly expect individuals who are linguists, authors, scientists, retail staff, or social workers to be able to “alter the software to fix their own problems”?

Even as technical experts, we are frustrated when someone closes an issue with “PR’s welcome”. Imagine how these other people feel when they can’t even express or report the problem in the first place or get told they aren’t good enough, or that “they can fix it themselves if they want”.

This attitude also discounts the subject matter knowledge required to alter or contribute to any piece of software however. I may be a Senior Software Engineer, but I lack the knowledge and time to contribute to Gnome for example. Even with these “freedoms” I lack the ability to “control” the software on my own system.

Open Source is Selfish

These licenses that we have in FLOSS all enshrine selfish and privileged behaviours.

I have the rights to freely access this code so I can read it or alter it.

I can change this project to fix issues I have.

I have freedoms.

None of these statements from FLOSS describe other people - the people who consume our software (in some cases, without choice). People who are not subject matter experts and can’t contribute to “solve their own problems”. People who may not have the experience and language to describe the problems they face.

This lack of empathy, the lack of concern for others in FLOSS leads us to where we are now. Those who have the subject matter knowledge lead projects, and do what they want because they can fix it. They tell others “PR’s welcome” knowing full-well that the other person may never be able to contribute, that the barriers to contribution are so high (both in programming experience and domain knowledge). They design the software to work the way they want, because they understand it and it “works for me”.

This is reflected in our software. Software that not does not care for the needs, experiences or rights of others. Software that pretends to be accessible, all while creating gated communities of control. Software that is out of reach of people, the same people that we “claim” to be working for and supporting.

It leads to our communities that are selfish, and do not empathise with people. Communities that have placed negative behaviours on pedestals and turned these people into “leaders”. Software that does not account for the experiences of our users, believing that the “community knows best”.

One does not need to look far for FLOSS projects that speak one set of words, but their actions do not align.

What Can We Do?

In our projects we need to go beyond preserving the freedoms of ourselves, and begin to discuss the freedoms and interactions that others should have with our systems and projects. Here are some starting ideas that I have:

  • Have a code of conduct for all contributors (remember, opening an issue is a contribution).
  • Document your target users, and what kind of experience they should have. Expand this over time.
  • Promote empathy for those who aren’t direct contributors - indirect users without choice exist.
  • Remove dependencies on as many problematic software projects as possible.
  • Push for improvements to open licenses that enshrine the freedoms of others - not just developers.

As individual communities we can advance the state of software and how we act socially so that future projects and users are in a better place. No software exists in a vacuum, all software exists to support people. We need to always keep in mind the effects our software has on others.

Tue, 23 Mar 2021 00:00:00 +1000 <![CDATA[Time Machine on Samba with ZFS]]> Time Machine on Samba with ZFS

Time Machine is Apple’s in-built backup system for MacOS. It’s probably the best consumer backup option, which really achieves “set and forget” backups.

It can backup to an external hard disk on a dock, an Apple Time Machine (wireless access point), or a custom location based on SMB shares.

Since I have a fileserver at home, I use this as my Time Machine backup target. To make this work really smoothly there are a few setup steps.

MacOS Time Machine Performance

By default timemachine operates as a low priority process. You can set a sysctl to improve the performance of this:

sysctl -w debug.lowpri_throttle_enabled=0

You will need a launchd script to make this setting survive a reboot.


I’m using ZFS on my server, which is probably the best filesystem available. To make Time Machine work well on ZFS there are a number of tuning options that can help. As these backups write and read many small files, you should have a large amount of RAM for ARC (best) or a ZIL + L2ARC on nvme. RAID 10 will likely work better than RAIDZ here as you need better seek latency than write throughput due to the need to access many small files.

For the ZFS properties on the filesystem I have set:

atime: off
dnodesize: auto
xattr: sa
logbias: latency
recordsize: 32K
compression: zstd-10
quota: 3T
# optional
sync: disabled

The important ones here are the compression setting, which in my case gives a 1.3x compression ratio to save space, the quota to prevent the backups overusing space, the recordsize that helps to minimise write fragmentation.

You may optionally choose to disable sync. This is because Time Machine issues a sync after every single file write to the server, which can cause low performance with many small files. To mitigate the data loss risk here, I both snapshot the backups directory hourly, but I also have two stripes (an A/B backup target) so that if one of the stripes goes back, I can still access the other. This is another reason that compression is useful, to help offset the cost of the duplicated data.


Inside of the backups filessytem I have two folders:


In each of these you can add a PList that applies quota limits to the time machine stripes.

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "">
<plist version="1.0">

The quota is in bytes. You may not need this if you use the smb fruit:time machine max size setting.


In smb.conf I offer two shares for the A and B stripe. These have identical configurations beside the paths.

comment = Time Machine
path = /var/data/backup/timemachine_b
browseable = yes
write list = timemachine
create mask = 0600
directory mask = 0700
spotlight = no
vfs objects = catia fruit streams_xattr
fruit:aapl = yes
fruit:time machine = yes
fruit:time machine max size = 1050G
durable handles = yes
kernel oplocks = no
kernel share modes = no
posix locking = no
# NOTE: Changing these will require a new initial backup cycle if you already have an existing
# timemachine share.
case sensitive = true
default case = lower
preserve case = no
short preserve case = no

The fruit settings are required to help Time Machine understand that this share is usable for it. Most of the durable settings are related to performance improvement to help minimise file locking and to improve throughput. These are “safe” only because we know that this volume is ALSO not accessed or manipulated by any other process or nfs at the same time.

I have also added a custom timemachine user to smbpasswd, and created a matching posix account who should own these files.


You can now add this to MacOS via system preferences. Alternately you can use the command line.

tmutil setdestination smb://timemachine:password@hostname/timemachine_a

If you intend to have stripes (A/B), MacOS is capable of mirroring between two strips alternately. You can append the second stripe with (note the -a).

tmutil setdestination -a smb://timemachine:password@hostname/timemachine_b
Mon, 22 Mar 2021 00:00:00 +1000 <![CDATA[Against Packaging Rust Crates]]> Against Packaging Rust Crates

Recently the discussion has once again come up around the notion of packaging Rust crates as libraries in distributions. For example, taking a library like serde and packaging it to an RPM. While I use RPM as the examples here it applies equally to other formats.

Proponents of crate packaging want all Rust applications to use the “distributions” versions of a crate. This is to prevent “vendoring” or “bundling”. This is where an application (such as 389 Directory Server) ships all of it’s sources, as well as the sources of it’s Rust dependencies in a single archive. These sources may differ in version from the bundled sources of other applications.

“Packaging crates is not reinventing Cargo”

This is a common claim by advocates of crate packaging. However it is easily disproved:

If packaging is not reinventing cargo, I am free to use all of Cargo’s features without conflicts to distribution packaging.

The reality is that packaging crates is reinventing Cargo - but without all it’s features. Common limitations are that Cargo’s exact version/less than requirements can not be used safely, or Cargo’s ability to apply patches or uses sources from specific git revisions can not be used at all.

As a result, this hinders upstreams from using all the rich features within Cargo to comply with distribution packaging limitations, or it will cause the package to hit exceptions in policy and necesitate vendoring anyway.

“You can vendor only in these exceptional cases …”

As noted, since packaging is reinventing Cargo, if you use features of Cargo that are unsupported then you may be allowed to vendor depending on the distributions policy. However, this raises some interesting issues itself.

Assume I have been using distribution crates for a period of time - then the upstream adds an exact version or git revision requirement to a project or a dependency in my project. I now need to change my spec file and tooling to use vendoring and all of the benefits of distribution crates no longer exists (because you can not have any dependency in your tree that has an exact version rule).

If the upstream ‘un-does’ that change, then I need to roll back to distribution crates since the project would no longer be covered by the exemption.

This will create review delays and large amounts of administrative overhead. It means pointless effort to swap between vendored and distribution crates based on small upstream changes. This may cause packagers to avoid certain versions or updates so that they do not need to swap between distribution methods.

It’s very likely that these “exceptional” cases will be very common, meaning that vendoring will be occuring. This necesitates supporting vendored applications in distribution packages.

“You don’t need to package the universe”

Many proponents say that they have “already packaged most things”. For example in 389 Directory Server of our 60 dependencies, only 2 were missing in Fedora (2021-02). However this overlooks the fact that I do not want to package those 2 other crates just to move forward. I want to support 389 Directory Server the application not all of it’s dependencies in a distribution.

This is also before we come to larger rust projects, such as Kanidm that has nearly 400 dependencies. The likelihood that many of them are missing is high.

So you will need to package the universe. Maybe not all of it. But still a lot of it. It’s already hard enough to contribute packages to a distribution. It becomes even harder when I need to submit 3, 10, or 100 more packages. It could be months before enough approvals were in place. It’s a staggering amount of administration and work, which will discourage many contributors.

People have already contacted me to say that if they had to package crates to distribution packages to contribute, they would give up and walk away. We’ve already lost future contributors.

Further to this Ruby, Python and many other languages today all recommend language native tools such as rvm or virtualenv to avoid using distribution packaged libraries.

Packages in distributions should exist as a vehicle to ship bundled applications that are created from their language native tools.

“We will update your dependencies for you”

A supposed benefit is that versions of crates in distributions will be updated in the background according to semver rules.

If we had an exact version requirement (that was satisfiable), a silent background update will cause this to no longer work - and will break the application from building. This would necesitate one of:

  • A change to the Cargo.toml to remove the equality requirement - a requirement that may exist for good reason.
  • It will force the application to temporarily swap to vendoring instead.
  • The application will remain broken and unable to be updated until upstream resolves the need for the equality requirement.

Background updates also ignore the state of your Cargo.lock file by removing it. A Cargo.lock file is recommended to be checked in with binary applications in Rust, as evidence that shows “here is an exact set of dependencies that upstream has tested and verified as building and working”.

To remove and ignore this file, means to remove the guarantees of quality from an upstream.

It is unlikely that packagers will run the entire test suite of an application to regain this confidence. They will “apply the patch and pray” method - as they already do with other languages.

We can already see how background updates can have significant negative consequences on application stability. FreeIPA has hundreds of dependencies, and it’s common that if any of them changes in small ways, it can cause FreeIPA to fall over. This is not the fault of FreeIPA - it’s the fault of relying on so many small moving parts that can change underneath your feet without warning. FreeIPA would strongly benefit from vendoring to improve it’s stability and quality.

Inversely, it can cause hesitation to updating libraries - since there is now a risk of breaking other applications that depend on them. We do not want people to be afraid of updates.

“We can respond to security issues”

On the surface this is a strong argument, but in reality it does not hold up. The security issues that face Rust are significantly different to that which affect C. In C it may be viable to patch and update a dynamic library to fix an issue. It saves time because you only need to update and change one library to fix everything.

Security issues are much rarer in Rust. When they occur, you will have to update and re-build all applications depending on the affected library.

Since this rebuilding work has to occur, where the security fix is applied is irrelevant. This frees us to apply the fixes in a different way to how we approach C.

It is better to apply the fixes in a consistent and universal manner. There will be applications that are vendored due to vendoring exceptions, there is now duplicated work and different processes to respond to both distribution crates, and vendored applications.

Instead all applications could be vendored, and tooling exists that would examine the Cargo.toml to check for insecure versions (RustSec/cargo-audit does this for example). The Cargo.toml’s can be patched, and applications tested and re-vendored. Even better is these changes could easily then be forwarded to upstreams, allowing every distribution and platform to benefit from the work.

In the cases that the upstream can not fix the issue, then Cargo’s native patching tooling can be used to supply fixes directly into vendored sources for rare situations requiring it.

“Patching 20 vulnerable crates doesn’t scale, we need to patch in one place!”

A common response to the previous section is that the above process won’t scale as we need to find and patch 20 locations compared to just one. It will take “more human effort”.

Today, when a security fix comes out, every distribution’s security teams will have to be made aware of this. That means - OpenSUSE, Fedora, Debian, Ubuntu, Gentoo, Arch, and many more groups all have to become aware and respond. Then each of these projects security teams will work with their maintainers to build and update these libraries. In the case of SUSE and Red Hat this means that multiple developers may be involved, quality engineering will be engaged to test these changes. Consumers of that library will re-test their applications in some cases to ensure there are no faults of the components they rely upon. This is all before we approach the fact that each of these distributions have many supported and released versions they likely need to maintain so this process may be repeated for patching and testing multiple versions in parallel.

In this process there are a few things to note:

  • There is a huge amount of human effort today to keep on top of security issues in our distributions.
  • Distributions tend to be isolated and can’t share the work to resolve these - the changes to the rpm specs in SUSE won’t help Debian for example.
  • Human error occurs in all of these layers causing security issues to go un-fixed or breaking a released application.

To suggest that rust and vendoring somehow makes this harder or more time consuming is discounting the huge amount of time, skill, and effort already put in by people to keep our C based distributions functioning today.

Vendored Rust won’t make this process easier or harder - it just changes the nature of the effort we have to apply as maintainers and distributions. It shifts our focus from “how do we ensure this library is secure” to “how do we ensure this application made from many libraries is secure”. It allows further collaboration with upstreams to be involved in the security update process, which ends up benefiting all distributions.

“It doesn’t duplicate effort”

It does. By the very nature of both distribution libraries and vendored applications needing to exist in a distribution, there will become duplicated but seperate processes and policies to manage these, inspect, and update these. This will create a need for tooling and supporting both methods, which consumes time for many people.

People have already done the work to package and release libraries to Tools already exist to provide our dependencies and include them in our applications. Why do we need to duplicate these features and behaviours in distribution packages when Cargo already does this correctly, and in a way that is universal and supported.

Don’t support distribution crates

I can’t be any clearer than that. They consume excessive amounts of contributor time, for little to no benefit, it detracts from simpler language-native solutions for managing dependencies, distracts from better language integration tooling being developed, it can introduce application instability and bugs, and it creates high barriers to entry for new contributors to distributions.

It doesn’t have to be like this.

We need to stop thinking that Rust is like C. We have to accept that language native tools are the interface people will want to use to manage their libraries and distribute whole applications. We must use our time more effectively as distributions.

If we focus on supporting vendored Rust applications, and developing our infrastructure and tooling to support this, we will attract new contributors by lowering barriers to entry, but we will also have a stronger ability to contribute back to upstreams, and we will simplify our building and packaging processes.

Today, tools like docker, podman, flatpak, snapd and others have proven how bundling/vendoring, and a focus an applications can advance the state of our ecosystems. We need to adopt the same ideas into distributions. Our package managers should become a method to ship applications - not libraries.

We need to focus our energy to supporting applications as self contained units - not supporting the libraries that make them up.


  • Released: 2021-02-16
  • EDIT: 2021-02-22 - improve clarity on some points, thanks to ftweedal.
  • EDIT: 2021-02-23 - due to a lot of comments regarding security updates, added an extra section to address how this scales.
Tue, 16 Feb 2021 00:00:00 +1000 <![CDATA[Getting Started Packaging A Rust CLI Tool in SUSE OBS]]> ]]> Mon, 15 Feb 2021 00:00:00 +1000 <![CDATA[Webauthn UserVerificationPolicy Curiosities]]> Webauthn UserVerificationPolicy Curiosities

Recently I received a pair of interesting bugs in Webauthn RS where certain types of authenticators would not work in Firefox, but did work in Chromium. This confused me, and I couldn’t reproduce the behaviour. So like any obsessed person I ordered myself one of the affected devices and waited for Australia Post to lose it, find it, lose it again, and then finally deliver the device 2 months later.

In the meantime I swapped browsers from Firefox to Edge and started to notice some odd behaviour when logging into my corporate account - my yubikey began to ask me for my pin on every authentication, even though the key was registered to the corp servers without a pin. Yet the key kept working on Edge with a pin - and confusingly without a pin on Firefox.

Some background on Webauthn

Before we dive into the issue, we need to understand some details about Webauthn. Webauthn is a standard that allows a client (commonly a web browser) to cryptographically authenticate to a server (commonly a web site). Webauthn defines how different types of hardware cryptographic authenticators may communicate between the client and the server.

An example of some types of authenticator devices are U2F tokens (yubikeys), TouchID (Apple Mac, iPhone, iPad), Trusted Platform Modules (Windows Hello) and many more. Webauthn has to account for differences in these hardware classes and how they communicate, but in the end each device performs a set of asymmetric cryptographic (public/private key) operations.

Webauthn defines the structures of how a client and server communicate to both register new authenticators and subsequently authenticate with those authenticators.

For the first step of registration, the server provides a registration challenge to the client. The structure of this (which is important for later) looks like:

dictionary PublicKeyCredentialCreationOptions {
    required PublicKeyCredentialRpEntity         rp;
    required PublicKeyCredentialUserEntity       user;

    required BufferSource                             challenge;
    required sequence<PublicKeyCredentialParameters>  pubKeyCredParams;

    unsigned long                                timeout;
    sequence<PublicKeyCredentialDescriptor>      excludeCredentials = [];
    AuthenticatorSelectionCriteria               authenticatorSelection = {
        AuthenticatorAttachment      authenticatorAttachment;
        boolean                      requireResidentKey = false;
        UserVerificationRequirement  userVerification = "preferred";
    AttestationConveyancePreference              attestation = "none";
    AuthenticationExtensionsClientInputs         extensions;

The client then takes this structure, and creates a number of hashes from it which the authenticator then signs. This signed data and options are returned to the server as a PublicKeyCredential containing an Authenticator Attestation Response.

Next is authentication. The server sends a challenge to the client which has the structure:

dictionary PublicKeyCredentialRequestOptions {
    required BufferSource                challenge;
    unsigned long                        timeout;
    USVString                            rpId;
    sequence<PublicKeyCredentialDescriptor> allowCredentials = [];
    UserVerificationRequirement          userVerification = "preferred";
    AuthenticationExtensionsClientInputs extensions;

Again, the client takes this structure, takes a number of hashes and the authenticator signs this to prove it is the holder of the private key. The signed response is sent to the server as a PublicKeyCredential containing an Authenticator Assertion Response.

Key to this discussion is the following field:

UserVerificationRequirement          userVerification = "preferred";

This is present in both PublicKeyCredentialRequestOptions and PublicKeyCredentialCreationOptions. This informs what level of interaction assurance should be provided during the signing process. These are discussed in NIST SP800-64b (5.1.7, 5.1.9) (which is just an excellent document anyway, so read it).

One aspect of these authenticators is that they must provide tamper proof evidence that a person is physically present and interacting with the device for the signature to proceed. This is important as it means that if someone is able to gain remote code execution on your system, they are unable to use your authenticator devices (even if it’s part of the device, like touch id) as they are not physically present at the machine.

Some authenticators are able to go beyond to strengthen this assurance, by verifying the identity of the person interacting with the authenticator. This means that the interaction also requires say a PIN (something you know), or a biometric (something you are). This allows the authenticator to assert not just that someone is present but that it is a specific person who is present.

All authenticators are capable of asserting user presence but only some are capable of asserting user verification. This creates two classes of authenticators as defined by NIST SP800-64b.

Single-Factor Cryptographic Devices (5.1.7) which only assert presence (the device becomes something you have) and Multi-Factor Cryptographic Devices (5.1.9) which assert the identity of the holder (something you have + something you know/are).

Webauthn is able to request the use of a Single-Factor Device or Multi-Factor Device through it’s UserVerificationRequirement option. The levels are:

  • Discouraged - Only use Single-Factor Devices
  • Required - Only use Multi-Factor Devices
  • Preferred - Request Multi-Factor if possible, but allow Single-Factor devices.

Back to the mystery …

When I initially saw these reports - of devices that did not work in Firefox but did in Chromium, and of devices asking for PINs on some browsers but not others - I was really confused. The breakthrough came as I was developing Webauthn Authenticator RS. This is the client half of Webauthn, so that I could have the Kanidm CLI tools use Webauthn for multi-factor authentication (MFA). In the process, I have been using the authenticator crate made by Mozilla and used by Firefox.

The authenticator crate is what communicates to authenticators by NFC, Bluetooth, or USB. Due to the different types of devices, there are multiple different protocols involved. For U2F devices, the protocol is CTAP over USB. There are two versions of the CTAP protocol - CTAP1, and CTAP2.

In the authenticator crate, only CTAP1 is supported. CTAP1 devices are unable to accept a PIN, so user verification must be performed internally to the device (such as a fingerprint reader built into the U2F device).

Chromium, however, is able to use CTAP2 - CTAP2 does allow a PIN to be provided from the host machine to the device as a user verification method.

Why would devices fail in Firefox?

Once I had learnt this about CTAP1/CTAP2, I realised that my example code in Webauthn RS was hardcoding Required as the user verification level. Since Firefox can only use CTAP1, it was unable to use PINs to U2F devices, so they would not respond to the challenge. But on Chromium with CTAP2 they are able to have PINs so Required can be satisfied and the devices work.

Okay but the corp account?

This one is subtle. The corp identity system uses user verification of ‘Preferred’. That meant that on Firefox, no PIN was requested since CTAP1 can’t provide them, but on Edge/Chromium a PIN can be provided as they use CTAP2.

What’s more curious is that the same authenticator device is flipping between Single Factor and Multi Factor, with the same Private/Public Key pair just based on what protocol is used! So even though the ‘Preferred’ request can be satisfied on Chromium/Edge, it’s not on Firefox. To further extend my confusion, the device was originally registered to the corp identity system in Firefox so it would have not had user verification available, but now that I use Edge it has gained this requirement during authentication.

That seems … wrong.

I agree. But Webauthn fully allows this. This is because user verification is a property of the request/response flow, not a property of the device.

This creates some interesting side effects that become an opportunity for user confusion. (I was confused about what the behaviour was and I write a webauthn server and client library - imagine how other people feel …).

Devices change behaviour

This means that during registration one policy can be requested (i.e. Required) but subsequently it may not be used (Preferred + Firefox + U2F, or Discouraged). Another example of a change in behaviour occurs when a device is used on Chromium with Preferred user verification is required, but when used on Firefox the device may not require verification. It also means that a site that implements Required can have devices that simply don’t work in other browsers.

Because this is changing behaviour it can confuse users. For examples:

  • Why do I need a PIN now but not before?
  • Why did I need a PIN before but not now?
  • Why does my authenticator work on this computer but not on another?

Preferred becomes Discouraged

This opens up a security risk where since Preferred “attempts” verification but allows it to not be present, a U2F device can be “downgraded” from Multi-Factor to Single-Factor by using it with CTAP1 instead of CTAP2. Since it’s also per request/response, a compromised client could also tamper with the communication to the authenticator removing the requested userverification parameter silently and the server would allow it.

This means that in reality, Preferred is policy and security wise equivalent to Discouraged, but with a more annoying UI/UX for users who have to conduct a verification that doesn’t actually help identify them.

Remember - if unspecified, ‘Preferred’ is the default user verification policy in Webauthn!

Lock Out / Abuse Vectors

There is also a potential abuse vector here. Many devices such as U2F tokens perform a “trust on first use” for their PIN setup. This means that the first time a user verification is requested you configure the pin at that point in time.

A potential abuse vector here is a token that is always used on Firefox, a malicious person could connect the device to Chromium, and setup the PIN without the knowledge of the owner. The owner could continue to use the device, and when Firefox eventually supports CTAP2, or they swap computer or browser, they would not know the PIN, and their token would effectively be unusable at that point. They would need to reset it, potentially causing them to be locked out from accounts, but more likely causing them to need to conduct a lot of password/credential resets.

Unable to implement Authenticator Policy

One of the greatest issues here though is that because user verification is part of the request/response flow and not per device attributes, authenticator policy, and mixed credentials are unable to exist in the current implementation of Webauthn.

Consider a user who has enrolled say their laptop’s U2F device + password, and their iPhone’s touchID to a server. Both of these are Multi Factor credentials. The U2F is a Single Factor Device and becomes Multi-Factor in combination with the password. The iPhone touchID is a Multi-Factor Device on it’s due to the biometric verification it is capable of.

We should be able to have a website request webauthn and based on the device used we can flow to the next required step. If the device was the iPhone, we would be authenticated as we have authenticated a Multi Factor credentials. If we saw the U2F device we would then move to request the password since we have only received a Single Factor. However Webauthn is unable to express this authentication flow.

If we requested Required, we would exclude the U2F device.

If we requested Discouraged, we would exclude the iPhone.

If we request Preferred, the U2F device could be used on a different browser with CTAP2, either:

  • bypassing the password, since the device is now a self contained Multi-Factor; or
  • the U2F device could prompt for the PIN needlessly and we progress to setting a password

The request to an iPhone could be tampered with, preventing the verification occurring and turning it into a single factor device (presence only).

Today, these mixed device scenarios can not exist in Webauthn. We are unable to create the policy around Single-Factor and Multi-Factor devices as defined by NIST because these require us to assert the verification requirements per credential, but Webauthn can not satisfy this.

We would need to pre-ask the user how they want to authenticate on that device and then only send a Webauthn challenge that can satisfy the authentication policy we have decided on for those credentials.

How to fix this

The solution here is to change PublicKeyCredentialDescriptor in the Webauthn standard to contain an optional UserVerificationRequirement field. This would allow a “global” default set by the server and then per-credential requirements to be defined. This would allow the user verification properties during registration to be associated to that credential, which can then be enforced by the server to guarantee the behaviour of a webauthn device. It would also allow the ‘Preferred’ option to have a valid and useful meaning during registration, where devices capable of verification can provide that or not, and then that verification boolean can be then transformed to a Discouraged or Required setting for that credential for future authentications.

The second change would be to disallow ‘Preferred’ as a valid value in the “global” default during authentications. The new “default” global value should be ‘Discouraged’ and then only credentials that registered with verification would indicate that in their PublicKeyCredentialDescriptor.

This would resolve the issues above by:

  • Making the use of an authenticator consistent after registration. For example, authenticators registered with CTAP1 would stay ‘Discouraged’ even when used with CTAP2
  • If PIN/Verification abuse occurred, the credentials registered on CTAP1 without verification would continue to be ‘presence only’ preventing the lockout
  • Allowing the server to proceed with the authentication flow based on which credential authenticated and provide logic about further factors if needed.
  • Allowing true Single Factor and Multi Factor device policies to be expressed in line with NIST SP800-63b, so users can have a mix of Single and Multi Factor devices associated with a single account.

I have since opened this issue with the webauthn specification about this, but early comments seem to be highly focused on the current expression of the standard rather than the issues around the user experience and ability for identity systems to accurately express credential policy.

In the meantime, I am going to make changes to Webauthn RS to help avoid some of these issues:

  • Preferred will be renamed to Preferred_Is_Equivalent_To_Discouraged (it will still emit ‘Preferred’ in the JSON, this only changes the Rust API enum)
  • Credential structures persisted by applications will contain the boolean of user-verification if it occurred during registration
  • During an authentication, if the set of credentials contains inconsistent user-verification booleans, an error will be raised
  • Authentication User Verification Policy is derived from the set of credentials having a consistent user-verification boolean

While not perfect, it will mean that it’s “hard to hold it wrong” with Webauthn RS.


Thanks to both @Charcol0x89 and @JuxhinDB for reviewing this post.

Sat, 21 Nov 2020 00:00:00 +1000 <![CDATA[Rust, SIMD and target-feature flags]]>

Rust, SIMD and target-feature flags

This year I’ve been working on concread and one of the ways that I have improved it is through the use of packed_simd for parallel key lookups in hashmaps. During testing I saw a ~10% speed up in Kanidm which heavily relies on concread, so great, pack it up, go home.


Or so I thought. Recently I was learning to use Ghidra with a friend, and as a thought exercise I wanted to see how Rust decompiled. I put the concread test suite into Ghidra and took a look. Looking at the version of concread with simd_support enabled, I saw this in the disassembly (truncated for readability).

  *                          FUNCTION                          *
  Simd<[packed_simd_2--masks--m64;8]> __stdcall eq(Simd<[p
100114510 55              PUSH       RBP
100114511 48 89 e5        MOV        RBP,RSP
100114514 48 83 e4 c0     AND        RSP,-0x40
100114518 48 81 ec        SUB        RSP,0x100
          00 01 00 00
10011451f 48 89 f8        MOV        RAX,__return_storage_ptr__
100114522 0f 28 06        MOVAPS     XMM0,xmmword ptr [self->__0.__0]
100114540 66 0f 76 c4     PCMPEQD    XMM0,XMM4
100114544 66 0f 70        PSHUFD     XMM4,XMM0,0xb1
          e0 b1
100114549 66 0f db c4     PAND       XMM0,XMM4
100114574 0f 29 9c        MOVAPS     xmmword ptr [RSP + local_90],XMM3
          24 b0 00
          00 00
1001145b4 48 89 7c        MOV        qword ptr [RSP + local_c8],__return_storage_pt
          24 78
1001145be 0f 29 44        MOVAPS     xmmword ptr [RSP + local_e0],XMM0
          24 60
1001145d2 48 8b 44        MOV        RAX,qword ptr [RSP + local_c8]
          24 78
1001145d7 0f 28 44        MOVAPS     XMM0,xmmword ptr [RSP + local_e0]
          24 60
1001145dc 0f 29 00        MOVAPS     xmmword ptr [RAX],XMM0
1001145ff 48 89 ec        MOV        RSP,RBP
100114602 5d              POP        RBP
100114603 c3              RET

Now, it’s been a long time since I’ve had to look at x86_64 asm, so I saw this and went “great, it’s not using a loop, those aren’t simple TEST/JNZ instructions, they have a lot of letters, awesome it’s using HW accel.

Time passes …

Coming back to this, I have been wondering how we could enable SIMD in concread at SUSE, since 389 Directory Server has just merged a change for 2.0.0 that uses concread as a cache. For this I needed to know what minimum CPU is supported at SUSE. After some chasing internally, knowing what we need I asked in the Rust Brisbane group about how you can define in packed_simd to only emit instructions that work on a minimum CPU level rather than my cpu or the builder cpu.

The response was “but that’s already how it works”.

I was helpfully directed to the packed_simd perf guide which discusses the use of target features and target cpu. At that point I realised that for this whole time I’ve only been using the default:

# rustc --print cfg | grep -i target_feature

The PCMPEQD is from sse2, but my cpu is much newer and should support AVX and AVX2. Retesting this, I can see my CPU has much more:

# rustc --print cfg -C target-cpu=native | grep -i target_feature

All this time, I haven’t been using my native features!

For local builds now, I have .cargo/config set with:

rustflags = "-C target-cpu=native"

I recompiled concread and I now see in Ghidra:

00198960 55              PUSH       RBP
00198961 48 89 e5        MOV        RBP,RSP
00198964 48 83 e4 c0     AND        RSP,-0x40
00198968 48 81 ec        SUB        RSP,0x100
         00 01 00 00
0019896f 48 89 f8        MOV        RAX,__return_storage_ptr__
00198972 c5 fc 28 06     VMOVAPS    YMM0,ymmword ptr [self->__0.__0]
00198976 c5 fc 28        VMOVAPS    YMM1,ymmword ptr [RSI + self->__0.__4]
         4e 20
0019897b c5 fc 28 12     VMOVAPS    YMM2,ymmword ptr [other->__0.__0]
0019897f c5 fc 28        VMOVAPS    YMM3,ymmword ptr [RDX + other->__0.__4]
         5a 20
00198984 c4 e2 7d        VPCMPEQQ   YMM0,YMM0,YMM2
         29 c2
00198989 c4 e2 75        VPCMPEQQ   YMM1,YMM1,YMM3
         29 cb
0019898e c5 fc 29        VMOVAPS    ymmword ptr [RSP + local_a0[0]],YMM1
         8c 24 a0
         00 00 00
001989e7 48 89 ec        MOV        RSP,RBP
001989ea 5d              POP        RBP
001989eb c5 f8 77        VZEROUPPER
001989ee c3              RET

VPCMPEQQ is the AVX2 compare instruction (You can tell it’s AVX2 due to the register YMM, AVX uses XMM). Which means now I’m getting the SIMD comparisons I wanted!

These can be enabled with RUSTFLAGS=’-C target-feature=+avx2,+avx’ for selected builds, or in your .cargo/config. It may be a good idea for just local development to do target-cpu=native.

Fri, 20 Nov 2020 00:00:00 +1000 <![CDATA[Deploying sccache on SUSE]]> Deploying sccache on SUSE

sccache is a ccache/icecc-like tool from Mozilla, which in addition to working with C and C++, is also able to help with Rust builds.

Adding the Repo

A submission to Factory (tumbleweed) has been made, so check if you can install from zypper:

zypper install sccache

If not, sccache is still part of devel:tools:building so you will need to add the repo to use sccache.

zypper ar -f obs://devel:tools:building devel:tools:building
zypper install sccache

It’s also important you do not have ccache installed. ccache intercepts the gcc command so you end up “double caching” potentially.

zypper rm ccache

Single Host

To use sccache on your host, you need to set the following environment variables:

export RUSTC_WRAPPER=sccache
export CC="sccache /usr/bin/gcc"
export CXX="sccache /usr/bin/g++"
# Optional: This can improve rust caching
# export CARGO_INCREMENTAL=false

This will allow sccache to wrap your compiler commands. You can show your current sccache status with:

sccache -s

There is more information about using cloud/remote storage for the cache on the sccache project site

Distributed Compiliation

sccache is also capable of distributed compilation, where a number of builder servers can compile items and return the artificats to your machine. This can save you time by allowing compilation over a cluster, using a faster remote builder, or just to keep your laptop cool.

Three components are needed to make this work. A scheduler that coordinates the activities, one or more builders that provide their CPU, and a client that submits compilation jobs.

The sccache package contains the required elements for all three parts.

Note that the client does not need to be the same version of SUSE or even the same distro as the scheduler or builder. This is because the client is able to bundle and submit it’s toolchains to the workers on the fly. Neat! sccache is capacble of also compiling for macos and windows, but in these cases the toolchains can-not be submitted on the fly and requires extra work to configure.


The scheduler is configured with /etc/sccache/scheduler.conf. You need to define the listening ip, client auth, and server (builder) auth methods. The example configuration is well commented to help with this:

# The socket address the scheduler will listen on. It's strongly recommended
# to listen on localhost and put a HTTPS server in front of it.
public_addr = ""
# public_addr = "[::1]:10600"

# This is how a client will authenticate to the scheduler.
# # sccache-dist auth generate-shared-token --help
type = "token"
token = "token here"
# type = "jwt_hs256"
# secret_key = ""

# sccache-dist auth --help
# To generate the secret_key:
# # sccache-dist auth generate-jwt-hs256-key
# To generate a key for a builder, use the command:
# # sccache-dist auth generate-jwt-hs256-server-token --config /etc/sccache/scheduler.conf --secret-key "..." --server "builderip:builderport"
type = "jwt_hs256"
secret_key = "my secret key"

You can start the scheduler with:

systemctl start sccache-dist-scheduler.service

If you have issues you can increase logging verbosity with:

# systemctl edit sccache-dist-scheduler.service


Similar to the scheduler, the builder is configured with /etc/sccache/builder.conf. Most of the defaults should be left “as is” but you will need to add the token generated from the comments in scheduler.conf - server_auth.

# This is where client toolchains will be stored.
# You should not need to change this as it is configured to work with systemd.
cache_dir = "/var/cache/sccache-builder/toolchains"
# The maximum size of the toolchain cache, in bytes.
# If unspecified the default is 10GB.
# toolchain_cache_size = 10737418240
# A public IP address and port that clients will use to connect to this builder.
public_addr = ""
# public_addr = "[::1]:10501"

# The URL used to connect to the scheduler (should use https, given an ideal
# setup of a HTTPS server in front of the scheduler)
scheduler_url = ""

type = "overlay"
# The directory under which a sandboxed filesystem will be created for builds.
# You should not need to change this as it is configured to work with systemd.
build_dir = "/var/cache/sccache-builder/tmp"
# The path to the bubblewrap version 0.3.0+ `bwrap` binary.
# You should not need to change this as it is configured for a default SUSE install.
bwrap_path = "/usr/bin/bwrap"

type = "jwt_token"
# This will be generated by the `generate-jwt-hs256-server-token` command or
# provided by an administrator of the sccache cluster. See /etc/sccache/scheduler.conf
token = "token goes here"

Again, you can start the builder with:

systemctl start sccache-dist-builder.service

If you have issues you can increase logging verbosity with:

# systemctl edit sccache-dist-builder.service

You can configure many hosts as builders, and compilation jobs will be distributed amongst them.


The client is the part that submits compilation work. You need to configure your machine the same as single host with regard to the environment variables.

Additionally you need to configure the file ~/.config/sccache/config. An example of this can be found in /etc/sccache/client.example.

# The URL used to connect to the scheduler (should use https, given an ideal
# setup of a HTTPS server in front of the scheduler)
scheduler_url = "http://x.x.x.x:10600"
# Used for mapping local toolchains to remote cross-compile toolchains. Empty in
# this example where the client and build server are both Linux.
toolchains = []
# Size of the local toolchain cache, in bytes (5GB here, 10GB if unspecified).
# toolchain_cache_size = 5368709120

cache_dir = "/tmp/toolchains"

type = "token"
# This should match the `client_auth` section of the scheduler config.
token = ""

You can check the status with:

sccache --stop-server
sccache --dist-status

If you have issues, you can increase the logging with:

sccache --stop-server
SCCACHE_NO_DAEMON=1 RUST_LOG=sccache=trace sccache --dist-status

Then begin a compilation job and you will get the extra logging. To undo this, run:

sccache --stop-server
sccache --dist-status

In addition, sccache even in distributed mode can still use cloud or remote storage for items, using it’s cache first, and the distributed complitation second. Anything that can’t be remotely complied will be run locally.


If you compile something from your client, you should see messages like this appear in journald in the builder/scheduler machine:

INFO 2020-11-19T22:23:46Z: sccache_dist: Job 140 created and will be assigned to server ServerId(V4(x.x.x.x:10501))
INFO 2020-11-19T22:23:46Z: sccache_dist: Job 140 successfully assigned and saved with state Ready
INFO 2020-11-19T22:23:46Z: sccache_dist: Job 140 updated state to Started
INFO 2020-11-19T22:23:46Z: sccache_dist: Job 140 updated state to Complete
Thu, 19 Nov 2020 00:00:00 +1000 <![CDATA[How a Search Query is Processed in Kanidm]]> ]]> Tue, 01 Sep 2020 00:00:00 +1000 <![CDATA[Using SUSE Leap Enterprise with Docker]]> Using SUSE Leap Enterprise with Docker

It’s a little bit annoying to connect up all the parts for this. If you have a SLE15 system then credentials for SCC are automatically passed into containers via secrets.

But if you are on a non-SLE base, like myself with MacOS or OpenSUSE you’ll need to provide these to the container in another way. The documentation is a bit tricky to search and connect up what you need but in summary:

  • Get /etc/SUSEConnect and /etc/zypp/credentials.d/SCCcredentials from an SLE install that has been registered. The SLE version does not matter.
  • Mount them into the image:
docker ... -v /scc/SUSEConnect:/etc/SUSEConnect \
    -v /scc/SCCcredentials:/etc/zypp/credentials.d/SCCcredentials \

Now you can use the images from the SUSE registry. For example docker pull and have working zypper within them.

If you want to add extra modules to your container (you can list what’s available with container-suseconnect from an existing SLE container of the same version), you can do this by adding environment variables at startup. For example, to add dev tools like gdb:

docker ... -e ADDITIONAL_MODULES=sle-module-development-tools \
    -v /scc/SUSEConnect:/etc/SUSEConnect \
    -v /scc/SCCcredentials:/etc/zypp/credentials.d/SCCcredentials \

This also works during builds to add extra modules.

HINT: SUSEConnect and SCCcredentials and not version dependent so will work in any image version.

Wed, 26 Aug 2020 00:00:00 +1000 <![CDATA[Windows Hello in Webauthn-rs]]> Windows Hello in Webauthn-rs

Recently I’ve been working again on webauthn-rs, as a member of the community wants to start using it in production for a service. So far the development of the library has been limited to the test devices that I own, but now this pushes me toward implementing true fido compliance.

A really major part of this though was that a lot of their consumers use windows, which means support windows hello.

A background on webauthn

Webauthn itself is not a specification for the cryptographic operations required for authentication using an authenticator device, but a specification that wraps other techniques to allow a variety of authenticators to be used exposing their “native” features.

The authentication side of webauthn is reasonably simple in this way. The server stores a public key credential associated to a user. During authentication the server provides a challenge which the authenticator signs using it’s private key. The server can then verify using it’s copy of the challenge, and the public key that the authentication must have come from that credentials. Of course like anything there is a little bit of magic in here around how the authenticators store credentials that allows other properties to be asserted, but that’s beyond the scope of this post.

The majority of the webauthn specification is around the process of registering credentials and requesting specific properties to exist in the credentials. Some of these properties are optional hints (resident keys, authenticator attachment) and some of these properties are enforced (user verification so that the credential is a true MFA). Beyond these there is also a process for the authenticator to provide information about it’s source and trust. This process is attestation and has multiple different formats and details associated.

It’s interesting to note that for most deployments of webauthn, attestation is not required by the attestation conveyance preference, and generally provides little value to these deployments. For many sites you only need to know that a webauthn authenticator is in use. However attestation allows environments with strict security requirements to verify and attest the legitimacy of, and make and model of authenticators in use. (An interesting part of webauthn is how much of it seems to be Google and Microsoft internal requirements leaking into a specification, just saying).

This leads to what is effectively, most of the code in webauthn-rs -

Windows Hello

Windows Hello is Microsoft’s equivalent to TouchID on iOS. Using a Trusted Platform Module (TPM) as a tamper-resistant secure element, it allows devices such as a Windows Surface to perform cryptographic operations. As Microsoft is attempting to move to a passwordless future (which honestly, I’m on board for and want to support in Kanidm), this means they want to support Webauthn on as many of their devices as possible. Microsoft even defines in their hardware requirements for Windows 10 Home, Pro, Education and Enterprise that as of July 28, 2016, all new device models, lines or series … a component which implements the TPM 2.0 must be present and enabled by default from this effective date.. This is pretty major as this means that slowly MS have been ensuring that all consumer and enterprise devices are steadily moving to a point where passwordless is a viable future. Microsoft state that they use TPMv2 for many reasons, but a defining one is: The TPM 1.2 spec only allows for the use of RSA and the SHA-1 hashing algorithm which is now considered broken.

Of course, if you have noticed this means that TPM’s are involved. Webauthn supports a TPM attestation path, and that means I have to implement it.

Once more into the abyss

Reading the Webauthn spec for TPM attestation it pointed me to the TPMv2.0 specification part1, part2 and part3. I will spare you from this as there is a sum total of 861 pages between these documents, and the Webauthn spec while it only references a few parts, manages to then create a set of expanding references within these documents. To make it even more enjoyable, text search is mostly broken in these documents, meaning that trying to determine the field contents and types involves a lot of manual-eyeball work.

TPM’s structures are packed C structs which means that they can be very tricky to parse. They use u16 identifiers to switch on unions, and other fun tricks that we love to see from C programs. These u16’s often have some defined constants which are valid choices, such as TPM_ALG_ID, which allows switching on which cryptographic algorithms are in use. Some stand out parts of this section were as follows.

Unabashed optimism:

TPM_ALG_ERROR 0x0000 // Should not occur

Broken Crypto

TPM_ALG_SHA1 0x0004 // The SHA1 Algorithm

Being the boomer equivalent of JWT

TPM_ALG_NULL 0x0010 // The NULL Algorithm

And supporting the latest in modern cipher suites

TPM_ALG_XOR 0x000A // TCG TPM 2.0 library specification - the XOR encryption algorithm.

ThE XOR eNcRyPtIoN aLgoRitHm.

Some of the structures are even quite fun to implement, such as TPMT_SIGNATURE, where a matrix of how to switch on it is present where the first two bytes when interpreted as a u16, define a TPM_ALG_ID where, if it the two bytes are not in a set of the TPM_ALG_ID then the whole blob including leading two bytes is actually just a blob of hash. It would certainly be unfortunate if in the interest of saving two bytes that my hash accidentally emited data where the first two bytes were accidentally a TPM_ALG_ID causing a parser to overflow.

I think the cherry on all of this though, is that despite Microsoft requiring TPMv2.0 to move away from RSA and SHA-1, that when I checked the attestation signatures for a Windows Hello device I had to implement the following:

COSEContentType::INSECURE_RS1 => {
    hash::hash(hash::MessageDigest::sha1(), input)
        .map(|dbytes| Vec::from(dbytes.as_ref()))
        .map_err(|e| WebauthnError::OpenSSLError(e))


Saying this, I’m happy that Windows Hello is now in Webauthn-rs. The actual Webauthn authentication flows DO use secure algorithms (RSA2048 + SHA256 and better), it is only in the attestation path that some components are signed by SHA1. So please use webauthn-rs, and do use Windows Hello with it!

Mon, 24 Aug 2020 00:00:00 +1000 <![CDATA[User gesture is not detected - using iOS TouchID with webauthn-rs]]> User gesture is not detected - using iOS TouchID with webauthn-rs

I was recently contacted by a future user of webauthn-rs who indicated that the library may not currently support Windows Hello as an authenticator. This is due to the nature of the device being a platform attached authenticator and that webauthn-rs at the time did not support attachment preferences.

As I have an ipad, and it’s not a primary computing device I decided to upgrade to iPadOS 14 beta to try out webauthn via touch (and handwriting support).

The Issue

After watching Jiewen’s WWDC presentation about using TouchID with webauthn, I had a better idea about some of the server side requirements to work with this.

Once I started to test though, I would recieve the following error in the safari debug console.

User gesture is not detected. To use the platform authenticator,
call 'navigator.credentials.create' within user activated events.

I was quite confused by this error - a user activated event seems to be a bit of an unknown term, and asking other people they also didn’t quite know what it meant. My demo site was using a button input with onclick event handlers to call javascript similar to the following:

function register() {
fetch(REG_CHALLENGE_URL + username, {method: "POST"})
   .then(res => {
      ... // error handling
   .then(res => res.json())
   .then(challenge => {
     challenge.publicKey.challenge = fromBase64(challenge.publicKey.challenge); = fromBase64(;
     return navigator.credentials.create(challenge)
       .then(newCredential => {
         console.log("PublicKeyCredential Created");
         return fetch(REGISTER_URL + username, {
           method: "POST",
           body: JSON.stringify(cc),
           headers: {
             "Content-Type": "application/json",

This works happily in Firefox and Chrome, and for iPadOS it event works with my yubikey 5ci.

I investigated further to determine if the issue was in the way I was presenting the registration to the navigator.credentials.create function. Comparing to (which does work with TouchID on iPadOS 14 beta), I noticed some subtle differences but nothing that should cause an issue like this.

After much pacing, thinking and asking for help I eventually gave in and went to the source of webkit

The Solution

Reading through the webkit source I noted that the check within the code was looking for association of how the event was initiated. This comes from a context that is available within the browser. This got me to think about the fact that the fetch api is async, and I realised at this point that was using the jQuery.ajax apis. I altered my demo to use the same, and it began to work with TouchID. That meant that the user activation was being lost over the async boundary to the fetch API. (note: it’s quite reasonable to expect user interaction to use navigator.credentials to prevent tricking or manipulating users into activating their webauthn devices).

I emailed Jiewen, who responded overnight and informed me that this is an issue, and it’s being tracked in the webkit bugzilla . He assures me that it will be resolved in a future release. Many thanks to him for helping me with this issue!

At this point I now know that TouchID will work with webauthn-rs, and I can submit some updates to the library to help support this.

Notes on Webauthn with TouchID

It’s worth pointing out a few notes from the WWDC talk, and the differences I have observed with webauthn on real devices.

In the presentation it is recommended that in your Credential Creation Options, that you (must?) define the options listed to work with TouchID

authenticatorSelection: { authenticatorAttachment: "platform" },
attestation: "direct"

It’s worth pointing out that authenticatorAttachment is only a hint to the client to which credentials it should use. This allows your web page to streamline the UI flow (such as detection of platform key and then using that to toggle the authenticatorAttachment), but it’s not an enforced security policy. There is no part of the attestation response that indicates the attachement method. The only way to determine that the authenticator is a platform authenticator would be in attestation “direct” to validate the issuing CA or the device’s AAGUID match the expectations you hold for what authenticators can be used within your organisation or site.

Additionally, TouchID does work with no authenticatorAttachment hint (safari prompts if you want to use an external key or TouchID), and that attestation: “none” also works. This means that a minimised and default set of Credential Creation Options will allow you to work with the majority of webauthn devices.

Finally, the WWDC video glosses over the server side process. Be sure to follow the w3c standard for verifying attestations, or use a library that implementes this standard (such as webauthn-rs or duo-labs go webauthn). I’m sure that other libraries exist, but it’s critical that they follow the w3c process as webauthn is quite complex and fiddly to implement in a correct and secure manner.

Wed, 12 Aug 2020 00:00:00 +1000 <![CDATA[docker buildx for multiarch builds]]> docker buildx for multiarch builds

I have been previously building Kanidm with plain docker build, but recently a community member wanted to be able to run kanidm on arm64. That meant that I needed to go down the rabbit hole of how to make this work …

What not to do …

There is a previous method of using manifest files to allow multiarch uploads. It’s pretty messy but it works, so this is an option if you want to investigate but I didn’t want to pursue it.

Bulidx exists and I got it working on my linux machine with the steps from here but the build took more than 3 hours, so I don’t recommend it if you plan to do anything intense or frequently.

Buildx cluster

Docker has a cross-platform building toolkit called buildx which is currently tucked into the experimental features. It can be enabled on docker for mac in the settings (note: you only need experimental support on the coordinating machine aka your workstation).

Rather than follow the official docs this will branch out. The reason is that buildx in the official docs uses qemu-aarch64 translation which is very slow and energy hungry, taking a long time to produce builds. As mentioned already I was seeing in excess of 3 hours for aarch64 on my builder VM or my mac.

Instead, in this configuration I will use my mac as a coordinator, and an x86_64 VM and a rock64pro as builder nodes, so that the builds are performed on native architecture machines.

First we need to configure our nodes. In /etc/docker/daemon.json we need to expose our docker socket to our mac. I have done this with the following:

  "hosts": ["unix:///var/run/docker.sock", "tcp://"]

WARNING: This configuration is HIGHLY INSECURE. This exposes your docker socket to the network with no authentication, which is equivalent to un-authenticated root access. I have done this because my builder nodes are on an isolated and authenticated VLAN of my home network. You should either do similar or use TLS authentication.

NOTE: The ssh:// transport does not work for docker buildx. No idea why but it don’t.

Once this is done restart docker on the two builder nodes.

Now we can configure our coordinator machine. We need to check buildx is present:

docker buildx --help

We then want to create a new builder instance and join our nodes to it. We can use the DOCKER_HOST environment variable for this:

DOCKER_HOST=tcp://x.x.x.x:2376 docker buildx create --name cluster
DOCKER_HOST=tcp://x.x.x.x:2376 docker buildx create --name cluster --append

We can then startup and bootstrap the required components with:

docker buildx use cluster
docker buildx inspect --bootstrap

We should see output like:

Name:   cluster
Driver: docker-container

Name:      cluster0
Endpoint:  tcp://...
Status:    running
Platforms: linux/amd64, linux/386

Name:      cluster1
Endpoint:  tcp://...
Status:    running
Platforms: linux/arm64, linux/arm/v7, linux/arm/v6

If we are happy with this we can make this the default builder.

docker buildx use cluster --default

And you can now use it to build your images such as:

docker buildx build --push --platform linux/amd64,linux/arm64 -f Dockerfile -t <tag> .

Now I can build my multiarch images much quicker and efficently!

Thu, 06 Aug 2020 00:00:00 +1000 <![CDATA[Developer Perspective on Docker]]> Developer Perspective on Docker

A good mate of mine Ron Amosa put a question up on twitter about what do developers think Docker brings to the table. I’m really keen to see what he has to say (his knowledge of CI/CD and Kubernetes is amazing by the way!), but I thought I’d answer his question from my view as a software engineer.

Docker provides resource isolation and management to applications

Lets break that down.

What is a resource? What is an application?

It doesn’t matter what kind of application we write: A Rust command line tool, an embedded database in C, or a webserver or website with Javascript. Every language and that program requires resources to run. Let’s focus on a Python webserver for this thought process.

Our webserver (which is an application) requires a lot of things to be functional! It needs to access a network to open listening sockets, it needs access to a filesystem to read pages or write to a database (like sqlite). It needs CPU time to process requests, and memory to create a stack/heap to work through those requests. But as well our application also needs to be seperated and isolated from other programs too, so that they can not disclose our data - but so that faults in our application do not affect other services. It probably needs a seperate user and group, which is a key idea in unix process isolation and security. Maybe also there are things like SELinux or AppArmor that also provide extra enhancements.

But why stop here there are many more. We might need system controls (sysctls) that define networking stack behaviour like how TCP performs. We may need specific versions of python libraries for our application. Perhaps we also want to limit the system calls that our python application can perform to our OS.

I hope we can see that the resources we have, really is more than simply CPU and Memory here! Every application is really quite involved.

A short view back to the past …

In the olden days, as developers we had to be responsible for these isolations. For example on a system, we’d have to select a bind address so that we could be configured to only use a single network device for listening on. This not only meant that our applications had to support this behaviour, but that a person had to read our documentation, and find out how to configure that behaviour to isolate the networking resource.

And of course many others. To limit the amount of CPU or RAM that was available required you to configure ulimits for the user, and to select which user was going to run our application.

Many problems have been seen too with a language like python where libraries are not isolated and there are conflicts between which version different applications require. Is it the fault of python? The application developer? It’s hard to say …

What about system calls? With an interpretted language like python, you can’t just set the capabilities flags or other hardening options because they have to be set on the interpretter (python) which is used in many places. An example where the resource (python) is shared between many applications preventing us from creating isolated python runtimes.

Even things like SELinux and AppArmor required complex, hand created profiles, that were cryptic at best, or led to being disabled in the common case (It can’t be secure if it’s not usable! People will always take the easy option …).

And that’s even before we look at init scripts - bash scripts that had to be invoked in careful ways, and were all hand rolled, each adding different mistakes or issues. It was a time where to “package” and application and deploy it, required huge amounts of knowledge of a broad range of topics.

In many cases, I have seen another way this manifested. Rather than isolated applications (which was too hard), every application was installed on a dedicated virtual machine. The resource management then came as an artifact of every machine being seperate and managed by a hypervisor.


Along came systemd though, and it got us much further. It provided consistent application launch tools, and has done a lot of work to manage and provide resource management such as cgroups (cpu, mem), dynamic users, some types of filesystem isolation and some more. Systemd as an init system has done some really good stuff.

But still problems exist. Applications still require custom SELinux or AppArmor profiles, and systemd can’t help managing network interfaces (that still falls on the application).

It also still relies on you to put the files in place, or a distribution package to get the file content into the system.


Docker takes this even further. Docker manages and arbitrates every resource your application requires, even the filesystem and install process. For example, a very complex topic like CPU or memory limit’s on Linux, becomes quite simple in docker which exposes CPU and memory tunables. Docker allows sysctls per container. You assign and manage storage.

docker run -v db:/data/db -v config:/data/config --network private_net \
    --memory 1024M --shm-size 128M -P 80:8080 --user isolated \

From this command we can see what resources are involved. We know that we mount two storage locations. We can see that we confine the network to a private network, and that we want to limit memory to 1024M. We also can see we’ll be listening on port 80 which remaps to the container internally. We even know what user we’ll run as so we can assign permissions to the volumes. Our application is also defined as is it’s version.

Not only can we see what resources we are using there are a lot of other benefits. Docker can dynamically generate selinux/apparmor isolation profiles, so we get stronger process isolation between containers and host processes. We know that the filesystem of this container is isolated from others, so they can have and bundle the correct versions of dependencies. We know how to start, stop, and even monitor the application. It can even have health checks. Logs will (should?) go to stdout/err which docker will forward to a log collector we can define. In the future each application may even be in it’s own virtual memory space (IE seperate vm’s).

Docker provides a level of isolation to resources that is still hard to achieve in systemd, and not only that it makes very advanced or complex configurations easy to access and use. Accesibility of these features is vitally important to allow all people to create robust applications in their environments. But it also allows me as a developer to know what resources can exist in the container and how to interact with these in a way that will respect the wishes of the deploying administrator.

Docker Isn’t Security Isolation

It’s worth noting that while Docker can provide SELinux and AppArmor profiles, Docker is not an effective form of security isolation. It certainly makes the bar much much higher than before yes! And that’s great! And I hope that bar continues to rise. However today we do live in an age where there are many attacks still locally on linux kernels and the fix delay in these is still long. We also still see CPU sidechannels, and these will never be resolved while we rely on asynchronous CPU behaviour.

If you have high value data, it is always best to have seperated physical machines for these applications, and to always patch frequently, have a proper CI/CD pipeline, and centralised logging, and much much more. Ask your security team! I’m sure they’d love to help :)


For me personally, docker is about resource management and isolation. It helps me to define an interface that an admin can consume and interact with, making very advanced concepts easy to use. It gives me trust that applications will run in a way that is isolated and known all the way from development and testing through to production under high load. By making this accessible, it means that anyone - from a single docker container to a giant kubernetes cluster, can have really clear knowledge of how their applications are operating.

Mon, 13 Jul 2020 00:00:00 +1000 <![CDATA[virt-manager missing pci.ids usb.ids macos]]> virt-manager missing pci.ids usb.ids macos

I got the following error:

/usr/local/Cellar/libosinfo/1.8.0/share/libosinfo/pci.ids No such file or directory

This appears to be an issue in libosinfo from homebrew. Looking at the libosinfo source, there are some aux download files. You can fix this with:

mkdir -p /usr/local/Cellar/libosinfo/1.8.0/share/libosinfo/
cd /usr/local/Cellar/libosinfo/1.8.0/share/libosinfo/
wget -q -O pci.ids
wget -q -O usb.ids

All is happy again with virt-manager

Mon, 15 Jun 2020 00:00:00 +1000 <![CDATA[Resolving AirPlayXPCHelper Perr NULL kCanceledErr with Apple TV and MacOS]]> Resolving AirPlayXPCHelper Perr NULL kCanceledErr with Apple TV and MacOS

I decided to finally get an Apple TV so that I could use my iPad and MacBook Pro to airplay to my projector. So far I’ve been really impressed by it and how well it works with modern amplifiers and my iPad.

Sadly though, when I tried to use my MacBook pro to airplay to the Apple TV I recieved an “Unable to connect” error, with no further description.

Initial Research

The first step was to look in at the local system logs. The following item stood out:

error 09:24:41.459722+1000 AirPlayXPCHelper ### Error: CID 0xACF10006, Peer NULL, -6723/0xFFFFE5BD kCanceledErr

I only found a single result on a search for this, and they resolved the problem by disabling their MacOS firewall - attempting this myself did not fix the issue. There are also reports of apple service staff disabling the firewall to resolve airplay problems too.

Time to Dig Further …

Now it was time to look more. To debug an Apple TV you need to connect a USB-C cable to it’s service port on the rear of the device, while you connect this to a Mac on the other side. will then show you the streamed logs from the device.

While looking on the Apple TV I noticed the following log item:

[AirPlay] ### [0x8F37] Set up session 16845584210140482044 with [<ipv6 address>:3378]:52762 failed: 61/0x3D ECONNREFUSED {
"timingProtocol" : "NTP",
"osName" : "Mac OS X",
"isScreenMirroringSession" : true,
"osVersion" : "10.15.4",
"timingPort" : 64880,

I have trimmed this log, as most details don’t matter. What is important is that it looks like the Apple TV is attempting to back-connect to the MacBook Pro, which has a connection refused. From iOS it appears that the video/timing channel is initiated from the iOS device, so no back-connection is required, but for AirPlay to work from the MacBook Pro to the Apple TV, the Apple TV must be able to connect back on high ports with new UDP/TCP sessions for NTP to synchronise clocks.

My Network

My MacBook pro is on a seperate VLAN to my Apple TV for security reasons, mainly because I don’t want most devices to access management consoles of various software that I have installed. I have used the Avahi reflector on my USG to enable cross VLAN discovery. This would appear to be issue, is that my firewall is not allowing the NTP traffic back to my MacBook pro.

To resolve this I allowed some high ports from the Apple TV to connect back to the VLAN my MacBook Pro is on, and I allowed built-in software to recieve connections.

Once this was done, I was able to AirPlay across VLANs to my Apple TV!

Sun, 03 May 2020 00:00:00 +1000 <![CDATA[Building containers on OBS]]> Building containers on OBS

My friend showed me how to build containers in OBS, the opensuse build service. It makes it really quite nice, as the service can parse your dockerfile, and automatically trigger rebuilds when any package dependency in the chain requires a rebuild.

The simplest way is to have a seperate project for your containers to make the repository setup a little easier.

When you edit the project metadata, if the project doesn’t already exist, a new one is created. So we can start by filling out the template from the command:

osc meta prj -e home:firstyear:containers

This will give you a template: We need to add some repository lines:

<project name="home:firstyear:apps">
  <title>Containers Demo</title>
  <description>Containers Demo</description>
  <person userid="firstyear" role="bugowner"/>
  <person userid="firstyear" role="maintainer"/>
  <!-- this repository -->
  <repository name="containers">
    <path project="openSUSE:Templates:Images:Tumbleweed" repository="containers"/>

Remember, to set the publist to “enable” if you want the docker images you build to be pushed to the registry!

Now that that’s done, we can check out the project, and create a new container package within.

osc co home:firstyear:containers
cd home:firstyear:containers
osc mkpac mycontainer

Now in the mycontainer folder you can start to build a container. Add your dockerfile:

#!BuildTag: mycontainer
# docker pull
#                                   ^projectname        ^repos     ^ build tag
FROM opensuse/tumbleweed:latest

# only one zypper ar command per line. only repositories inside the OBS are allowed
RUN zypper ar "home:firstyear:apps"
RUN zypper mr -p 97 "home:firstyear:apps"
RUN zypper --gpg-auto-import-keys ref
RUN zypper install -y vim-data vim python3-ipython shadow python3-praw
# Then the rest of your container as per usual ...

Then to finish up, you can commit this:

osc add Dockerfile
osc ci
osc results
Mon, 20 Apr 2020 00:00:00 +1000 <![CDATA[389ds in containers]]> 389ds in containers

I’ve spent a number of years working in the background to get 389-ds working in containers. I think it’s very close to production ready (one issue outstanding!) and I’m now using it at home for my production LDAP needs.

So here’s a run down on using 389ds in a container!

Getting it Started

The team provides an image for pre-release testing which you can get with docker pull:

docker pull 389ds/dirsrv:latest
# OR, if you want to be pinned to the 1.4 release series.
docker pull 389ds/dirsrv:1.4

The image can be run in an ephemeral mode (data will be lost on stop of the container) so you can test it:

docker run 389ds/dirsrv:1.4

Making it Persistent

To make your data persistent, you’ll need to add a volume, and bind it to the container.

docker volume create 389ds

You can run 389ds where the container instance is removed each time the container stops, but the data persists (I promise this is safe!) with:

docker run --rm -v 389ds:/data -p 3636:3636 389ds/dirsrv:latest

Check your instance is working with an ldapsearch:

LDAPTLS_REQCERT=never ldapsearch -H ldaps:// -x -b '' -s base vendorVersion

NOTE: Setting the environment variable `LDAPTLS_REQCERT` to `never` disables CA verification of the LDAPS connection. Only use this in testing environments!

If you want to make the container instance permanent (uses docker start/stop/restart etc) then you’ll need to do a docker create with similar arguments:

docker create  -v 389ds:/data -p 3636:3636 389ds/dirsrv:latest
docker ps -a
CONTAINER ID        IMAGE                 ...  NAMES
89b342c2e058        389ds/dirsrv:latest   ...  adoring_bartik

Remember, even if you rm the container instance, the volume stores all the data so you can re-pull the image and recreate the container and continue.

Administering the Instance

The best way is to the use the local LDAPI socket - by default the cn=Directory Manager password is randomised so that it can’t be accessed remotely.

To use the local LDAPI socket, you’ll use docker exec into the running instance.

docker start <container name>
docker exec -i -t <container name> /usr/sbin/dsconf localhost <cmd>
docker exec -i -t <container name> /usr/sbin/dsconf localhost backend suffix list
No backends

In a container, the instance is always named “localhost”. So lets add a database backend now to our instance:

docker exec -i -t <cn> /usr/sbin/dsconf localhost backend create --suffix dc=example,dc=com --be-name userRoot
The database was sucessfully created

You can even go ahead and populate your backend now. To make it easier, specify your basedn into the volume’s /data/config/container.inf. Once that’s done we can setup sample data (including access controls), and create some users and groups.

docker exec -i -t <cn> /bin/sh -c "echo -e '\nbasedn = dc=example,dc=com' >> /data/config/container.inf"
docker exec -i -t <cn> /usr/sbin/dsidm localhost initialise
docker exec -i -t <cn> /usr/sbin/dsidm localhost user create --uid william --cn william \
    --displayName William --uidNumber 1000 --gidNumber 1000 --homeDirectory /home/william
docker exec -i -t <cn> /usr/sbin/dsidm localhost group create --cn test_group
docker exec -i -t <cn> /usr/sbin/dsidm localhost group add_member test_group uid=william,ou=people,dc=example,dc=com
docker exec -i -t <cn> /usr/sbin/dsidm localhost account reset_password uid=william,ou=people,dc=example,dc=com
LDAPTLS_REQCERT=never ldapwhoami -H ldaps:// -x -D uid=william,ou=people,dc=example,dc=com -W
    Enter LDAP Password:
    dn: uid=william,ou=people,dc=example,dc=com

There is much more you can do with these tools of course, but it’s very easy to get started and working with an ldap server like this.

Further Configuration

Because this runs in a container, the approach to some configuration is a bit different. Some settings can be configured through either the content of the volume, or through environment variables.

You can reset the directory manager password on startup but use the environment variable DS_DM_PASSWORD. Of course, please use a better password than “password”. pwgen is a good tool for this! This password persists across restarts, so you should make sure it’s good.

docker run --rm -e DS_DM_PASSWORD=password -v 389ds:/data -p 3636:3636 389ds/dirsrv:latest
LDAPTLS_REQCERT=never ldapwhoami -H ldaps:// -x -D 'cn=Directory Manager' -w password
    dn: cn=directory manager

You can also configure certificates through pem files.


All the certs in /data/tls/ca/ will be imported as CA’s and the server key and crt will be used for the TLS server.

If for some reason you need to reindex your db at startup, you can use:

docker run --rm -e DS_REINDEX=true -v 389ds:/data -p 3636:3636 389ds/dirsrv:latest

After the reindex is complete the instance will start like normal.


389ds in a container is one of the easiest and quickest ways to get a working LDAP environment today. Please test it and let us know what you think!

Sat, 28 Mar 2020 00:00:00 +1000 <![CDATA[APFS (why is df showing me funny numbers?!)]]> APFS (why is df showing me funny numbers?!)

Apple’s APFS has been the default for MacOS since High Sierra, where SSD (flash) automatically would convert from HFS+. This is a god send, especially with HFS+’s history of destroying any folder that has a large number of inodes within it.

However, APFS behaves differently to previous filesystem technology. Let’s see if we can explain why df reports multiple 932Gi disks like this:

> df -h
Filesystem                             Size   Used  Avail Capacity    iused      ifree %iused  Mounted on
/dev/disk1s5                          932Gi   10Gi  380Gi     3%     484322 9767493838    0%   /
/dev/disk1s1                          932Gi  530Gi  380Gi    59%    2072480 9765905680    0%   /System/Volumes/Data

And if we can explain why when you delete large files, you don’t get any space back from df either.

How it looked with HFS+

With HFS+ it was pretty simple. You had a disk (a block device), which had partitions (slices of the space in the block device) and those partitions were formatted with a filesystem that knew how to store data in them. An example:

> diskutil list
/dev/disk2 (external, physical):
   #:                       TYPE NAME                    SIZE       IDENTIFIER
   0:      GUID_partition_scheme                         *1.0 TB     disk2
   1:                        EFI EFI                     209.7 MB   disk2s1
   2:                  Apple_HFS tmachine                999.9 GB   disk2s2

We can see that disk2 is 1.0TB in size, and it contains two partitions, the first is 209.7MB for EFI (disk2s1) and the second has data and formatted as HFS+ (disk2s2).

Of course, this has some drawbacks - partitions don’t like being moved, and filesystem resizing is a costly process of time and IO cycles. It’s quite inflexible. If you wanted another partition here for read only data, well, you’d have to change a lot. Properties can only be applied to a filesystem as a whole, and they can’t share space. If you had a 1TB drive partitioned to 500GB each, and were running low on space on one of them, well … good luck! You have to move data manually, or change where applications store data.


APFS doesn’t quite follow this model though. APFS is what’s called a volume based filesystem. That means there is an intermediate layer in here. The layout looks like this in diskutil

> diskutil list
/dev/disk0 (internal, physical):
   #:                       TYPE NAME                    SIZE       IDENTIFIER
   0:      GUID_partition_scheme                        *1.0 TB     disk0
   1:                        EFI EFI                     314.6 MB   disk0s1
   2:                 Apple_APFS Container disk1         1.0 TB     disk0s2

So our disk0 looks like before - an EFI partition, and a very large APFS container. However the container itself is NOT the filesystem. The contain is a pool of storage that APFS volumes are created into. We can see the volumes too.

> diskutil list
/dev/disk1 (synthesized):
   #:                       TYPE NAME                    SIZE       IDENTIFIER
   0:      APFS Container Scheme -                      +1.0 TB     disk1
                                 Physical Store disk0s2
   1:                APFS Volume Macintosh HD — Data     569.4 GB   disk1s1
   2:                APFS Volume Preboot                 81.8 MB    disk1s2
   3:                APFS Volume Recovery                526.6 MB   disk1s3
   4:                APFS Volume VM                      10.8 GB    disk1s4
   5:                APFS Volume Macintosh HD            11.0 GB    disk1s5

Notice how /dev/disk1 is “synthesized”? It’s not real - it’s there to “trick” legacy tools into thinking that the container is a “block” device and the volumes are “partitions”.

Benefits of Volumes

One of the immediate benefits is that unlike partitions, in a volume filesystem, all the space of the underlying container (also known as: pool, volume group) is available to all volumes at anytime. Because the volumes are a flexible concept, they can have non-contiguous geometry on the disk (unlike a partition). That’s why in your df output you can see:

> df -h
Filesystem                             Size   Used  Avail Capacity    iused      ifree %iused  Mounted on
/dev/disk1s5                          932Gi   10Gi  380Gi     3%     484322 9767493838    0%   /
/dev/disk1s1                          932Gi  530Gi  380Gi    59%    2072480 9765905680    0%   /System/Volumes/Data

Both disk1s5 (Macintosh HD) and disk1s1 (Macintosh HD — Data) are APFS volumes. The container has 932Gi total space, and 380Gi available in the container which either volume could allocate. But you can also see the exact space reservation of each volume too: disk1s5 only has 10Gi in use, and disk1s1 has 530Gi in use.

It would be very possible for disk1s1 to grow to fill all the space, and then to contract, and then have disk1s5 grow to take all the space and contract - this is because the space is flexibly allocated from the container. Neat!

Each volume also can have different properties applied. For example, /dev/disk1s5 (Macintosh HD) in MacOS catalina is read-only:

/dev/disk1s5 on / (apfs, local, read-only, journaled)
/dev/disk1s1 on /System/Volumes/Data (apfs, local, journaled, nobrowse)

This is to prevent system tampering, and strengthen integrity of the system. There are a number of tricks to achieve this such as overlaying multiple volumes together. /Applications for example is actually a folder consitituted from the content of /System/Applications and /System/Volumes/Data/Applications. Anytime you “drag” and application to /Applications, you are actually putting it into /System/Volumes/Data/Applications. A very similar property holds for /Users (/System/Volumes/Data/Users), and even /Volumes.

Copy-on-Write, Snapshots

APFS is also a copy-on-write filesystem. This means whenever you write data, it’s actually written to newly allocated disk regions, and the pointers are atomicly flipped to it. The full write occurs or it does not. This is part of the reason why APFS is so much better than HFS+ - in a crash your data is either in a previous state, or the new state - never a half written or corrupted state.

This is the reason why APFS is only used on SSD (flash) devices - COW is very random IO write intensive, and on a rotational disk this would cause the head to “seek” randomly which would make both writes and reads very slow. SSD of course isn’t affected by this, so having a highly fragmented file does not impose a penalty in the same way.

Copy-on-Write however opens up some interesting behaviours. If you COW a file, but never remove the old version, you have a snapshot. This means you can have point-in-time views to how a filesystem was. This is actually used now by time machine during backups to ensure the content of a backup is stable before being written to the external backup media. It also allow time machine to perform “backups” while you are out-and-about, by snapshotting as you work. Because snapshots are just “not removing old data” they are low overhead to maintain and take snapshots.

You can see snapshots on your system with:

> tmutil listlocalsnapshots /
Snapshots for volume group containing disk /:

You can even take your own snapshots if you want!

> time tmutil localsnapshot
Created local snapshot with date: 2020-03-28-091943
tmutil localsnapshot  0.01s user 0.01s system 4% cpu 0.439 total

See how fast that is! Remember also because this is based on copy-on-write, the snapshots only take as much data as the differences, or what you are changing as you work.

Space Reclaim

This leads to the final point of confusion - when people delete files to clear space, but df reports no change. For example:

> df -h
Filesystem                             Size   Used  Avail Capacity    iused      ifree %iused  Mounted on
/dev/disk1s1                          932Gi  530Gi  380Gi    59%    2072480 9765905680    0%   /System/Volumes/Data
> ls -alh Downloads/Windows_Server_2016_Datacenter_EVAL_en-us_14393_refresh.ISO
-rwx------@ 1 william  staff   6.5G 10 Oct  2018 Downloads/Windows_Server_2016_Datacenter_EVAL_en-us_14393_refresh.ISO
> rm Downloads/Windows_Server_2016_Datacenter_EVAL_en-us_14393_refresh.ISO
> df -h
Filesystem                             Size   Used  Avail Capacity    iused      ifree %iused  Mounted on
/dev/disk1s1                          932Gi  530Gi  380Gi    59%    2072479 9765905681    0%   /System/Volumes/Data

Now I promise, I really did delete the file - check the “iused” and “ifree” columns. But also note that the “Used” space didn’t change? Surely we should expect to see this value drop to 523Gi since I removed a 6.5G file.

Remember that APFS is a voluming filesystem, with copy-on-write. Due to snapshots, the space used in a volume is the sum of active data and snapshotted data. This means that when you are removing a file you are removing it from the volume at this point in time, but it may still exist in snapshots that exist in the volume! That’s why there is a reduction in the iused/ifree (an inode pointer was removed) but no change in the space (the file still exists in a snapshot).

During normal operation, provided there is sufficent freespace, you won’t actually notice this behaviour. But when you say … have not a lot of space left (maybe 10G), and you delete some files to import something (say a 40G import), you try the copy again … and it fails! Drat! But you wait a bit and suddenly it works? What in heck happened?

In the background, MacOS has registered “okay, the user demands at least 30G more space to complete this task. Let’s clean snapshots until we have that much space available”. The snapshots are pruned so when you come back later, suddenly you have the space.

Again, you can actually do this yourself. tmutil has a command “thinlocalsnapshots” for this. An example usage would be:

> tmutil thinlocalsnapshots /System/Volumes/Data [bytes required]
Thinned local snapshots:

In my case I have a lot of space available, so no snapshots are pruned. But you may find that multiple snapshots are removed in this process!


APFS is actually a really cool piece of filesystem technology, and I think has made MacOS one of the most viable platforms for reliable daily use. It embraces many great ideas, and despite it’s youth, has done really well. But those new ideas conflict with legacy, and have some behaviours that are not always clearly exposed on shown to you, the user. Understanding those behaviours means we can see why our computers are behaving in certain - sometimes unexpected - ways.

Sat, 28 Mar 2020 00:00:00 +1000 <![CDATA[USG fixing avahi]]> USG fixing avahi

Sadly on the USG pro 4 avahi will regularly spiral out of control taking up 100% cpu. To fix this, we set an hourly restart:

sudo -s
crontab -e

Then add:

15 * * * * /usr/sbin/service avahi-daemon restart
Sun, 15 Mar 2020 00:00:00 +1000 <![CDATA[Fedora 32 Wallpaper Submission - Story]]> Fedora 32 Wallpaper Submission - Story

Fedora opens submissions for wallpapers to be submitted for the next version of the release. I used fedora for a long time, so I decided to submit this photo, and write this post to talk about it:


This was takeing on 2019-11-19 in my home city of Adelaide, South Australia. I had traveled to see some friends over Christmas. We went to Mount Osmond to take some photos, and I took this as we walked up to the lookout.

The next day, this area was a high risk location for a possible bushfire - and many bushfires have since devastated many regions of Australia, affecting many people that I know.

I really find that the Australian landscape is so different to Europe or Asia - many tones of subtle reds, browns, and more. A dry and dusty look. The palette is such a contrast to the lush greens of Europe. Australia is a really beautiful country, in a very distinct and striking manner.

Anyway, I hope you like the photo :)

Sat, 14 Mar 2020 00:00:00 +1000 <![CDATA[Fixing a MacBook Pro 8,2 with dead AMD GPU]]> Fixing a MacBook Pro 8,2 with dead AMD GPU

I’ve owned a MacBook Pro 8,2 late 2011 edition, which I used from 2011 to about 2018. It was a great piece of hardware, and honestly I’m surprised it lasted so long given how many MacOS and Fedora installs it’s seen.

I upgraded to a MacBook Pro 15,1, and I gave the 8,2 to a friend who was in need of a new computer so she could do her work. It worked really well for her until today when she’s messaged me that the machine is having a problem.

The Problem

The machine appeared to be in a bootloop, where just before swapping from the EFI GPU to the main display server, it would go black and then lock up/reboot. Booting to single user mode (boot holding cmd + s) showed the machine’s disk was intact with a clean apfs. The system.log showed corruption at the time of the fault, which didn’t instill confidence in me.

Attempting a recovery boot (boot holding cmd + r), this also yielded the bootloop. So we have potentially eliminated the installed copy of MacOS as the source of the issue.

I’ve then used the apple hardware test (boot while holding d), and it has passed the machine as a clear bill of health.

I have seen one of these machines give up in the past - my friends mother had one from the same generation and that died in almost the same way - could it be the same?

The 8,2’s cursed gpu stack

The 8,2 15” mbp has dual gpu’s - it has the on cpu Intel 3000, and an AMD radeon 6750M. The two pass through an LVDS graphics multiplexer to the main panel. The external display port however is not so clear - the DDC lines are passed through the GMUX, but the datalines directly attach to the the display port.

The machine is also able to boot with EFI rendering to either card. By default this is the AMD radeon. Which ever card is used at boot is also the first card MacOS attempts to use, but it will try to swap to the radeon later on.

This generation had a large number of the radeons develop faults in their 3d rendering capability so it would render the EFI buffer correctly, but on the initiation of 3d rendering it would fail. Sounds like what we have here!

To fix this …

Okay, so this is fixable. First, we need to tell EFI to boot primarily from the intel card. Boot to single user mode and then run.

nvram fa4ce28d-b62f-4c99-9cc3-6815686e30f9:gpu-power-prefs=%01%00%00%00

Now we need to prevent loading of the AMD drivers so that during the boot MacOS doesn’t attempt to swap from Intel to the Radeon. We can do this by hiding the drivers. System integrity protection will stop you, so you need to do this as part of recovery. Boot with cmd + r, which now works thanks to the EFI changes, then open terminal

cd /Volumes/Macintosh HD
sudo mkdir amdkext
sudo mv System/Library/Extensions/AMDRadeonX3000.kext amdkext/

Then reboot. You’ll notice the fans go crazy because the Radeon card can’t be disabled without the driver. We can post-boot load the driver to stop the fans to fix this up.

To achieve this we make a helper script:

# cat /usr/local/libexec/
/sbin/kextload /amdkext/AMDRadeonX3000.kext

And a launchctl daemon

# cat /Library/LaunchDaemons/
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "">
<plist version="1.0">

Now if you reboot, you’ll have a working mac, and the fans will stop properly. I’ve tested this with suspend and resume too and it works! The old beast continues to live :)

Tue, 04 Feb 2020 00:00:00 +1000 <![CDATA[There are no root causes]]> There are no root causes

At Gold Coast LCA2020 I gave a lightning talk on swiss cheese. Well, maybe not really swiss cheese. But it was about the swiss cheese failure model which was proposed at the university of manchester.

Please note this will cover some of the same topics as the talk, but in more detail, and with less jokes.

An example problem

So we’ll discuss the current issues behind modern CPU isolation attacks IE spectre. Spectre is an attack that uses timing of a CPU’s speculative execution unit to retrieve information from another running process on the same physical system.

Modern computers rely on hardware features in their CPU to isolate programs from each other. This could be isolating your web-browser from your slack client, or your sibling’s login from yours.

This isolation however has been compromised by attacks like Spectre, and it looks unlikely that it can be resolved.

What is speculative execution?

In order to be “fast” modern CPU’s are far more complex than most of us have been taught. Often we believe that a CPU thread/core is executing “one instruction/operation” at a time. However this isn’t how most CPU’s work. Most work by having a pipeline of instructions that are in various stages of execution. You could imagine it like this:

let mut x = 0
let mut y = 0
x = 15 * some_input;
y = 10 * other_input;
if x > y {
    return true;
} else {
    return false;

This is some made up code, but in a CPU, every part of this could be in the “pipeline” at once.

let mut x = 0                   <<-- at the head of the queue and "further" along completion
let mut y = 0                   <<-- it's executed part way, but not to completion
x = 15 * some_input;
y = 10 * other_input;           <<-- all of these are in pipeline, and partially complete
if x > y {                      <<-- what happens here?
    return true;
} else {
    return false;

So how does this “pipeline” handle the if statement? If the pipeline is looking ahead, how can we handle a choice like an if? Can we really predict the future?

Speculative execution

At the if statement, the CPU uses past measurements to make a prediction about which branch might be taken, and it then begins to execute that path, even though ‘x > y’ has not been executed or completed yet! At this point x or y may not have even finished being computed yet!

Let’s assume for now our branch predictor thinks that ‘x > y’ is false, so we’ll start to execute the “return false” or any other content in that branch.

Now the instructions ahead catch up, and we resolve “did we really predict correctly?”. If we did, great! We have been able to advance the program state asynchronously even without knowing the answer until we get there.

If not, ohh nooo. We have to unwind what we were doing, clear some of the pipeline and try to do the correct branch.

Of course this has an impact on timing of the program. Some people found you could write a program to manipulate this predictor and using specific addresses and content, they could use these timing variations to “access memory” they are not allowed to by letting the specualative executor contribute to code they are not allowed to access before the unroll occurs. They could time this, and retrieve the memory contents from areas they are not allowed to access, breaking isolation.

Owwww my brain

Yes. Mine too.

Community Reactions

Since this has been found, a large amount of the community reaction has been about the “root cause”. ‘Clearly’ the root cause is “Intel are bad at making CPU’s” and so everyone should buy AMD instead because they “weren’t affected quite as badly” (Narrators voice: They were absolutely just as bad). We’ve had some intel CPU updates and kernel/program fixes so all good right? We addressed the root cause.

Or … did we?

Our computers are still asynchronous, and contain many out-of-order parts. It’s hard to believe we have “found” every method of exploiting this. Indeed in the last year many more ways to bypass hardware isolation due to our systems async nature have been found.

Maybe the “root cause” wasn’t addressed. Maybe … there are no ….


To understand how we got to this situation we need to look at how CPU’s have evolved. This is not a complete history.

The PDP11 was a system owned at bell labs, where the C programing language was developed. Back then CPU’s were very simple - A CPU and memory, executing one instruction at a time.

The C programming language gained a lot of popularity as it was able to be “quickly” ported to other CPU models to allow software to be compiled on other platforms. This led to many systems being developed in C.

Intel introduced the 8086, and many C programs were ported to run on it. Intel then released the 80486 in 1989, which had the first pipeline and cache to improve performance. In order to continue to support C, this meant the memory model could not change from the PDP11 - the cache had to be transparent, and the pipeline could not expose state.

This has of course led to computers being more important in our lives and businesses, so we expected further performance, leading to increased frequencies and async behaviours.

The limits of frequencies were really hit in the Pentium 4 era, when about 4GHz was shown to be a barrier of stability for those systems. They had very deep pipelines to improve performance, but that also had issues when branch prediction failed causing pipeline stalls. Systems had to improve their async behaviours futher to squeeze every single piece of performance possible out.

Compiler developers also wanted more performance so they started to develop ways to transform C in ways that “took advantage” of x86_64 tricks, by manipulating the environment so the CPU is “hinted” into states we “hope” it gets into.

Many businesses also started to run servers to provide to consumers, and in order to keep costs low they would put many users onto single pieces of hardware so they could share or overcommit resources.

This has created a series of positive reinforcement loops - C is ‘abi stable’ so we keep developing it due to it’s universal nature. C code can’t be changed without breaking every existing system. We can’t change the CPU memory model without breaking C, which is hugely prevalent. We improve the CPU to make C faster, transparently so that users/businesses can run more C programs and users. And then we improve compilers to make C faster given quirks of the current CPU models that exist …

Swiss cheese model

It’s hard to look at the current state of systems security and simply say “it’s the cpu vendors fault”. There are many layers that have come together to cause this situation.

This is called the “swiss cheese model”. Imagine you take a stack of swiss cheese and rotate and rearrange the slices. You will not be able to see through it. but as you continue to rotate and rearrange, eventually you may see a tunnel through the cheese where all the holes line up.

This is what has happened here - we developed many layers socially and technically that all seemed reasonable over time, and only after enough time and re-arrangements of the layers, have we now arrived at a situation where a failure has occured that permeates all of computer hardware.

To address it, we need to look beyond just “blaming hardware makers” or “software patches”. We need to help developers move away from C to other languages that can be brought onto new memory models that have manual or other cache strategies. We need hardware vendors to implement different async models. We need to educate businesses on risk analysis and how hardware works to provide proper decision making capability. We need developers to alter there behaviour to work in environments with higher performance constraints. And probably much much more.

There are no root causes

It is a very pervasive attitude in IT that every issue has a root cause. However, looking above we can see it’s never quite so simple.

Saying an issue has a root cause, prevents us from examining the social, political, economic and human factors that all become contributing factors to failure. Because we are unable to examine them, we are unable to address the various layers that have contributed to our failures.

There are no root causes. Only contributing factors.

Mon, 20 Jan 2020 00:00:00 +1000 <![CDATA[Concurrency 2: Concurrently Readable Structures]]> Concurrency 2: Concurrently Readable Structures

In this post, I’ll discuss concurrently readable datastructures that exist, and ideas for future structures. Please note, this post is an inprogress design, and may be altered in the future.

Before you start, make sure you have read part 1

Concurrent Cell

The simplest form of concurrently readable structure is a concurrent cell. This is equivalent to a read-write lock, but has concurrently readable properties instead. The key mechanism to enable this is that when the writer begins, it clones the data before writing it. We trade more memory usage for a gain in concurrency.

To see an implementation, see my rust crate, concread

Concurrent Tree

The concurrent cell is good for small data, but a larger structure - like a tree - may take too long to clone on each write. A good estimate is that if your data in the cell is larger than about 512 bytes, you likely want a concurrent tree instead.

In a concurrent tree, only the branches involved in the operation are cloned. Imagine the following tree:


When we attempt to change a value in the 4th leaf we copy it before we begin, and all it’s parents to update their pointers.


In the process the pointers from the new root b to branch 1 are maintained. The new second branch also maintains a pointer to the original 3rd leaf.

This means that in this example only 3/7 nodes are copied, saving a lot of cloning. As your tree grows this saves a lot of work. Consider a tree with node-widths of 7 pointers and at height level 5. Assuming perfect layout, you only need to clone 5/~16000 nodes. A huge saving in memory copy!

The interesting part is a reader of root a, also is unaffected by the changes to root b - the tree from root a hasn’t been changed, as all it’s pointers and nodes are still valid.

When all readers of root a end, we clean up all the nodes it pointed to that no longer are needed by root b (this can be done with atomic reference counting, or garbage lists in transactions).


It is through this copy-on-write (also called multi view concurrency control) that we achieve concurrent readability in the tree.

This is really excellent for databases where you have in memory structures that work in parallel to the database transactions. In kanidm an example is the in-memory schema that is used at run time but loaded from the database. They require transactional behaviours to match the database, and ACID properties so that readers of a past transaction have the “matched” schema in memory.

Concurrent Cache (Updated 2020-05-13)

A design that I have thought about for a long time has finally come to reality. This is a concurrently readable transactional cache. One writer, multiple readers with consistent views of the data. Additionally due to the transactioal nature, rollbacks and commits are fulled supported.

For a more formal version of this design, please see my concurrent ARC draft paper.

This scheme should work with any cache type - LRU, LRU2Q, LFU. I have used ARC.

ARC was popularised by ZFS - ARC is not specific to ZFS, it’s a strategy for cache replacement, despite the comment association between the two.

ARC is a pair of LRU’s with a set of ghost lists and a weighting factor. When an entry is “missed” it’s inserted to the “recent” LRU. When it’s accessed from the LRU a second time, it moves to the “frequent” LRU.

When entries are evicted from their sets they are added to the ghost list. When a cache miss occurs, the ghost list is consulted. If the entry “would have been” in the “recent” LRU, but was not, the “recent” LRU grows and the “frequent” LRU shrinks. If the item “would have been” in the “frequent” LRU but was not, the “frequent” LRU is expanded, and the “recent” LRU shrunk.

This causes ARC to be self tuning to your workload, as well as balancing “high frequency” and “high locality” operations. It’s also resistant to many cache invalidation or busting patterns that can occur in other algorithms.

A major problem though is ARC is not designed for concurrency - LRU’s rely on double linked lists which is very much something that only a single thread can modify safely due to the number of pointers that are not aligned in a single cache line, prevent atomic changes.

How to make ARC concurrent

To make this concurrent, I think it’s important to specify the goals.

  • Readers must always have a correct “point in time” view of the cache and its data
  • Readers must be able to trigger cache inclusions
  • Readers must be able to track cache hits accurately
  • Readers are isolated from all other readers and writer actions
  • Writers must always have a correct “point in time” view
  • Writers must be able to rollback changes without penalty
  • Writers must be able to trigger cache inclusions
  • Writers must be able to track cache hits accurately
  • Writers are isolated from all readers
  • The cache must maintain correct temporal ordering of items in the cache
  • The cache must properly update hit and inclusions based on readers and writers
  • The cache must provide ARC semantics for management of items
  • The cache must be concurrently readable and transactional
  • The overhead compared to single thread ARC is minimal

There are a lot of places to draw inspiration from, and I don’t think I can list - or remember them all.

My current design uses a per-thread reader cache to allow inclusions, with a channel to asynchronously include and track hits to the write thread. The writer also maintains a local cache of items including markers of removed items. When the writer commits, the channel is drained to a time point T, and actions on the ARC taken.

This means the LRU’s are maintained only in a single write thread, but the readers changes are able to influence the caching decisions.

To maintain consistency, and extra set is added which is the haunted set, so that a key that has existed at some point can be tracked to identify it’s point in time of eviction and last update so that stale data can never be included by accident.

Limitations and Concerns

Cache missing is very expensive - multiple threads may load the value, the readers must queue the value, and the writer must then act on the queue. Sizing the cache to be large enough is critically important as eviction/missing will have a higher penalty than normal. Optimally the cache will be “as large or larger” than the working set.

But with a concurrent ARC we now have a cache where each reader thread has a thread local cache and the writer is communicated to by channels. This may make the cache’s memory limit baloon to a high amount over a normal cache. To help, an algorithm was developed based on expect cache behaviour for misses and communication to help size the caches of readers and writers.


This is a snapshot of some concurrently readable datastructures, and how they are implemented and useful in your projects. Using them in Kanidm we have already seen excellent performance and scaling of the server, with very little effort for tuning. We plan to adapt these for use in 389 Directory Server too. Stay tuned!

Sun, 29 Dec 2019 00:00:00 +1000 <![CDATA[Concurrency 1: Types of Concurrency]]> Concurrency 1: Types of Concurrency

I want to explain different types of concurrent datastructures, so that we can explore their properties and when or why they might be useful.

As our computer systems become increasingly parallel and asynchronous, it’s important that our applications are able to work in these environments effectively. Languages like Rust help us to ensure our concurrent structures are safe.

CPU Memory Model Crash Course

In no way is this a thorough, complete, or 100% accurate representation of CPU memory. My goal is to give you a quick brief on how it works. I highly recommend you read “what every programmer should know about memory” if you want to learn more.

In a CPU we have a view of a memory space. That could be in the order of KB to TB. But it’s a single coherent view of that space.

Of course, over time systems and people have demanded more and more performance. But we also have languages like C, that won’t change from their view of a system as a single memory space, or change how they work. Of course, it turns out C is not a low level language but we like to convince ourselves it is.

To keep working with C and others, CPU’s have acquired cache’s that are transparent to the operation of the memory. You have no control of what is - or is not - in the cache. It “just happens” asynchronously. This is exactly why spectre and meltdown happened (and will continue to happen) because these async behaviours will always have the observable effect of making your CPU faster. Who knew!

Anyway, for this to work, each CPU has multiple layers of cache. At L3 the cache is shared with all the cores on the die. At L1 it is “per cpu”.

Of course it’s a single view into memory. So if address 0xff is in the CPU cache of core 1, and also in cache of core 2, what happens? Well it’s supported! Caches between cores are kept in sync via a state machine called MESI. These states are:

  • Exclusive - The cache is the only owner of this value, and it is unchanged.
  • Modified - The cache is the only owner of this value, and it has been changed.
  • Invalid - The cache holds this value but another cache has changed it.
  • Shared - This cache and maybe others are viewing this valid value.

To gloss very heavily over this topic, we want to avoid invaild. Why? That means two cpus are contending for the value, causing many attempts to keep each other in check. These contentions cause CPU’s to slow down.

We want values to either be in E/M or S. In shared, many cpu’s are able to read the value at maximum speed, all the time. In E/M, we know only this cpu is changing the value.

This cache coherency is also why mutexes and locks exist - they issue the needed CPU commands to keep the caches in the correct states for the memory we are accessing.

Keep in mind Rust’s variables are immutable, and able to share between threads, or mutable and single thread only. Sound familar? Rust is helping with concurrency by keeping our variables in the fastest possible cache states.

Data Structures

We use data structures in programming to help improve behaviours of certain tasks. Maybe we need to find values quicker, sort contents, or search for things. Data Structures are a key element of modern computer performance.

However most data structures are not thread safe. This means only a single CPU can access or change them at a time. Why? Because if a second read them, due to cache-differences in content the second CPU may see an invalid datastructure, leading to undefined behaviour.

Mutexes can be used, but this causes other CPU’s to stall and wait for the mutex to be released - not really what we want on our system. We want every CPU to be able to process data without stopping!

Thread Safe Datastructures

There exist many types of thread safe datastructures that can work on parallel systems. They often avoid mutexes to try and keep CPU’s moving as fast as possible, relying on special atomic cpu operations to keep all the threads in sync.

Multiple classes of these structures exist, which have different properties.


I have mentioned these already, but it’s worth specifying the properties of a mutex. A mutex is a system where a single CPU exists in the mutex. It becomes one “reader/writer” and all other CPU’s must wait until the mutex is released by the current CPU holder.

Read Write Lock

Often called RWlock, these allow one writer OR multiple parallel readers. If a reader is reading then a writer request is delayed until the readers complete. If a writer is changing data, all new reads are blocked. All readers will always be reading the same data.

These are great for highly concurrent systems provided your data changes infrequently. If you have a writer changing data a lot, this causes your readers to be continually blocking. The delay on the writer is also high due to a potentially high amount of parallel readers that need to exit.

Lock Free

Lock free is a common (and popular) datastructue type. These are structures that don’t use a mutex at all, and can have multiple readers and multiple writers at the same time.

The most common and popular structure for lock free is queues, where many CPUs can append items and many can dequeue at the same time. There are also a number of lock free sets which can be updated in the same way.

An interesting part of lock free is that all CPU’s are working on the same set - if CPU 1 reads a value, then CPU 2 writes the same value, the next read from CPU 1 will show the new value. This is because these structures aren’t transactional - lock free, but not transactional. There are some times where this is really useful as a property when you need a single view of the world between all threads, and your program can tolerate data changing between reads.

Wait Free

This is a specialisation of lock free, where the reader/writer has guaranteed characteristics about the time they will wait to read or write data. This is very detailed and subtle, only affecting real time systems that have strict deadline and performance requirements.

Concurrently Readable

In between all of these is a type of structure called concurrently readable. A concurrently readable structure allows one writer and multiple parallel readers. An interesting property is that when the reader “begins” to read, the view for that reader is guaranteed not to change until the reader completes. This means that the structure is transactional.

An example being if CPU 1 reads a value, and CPU 2 writes to it, CPU 1 would NOT see the change from CPU 2 - it’s outside of the read transaction!

In this way there are a lot of read-only immutable data, and one writer mutating and changing things … sounds familar? It’s very close to how our CPU’s cache work!

These structures also naturally lend themself well to long processing or database systems where you need transactional (ACID) properties. In fact some databases use concurrent readable structures to achieve ACID semantics.

If it’s not obvious - concurrent readability is where my interest lies, and in the next post I’ll discuss some specific concurrently readable structures that exist today, and ideas for future structures.

Sun, 29 Dec 2019 00:00:00 +1000 <![CDATA[Packaging and the Security Proposition]]> Packaging and the Security Proposition

As a follow up to my post on distribution packaging, it was commented by Fraser Tweedale (@hackuador) that traditionally the “security” aspects of distribution packaging was a compelling reason to use distribution packages over “upstreams”. I want to dig into this further.

Why does C need “securing”

C as a language is unsafe in every meaning of the word. The best C programmers on the planet are incapable of writing a secure program. This is because to code in C you have to express a concurrent problem, into a language that is linearised, which is compiled relying on undefined behaviour, to be executed on an asynchronous concurrent out of order CPU. What could possibly go wrong?!

There is a lot you need to hold in mind to make C work. I can tell you now that I spend a majority of my development time thinking about the code to change rather than writing C because of this!

This has led to C based applications having just about every security issue known to man.

How is C “secured”

So as C is security swiss cheese, this means we have developed processes around the language to soften this issue - for example advice like patch and update continually as new changes are continually released to resolve issues.

Distribution packages have always been the “source” of updates for these libraries and applications. These packages are maintained by humans who need to update these packages. This means when a C project releases a fix, these maintainers would apply the patch to various versions, and then release the updates. These library updates due to C’s dynamic nature means when the machine is next rebooted (yes rebooted, not application restarted) that these fixes apply to all consumers who have linked to that library - change one, fix everything. Great!

But there are some (glaring) weaknesses to this model. C historically has little to poor application testing so many of these patches and their effects can’t be reproduced. Which also subsequently means that consuming applications also aren’t re-tested adequately. It can also have impacts where a change to a shared library can impact a consuming application in a way that was unforseen as the library changed.

The Dirty Secret

The dirty secret of many of these things is that “thoughts and prayers” is often the testing strategy of choice when patches are applied. It’s only because humans carefully think about and write tiny amounts of C that we have any reliability in our applications. And we already established that it’s nearly impossible for humans to write correct C …

Why Are We Doing This?

Because C linking and interfaces are so fragile, and due to the huge scope in which C can go wrong due to being a memory unsafe language, distributions and consumers have learnt to fear version changes. So instead we patch ancient C code stacks, barely test them, and hope that our castles of sand don’t fall over, all so we can keep “the same version” of a program to avoid changing it as much as possible. Ironically this makes those stacks even worse because we’ve developed infinite numbers of bespoke barely tested packages that people rely on daily.

To add more insult to this, most of this process is manual - humans monitor mailing lists, and have to know what code needs what patch, and when in what release streams. It’s a monumental amount of human time and labour involved to keep the sand castles standing. This manual involvement is what leads to information overload, and maintainers potentially missing security updates or releases that causes many distribution packages to be outdated, missing patches, or vulnerable more often than not. In other cases packages continue to be shipped that are unmaintained or have no upstream, so any issues that may exist are unknown or unresolved.

Distribution Security

This means all of platform and distribution security comes to one factor.

A lot of manual human labour.

It’s is only because distributions have so many volunteers or paid staff, that this entire system continues to progress to give the illusion of security and reliability. When it fails, it fails silently.

Heartbleed really dragged the poor state of C security into the open , and it’s still not been addressed.

When people say “how can we secure docker/flatpak/Rust” like we do with distributions, I say: “Do we really secure distributions at all?”. We only have a veneer of best effort masquerading as a secure supply chain.

A Different Model …

So let’s look briefly at Rust and how you package it today (against distribution maintainer advice).

Because it’s staticly linked, each application must be rebuilt if a library changes. Because the code comes from a central upstream, there are automated tools to find security issues (like cargo audit). The updates are pulled from the library as a whole working tested unit, and then built into our application to to recieve further testing and verification of the application as a whole singular functional unit.

These dependencies once can then be vendored to a tar (allowing offline builds and some aspects of reproducability). This vendor.tar.gz is placed into the source rpm along with the application source, and then built.

There is a much stronger pipeline of assurances here! And to further aid Rust’s cause, because it is a memory safe language, it eliminates most of the security issues that C is afflicted by, causing security updates to be far fewer, and to often affect higher level or esoteric situations. If you don’t believe me, look at the low frequency, and low severity commits for the rust advisory-db

People have worried that because Rust is staticly linked we’ll have to rebuild it and update it continually to keep it secure - I’d say because it’s Rust we’ll have stronger guarantees at build that security issues are less likely to exist and we won’t have to ship updates nearly as often as a C stack.

Another point to make is Rust libraries don’t release patches - because of Rust’s stronger guarantees at compile time and through integrated testing, people are less afraid of updates to versions. We are very unlikely to see Rust releasing patches, rather than just shipping “updates” to libraries and expecting you to update. Because these are staticly linked, we don’t have to worry about versions for other libraries on the platform, we only need to assure the application is currently working as intended. Because of the strong typing those interfaces of those libraries has stronger compile time guarantees at build time, meaning the issues around shared object versioning and symbol/version mismatching simply don’t exist - one of the key reasons people became version change averse in the first place.

So Why Not Package All The Things?

Many distribution packagers have been demanding a C-like model for Rust and others (remember, square peg, round hole). This means every single crate (library) is packaged, and then added to a set of buildrequires for the application. When a crate updates, it triggers the application to rebuild. When a security update for a library comes out, it rebuilds etc.

This should sound familiar … because it is. It’s reinventing Cargo in a clean-room.

RPM provides a way to manage dependencies. Cargo provides a way to manage dependencies.

RPM provides a way to offline build sources. Cargo provides a way to offline build sources.

RPM provides a way to patch sources. Cargo provides a way to update them inplace - and patch if needed.

RPM provides a way to … okay you get the point.

There is also a list of what we won’t get from distribution packages - remember distribution packages are the C language packaging system

We won’t get the same level of attention to detail, innovation and support as the upstream language tooling has. Simply put, users of the language just won’t use distribution packages (or toolchains, libraries …) in their workflows.

Distribution packages can’t offer is the integration into tools like cargo-audit for scanning for security issues - that needs still needs Cargo, not RPM, meaning the RPM will need to emulate what Cargo does exactly.

Using distribution packages means you have an untested pipeline that may add more risks now. Developers won’t use distribution packages - they’ll use cargo. Remember applications work best as they are tested and developed - outside of that environment they are an unknown.

Finally, the distribution maintainers security proposition is to secure our libraries - for distributions only. That’s acting in self interest. Cargo is offering a way to secure upstream so that everyone benefits. That means less effort and less manual labour all around. And secure libraries are not the full picture. Secure applications is what matters.

The large concerning factor is the sheer amount of human effort. We would spend hundreds if not thousands of hours to reinvent a functional tool in a disengaged manner, just so that we can do things as they have always been done in C - for the benefit of distributions individually rather than languages upstream.

What is the Point

Again - as a platform our role is to provide applications that people can trust. The way we provide these applications is never going to be one size fits all. Our objective isn’t to secure “this library” or “that library”, it’s to secure applications as a functional whole. That means that companies shipping those applications, should hire maintainers to work on those applications to secure their stacks.

Today I honestly think Rust has a better security and updating story than C packages ever has, powered by automation and upstream integration. Let’s lean on that, contribute to it, and focus on shipping applications instead of reinventing tools. We need to accept our current model is focused on C, that developers have moved around distribution packaging, and that we need to change our approach to eliminate the large human risk factor that currently exists.

We can’t keep looking to the models of the past, we need to start to invest in new methods for the future.

Today, distributions should focus on supporting and distributing _applications_ and work with native language supply chains to enable this.

Which is why I’ll keep using cargo’s tooling and auditing, and use distribution packages as a delievery mechanism for those applications.

What Could it Look Like?

We have a platform that updates as a whole (Fedora Atomic comes to mind …) with known snapshots that are tested and well known. This platform has methods to run applications, and those applications are isolated from each other, have their own libraries, and security audits.

And because there are now far fewer moving parts, quality is easier to assert, understand, and security updates are far easier and faster, less risky.

It certainly sounds a lot like what macOS and iOS have been doing with a read-only base, and self-contained applications within that system.

Thu, 19 Dec 2019 00:00:00 +1000 <![CDATA[Packaging, Vendoring, and How It’s Changing]]> Packaging, Vendoring, and How It’s Changing

In today’s thoughts, I was considering packaging for platforms like opensuse or other distributions and how that interacts with language based packaging tools. This is a complex and … difficult topic, so I’ll start with my summary:

Today, distributions should focus on supporting and distributing _applications_ and work with native language supply chains to enable this.

Distribution Packaging

Let’s start by clarifying what distribution packaging is. This is your linux or platforms method of distributing it’s programs libraries. For our discussion we really only care about linux so say suse or fedora here. How macOS or FreeBSD deal with this is quite different.

Now these distribution packages are built to support certain workflows and end goals. Many open source C projects release their source code in varying states, perhaps also patches to improve or fix issues. These code are then put into packages, dependencies between them established due to dynamic linking, they are signed for verification purposes and then shipped.

This process is really optimised for C applications. C has been the “system language” for many decades now, and so we can really see these features designed to promote - and fill in gaps - for these applications.

For example, C applications are dynamically linked. Because of this it encourages package maintainers to “split” applications into smaller units that can have shared elements. An example that I know is openldap which may be a single source tree, but often is packaged to multiple parts such as the, lmdb, openldap-client applications, it’s server binary, and probably others. The package maintainer is used to taking their scalpels and carefully slicing sources into elegant packages that can minimise how many things are installed to what is “just needed”.

We also see other behaviours where C shared objects have “versions”, which means you can install multiple versions of them at once and programs declare in their headers which library versions they want to consume. This means a distribution package can have many versions of the same thing installed!

This in mind, the linking is simplistic and naive. If a shared object symbol doesn’t exist, or you don’t give it the “right arguments” via a weak-compile time contract, it’s likely bad things (tm) will happen. So for this, distribution packaging provides the stronger assertions about “this program requires that library version”.

As well, in the past the internet was a more … wild place, where TLS wasn’t really widely used. This meant that to gain strong assertions about the source of a package and that it had not been tampered, tools like GPG were used.

What about Ruby or Python?

Ruby and Python are very different languages compared to C though. They don’t have information in their programs about what versions of software they require, and how they mesh together. Both languages are interpreted, and simply “import library” by name, searching a filesystem path for a library matching regardless of it’s version. Python then just loads in that library as source straight to the running vm.

It’s already apparent how we’ll run into issues here. What if we have a library “foo” that has a different function interface between version 1 and version 2? Python applications only request access to “foo”, not the version, so what happens if the wrong version is found? What if it’s not found?

Some features here are pretty useful from the “distribution package” though. Allowing these dynamic tools to have their dependencies requested from the “package”, and having the package integrity checked for them.

But overtime, conflicts started, and issues arose. A real turning point was ruby in debian/ubuntu where debian package maintainers (who are used to C) brought out the scalpel and attempted to slice ruby down to “parts” that could be reused form a C mindset. This led to a combinations of packages that didn’t make sense (rubygems minus TLS, but rubygems requires https), which really disrupted the communities.

Another issue was these languages as they grew in popularity had different projects requiring different versions of libraries - which as before we mentioned isn’t possible beside library search path manipulations which is frankly user hostile.

These issues (and more) eventually caused these communities as a whole to stop recommending distribution packages.

So put this history in context. We have Ruby (1995) and Python (1990), which both decided to avoid distribution packages with their own tools aka rubygems (2004) and pip (2011), as well as tools to manage multiple parallel environments (rvm, virtualenv) that were per-application.

New kids on the block

Since then I would say three languages have risen to importance and learnt from the experiences of Ruby - This is Javascript (npm/node), Go and Rust.

Rust went further than Ruby and Python and embedded distribution of libraries into it’s build tools from an early date with Cargo. As Rust is staticly linked (libraries are build into the final binary, rather than being dynamicly loaded), this moves all dependency management to build time - which prevents runtime library conflict. And because Cargo is involved and controls all the paths, it can do things such as having multiple versions available in a single build for different components and coordinating all these elements.

Now to hop back to npm/js. This ecosystem introduced a new concept - micro-dependencies. This happened because javascript doesn’t have dead code elimination. So if given a large utility library, and you call one function out of 100, you still have to ship all 99 unused ones. This means they needed a way to manage and distribute hundreds, if not thousands of tiny libraries, each that did “one thing” so that you pulled in “exactly” the minimum required (that’s not how it turned out … but it was the intent).

Rust also has inherited a similar culture - not to the same extreme as npm because Rust DOES have dead code elimination, but still enough that my concread library with 3 dependencies pulls in 32 dependencies, and kanidm from it’s 30 dependencies, pulls in 365 into it’s graph.

But in a way this also doesn’t matter - Rust enforces strong typing at compile time, so changes in libraries are detected before a release (not after like in C, or dynamic languages), and those versions at build are used in production due to the static linking.

This has led to a great challenge is distribution packaging for Rust - there are so many “libraries” that to package them all would be a monumental piece of human effort, time, and work.

But once again, we see the distribution maintainers, scalpel in hand, a shine in their eyes looking and thinking “excellent, time to package 365 libraries …”. In the name of a “supply chain” and adding “security”.

We have to ask though, is there really value of spending all this time to package 365 libraries when Rust functions so differently?

What are you getting at here?

To put it clearly - distribution packaging isn’t a “higher” form of distributing software. Distribution packages are not the one-true solution to distribute software. It doesn’t magically enable “security”. Distribution Packaging is the C language source and binary distribution mechanism - and for that it works great!

Now that we can frame it like this we can see why there are so many challenges when we attempt to package Rust, Python or friends in rpms.

Rust isn’t C. We can’t think about Rust like C. We can’t secure Rust like C.

Python isn’t C. We can’t think about Python like C. We can’t secure Python like C.

These languages all have their own quirks, behaviours, flaws, benefits, and goals. They need to be distributed in unique ways appropriate to those languages.

An example of the mismatch

To help drive this home, I want to bring up FreeIPA. FreeIPA has a lot of challenges in packaging due to it’s huge number of C, Python and Java dependencies. Recently on twitter it was annouced that “FreeIPA has been packaged for debian” as the last barrier (being dogtag/java) was overcome to package the hundreds of required dependencies.

The inevitable outcome of debian now packaging FreeIPA will be:

  • FreeIPA will break in some future event as one of the python or java libraries was changed in a way that was not expected by the developers or package maintainers.
  • Other applications may be “held back” from updating at risk/fear of breaking FreeIPA which stifles innovation in the java/python ecosystems surrounding.

It won’t be the fault of FreeIPA. It won’t be the fault of the debian maintainers. It will be that we are shoving square applications through round C shaped holes and hoping it works.

So what does matter?

It doesn’t matter if it’s Kanidm, FreeIPA, or 389-ds. End users want to consume applications. How that application is developed, built and distributed is a secondary concern, and many people will go their whole lives never knowing how this process works.

We need to stop focusing on packaging libraries and start to focus on how we distribute applications.

This is why projects like docker and flatpak have surprised traditional packaging advocates. These tools are about how we ship applications, and their build and supply chains are separated from these.

This is why I have really started to advocate and say:

Today, distributions should focus on supporting and distributing _applications_ and work with native language supply chains to enable this.

Only we accept this shift, we can start to find value in distributions again as sources of trusted applications, and how we see the distribution as an application platform rather than a collection of tiny libraries.

The risk of not doing this is alienating communities (again) from being involved in our platforms.

Follow Up

There have been some major comments since:

First, there is now a C package manager named conan . I have no experience with this tool, so at a distance I can only assume it works well for what it does. However it was noted it’s not gained much popularity, likely due to the fact that distro packages are the current C language distribution channels.

The second was about the security components of distribution packaging and this - that topic is so long I’ve written another post about the topic instead, to try and keep this post focused on the topic.

Finally, The Fedora Modularity effort is trying to deal with some of these issues - that modules, aka applications have different cadences and requirements, and those modules can move more freely from the base OS.

Some of the challenges have been explored by LWN and it’s worth reading. But I think the underlying issue is that again we are approaching things in a way that may not align with reality - people are looking at modules as libraries, not applications which is causing things to go sideways. And when those modules are installed, they aren’t isolated from each other , meaning we are back to square one, with a system designed only for C. People are starting to see that but the key point is continually missed - that modularity should be about applications and their isolation not about multiple library versions.

Wed, 18 Dec 2019 00:00:00 +1000 <![CDATA[Fixing opensuse virtual machines with resume]]> Fixing opensuse virtual machines with resume

Today I hit an unexpected issue - after changing a virtual machines root disk to scsi, I was unable to boot the machine.

The host is opensuse leap 15.1, and the vm is the same. What’s happening!

The first issue appears to be that opensuse 15.1 doesn’t support scsi disks from libvirt. I’m honestly not sure what’s wrong here.

The second is that by default opensuse leap configures suspend and resume to disk - by it uses the pci path instead of a swap volume UUID. So when you change the bus type, it renames the path making the volume inaccessible. This causes boot to fail.

To work around you can remove “resume=/disk/path” from the cli. Then to fix it permanently you need:

transactional-update shell
vim /etc/default/grub
# Edit this line to remove "resume"
GRUB_CMDLINE_LINUX_DEFAULT="console=ttyS0,115200 resume=/dev/disk/by-path/pci-0000:00:07.0-part3 splash=silent quiet showopts"

vim /etc/default/grub_installdevice
# Edit the path to the correct swap location as by ls -al /dev/disk/by-path

I have reported these issues, and I hope they are resolved.

Sun, 15 Dec 2019 00:00:00 +1000 <![CDATA[Password Quality and Badlisting in Kanidm]]> Password Quality and Badlisting in Kanidm

Passwords are still a required part of any IDM system. As much as I wish for Kanidm to only support webauthn and stronger authentication types, at the end of the day devices can be lost, destroyed, some people may not be able to afford them, some clients aren’t compatible with them and more.

This means the current state of the art is still multi-factor auth. Something you have and something you know.

Despite the presence of the multiple factors, it’s still important to quality check passwords. Microsoft’s Azure security team have written about passwords, and it really drives home the current situation. I would certainly trust these people at Microsoft to know what they are talking about given the scale of what they have to defend daily.

The most important take away is that trying to obscure the password from a bruteforce is a pointless exercise because passwords end up in password dumps, they get phished, keylogged, and more. MFA matters!

It’s important here to look at the “easily guessed” and “credential stuffing” category. That’s what we really want to defend against with password quality, and MFA protects us against keylogging, phising (only webauthn), and reuse.

Can we Avoid This?

Yes! Kanidm supports a “generated” password that is a long, high entropy password that should be stored in a password manager or similar tool to prevent human knowledge. This fits to our “device as authentication” philosophy. You authenticate to your device (phone, laptop etc), and then the devices stored passwords authenticate you from that point on. This has the benefit that devices and password managers generally perform better checking of the target where we enter the password, making phising less likely.

But sometimes we can’t rely on this, so we still need human-known passwords, so we still take steps to handle these.


In the darkages, we would decree that human known passwords had to have a number, symbol, roman numeral, no double letters, one capital letter, two kanji and no symbols that could be part of an sql injection because of that one legacy system we can’t patch.

This lead to people making horrid, un-rememberable passwords in leetspeak, or giving up altogether and making the excellent “Password1” (which passes AD minimum password requirements on server 2003).

What we really want is entropy, length, and memorability. Dropbox made a great library for this called zxcvbn, which has since been ported to rust . I highly recommend it.

This library is great because it focuses on entropy, and then if the password doesn’t meet these requirements, the library recommends ways to improve. This is excellent for human interaction and experience, guiding people to create better passwords that they can remember, rather than our outdated advice of the complex passwords as described above.


Badlisting is another great technique for improving password quality. It’s essentially a blocklist of passwords that people are not allowed to set. This way you can have corporate-specific breach lists, or the top 10k most used passwords badlisted to prevent users using them. For example, “correct horse battery staple” may be a strong password, but it’s well known thanks to xkcd.

It’s also good for preventing password reuse in your company if you are phished and the credentials privately notified to you as some of the regional CERT’s do, allowing you to block these without them being in a public breach list.

This is important as many bots will attempt to spam these passwords against accounts (rate limiting auth and soft-locking accounts also helps to delay these attack styles).

In Kanidm

In Kanidm, we chose to use both approaches. First we check the password with zxcvbn, then we ensure it’s not in a badlist.

In order to minimise the size of the badlist, the badlist uses case insensitive storage so that multiple variants of “password” and “PasSWOrD” are only listed once. We also preprocessed the badlist with zxcvbn to remove any passwords that it would have denied from being entered. The preprocessor tool will be shipped with kanidm so that administrators can preprocess their own lists before adding them to the badlist configuration.

Creating a Badlist

I decided to do some analysis on a well known set of passwords maintained on the seclists repository . Apparently this is what pentesters reach for when they want to bruteforce or credential stuff on a domain.

I analysed this in three ways. The first was as a full set of passwords (about 25 million) and a smaller but “popular” set in the “rockyou” files which is about 60,000 passwords. Finally I did an analysis of the rockyou + top 10 million most common (which combined was 1011327 unique passwords, so about 50k of the rockyou set is from top 10 million).

From the 25 million set I ran this through a preprocessor tool that I developed for kanidm. It eliminated anything less than a score of 3 and no length rule. This showed that zxcvbn was able to prevent 80% of these inputs from being allowed. If I was to ship this full list, this would contain 4.8 million badlisted passwords. It’s pretty amazing already that zxcvbn stops 80% of bad passwords that end up in breaches from being able to be used, with the remaining 20% likely to be complex passwords that just got dumped by other means.

However, for the badlist in Kanidm, I decided to start with “what’s popular” for now, and to allow sites to add extra content if they desire. This meant that I focused instead on the “rockyou” password set instead.

From the rockyou set I did more tests. zxcvbn has a concept of scores, and we can have policy to request a minimum score is present to allow the password. I did a score 3 test, a score 3 with min pw len 10 and a score 4 test. This showed the following results which has the % blocked by zxcvbn and the no. that passed which will required badlisting as zxcvbn can’t detect them (today).

TEST     | % blocked | no. passed
 s3      |  98.3%    |  1004
 s3 + 10 |  98.9%    |  637
 s4      |  99.7%    |  133

Personally, it’s quite hilarious that “2fast2furious” passed the score 3 check, and “30secondstomars” and “dracomalfoy” passed the score 4 check, but who am I to judge - that’s what bad lists are for.

More seriously, I found it interesting to see the effect of the check on length - not only was the preprocessor step faster, but alone that eliminated ~400 passwords that would have “passed” on score 3.

Finally, from the rockyou + 10m set, the results show the following in the same conditions.

TEST     | % blocked | no. passed
 s3      |  89.9%    |  101349
 s3 + 10 |  92.4%    |  76425
 s4      |  96.5%    |  34696

This shows that a very “easy” win is to enforce password length, in addition to entropy checkers like zxcvbn, which are effective to block 92% of the most common passwords in use on a broad set and 98% of what a pentester will look for (assuming rockyou lists). If you have a high security environment you should consider setting zxcvbn to request passwords of score 4 (the maximum), given that on the 10m set it had a 96.5% block rate.


You should use zxcvbn, it’s a great library, which quickly reduces a huge amount of risk from low quality passwords.

After that your next two strongest controls are password length, and being able to support badlisting.

Better yet, use MFA like Webauthn as well, and support server-side generated high-entropy passwords!

Sat, 07 Dec 2019 00:00:00 +1000 <![CDATA[Rust 2020 - helping to get rust deployed]]> Rust 2020 - helping to get rust deployed

This is my contribution to Rust 2020, where community members put forward ideas on what they thing Rust should aim to achieve in 2020.

In my view, Rust has had an amazing adoption by developers, and is great if you are in a position to deploy it in your own infrastructure, but we have yet to really see Rust make it to broad low-level components (IE in a linux distro or other infrastructure).

As someone who works on “enterprise” software (389-ds) and my own IDM project (kanidm), there is a need to have software packaged and distributed. We can not ask our consumers to build and compile these tools. One could view it as a chain, where I develop software in a language, it’s packaged for a company (like SUSE), and then consumed by a customer (could be anyone!) who provides a service to others (indirect users).

Rust however has always been modeled that there is no “middle” section. You have either a developer who’s intent is to develop for other developers. This is where Rust ideas like becomes involved. Alternately, you have a larger example in firefox, where developers build a project and can “bundle” everything into a whole unit that is then distributed directly to customers.

The major difference is that in the intermediate distribution case, we have to take on different responsibilities such as security auditing, building, ensuring dependencies exist etc.

So this leads me to:

1: Cargo Vendor Needs Some Love

Cargo vendor today is quite confusing in some scenarios, and it’s not clear how to have it work for projects that require offline builds. I have raised issues about this, but largely they have been un-acted upon.

2: Cargo is Difficult to Use in Mixed Language Projects

A large value proposition of Rust is the ability to use it with FFI and C. This is great if you say have cargo compile your C code for you.

But most major existing projects don’t. They use autotools, cmake, or maybe even meson or more esoteric, waf (looking at you samba). Cargo’s extreme opinionation in this area makes it extremely difficult to integrate Rust into an existing build system reliably. It’s hard to point to one fault, as much as a broader “lack of concern” in the space, and I think cargo needs to evolve improvements to help allow Rust to be used from other build tools.

3: Rust Moves Too Fast

A lot of “slower” enterprise companies want to move slowly, including compiler versions. Admittedly, this conservative behaviour is because of the historical instability of gcc versions and how it can change or affect your code between releases. Rust doesn’t suffer this, but people are still wary of fast version changes. This means Rustc + Cargo will get pinned to some version that may be 6 months old.

However crate authors don’t consider this - they will use the latest and greatest features from stable (and sometimes still nightly … grrr) in releases. Multiple times I have found that on my development environment even with a 3 month old compiler, dependencies won’t build.

Compounding this, doesn’t distinguish a security release from a feature one. Crates also encourages continuall version bumping, rather than maintenence of versioned branches. IE version 0.4.3 of a crate with a security fix will become 0.4.4, but then a feature update to include try_from may go to 0.4.5 as it “adds” to the api, or they use it internally as a cleanup.

Part of this issue is that Rust applications need to be treated closer to docker, as static whole units where only the resulting binary is supported rather than the toolchain that built it. But that only works on pure Rust applications - any mixed C + Rust application will hit this issue due to the difference between a system Rust version and what crate dependencies publish.

So I think that from this it leads to:

3.1: Crates need to indicate a minimum supported compiler version

Rust has “toyed” with the idea of editions, but within 2018 we’ve seen new features like maybeuninit and try_from land, which within an “edition” caused crates to stop worked on older compilers.

As a result, editions I think is “too broad” and people will fear incrementing it, and Rust will add features without changing edition anyway. Instead Rust needs to consider following up on the minimum supported rust version flag RFC. Rust has made it pretty clear the only “edition” flag that matters is the rust compiler version based on crate developers and what they are releasing.

3.2: Rust Needs to Think “What’s Our End Goal”

Rust is still moving incredibly fast, and I think in a way we need to ask … when will Rust be finished? When will it as a language go from continually rapid growth to stable and known feature sets? The idea of Rust editions acts as though this has happened (saying we change only every few years) when this is clearly not the case. Rust is evolving release-on-release, every 6 weeks.

4: Zero Cost Needs to Factor in Human Cost

My final wish for Rust is that sometimes we are so obsessed with the technical desire for zero cost abstraction, that we forget the high human cost and barriers that can exist as a result making feature adoption challenging. Rust has had a great community that treats people very well, and I think sometimes we need to extend that into feature development, to really consider the human cognitive cost of a feature.

Summarised - what’s the benefit of a zero cost abstraction if people can not work out how to use it?


I want to see Rust become a major part of operating systems and how we build computer systems, but I think that we need to pace ourselves, improve our tooling, and have some better ideas around what Rust should look like.

Thu, 28 Nov 2019 00:00:00 +1000 <![CDATA[Recovering LVM when a device is missing with a cache pool lv]]> Recovering LVM when a device is missing with a cache pool lv

I had a heartstopping moment today: my after running a command lvm proudly annouced it had removed an 8TB volume containing all of my virtual machine backing stores.

Everyone, A short view back to the past …

I have a home server, with the configured storage array of:

  • 2x 8TB SMR (Shingled Magnetic Recording) archive disks (backup target)
  • 2x 8TB disks (vm backing store)
  • 2x 1TB nvme SSD (os + cache)

The vm backing store also had a lvm cache segment via the nvme ssds in a raid 1 configuration. This means that the 2x8TB drives are in raid 1, and a partition on each of the nvme devices are in raid 1, then they are composed to allow the nvme to cache blocks from/to the 8TB array.

Two weeks ago I noticed one of the nvme drives was producing IO errors indicating a fault of the device. Not wanting to risk corruption or other issues from growing out of hand, I immediately shutdown the machine and identified the nvme disk with the error.

At this stage I took the precaution of imaging (dd) both the good and bad nvme devices to the archive array. Subsequently I completed a secure erase of the faulty nvme drive before returning it to the vendor for RMA.

I then left the server offline as I was away from my home for more than a week and would not need, and was unable to monitor if the other drives would produce further errors.

Returning home …

I decided to ignore William of the past (always a bad idea) and to “break” the raid on the remaining nvme device so that my server could operate allowing me some options for work related tasks.

This is an annoying process in lvm - you need to remove the missing device from the volume group as well as indicating to the array that it should no longer be in a raid state. This vgreduce is only for removing missing PV’s, it shouldn’t be doing anything else.

I initiated the raid break process on the home, root and swap devices. The steps are:

vgreduce --removemissing <vgname>
lvconvert -m0 <vgname>/<lvname>

This occured without fault due to being present on an isolated “system” volume group, so the partial lvs were untouched and left on the remaining pv in the vg.

When I then initiated this process on the “data” vg which contained the libvirt backing store, vgreduce gave me the terrifying message:

Removing logical volume "libvirt_t2".

Oh no ~

Recovery Attempts

When a logical volume is removed, it can be recovered as lvm stores backups of the LVM metadata state in /etc/lvm/archive.

My initial first reaction was that I was on a live disk, so I needed to backup this content else it would be lost on reboot. I chose to put this on the unaffected, and healthy SMR archive array.

mount /dev/mapper/archive-backup /archive
cp -a /etc/lvm /archive/lvm-backup

At this point I knew that randomly attempting commands would cause further damage and likely prevent any ability to recover.

The first step was to activate ssh so that I could work from my laptop - rather than the tty with keyboard and monitor on my floor. It also means you can copy paste, which reduces errors. Remember, I’m booted on a live usb, which is why I reset the password.

# Only needed in a live usb.
systemctl start sshd

I then formulated a plan and wrote it out. This helps to ensure that I’ve thought through the recovery process and the risks, it helps be to be fully aware of the situation.

vim recovery-plan.txt

Into this I laid out the commands I would follow. Here is the plan:

bytes 808934440960

dd if=/dev/zero of=/mnt/lv_temp bs=4096 count=197493760
losetup /dev/loop10 /mnt/lv_temp
pvcreate --restorefile /etc/lvm/archive/ --uuid iC4G41-PSFt-6vqp-GC0y-oN6T-NHnk-ivssmg /dev/loop10
vgcfgrestore data --test --file /etc/lvm/archive/

Now to explain this: The situation we are in is:

  • We have a removed data/libvirt_t2 logical volume
  • The VG data is missing a single PV (nvme0). It still has three PVs (nvme1, sda1, sdb1).
  • We can not restore metadata unless all devices are present as per the vgcfgrestore man page.

This means, we need to make a replacement device to replace into the array, and then to restore the metadata with that.

The “bytes” section you see, is the size of the missing nvme0 partition that was a member of this array - we need to create a loopback device of the same or greater size to allow us to restore the metadata. (dd, losetup)

Once the loopback is created, we can then recreate the pv on the loopback device with the same UUID as the missing device.

Once this is present, we can now restore the metadata as documented which should contain the logical volume.

I ran these steps and it was all great until vgcfgrestore. I can not remember the exact error but it was along the lines of:

Unable to restore metadata as PV was missing for VG when last modification was performed.

Yep, the vgreduce command has changed the VG state, triggering a metadata backup, but because a device was missing at the time, we can not restore this metadata.

Options …

At this point I had to consider alternate options. I conducted research into this topic as well to see if others had encountered this case (no one has ever not been able to restore their metadata apparently in this case …). The options that I arrived at:

    1. Restore the metadata from the nvme /root as it has older (but known) states - however I had recently expanded the libvirt_t2 volume from a live disk, meaning it may not have the correct part sizes.
    1. Attempt to extract the xfs filesystem with DD from the disk to begin a data recovery.
    1. Cry in a corner
    1. Use lvcreate with the “same paramaters” and hope that it aligns the start at the same location as the former data/libvirt_t2 allowing the xfs filesystem to be accessed.

All of these weren’t good - especially not 3.

I opted to attempt solution 1, and then if that failed, I would disconnect one of the 8TB disks, attempt solution 4, then if that ALSO failed, I would then attempt 2, finally all else lost I would begin solution 3. The major risk of relying on 4 and 2 is that LVM has dynamic geometry on disk, it does not always allocate contiguously. This means that attempting 4 with lvcreate may not create with the same geometry, and it may write to incorrect locations causing dataloss. The risk of 2 was again, due to the dynamic geometry what we recover may be re-arranged and corrupt.

This mean option 1 was the best way to proceed.

I mounted the /root volume of the host and using the lvm archive I was able to now restore the metadata.

vgcfgrestore data --test --file /system/etc/lvm/archive/

Once completed I performed an lvscan to refresh what block devices were available. I was then shown that every member of the VG data had conflicting seqno, and that the metadata was corrupt and unable to proceed.

Somehow we’d made it worse :(

Successful Procedure

At this point, faced with 3 options that were all terrible, I started to do more research. I finally discovered a post describing that the lvm metadata is stored on disk in the same format as the .vg files in the archive, and it’s a ring buffer. We may be able to restore from these.

To do so, you must dd out of the disk into a file, and then manipulate the file to only contain a single metadata entry.

Remember how I made images of my disks before I sent them back? This was their time to shine.

I did do a recovery plan with these commands too, but it was more evolving due to the paramaters involved so it changed frequently with the offsets. The plan was very similar to above - use a loop device as a stand in for the missing block device, restore the metadata, and then go from there.

We know that LVM metadata occurs in the first section of the disk, just after the partition start. So to work out where this is we use gdisk to show the partitions in the backup image.

# gdisk /mnt/mion.nvme0n1.img
GPT fdisk (gdisk) version 1.0.4

Command (? for help): p
Disk /mnt/mion.nvme0n1.img: 2000409264 sectors, 953.9 GiB
Sector size (logical): 512 bytes

Number  Start (sector)    End (sector)  Size       Code  Name
   1            2048         1026047   500.0 MiB   EF00
   2         1026048       420456447   200.0 GiB   8E00
   3       420456448      2000409230   753.4 GiB   8E00

It’s important to note the sector size flag, as well as the fact the output is in sectors.

The LVM header occupies 255 sectors after the start of the partition. So this in mind we can now create a dd command to extract the needed information.

dd if=/mnt/mion.nvme0n1.img of=/tmp/lvmmeta bs=512 count=255 skip=420456448

bs sets the sector size to 512, count will read from the start up to 255 sectors of size ‘bs’, and skip says to start reading after ‘skip’ * ‘sector’.

At this point, we can now copy this and edit the file:

cp /tmp/lvmmeta /archive/lvm.meta.edit

Within this file you can see the ring buffer of lvm metadata. You need to find the highest seqno that is a complete record. For example, my seqno = 20 was partial (is the lvm meta longer than 255, please contact me if you know!), but seqno=19 was complete.

Here is the region:

# ^ more data above.
# Generated by LVM2 version 2.02.180(2) (2018-07-19): Mon Nov 11 18:05:45 2019

contents = "Text Format Volume Group"
version = 1

description = ""

creation_host = "linux-p21s"    # Linux linux-p21s 4.12.14-lp151.28.25-default #1 SMP Wed Oct 30 08:39:59 UTC 2019 (54d7657) x86_64
creation_time = 1573459545      # Mon Nov 11 18:05:45 2019

^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@data {
id = "4t86tq-3DEW-VATS-1Q5x-nLLy-41pR-zEWwnr"
seqno = 19
format = "lvm2"

So from there you remove everything above “contents = …”, and clean up the vgname header. It should look something like this.

contents = "Text Format Volume Group"
version = 1

description = ""

creation_host = "linux-p21s"    # Linux linux-p21s 4.12.14-lp151.28.25-default #1 SMP Wed Oct 30 08:39:59 UTC 2019 (54d7657) x86_64
creation_time = 1573459545      # Mon Nov 11 18:05:45 2019

data {
id = "4t86tq-3DEW-VATS-1Q5x-nLLy-41pR-zEWwnr"
seqno = 19
format = "lvm2"

Similar, you need to then find the bottom of the segment (look for the next highest seqno) and remove everything below the line: “# Generated by LVM2 …”

Now, you can import this metadata to the loop device for the missing device. Note I had to wipe the former lvm meta segment due to the previous corruption, which caused pvcreate to refuse to touch the device.

dd if=/dev/zero of=/dev/loop10 bs=512 count=255
pvcreate --restorefile lvmmeta.orig.nvme1.edited --uuid iC4G41-PSFt-6vqp-GC0y-oN6T-NHnk-ivssmg /dev/loop10

Now you can do a dry run:

vgcfgrestore --test -f lvmmeta.orig.nvme1.edited data

And the real thing:

vgcfgrestore -f lvmmeta.orig.nvme1.edited data

Hooray! We have volumes! Let’s check them, and ensure their filesystems are sane:

lvchange -ay data/libvirt_t2
xfs_repair -n /dev/mapper/data-libvirt_t2

If xfs_repair says no errors, then go ahead and mount!

At this point, lvm started to resync the raid, so I’ll leave that to complete before I take any further action to detach the loopback device.

How to Handle This Next Time

The cause of this issue really comes from vgreduce –removemissing removing the device when a cache member can’t be found. I plan to report this as a bug.

However another key challenge was the inability to restore the lvm metadata when the metadata archive reported a missing device. This is what stopped me from being able to restore the array in the first place, even though I had a “fake” replacement. This is also an issue I intend to raise.

Next time I would:

  • Activate the array as a partial
  • Remove the cache device first
  • Then stop the raid
  • Then perform the vgreduction

I really hope this doesn’t happen to you!

Tue, 26 Nov 2019 00:00:00 +1000 <![CDATA[Upgrading OpenSUSE 15.0 to 15.1]]> Upgrading OpenSUSE 15.0 to 15.1

It’s a little bit un-obvious how to do this. You have to edit the repo files to change the release version, then refresh + update.

sed -ri 's/15\.0/15.1/' /etc/zypp/repos.d/*.repo
zypper ref
zypper dup

Note this works on a transactional host too:

sed -ri 's/15\.0/15.1/' /etc/zypp/repos.d/*.repo
transactional-update dup

It would be nice if these was an upgrade tool that would attempt the upgrade and revert the repo files, or use temporary repo files for the upgrade though. It would be a bit nicer as a user experience than sed of the repo files.

Wed, 25 Sep 2019 00:00:00 +1000 <![CDATA[Announcing Kanidm - A new IDM project]]> Announcing Kanidm - A new IDM project

Today I’m starting to talk about my new project - Kanidm. Kanidm is an IDM project designed to be correct, simple and scalable. As an IDM project we should be able to store the identities and groups of people, authenticate them securely to various other infrastructure components and services, and much more.

You can find the source for kanidm on github.

For more details about what the project is planning to achieve, and what we have already implemented please see the github.

What about 389 Directory Server

I’m still part of the project, and working hard on making it the best LDAP server possible. Kanidm and 389-ds have different goals. 389 Directory Server is a globally scalable, distributed database, that can store huge amounts of data and process thousands of operations per second. 389-ds let’s you build a system ontop, in any way you want. If you want an authentication system today, use 389-ds. We are even working on a self-service web portal soon too (one of our most requested features!). Besides my self, no one on the (amazing) 389 DS team has any association with kanidm (yet?).

Kanidm is an opinionated IDM system, and has strong ideas about how authentication and users should be processed. We aim to be scalable, but that’s a long road ahead. We also want to have more web integrations, client tools and more. We’ll eventually write a kanidm to 389-ds sync tool.

Why not integrate something with 389? Why something new?

There are a lot of limitations with LDAP when it comes to modern web-focused auth processes such as webauthn. Because of this, I wanted to make something that didn’t have the same limitations, and had different ideas about data storage and apis. That’s why I wanted to make something new in parallel. It was a really hard decision to want to make something outside of 389 Directory Server (Because I really do love the project, and have great pride in the team), but I felt like it was going to be more productive to build in parallel, than ontop.

When will it be ready?

I think that a single-server deployment will be usable for small installations early 2020, and a fully fledged system with replication would be late 2020. It depends on how much time I have and what parts I implement in what order. Current rough work order (late 2019) is indexing, RADIUS integration, claims, and then self-service/web ui.

Wed, 18 Sep 2019 00:00:00 +1000 <![CDATA[OpenSUSE leap as a virtualisation host]]> OpenSUSE leap as a virtualisation host

I’ve been rebuilding my network to use SUSE from CentOS, and the final server was my hypervisor. Most of the reason for this is the change in my employment, so I feel it’s right to dogfood for my workplace.

What you will need

  • Some computer parts (assembaly may be required)
  • OpenSUSE LEAP 15.1 media (dd if=opensuse.iso of=/dev/a_usb_i_hope)

What are we aiming for?

My new machine has dual NVME and dual 8TB spinning disks. The intent is to have the OS on the NVME and to have a large part of the NVME act as LVM cache for the spinning disk. The host won’t run any applications beside libvirt, and has to connect a number of vlans over a link aggregation.

Useful commands

Through out this document I’ll assume some details about your devices and partitions. To find your own, and to always check and confirm what you are doing, some command will help:

lsblk  # Shows all block storage devices, and how (if) they are mounted.
lvs  # shows all active logical volumes
vgs  # shows all active volume groups
pvs  # shows all active physical volumes
dmidecode  # show hardware information
ls -al /dev/disk/by-<ID TYPE>/  # how to resolve disk path to a uuid etc.

I’m going to assume you have devices like:

/dev/nvme0  # the first nvme of your system that you install to.
/dev/nvme1  # the second nvme, used later
/dev/sda    # Two larger block storage devices.


Install and follow the prompts. Importantly when you install you install to a single NVME, and choose transactional server + lvm, then put btrfs on the /root in the lvm. You want to partition such that there is free space still in the NVME - I left about 400GB unpartitioned for this. In other words, the disk should be like:

[ /efi | pv + vg system               | pv (unused) ]
       | /root (btrfs), /boot, /home  |

Remember to leave about 1GB of freespace on the vg system to allow raid 1 metadata later!

Honestly, it may take you a try or two to get this right with YaST, and it was probably the trickiest part of the install.

You should also select that network management is via networkmanager, not wicked. You may want to enable ssh here. I disabled the firewall personally because there are no applications and it interfers with the bridging for the vms.


Because this is suse transactional we need to add packages and reboot each time. Here is what I used, but you may find you don’t need everything here:

transactional-update pkg install libvirt libvirt-daemon libvirt-daemon-qemu \
  sssd sssd-ad sssd-ldap sssd-tools docker zsh ipcalc python3-docker rdiff-backup \
  vim rsync iotop tmux fwupdate fwupdate-efi bridge-utils qemu-kvm apcupsd

Reboot, and you are ready to partition.

Partitioning - post install

First, copy your gpt from the first NVME to the second. You can do this by hand with:

gdisk /dev/nvme0

gdisk /dev/nvme1
<duplicate the parameters as required>

Now we’ll make your /efi at least a little redundant

mkfs.fat /dev/nvme1p0
ls -al /dev/disk/by-uuid/
# In the above, look for your new /efi fs, IE CE0A-2C1D -> ../../nvme1n1p1
# Now add a line to /etc/fstab like:
UUID=CE0A-2C1D    /boot/efi2              vfat   defaults                      0  0

Now to really make this work, because it’s transactional, you have to make a change to the /root, which is readonly! To do this run

transactional-update shell dup

This put’s you in a shell at the end. Run:

mkdir /boot/efi2

Now reboot. After the reboot your second efi should be mounted. rsync /boot/efi/* to /boot/efi2/. I leave it to the reader to decide how to sync this periodically.

Next you can setup the raid 1 mirror for /root and the system vg.

pvcreate /dev/nvme1p1
vgextend system /dev/nvme1p1

Now we have enough pvs to make a raid 1, so we convert all our volumes:

lvconvert --type raid1 --mirrors 1 system/home
lvconvert --type raid1 --mirrors 1 system/root
lvconvert --type raid1 --mirrors 1 system/boot

If this fails with “not enough space to alloc metadata”, it’s because you didn’t leave space on the vg during install. Honestly, I made this mistake twice due to confusion about things leading to two reinstalls …

Getting ready to cache

Now lets get ready to cache some data. We’ll make pvs and vgs for data:

pvcreate /dev/nvme0p2
pvcreate /dev/nvme1p2
pvcreate /dev/sda1
pvcreate /dev/sda2
vgcreate data /dev/nvme0p2 /dev/nvme1p2 /dev/sda1 /dev/sdb2

Create the larger volume

lvcreate --type raid1 --mirrors 1 -L7.5T -n libvirt_t2 data /dev/sda1 /dev/sdb1

Prepare the caches

lvcreate --type raid1 --mirrors 1 -L 4G -n libvirt_t2_meta data
lvcreate --type raid1 --mirrors 1 -L 400G -n libvirt_t2_cache data
lvconvert --type cache-pool --poolmetadata data/libvirt_t2_meta data/libvirt_t2_cache

Now put the caches in front of the disks. It’s important for you to check you have the correct cachemode at this point, because you can’t change it without removing and re-adding the cache. I choose writeback because my nvme devices are in a raid 1 mirror, and it helps to accelerate writes. You may err to use the default where the SSD’s are read cache only.

lvconvert --type cache --cachemode writeback --cachepool data/libvirt_t2_cache data/libvirt_t2
mkfs.xfs /dev/mapper/data-libvirt_t2

You can monitor the amount of “cached” data in the data column of lvs.

Now you can add this to /etc/fstab as any other xfs drive. I mounted it to /var/lib/libvirt/images.

Network Manager

Now, I have to assemble the network bridges. Network Manager has some specific steps to follow to achieve this. I have:

  • two intel gigabit ports
  • the ports are link aggregated by 802.3ad
  • there are multiple vlans ontop of the link agg
  • bridges must be built on top of the vlans

This requires a specific set of steps to layer this, because network manager sees the bridge and the lagg as seperate things that require the vlan to tie them together.

Configure the link agg, and add our two ethernet phys

nmcli conn add type bond con-name bond0 ifname bond0 mode 802.3ad ipv4.method disabled ipv6.method ignore
nmcli connection add type ethernet con-name bond0-eth1 ifname eth1 master bond0 slave-type bond
nmcli connection add type ethernet con-name bond0-eth2 ifname eth2 master bond0 slave-type bond

Add a bridge for a vlan:

nmcli connection add type bridge con-name net_18 ifname net_18 ipv4.method disabled ipv6.method ignore

Now tie together a vlan on the bond, to the bridge we created.

nmcli connection add type vlan con-name bond0.18 ifname bond0.18 dev bond0 id 18 master net_18 slave-type bridge

You will need to repeat these last two commands as required for the vlans you have.

House Keeping

Finally you need to do some house keeping. Transactional server will automatically reboot and update so you need to be ready for this. You may disable this with:

systemctl disable transactional-update.timer

You likely want to edit:


To be able to handle guest shutdown policy due to a UPS failure or a reboot.

Now you can enable and start libvirt:

systemctl enable libvirtd
systemctl start libvirtd

Finally you can build and import virtualmachines.

Mon, 02 Sep 2019 00:00:00 +1000 <![CDATA[LDAP Filter Syntax Validation]]> LDAP Filter Syntax Validation

Today I want to do a deep-dive into a change that will be released in 389 Directory Server 1.4.2. It’s a reasonably complicated change for our server, but it has a simple user interaction for admins and developers. I want to peel back some of the layers to explain what kind of experience, thought and team work goes into a change like this.

TL;DR - just keep upgrading your 389 Directory Server instance, and our ‘correct by default’ policy will apply, and you’ll keep having the best LDAP server we can make :)

LDAP Filters and How They Work

LDAP filters are one of the primary methods of expression in LDAP, and are used in almost every aspect of the system - from finding who you are when you login, to asserting you are member of a group or have other security attributes.

For the purposes of this discussion we’ll look at this filter:


In order to execute these queries quickly (LDAP is designed to handle thousands of operations per second) we heavily rely on indexing. Indexing is often a topic where people believe it to be some kind of “magic” but it’s reasonably simple: indexes are pre-computed partial result sets. So why do we need these?

We’ll imagine we have two entries (invalid, and truncated for brevity).

dn: cn=william,...
cn: william

dn: cn=claire,...
cn: claire

These entries both have entry-ids - these id’s are per-server in a replication group and are integers. You can show them by requesting entryid as an attribute in 389.

dn: cn=william,...
entryid: 1
cn: william

dn: cn=claire,...
entryid: 2
cn: claire

Our entries are stored in the main-entry database in /var/lib/dirsrv/slapd-standalone1/db/userRoot in the file “id2entry.db4”. This is a key-value database where the keys are the entryid, and the value is the serialised entry itself. Roughly, it’s:

[ ID ][ Entry             ]
  1     dn: cn=william,...
        cn: william

  2     dn: cn=claire,...
        cn: claire

Now, if we had NO indexes, to evaluate our filters we have to scan every entry of id2entry to determine if the filter matches. This algorithm is:

candidate_set = []
for id in id-min to id-max:
    entry = load_entry_by_id(id)
    if apply_filter(filter, entry):

For two entries, this may be fast, but when you have 1000, 10.000, or even millions, this is extremely slow. We call these searches full table scans or in 389 DS, ALLIDS searches.

To make our searches faster we have indexes. An index is a mapping of a partial query term to an id list (In 389 we call these IDLs). An IDL is a set of integers. Our index for these examples would likely be something like:

=william: [1, ]
=claire: [2, ]

These indexes are also stored in key-value databases in userRoot - you can see this as cn.db4.

So when we have an indexed term, to evaluate the query, we’ll load the indexes, then using mathematical set operations, we then produce a candidate_id_set, and we can then load the entries that only match.

For example in psuedo python code:

# Assume query is: (cn=william)

attr = filter.get_attr_name()
with open('%s.db' % attr) as index:
    idl = index.get('=william') # from the filter :)

for id in idl:
    ... # as before.

So we can see now that when we load the idl for cn index, this would give us the set [1, ]. Even if the database had 100 million entries, as our idl is a single value, we only need to load the one entry that matches. Neat!

When we have a more complex operation such as AND and OR, we can now manipulate the idl sets. For example:

   uid =claire -> idl [2, ]
   uid =william -> idl [1, ]

candidate_idl_set = union([2, ], [1, ])
# [1, 2]

This means again, even with millions of entries, we only need to load entry 1 and 2 to uphold the query provided to us.

So we finally know enough to understand how our example query is executed. PHEW!

Unindexed Attributes

However, it’s not always so easy. When we have an attribute that isn’t indexed, we have to handle this situation. In these cases, while we operate on the idl set, we may insert an idl with the value of ALLIDS (which as previously mentioned, is the “set of all entries”). This can have various effects.

If this was an AND query, we can annotate that the filter is partially resolved. This means that if we had:


Because an AND condition, both filter components must be satisfied, we have a partial candidate set from cn=william of [1, ]. We can load this partial candidate set, and then apply the filter test as in the full table scan case, but as we only apply it to a single entry this is really fast.

The real problem is OR queries. If we had:


Because OR means that both filter components could be satisfied, we have to turn unindexd into ALLIDS, and the result of the OR as a whole is ALLIDS. So even if we have 30 indexed values in the OR, a single ALLIDS (unindexed value) will always turn that OR into a full table scan. This is not good for performance!

Missing Attributes

So as a weirder case … what if the attribute doesn’t exist in schema at all? For example we could search for Microsoft AD attributes in 389 Directory Server, or we could submit bogus filters like “(whargarble=foo)”. What happens here?

Well, historically we treated these the same as unindexed queries. Which means that any term that is not in schema, would be treated as ALLIDS. This led to a “quitely known” denial of service attack again 389 Directory Server where you could emit a large number of queries for attributes that don’t exist, causing the server to attempt many ALLIDS scans. We have some defences like the allids limit (how many entries you can full table scan before giving up). But it can still cause entry cache churn and other performance issues.

I was first made aware of this issue in 2014 while working for University of Adelaide where our VMWare service would query LDAP for MS attributes, causing a large performance issue. We resolved this by adding the MS attributes to schema and indexing them so that they would create empty indexes - now we would call this in 389 Directory Server and “idl_alloc(0)” or “empty IDL”.

When initially hired by Red Hat in 2015 I knew this issue existed but I didn’t know enough about the server core to really fix it, so it went in the back of my mind … it was rare to have a customer raise this issue, but we had the work around and so I was able to advise support services on how to mitigate this.

In 2019 however, while investigating an issue related to filter optimisation, I was made aware of an issue with FreeIPA where they were doing certmap queries that requested MS Cert attributes. However it would cause large performance issues. We now had the issue again, and in a large widely installed product so it was time to tackle it.

How to handle this?

A major issue in this is “never breaking customers”. Because we had always supported this behaviour there is a risk that any solution would cause customer queries to “silently” begin to break if we issued a fix or change. More accurately, any change to how we execute the filters could cause results of the filters to change, which would disrupt customers.

Saying this, there is also precedent that 389 Directory Server was handling this incorrectly. From the RFC for LDAP it was noted:

Any assertion about the values of such an attribute is only defined if the AttributeType is known by the evaluating mechanism, the purported AttributeValue(s) conforms to the attribute syntax defined for that attribute type, the implied or indicated matching rule is applicable to that attribute type, and (when used) a presented matchValue conforms to the syntax defined for the indicated matching rules. When these conditions are not met, the FilterItem shall evaluate to the logical value UNDEFINED. An assertion which is defined by these conditions additionally evaluates to UNDEFINED if it relates to an attribute value and the attribute type is not present in an attribute against which the assertion is being tested. An assertion which is defined by these conditions and relates to the presence of an attribute type evaluates to FALSE.

Translation: If a filter component (IE nonexist=foo) is in a filter but NOT in the schema, the result of the filter’s evaluation is an empty-set aka undefined.

It was also clear that if an engaged and active consumer like FreeIPA is making this mistake, then it must be overlooked by many others without notice. So there is sometimes value in helping to raise the standard so that everyone benefits, and highlight mistakes quicker.

The Technical Solution

This is the easy part - we add a new configuration option with three states. “on”, “off”, “warn”. “On” would enable the strictest handling of filters, rejecting them an not continuing if any attribute requested was not in the schema. “Warn” would provide the rfc compliant behaviour, mapping to empty-set index, and notifying in the logs that this occured. Finally, “off” would be the previous “silently allow” behaviour.

This was easily achieved in filter parsing, by checking the attribute of each filter component against our schema hashmap. We then tag the filter element, and depending on the current setting level reject or continue.

In the query execution code, we now check the filter tag to understand if the attribute is schema present or not. If it’s flagged as “undefined”, then we immediately shortcut to return idl_alloc(0) instead of returning ALLIDS on the failure to find the relevant index db.

We can show the performance impact of this change:

Before with non-existant attribute

Average rate:    7.40/thr

After with “warn” enabled (map to empty set)

Average rate: 4808.70/thr

This is a huge improvement, and certainly shows the risk of DOS and how effective the solution was!

The Social Solution

Of course, this is the hard part - the 389 Directory Server team are all amazingly smart people, from many countries, and all live around the world. They all care that the server is the best possible, and that our standards as a team are high. That means when introducing a change that has a risk of affecting query result sets like this, they pay attention, and ask many difficult questions about how the feature will be implemented.

The first important justification - is a change like this worth while? We can see from the performance results that the risk of DOS is reasonable, so the answer there becomes Yes from a security view. But it’s also important to consider the cost on consumers - is this change going to benefit FreeIPA for example? As I am biased being the author I want to say “yes” - by notifying or rejecting invalid filters earlier, we can help FreeIPA developers improve their code quality, without expecting them to know LDAP inside and out.

The next major question is performance - before the feature was developed there is clearly a risk of DOS, but when we implement this we are performing additional locking on the schema. Is that a risk to our standalone performance or normal operating conditions. This had to also be discussed and assessed.

A really important point that was raised by Thierry is how we communicated these errors too. Previously we would use the “notes=” field of the access log. It looks like this:

conn=1 op=4 RESULT err=0 tag=101 nentries=13 etime=0.0003795424 notes=U

The challenge with the notes= field, is that it’s easy to overlook, and unless you are familar, hard to see what this is indicating. In this case, notes=U means partially unindexed query (one filter component but not all returned ALLIDS).

We can’t change the notes field due to the risk of breaking our own scripts like, support tools developed by RH or SUSE, or even integrations to platforms like splunk. But clearly we need a way to detail what is happening with your filter. So Thierry suggested an extension to have details about the provided notes. Now we get:

conn=1 op=4 RESULT err=0 tag=101 nentries=13 etime=0.0003795424 notes=U details="Partially Unindexed Filter"
conn=1 op=8 RESULT err=0 tag=101 nentries=0 etime=0.0001886208 notes=F details="Filter Element Missing From Schema"

So we have extended our log message, but without breaking existing integrations.

The final question is what our defaults should be. It’s one thing to have this feature, but what should we ship with? Do we strictly reject filters? Warn? Or disable, and expect people to turn this on.

This became a long discussion with Ludwig, Thierry and I - we discussed the risk of DOS in the first place, what the impact of the levels could be, how it could break legacy applications or sites using deprecated features or with weird data imports. Many different aspects were considered. We decided to default to “warn” (non-existant becomes empty-set), and we settled on communication with support to advise them of the upcoming change, but also we considered that our “back out” plan is to change the default and ship a patch if there is a large volume of negative feedback.


As of today, the PR is merged, and the code on it’s way to the next release. It’s a long process but the process exists to ensure we do what’s best for our users, while we try to balance many different aspects. We have a great team of people, with decades of experience from many backgrounds which means that these discussions can be long and detailed, but in the end, we hope to give what is the best product possible to our community.

It’s also valuable to share how much thought and effort goes into projects - in your life you may only interact with 1% of our work through our configuration and system, but we have an iceberg of decisions and design process that affects you every day, where we have to be responsible and considerate in our actions.

I hope you enjoyed this exploration of this change!



Thu, 29 Aug 2019 00:00:00 +1000 <![CDATA[Using ramdisks with Cargo]]> Using ramdisks with Cargo

I have a bit of a history of killing SSDs - probably because I do a bit too much compiling and management of thousands of tiny files. Plenty of developers have this problem! So while thinking one evening, I was curious if I could setup a ramdisk on my mac for my cargo work to output to.

Making the ramdisk

On Linux you’ll need to use tmpfs or some access to /dev/shm.

On OSX you need to run a script like the following:

diskutil partitionDisk $(hdiutil attach -nomount ram://4096000) 1 GPTFormat APFS 'ramdisk' '100%'

This creates and mounts a 4GB ramdisk to /Volumes/ramdisk. Make sure you have enough ram!

Asking cargo to use it

We probably don’t want to make our changes permant in Cargo.toml, so we’ll use the environment:

CARGO_TARGET_DIR=/Volumes/ramdisk/rs cargo ...

Does it work?


Disk Build (SSD, 2018MBP)

Finished dev [unoptimized + debuginfo] target(s) in 2m 29s

4 GB APFS ramdisk

Finished dev [unoptimized + debuginfo] target(s) in 1m 53s

For me it’s more valuable to try and save those precious SSD write cycles, so I think I’ll try to stick with this setup. You can see how much rust writes by doing a clean + build. My project used the following:

Filesystem                             Size   Used  Avail Capacity    iused               ifree %iused  Mounted on
/dev/disk110s1                        2.0Gi  1.2Gi  751Mi    63%       3910 9223372036854771897    0%   /Volumes/ramdisk

Make it permanent

Put the following in /Library/LaunchDaemons/

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "">
<plist version="1.0">

And the following into /usr/local/libexec/

diskutil partitionDisk $(hdiutil attach -nomount ram://4096000) 1 GPTFormat APFS 'ramdisk' '100%'

Finally put this in your cargo file of choice

target-dir = "/Volumes/ramdisk/rs"

Future william will need to work out if there are negative consequences to multiple cargo projects sharing the same target directory … hope not!

Launchctl tips

# Init the service
launchctl load /Library/LaunchDaemons/


Mon, 26 Aug 2019 00:00:00 +1000 <![CDATA[CPU atomics and orderings explained]]> CPU atomics and orderings explained

Sometimes the question comes up about how CPU memory orderings work, and what they do. I hope this post explains it in a really accessible way.

Short Version - I wanna code!

Summary - The memory model you commonly see is from C++ and it defines:

  • Relaxed
  • Acquire
  • Release
  • Acquire/Release (sometimes AcqRel)
  • SeqCst

There are memory orderings - every operation is “atomic”, so will work correctly, but there rules define how the memory and code around the atomic are influenced.

If in doubt - use SeqCst - it’s the strongest guarantee and prevents all re-ordering of operations and will do the right thing.

The summary is:

  • Relaxed - no ordering guarantees, just execute the atomic as is.
  • Acquire - all code after this atomic, will be executed after the atomic.
  • Release - all code before this atomic, will be executed before the atomic.
  • Acquire/Release - both Acquire and Release - ie code stays before and after.
  • SeqCst - Stronger consistency of Acquire/Release.

Long Version … let’s begin …

So why do we have memory and operation orderings at all? Let’s look at some code to explain:

let mut x = 0;
let mut y = 0;
x = x + 3;
y = y + 7;
x = x + 4;
x = y + x;

Really trivial example - now to us as a human, we read this and see a set of operations that are linear by time. That means, they execute from top to bottom, in order.

However, this is not how computers work. First, compilers will optimise your code, and optimisation means re-ordering of the operations to achieve better results. A compiler may optimise this to:

let mut x = 0;
let mut y = 0;
// Note removal of the x + 3 and x + 4, folded to a single operation.
x = x + 7
y = y + 7;
x = y + x;

Now there is a second element. Your CPU presents the illusion of running as a linear system, but it’s actually an asynchronous, out-of-order task execution engine. That means a CPU will reorder your instructions, and may even run them concurrently and asynchronously.

For example, your CPU will have both x + 7 and y + 7 in the pipeline, even though neither operation has completed - they are effectively running at the “same time” (concurrently).

When you write a single thread program, you generally won’t notice this behaviour. This is because a lot of smart people write compilers and CPU’s to give the illusion of linear ordering, even though both of them are operating very differently.

Now we want to write a multithreaded application. Suddenly this is the challenge:

We write a concurrent program, in a linear language, executed on a concurrent asynchronous machine.

This means there is a challenge is the translation between our mind (thinking about the concurrent problem), the program (which we have to express as a linear set of operations), which then runs on our CPU (an async concurrent device).

Phew. How do computers even work in this scenario?!

Why are CPU’s async?

CPU’s have to be async to be fast - remember spectre and meltdown? These are attacks based on measuring the side effects of CPU’s asynchronous behaviour. While computers are “fast” these attacks will always be possible, because to make a CPU synchronous is slow - and asynchronous behaviour will always have measurable side effects. Every modern CPU’s performance is an illusion of async forbidden magic.

A large portion of the async behaviour comes from the interaction of the CPU, cache, and memory.

In order to provide the “illusion” of a coherent synchronous memory interface there is no seperation of your programs cache and memory. When the cpu wants to access “memory” the CPU cache is utilised transparently and will handle the request, and only on a cache miss, will we retrieve the values from RAM.

(Aside: in almost all cases more CPU cache, not frequency will make your system perform better, because a cache miss will mean your task stalls waiting on RAM. Ohh no!)

CPU -> Cache -> RAM

When you have multiple CPU’s, each CPU has it’s own L1 cache:

CPU1 -> L1 Cache -> |              |
CPU2 -> L1 Cache -> | Shared L2/L3 | -> RAM
CPU3 -> L1 Cache -> |              |
CPU4 -> L1 Cache -> |              |

Ahhh! Suddenly we can see where problems can occur - each CPU has an L1 cache, which is transparent to memory but unique to the CPU. This means that each CPU can make a change to the same piece of memory in their L1 cache without the other CPU knowing. To help explain, let’s show a demo.

CPU just trash my variables

We’ll assume we now have two threads - my code is in rust again, and there is a good reason for the unsafes - this code really is unsafe!

// assume global x: usize = 0; y: usize = 0;

THREAD 1                        THREAD 2

if unsafe { *x == 1 } {          unsafe {
    unsafe { *y += 1 }              *y = 10;
}                                   *x = 1;

At the end of execution, what state will X and Y be in? The answer is “it depends”:

  • What order did the threads run?
  • The state of the L1 cache of each CPU
  • The possible interleavings of the operations.
  • Compiler re-ordering

In the end the result of x will always be 1 - because x is only mutated in one thread, the caches will “eventually” (explained soon) become consistent.

The real question is y. y could be:

  • 10
  • 11
  • 1

10 - This can occur because in thread 2, x = 1 is re-ordered above y = 10, causing the thread 1 “y += 1” to execute, followed by thread 2 assign 10 directly to y. It can also occur because the check for x == 1 occurs first, so y += 1 is skipped, then thread 2 is run, causing y = 10. Two ways to achieve the same result!

11 - This occurs in the “normal” execution path - all things considered it’s a miracle :)

1 - This is the most complex one - The y = 10 in thread 2 is applied, but the result is never sent to THREAD 1’s cache, so x = 1 occurs and is made available to THREAD 1 (yes, this is possible to have different values made available to each cpu …). Then thread 1 executes y (0) += 1, which is then sent back trampling the value of y = 10 from thread 2.

If you want to know more about this and many other horrors of CPU execution, Paul McKenny is an expert in this field and has many talks at LCA and others on the topic. He can be found on twitter and is super helpful if you have questions.

So how does a CPU work at all?

Obviously your system (likely a multicore system) works today - so it must be possible to write correct concurrent software. Cache’s are kept in sync via a protocol called MESI. This is a state machine describing the states of memory and cache, and how they can be synchronised. The states are:

  • Modified
  • Exclusive
  • Shared
  • Invalid

What’s interesting about MESI is that each cache line is maintaining it’s own state machine of the memory addresses - it’s not a global state machine. To coordinate CPU’s asynchronously message each other.

A CPU can be messaged via IPC (Inter-Processor-Communication) to say that another CPU wants to “claim” exclusive ownership of a memory address, or to indicate that it has changed the content of a memory address and you should discard your version. It’s important to understand these messages are asynchronous. When a CPU modifies an address it does not immediately send the invalidation message to all other CPU’s - and when a CPU recieves the invalidation request it does not immediately act upon that message.

If CPU’s did “synchronously” act on all these messages, they would be spending so much time handling IPC traffic, they would never get anything done!

As a result, it must be possible to indicate to a CPU that it’s time to send or acknowledge these invalidations in the cache line. This is where barriers, or the memory orderings come in.

  • Relaxed - No messages are sent or acknowledged.
  • Release - flush all pending invalidations to be sent to other CPUS
  • Acquire - Acknowledge and process all invalidation requests in my queue
  • Acquire/Release - flush all outgoing invalidations, and process my incomming queue
  • SeqCst - as AcqRel, but with some other guarantees around ordering that are beyond this discussion.

Understand a Mutex

With this knowledge in place, we are finally in a position to understand the operations of a Mutex

// Assume mutex: Mutex<usize> = Mutex::new(0);

THREAD 1                            THREAD 2

{                                   {
    let guard = mutex.lock()            let guard = mutex.lock()
    *guard += 1;                        println!(*guard)
}                                   }

We know very clearly that this will print 1 or 0 - it’s safe, no weird behaviours. Let’s explain this case though:


    let guard = mutex.lock()
    // Acquire here!
    // All invalidation handled, guard is 0.
    // Compiler is told "all following code must stay after .lock()".
    *guard += 1;
    // content of usize is changed, invalid req is queue
// Release here!
// Guard goes out of scope, invalidation reqs sent to all CPU's
// Compiler told all proceeding code must stay above this point.

            THREAD 2

                let guard = mutex.lock()
                // Acquire here!
                // All invalidations handled - previous cache of usize discarded
                // and read from THREAD 1 cache into S state.
                // Compiler is told "all following code must stay after .lock()".
            // Release here!
            // Guard goes out of scope, no invalidations sent due to
            // no modifications.
            // Compiler told all proceeding code must stay above this point.

And there we have it! How barriers allow us to define an ordering in code and a CPU, to ensure our caches and compiler outputs are correct and consistent.

Benefits of Rust

A nice benefit of Rust, and knowing these MESI states now, we can see that the best way to run a system is to minimise the number of invalidations being sent and acknowledged as this always causes a delay on CPU time. Rust variables are always mutable or immutable. These map almost directly to the E and S states of MESI. A mutable value is always exclusive to a single cache line, with no contention - and immutable values can be placed into the Shared state allowing each CPU to maintain a cache copy for higher performance.

This is one of the reasons for Rust’s amazing concurrency story is that the memory in your program map to cache states very clearly.

It’s also why it’s unsafe to mutate a pointer between two threads (a global) - because the cache of the two cpus’ won’t be coherent, and you may not cause a crash, but one threads work will absolutely be lost!

Finally, it’s important to see that this is why using the correct concurrency primitives matter - it can highly influence your cache behaviour in your program and how that affects cache line contention and performance.

For comments and more, please feel free to email me!

Shameless Plug

I’m the author and maintainer of Conc Read - a concurrently readable datastructure library for Rust. Check it out on!

Tue, 16 Jul 2019 00:00:00 +1000 <![CDATA[I no longer recommend FreeIPA]]> I no longer recommend FreeIPA

It’s probably taken me a few years to write this, but I can no longer recommend FreeIPA for IDM installations.

Why not?

The FreeIPA project focused on Kerberos and SSSD, with enough other parts glued on to look like a complete IDM project. Now that’s fine, but it means that concerns in other parts of the project are largely ignored. It creates design decisions that are not scalable or robust.

Due to these decisions IPA has stability issues and scaling issues that other products do not.

To be clear: security systems like IDM or LDAP can never go down. That’s not acceptable.

What do you recommend instead?

  • Samba with AD
  • AzureAD
  • 389 Directory Server

All of these projects are very reliable, secure, scalable. We have done a lot of work into 389 to improve our out of box IDM capabilities too, but there is more to be done too. The Samba AD team have done great things too, and deserve a lot of respect for what they have done.

Is there more detail than this?

Yes - buy me a drink and I’ll talk :)

Didn’t you help?

I tried and it was not taken on board.

So what now?

Hopefully in the next year we’ll see new IDM projects for opensource released that have totally different approachs to the legacy we currently ride upon.

Wed, 10 Jul 2019 00:00:00 +1000 <![CDATA[Using 389ds with docker]]> Using 389ds with docker

I’ve been wanting to containerise 389 Directory Server for a long time - it’s been a long road to get here, but I think that our container support is getting very close to a production ready and capable level. It took so long due to health issues and generally my obsession to do everything right.

Today, container support along with our new command line tools makes 389 a complete breeze to administer. So lets go through an example of a deployment now.

Please note: the container image here is a git-master build and is not production ready as of 2019-07, hopefully this changes soon.

Getting the Container

docker pull firstyear/389ds:latest

If you want to run an ephemeral instance (IE you will LOSE all your data on a restart)

docker run firstyear/389ds:latest

If you want your data to persist, you need to attach a volume at /data:

docker volume create 389ds_data
docker run -v 389ds_data:/data firstyear/389ds:latest

The image exposes ports 3389 and 3636, so you may want to consider publishing these if you want external access.

The container should now setup and run an instance! That’s it, LDAP has never been easier to deploy!

Actually Adding Some Data …

LDAP only really matters if we have some data! So we’ll create a new backend. You need to run these instructions inside the current container, so I prefix these with:

docker exec -i -t <name of container> <command>
docker exec -i -t 389inst dsconf ....

This uses the ldapi socket via /data, and authenticates you based on your process uid to map you to the LDAP administrator account - basically, it’s secure, host only admin access to your data.

Now you can choose any suffix you like, generally based on your dns name (IE I use dc=blackhats,dc=net,dc=au).

dsconf localhost backend create --suffix dc=example,dc=com --be-name userRoot
> The database was sucessfully created

Now fill in the suffix details into your configuration of the container. You’ll need to find where docker stores the volume on your host for this (docker inspect will help you). My location is listed here:

vim /var/lib/docker/volumes/389ds_data/_data/config/container.inf

--> change
# basedn ...
--> to
basedn = dc=example,dc=com

Now you can populate data into that: The dsidm command is our tool to manage users and groups of a backend, and it can provide initialised data which has best-practice aci’s, demo users and groups and starts as a great place for you to build an IDM system.

dsidm localhost initialise

That’s it! You can now see you have a user and a group!

dsidm localhost user list
> demo_user
dsidm localhost group list
> demo_group

You can create your own user:

dsidm localhost user create --uid william --cn William --displayName 'William Brown' --uidNumber 1000 --gidNumber 1000 --homeDirectory /home/william
> Successfully created william
dsidm localhost user get william

It’s trivial to add an ssh key to the user:

dsidm localhost user modify william add:nsSshPublicKey:AAA...
> Successfully modified uid=william,ou=people,dc=example,dc=com

Or to add them to a group:

dsidm localhost group add_member demo_group uid=william,ou=people,dc=example,dc=com
> added member: uid=william,ou=people,dc=example,dc=com
dsidm localhost group members demo_group
> dn: uid=william,ou=people,dc=example,dc=com

Finally, we can even generale config templates for your applications:

dsidm localhost client_config sssd.conf
dsidm localhost client_config ldap.conf
dsidm localhost client_config display

I’m happy to say, LDAP administration has never been easier - we plan to add more functionality to enabled broader ranges of administrative tasks, especially in the IDM area and management of the configuration. It’s honestly hard to beleve that in a shortlist of commands you can now have a fully functional LDAP IDM solution working.

Fri, 05 Jul 2019 00:00:00 +1000 <![CDATA[Implementing Webauthn - a series of complexities …]]> Implementing Webauthn - a series of complexities …

I have recently started to work on a rust webauthn library, to allow servers to be implemented. However, in this process I have noticed a few complexities to an API that should have so much promise for improving the state of authentication. So far I can say I have not found any cryptographic issues, but the design of the standard does raise questions about the ability for people to correctly implement Webauthn servers.

Odd structure decisions

Webauth is made up of multiple encoding standards. There is a good reason for this, which is that the json parts are for the webbrowser, and the cbor parts are for ctap and the authenticator device.

However, I quickly noticed an issue in the Attestation Object, as described here . Can you see the problem?

The problem is that the Authenticator Data relies on hand-parsing bytes, and has two structures that are concatenated with no length. This means:

  • You have to hand parse bytes 0 -> 36
  • You then have to CBOR deserialise the Attested Cred Data (if present)
  • You then need to serialise the ACD back to bytes and record that length (if your library doesn’t tell you how long the amount of data is parsed was).
  • Then you need to CBOR deserialise the Extensions.

What’s more insulting about this situation is that the Authenticator Data literally is part of the AttestationObject which is already provided as CBOR! There seems to be no obvious reason for this to require hand-parsing, as the Authenticator Data which will be signature checked, has it’s byte form checked, so you could have the AttestationObject store authDataBytes, then you can CBOR decode the nested structure (allowing the hashing of the bytes later).

There are many risks here because now you have requirements to length check all the parameters which people could get wrong - when CBOR would handle this correctly for you you and provides a good level of correctness that the structure is altered. I also trust the CBOR parser authors to do proper length checks too compared to my crappy byte parsing code!

Confusing Naming Conventions and Layout

The entire standard is full of various names and structures, which are complex, arbitrarily nested and hard to see why they are designed this way. Perhaps it’s a legacy compatability issue? More likely I think it’s object-oriented programming leaking into the specification, which is a paradigm that is not universally applicable.

Regardless, it would be good if the structures were flatter, and named better. There are many confusing structure names throughout the standard, and it can sometimes be hard to identify what you require and don’t require.

Additionally, naming of fields and their use, uses abbrivations to save bandwidth, but makes it hard to follow. I did honestly get confused about the difference between rp (the relying party name) and rp_id, where the challenge provides rp, and the browser response use rp_id.

It can be easy to point fingers and say “ohh William, you’re just not reading it properly and are stupid”. Am I? Or is it that humans find it really hard to parse data like this, and our brains are better suited to other tasks? Human factors are important to consider in specification design both in naming of values, consistency of their use, and appropriate communication as to how they are used properly. I’m finding this to be a barrier to correct implementation now (especially as the signature verification section is very fragmented and hard to follow …).

Crypto Steps seem complex or too static

There are a lot of possible choices here - there are 6 attestation formats and 5 attestation types. As some formats only do some types, there are then 11 verification paths you need to implement for all possible authenticators. I think this level of complexity will lead to mistakes over a large number of possible code branch paths, or lacking support for some device types which people may not have access to.

I think it may have been better to limit the attestation format to one, well defined format, and within that to limit the attestation types available to suit a more broad range of uses.

It feels a lot like these choice are part of some internal Google/MS/Other internal decisions for high security devices, or custom deviges, which will be internally used. It’s leaked into the spec and it raises questions about the ability for people to meaningfully implement the full specification for all possible devices, let alone correctly.

Some parts even omit details in a cryptographic operation, such as here in verification step 2, it doesn’t even list what format the bytes are. (Hint: it’s DER x509).

What would I change?

  • Be more specific

There should be no assumptions about format types, what is in bytes. Be verbose, detailed and without ambiguity.

  • Use type safe, length checked structures.

I would probably make the entire thing a single CBOR structure which contains other nested structures as required. We should never have to hand-parse bytes in 2019, especially when there is a great deal of evidence to show the risks of expecting people to do this.

  • Don’t assume object orientation

I think simpler, flatter structures in the json/cbor would have helped, and been clearer to implement, rather than the really complex maze of types currently involved.


Despite these concerns, I still think webauthn is a really good standard, and I really do think it will become the future of authentication. I’m hoping to help make that a reality in opensource and I hope that in the future I can contribute to further development and promotion of webauthn.

Sun, 28 Apr 2019 00:00:00 +1000 <![CDATA[The Case for Ethics in OpenSource]]> The Case for Ethics in OpenSource

For a long time there have been incidents in technology which have caused negative effects on people - from leaks of private data, to interfaces that are not accessible, to even issues like UI’s doing things that may try to subvert a persons intent. I’m sure there are many more: and we could be here all day listing the various issues that exist in technology, from small to great.

The theme however is that these issues continue to happen: we continue to make decisions in applications that can have consequences to humans.

Software is pointless without people. People create software, people deploy software, people interact with software, and even software indirectly can influence people’s lives. At every layer people exist, and all software will affect them in some ways.

I think that today, we have made a lot of progress in our communities around the deployment of code’s of conduct. These are great, and really help us to discuss the decisions and actions we take within our communities - with the people who create the software. I would like this to go further, where we can have a framework to discuss the effect of software on people that we write: the people that deploy, interact with and are influenced by our work.


I’m not a specialist in ethics or morality: I’m not a registered or certified engineer in the legal sense. Finally, like all humans I am a product of my experiences which causes all my view points to be biased through the lens of my experience.

Additionally, I specialise in Identity Management software, so many of the ideas and issues I have encountered are really specific to this domain - which means I may overlook the issues in other areas. I also have a “security” mindset which also factors into my decisions too.

Regardless I hope that this is a starting point to recieve further input and advice from others, and a place where we can begin to improve.

The Problem

TODO: Discuss data handling practices

Let’s consider some issues and possible solutions in work that I’m familiar with - identity management software. Lets list a few “features”. (Please don’t email me about how these are wrong, I know they are …)

  • Storing usernames as first and last name
  • Storing passwords in cleartext.
  • Deleting an account sets a flag to mark deletion
  • Names are used as the primary key
  • We request sex on signup
  • To change account details, you have to use a command line tool

Now “technically”, none of these decisions are incorrect at all. There is literally no bad technical decision here, and everything is “technically correct” (not always the best kind of correct).

What do we want to achieve?

There are lots of different issues here, but really want to prevent harm to a person. What is harm? Well that’s a complex topic. To me, it could be emotional harm, disrespect of their person, it could be a feeling of a lack of control.

I don’t believe it’s correct to dictate a set of rules that people should follow. People will be fatigued, and will find the process too hard. We need to trust that people can learn and want to improve. Instead I believe it’s important we provide important points that people should be able to consider in a discussion around the development of software. The same way we discuss technical implementation details, we should discuss potential human impact in every change we have. To realise this, we need a short list of important factors that relate to humans.

I think the following points are important to consider when designing software. These relate to general principles which I have learnt and researched.

People should be respected to have:

  • Informed consent
  • Choice over how they are identified
  • Ability to be forgotten
  • Individual Autonomy
  • Free from Harmful Discrimination
  • Privacy
  • Ability to meaningfully access and use software

There is already some evidence in research papers to show that there are strong reasons for moral positions in software. For example, to prevent harm to come to people, to respect peoples autonomy and to conform to privacy legislation ( source ).

Let’s apply these

Given our set of “features”, lets now discuss these with the above points in mind.

  • Storing usernames as first and last name

This point clearly is in violation of the ability to choose how people are identified - some people may only have a single name, some may have multiple family names. On a different level this also violates the harmful discrimination rule due to the potential to disrespect individuals with cultures that have different name schemes compared to western/English societies.

A better way to approach this is “displayName” as a freetext UTF8 case sensitive field, and to allow substring search over the content (rather than attempting to sort by first/last name which also has a stack of issues).

  • Storing passwords in cleartext.

This one is a violation of privacy, that we risk the exposure of a password which may have been reused (we can’t really stop password reuse, we need to respect human behaviour). Not only that some people may assume we DO hash these correctly, so we actually are violating informed consent as we didn’t disclose the method of how we store these details.

A better thing here is to hash the password, or at least to disclose how it will be stored and used.

  • Deleting an account sets a flag to mark deletion

This violates the ability to be forgotten, because we aren’t really deleting the account. It also breaks informed consent, because we are being “deceptive” about what our software is actually doing compared to the intent of the users request

A better thing is to just delete the account, or if not possible, delete all user data and leave a tombstone inplace that represents “an account was here, but no details associated”.

  • Names are used as the primary key

This violates choice over identification, especially for women who have a divorce, or individuals who are transitioning or just people who want to change their name in general. The reason for the name change doesn’t matter - what matters is we need to respect peoples right to identification.

A better idea is to use UUID/ID numbers as a primary key, and have name able to be changed at any point in time.

  • We request sex on signup

Violates a privacy as a first point - we probably have no need for the data unless we are a medical application, so we should never ask for this at all. We also need to disclose why we need this data to satisfy informed consent, and potentially to allow them to opt-out of providing the data. Finally (if we really require this), to not violate self identification, we need to allow this to be a free-text field rather than a Male/Female boolean. This is not just in respect of individuals who are LGBTQI+, but the reality that there are biologically people who medically are neither. We also need to allow this to be changed at any time in the future. This in mind Sex and Gender are different concepts, so we should be careful which we request - Sex is the medical term of a person’s genetics, and Gender is who the person identifies as.

Not only this, because this is a very personal piece of information, we must disclose how we protect this information from access, who can see it, and if or how we’ll ever share it with other systems or authorities.

Generally, we probably don’t need to know, so don’t ask for it at all.

  • To change account details, you have to use a command line tool

This violates a users ability to meaningfully access and use software - remember, people come from many walks of life and all have different skill sets, but using command line tools is not something we can universally expect.

A proper solution here is at minimum a web/graphical self management portal that is easy to access and follows proper UX/UI design rules, and for a business deploying, a service desk with humans involved that can support and help people change details on their account on their behalf if the person is unable to self-support via the web service.


I think that OpenSource should aim to have a code of ethics - the same way we have a code of conduct to guide our behaviour internally to a project, we should have a framework to promote discussion of people’s rights that use, interact with and are affected by our work. We should not focus on technical matters only, but should be promoting people at the core of all our work. Every decision we make is not just technical, but social.

I’m sure that there are more points that could be considere than what I have listed here: I’d love to hear feedback to william at Thanks!

Sun, 28 Apr 2019 00:00:00 +1000 <![CDATA[Using Rust Generics to Enforce DB Record State]]> Using Rust Generics to Enforce DB Record State

In a database, entries go through a lifecycle which represents what attributes they have have, db record keys, and if they have conformed to schema checking.

I’m currently working on a (private in 2019, public in july 2019) project which is a NoSQL database writting in Rust. To help us manage the correctness and lifecycle of database entries, I have been using advice from the Rust Embedded Group’s Book.

As I have mentioned in the past, state machines are a great way to design code, so let’s plot out the state machine we have for Entries:

Entry State Machine

The lifecyle is:

  • A new entry is submitted by the user for creation
  • We schema check that entry
  • If it passes schema, we commit it and assign internal ID’s
  • When we search the entry, we retrieve it by internal ID’s
  • When we modify the entry, we need to recheck it’s schema before we commit it back
  • When we delete, we just remove the entry.

This leads to a state machine of:

             (create operation)
            [ New + Invalid ] -(schema check)-> [ New + Valid ]
                                               (send to backend)
                                                      v    v-------------\
[Commited + Invalid] <-(modify operation)- [ Commited + Valid ]          |
          |                                          ^   \       (write to backend)
          \--------------(schema check)-------------/     ---------------/

This is a bit rough - The version on my whiteboard was better :)

The main observation is that we are focused only on the commitability and validty of entries - not about where they are or if the commit was a success.

Entry Structs

So to make these states work we have the following structs:

struct EntryNew;
struct EntryCommited;

struct EntryValid;
struct EntryInvalid;

struct Entry<STATE, VALID> {
    state: STATE,
    valid: VALID,
    // Other db junk goes here :)

We can then use these to establish the lifecycle with functions (similar) to this:

impl Entry<EntryNew, EntryInvalid> {
    fn new() -> Self {
        Entry {
            state: EntryNew,
            valid: EntryInvalid,


impl<STATE> Entry<STATE, EntryInvalid> {
    fn validate(self, schema: Schema) -> Result<Entry<STATE, EntryValid>, ()> {
        if schema.check(self) {
            Ok(Entry {
                state: self.state,
                valid: EntryValid,
        } else {

    fn modify(&mut self, ...) {
        // Perform any modifications on the entry you like, only works
        // on invalidated entries.

impl<STATE> Entry<STATE, EntryValid> {
    fn seal(self) -> Entry<EntryCommitted, EntryValid> {
        // Assign internal id's etc
        Entry {
            state: EntryCommited,
            valid: EntryValid,

    fn compare(&self, other: Entry<STATE, EntryValid>) -> ... {
        // Only allow compares on schema validated/normalised
        // entries, so that checks don't have to be schema aware
        // as the entries are already in a comparable state.

impl Entry<EntryCommited, EntryValid> {
    fn invalidate(self) -> Entry<EntryCommited, EntryInvalid> {
        // Invalidate an entry, to allow modifications to be performed
        // note that modifications can only be applied once an entry is created!
        Entry {
            state: self.state,
            valid: EntryInvalid,

What this allows us to do importantly is to control when we apply search terms, send entries to the backend for storage and more. Benefit is this is compile time checked, so you can never send an entry to a backend that is not schema checked, or run comparisons or searches on entries that aren’t schema checked, and you can even only modify or delete something once it’s created. For example other parts of the code now have:

impl BackendStorage {
    // Can only create if no db id's are assigned, IE it must be new.
    fn create(&self, ..., entry: Entry<EntryNew, EntryValid>) -> Result<...> {

    // Can only modify IF it has been created, and is validated.
    fn modify(&self, ..., entry: Entry<EntryCommited, EntryValid>) -> Result<...> {

    // Can only delete IF it has been created and committed.
    fn delete(&self, ..., entry: Entry<EntryCommited, EntryValid>) -> Result<...> {

impl Filter<STATE> {
    // Can only apply filters (searches) if the entry is schema checked. This has an
    // important behaviour, where we can schema normalise. Consider a case-insensitive
    // type, we can schema-normalise this on the entry, then our compare can simply
    // be a, because we assert both entries *must* have been through
    // the normalisation routines!
    fn apply_filter(&self, ..., entry: &Entry<STATE, EntryValid>) -> Result<bool, ...> {

Using this with Serde?

I have noticed that when we serialise the entry, that this causes the valid/state field to not be compiled away - because they have to be serialised, regardless of the empty content meaning the compiler can’t eliminate them.

A future cleanup will be to have a serialised DBEntry form such as the following:

struct DBEV1 {
    // entry data here

enum DBEntryVersion {

struct DBEntry {
    data: DBEntryVersion

impl From<Entry<EntryNew, EntryValid>> for DBEntry {
    fn from(e: Entry<EntryNew, EntryValid>) -> Self {
        // assign db id's, and return a serialisable entry.

impl From<Entry<EntryCommited, EntryValid>> for DBEntry {
    fn from(e: Entry<EntryCommited, EntryValid>) -> Self {
        // Just translate the entry to a serialisable form

This way we still have the zero-cost state on Entry, but we are able to move to a versioned seralised structure, and we minimise the run time cost.

Testing the Entry

To help with testing, I needed to be able to shortcut and move between anystate of the entry so I could quickly make fake entries, so I added some unsafe methods:

unsafe fn to_new_valid(self, Entry<EntryNew, EntryInvalid>) -> {
    Entry {
        state: EntryNew,
        valid: EntryValid

These allow me to setup and create small unit tests where I may not have a full backend or schema infrastructure, so I can test specific aspects of the entries and their lifecycle. It’s limited to test runs only, and marked unsafe. It’s not “technically” memory unsafe, but it’s unsafe from the view of “it could absolutely mess up your database consistency guarantees” so you have to really want it.


Using statemachines like this, really helped me to clean up my code, make stronger assertions about the correctness of what I was doing for entry lifecycles, and means that I have more faith when I and future-contributors will work on the code base that we’ll have compile time checks to ensure we are doing the right thing - to prevent data corruption and inconsistency.

Sat, 13 Apr 2019 00:00:00 +1000 <![CDATA[Debugging MacOS bluetooth audio stutter]]> Debugging MacOS bluetooth audio stutter

I was noticing that audio to my bluetooth headphones from my iPhone was always flawless, but I started to noticed stutter and drops from my MBP. After exhausting some basic ideas, I was stumped.

To the duck duck go machine, and I searched for issues with bluetooth known issues. Nothing appeared.

However, I then decided to debug the issue - thankfully there was plenty of advice on this matter. Press shift + option while clicking bluetooth in the menu-bar, and then you have a debug menu. You can also open and search for “bluetooth” to see all the bluetooth related logs.

I noticed that when the audio stutter occured that the following pattern was observed.

default     11:25:45.840532 +1000   wirelessproxd   About to scan for type: 9 - rssi: -90 - payload: <00000000 00000000 00000000 00000000 00000000 0000> - mask: <00000000 00000000 00000000 00000000 00000000 0000> - peers: 0
default     11:25:45.840878 +1000   wirelessproxd   Scan options changed: YES
error       11:25:46.225839 +1000   bluetoothaudiod Error sending audio packet: 0xe00002e8
error       11:25:46.225899 +1000   bluetoothaudiod Too many outstanding packets. Drop packet of 8 frames (total drops:451 total sent:60685 percentDropped:0.737700) Outstanding:17

There was always a scan, just before the stutter initiated. So what was scanning?

I searched for the error related to packets, and there were a lot of false leads. From weird apps to dodgy headphones. In this case I could eliminate both as the headphones worked with other devices, and I don’t have many apps installed.

So I went back and thought about what macOS services could be the problem, and I found that airdrop would scan periodically for other devices to send and recieve files. Disabling Airdrop from the sharing menu in System Prefrences cleared my audio right up.

UPDATE 2019-12-20: It looks like the Airdrop sharing option in system preferences has been removed in 10.15, so I don’t believe it’s possible to resove this issue now - audio stutter forever!

Mon, 08 Apr 2019 00:00:00 +1000 <![CDATA[GDB autoloads for 389 DS]]> GDB autoloads for 389 DS

I’ve been writing a set of extensions to help debug 389-ds a bit easier. Thanks to the magic of python, writing GDB extensions is really easy.

On OpenSUSE, when you start your DS instance under GDB, all of the extensions are automatically loaded. This will help make debugging a breeze.

zypper in 389-ds gdb
gdb /usr/sbin/ns-slapd
GNU gdb (GDB; openSUSE Tumbleweed) 8.2
(gdb) ds-
ds-access-log  ds-backtrace
(gdb) set args -d 0 -D /etc/dirsrv/slapd-<instance name>
(gdb) run

All the extensions are under the ds- namespace, so they are easy to find. There are some new ones on the way, which I’ll discuss here too:


As DS is a multithreaded process, it can be really hard to find the active thread involved in a problem. So we provided a command that knows how to fold duplicated stacks, and to highlight idle threads that you can (generally) skip over.

Thread 37 (LWP 70054))
Thread 36 (LWP 70053))
Thread 35 (LWP 70052))
Thread 34 (LWP 70051))
Thread 33 (LWP 70050))
Thread 32 (LWP 70049))
Thread 31 (LWP 70048))
Thread 30 (LWP 70047))
Thread 29 (LWP 70046))
Thread 28 (LWP 70045))
Thread 27 (LWP 70044))
Thread 26 (LWP 70043))
Thread 25 (LWP 70042))
Thread 24 (LWP 70041))
Thread 23 (LWP 70040))
Thread 22 (LWP 70039))
Thread 21 (LWP 70038))
Thread 20 (LWP 70037))
Thread 19 (LWP 70036))
Thread 18 (LWP 70035))
Thread 17 (LWP 70034))
Thread 16 (LWP 70033))
Thread 15 (LWP 70032))
Thread 14 (LWP 70031))
Thread 13 (LWP 70030))
Thread 12 (LWP 70029))
Thread 11 (LWP 70028))
Thread 10 (LWP 70027))
#0  0x00007ffff65db03c in pthread_cond_wait@@GLIBC_2.3.2 () at /lib64/
#1  0x00007ffff66318b0 in PR_WaitCondVar () at /usr/lib64/
#2  0x00000000004220e0 in [IDLE THREAD] connection_wait_for_new_work (pb=0x608000498020, interval=4294967295) at /home/william/development/389ds/ds/ldap/servers/slapd/connection.c:970
#3  0x0000000000425a31 in connection_threadmain () at /home/william/development/389ds/ds/ldap/servers/slapd/connection.c:1536
#4  0x00007ffff6637484 in None () at /usr/lib64/
#5  0x00007ffff65d4fab in start_thread () at /lib64/
#6  0x00007ffff6afc6af in clone () at /lib64/

This example shows that there are 17 idle threads (look at frame 2) here, that all share the same trace.


The access log is buffered before writing, so if you have a coredump, and want to see the last few events before they were written to disk, you can use this to display the content:

(gdb) ds-access-log
===== BEGIN ACCESS LOG =====
$2 = 0x7ffff3c3f800 "[03/Apr/2019:10:58:42.836246400 +1000] conn=1 fd=64 slot=64 connection from to
[03/Apr/2019:10:58:42.837199400 +1000] conn=1 op=0 BIND dn=\"\" method=128 version=3
[03/Apr/2019:10:58:42.837694800 +1000] conn=1 op=0 RESULT err=0 tag=97 nentries=0 etime=0.0001200300 dn=\"\"
[03/Apr/2019:10:58:42.838881800 +1000] conn=1 op=1 SRCH base=\"\" scope=2 filter=\"(objectClass=*)\" attrs=ALL
[03/Apr/2019:10:58:42.839107600 +1000] conn=1 op=1 RESULT err=32 tag=101 nentries=0 etime=0.0001070800
[03/Apr/2019:10:58:42.840687400 +1000] conn=1 op=2 UNBIND
[03/Apr/2019:10:58:42.840749500 +1000] conn=1 op=2 fd=64 closed - U1
", '\276' <repeats 3470 times>

At the end the line that repeats shows the log is “empty” in that segment of the buffer.


This command shows the in-memory entry. It can be common to see Slapi_Entry * pointers in the codebase, so being able to display these is really helpful to isolate what’s occuring on the entry. Your first argument should be the Slapi_Entry pointer.

(gdb) ds-entry-print ec
Display Slapi_Entry: cn=config
cn: config
objectClass: top
objectClass: extensibleObject
objectClass: nsslapdConfig
nsslapd-schemadir: /opt/dirsrv/etc/dirsrv/slapd-standalone1/schema
nsslapd-lockdir: /opt/dirsrv/var/lock/dirsrv/slapd-standalone1
nsslapd-tmpdir: /tmp
nsslapd-certdir: /opt/dirsrv/etc/dirsrv/slapd-standalone1
Wed, 03 Apr 2019 00:00:00 +1000 <![CDATA[Programming Lessons and Methods]]> Programming Lessons and Methods

Everyone has their own lessons and methods that they use when they approaching programming. These are the lessons that I have learnt, which I think are the most important when it comes to design, testing and communication.

Comments and Design

Programming is the art of writing human readable code, that a machine will eventually run. Your program needs to be reviewed, discussed and parsed by another human. That means you need to write your program in a way they can understand first.

Rather than rushing into code, and hacking until it works, I find it’s great to start with comments such as:

fn data_access(search: Search) -> Type {
    // First check the search is valid
    //  * No double terms
    //  * All schema is valid

    // Retrieve our data based on the search

    // if debug, do an un-indexed assert the search matches

    // Do any need transform

    // Return the data

After that, I walk away, think about the issue, come back, maybe tweak these comments. When I eventually fill in the code inbetween, I leave all the comments in place. This really helps my future self understand what I was thinking, but it also helps other people understand too.

State Machines

State machines are a way to design and reason about the states a program can be in. They allow exhaustive represenations of all possible outcomes of a function. A simple example is a microwave door.

  /----\            /----- close ----\          /-----\
  |     \          /                 v         v      |
  |    -------------                ---------------   |
open   | Door Open |                | Door Closed |  close
  |    -------------                ---------------   |
  |    ^          ^                  /          \     |
  \---/            \------ open ----/            \----/

When the door is open, opening it again does nothing. Only when the door is open, and we close the door (and event), does the door close (a transition). Once closed, the door can not be closed any more (event does nothing). It’s when we open the door now, that a state change can occur.

There is much more to state machines than this, but they allow us as humans to reason about our designs and model our programs to have all possible outcomes considered.

Zero, One and Infinite

In mathematics there are only three numbers that matter. Zero, One and Infinite. It turns out the same is true in a computer too.

When we are making a function, we can define limits in these terms. For example:

fn thing(argument: Type)

In this case, argument is “One” thing, and must be one thing.

fn thing(argument: Option<Type>)

Now we have argument as an option, so it’s “Zero” or “One”.

fn thing(argument: Vec<Type>)

Now we have argument as vec (array), so it’s “Zero” to “Infinite”.

When we think about this, our functions have to handle these cases properly. We don’t write functions that take a vec with only two items, we write a function with two arguments where each one must exist. It’s hard to handle “two” - it’s easy to handle two cases of “one”.

It also is a good guide for how to handle data sets, assuming they could always be infinite in size (or at least any arbitrary size).

You can then apply this to tests. In a test given a function of:

fn test_me(a: Option<Type>, b: Vec<Type>)

We know we need to test permutations of:

  • a is “Zero” or “One” (Some, None)
  • b is “Zero”, “One” or “Infinite” (.len() == 0, .len() == 1, .len() > 0)

Note: Most languages don’t have an array type that is “One to Infinite”, IE non-empty. If you want this condition (at least one item), you have to assert it yourself ontop of the type system.

Correct, Simple, Fast

Finally, we can put all these above tools together and apply a general philosophy. When writing a program, first make it correct, then simpify the program, then make it fast.

If you don’t do it in this order you will hit barriers - social and technical. For example, if you make something fast, simple, correct, you will likely have issues that can be fixed without making a decrease in performance. People don’t like it when you introduce a patch that drops performance, so as a result correctness is now sacrificed. (Spectre anyone?)

If you make something too simple, you may never be able to make it correctly handle all cases that exist in your application - likely facilitating a future rewrite to make it correct.

If you do correct, fast, simple, then your program will be correct, and fast, but hard for a human to understand. Because programming is the art of communicating intent to a person sacrificing simplicity in favour of fast will make it hard to involve new people and educate and mentor them into development of your project.

  • Correct: Does it behave correctly, handle all states and inputs correctly?
  • Simple: Is it easy to comprehend and follow for a human reader?
  • Fast: Is it performant?
Tue, 26 Feb 2019 00:00:00 +1000 <![CDATA[Meaningful 2fa on modern linux]]> Meaningful 2fa on modern linux

Recently I heard of someone asking the question:

“I have an AD environment connected with <product> IDM. I want to have 2fa/mfa to my linux machines for ssh, that works when the central servers are offline. What’s the best way to achieve this?”

Today I’m going to break this down - but the conclusion for the lazy is:

This is not realistically possible today: use ssh keys with ldap distribution, and mfa on the workstations, with full disk encryption.


So there are a few parts here. AD is for intents and purposes an LDAP server. The <product> is also an LDAP server, that syncs to AD. We don’t care if that’s 389-ds, freeipa or vendor solution. The results are basically the same.

Now the linux auth stack is, and will always use pam for the authentication, and nsswitch for user id lookups. Today, we assume that most people run sssd, but pam modules for different options are possible.

There are a stack of possible options, and they all have various flaws.

  • FreeIPA + 2fa
  • PAM TOTP modules
  • PAM radius to a TOTP server
  • Smartcards

FreeIPA + 2fa

Now this is the one most IDM people would throw out. The issue here is the person already has AD and a vendor product. They don’t need a third solution.

Next is the fact that FreeIPA stores the TOTP in the LDAP, which means FreeIPA has to be online for it to work. So this is eliminated by the “central servers offline” requirement.

PAM radius to TOTP server

Same as above: An extra product, and you have a source of truth that can go down.

PAM TOTP module on hosts

Okay, even if you can get this to scale, you need to send the private seed material of every TOTP device that could login to the machine, to every machine. That means any compromise, compromises every TOTP token on your network. Bad place to be in.


Are notoriously difficult to have functional, let alone with SSH. Don’t bother. (Where the Smartcard does TLS auth to the SSH server this is.)

Come on William, why are you so doom and gloom!

Lets back up for a second and think about what we we are trying to prevent by having mfa at all. We want to prevent single factor compromise from having a large impact and we want to prevent brute force attacks. (There are probably more reasons, but these are the ones I’ll focus on).

So the best answer: Use mfa on the workstation (password + totp), then use ssh keys to the hosts.

This means the target of the attack is small, and the workstation can be protected by things like full disk encryption and group policy. To sudo on the host you still need the password. This makes sudo MFA to root as you need something know, and something you have.

If you are extra conscious you can put your ssh keys on smartcards. This works on linux and osx workstations with yubikeys as I am aware. Apparently you can have ssh keys in TPM, which would give you tighter hardware binding, but I don’t know how to achieve this (yet).

To make all this better, you can distributed your ssh public keys in ldap, which means you gain the benefits of LDAP account locking/revocation, you can remove the keys instantly if they are breached, and you have very little admin overhead to configuration of this service on the linux server side. Think about how easy onboarding is if you only need to put your ssh key in one place and it works on every server! Let alone shutting down a compromised account: lock it in one place, and they are denied access to every server.

SSSD as the LDAP client on the server can also cache the passwords (hashed) and the ssh public keys, which means a disconnected client will still be able to be authenticated to.

At this point, because you have ssh key auth working, you could even deny password auth as an option in ssh altogether, eliminating an entire class of bruteforce vectors.

For bonus marks: You can use AD as the generic LDAP server that stores your SSH keys. No additional vendor products needed, you already have everything required today, for free. Everyone loves free.


If you want strong, offline capable, distributed mfa on linux servers, the only choice today is LDAP with SSH key distribution.

Want to know more? This blog contains how-tos on SSH key distribution for AD, SSH keys on smartcards, and how to configure SSSD to use SSH keys from LDAP.

Tue, 12 Feb 2019 00:00:00 +1000 <![CDATA[Using the latest 389-ds on OpenSUSE]]> Using the latest 389-ds on OpenSUSE

Thanks to some help from my friend who works on OBS, I’ve finally got a good package in review for submission to tumbleweed. However, if you are impatient and want to use the “latest” and greatest 389-ds version on OpenSUSE.

zypper ar obs://network:ldap network:ldap
zypper in 389-ds


docker run --rm -i -t

To make it persistent:

docker run -v 389ds_data:/data <your options here ...>

Then to run the admin tools:

docker exec -i -t <container name> /usr/sbin/dsconf ...
docker exec -i -t <container name> /usr/sbin/dsidm ...

Testing in docker?

If you are “testing” in docker (please don’t do this in production: for production see above) you’ll need to do some tweaks to get around the lack of systemd.

docker run -i -t opensuse/tumbleweed:latest
zypper ar obs://network:ldap network:ldap
zypper in 389-ds

vim /usr/share/dirsrv/inf/defaults.inf
# change the following to match:
with_systemd = 0

What next?

After this, you should now be able to follow our new quickstart guide on the 389-ds website.

If you followed the docker steps, skip to adding users and groups

The network:ldap repo and the container listed are updated when upstream makes releases so you’ll always get the latest 389-ds

EDIT: Updated 2019-04-03 to change repo as changes have progressed forward.

EDIT: Updated 2019-08-27 Improve clarity about when you need to do docker tweaks, and add docker image steps

Wed, 30 Jan 2019 00:00:00 +1000 <![CDATA[SUSE Open Build Service cheat sheet]]> SUSE Open Build Service cheat sheet

Part of starting at SUSE has meant that I get to learn about Open Build Service. I’ve known that the project existed for a long time but I have never had a chance to use it. So far I’m thoroughly impressed by how it works and the features it offers.

As A Consumer

The best part of OBS is that it’s trivial on OpenSUSE to consume content from it. Zypper can add projects with the command:

zypper ar obs://<project name> <repo nickname>
zypper ar obs://network:ldap network:ldap

I like to give the repo nickname (your choice) to be the same as the project name so I know what I have enabled. Once you run this you can easily consume content from OBS.

Package Management

As someone who has started to contribute to the suse 389-ds package, I’ve been slowly learning how this work flow works. OBS similar to GitHub/Lab allows a branching and request model.

On OpenSUSE you will want to use the osc tool for your workflow:

zypper in osc
# If you plan to use the "service" command
zypper in obs-service-tar obs-service-obs_scm obs-service-recompress obs-service-set_version obs-service-download_files python-xml obs-service-format_spec_file

You can branch from an existing project to make changes with:

osc branch <project> <package>
osc branch network:ldap 389-ds

This will branch the project to my home namespace. For me this will land in “home:firstyear:branches:network:ldap”. Now I can checkout the content on to my machine to work on it.

osc co <project>
osc co home:firstyear:branches:network:ldap

This will create the folder “home:…:ldap” in the current working directory.

From here you can now work on the project. Some useful commands are:

Add new files to the project (patches, new source tarballs etc).

osc add <path to file>
osc add feature.patch
osc add new-source.tar.xz

Edit the change log of the project (I think this is used in release notes?)

osc vc

To ammend your changes, use:

osc vc -e

Build your changes locally matching the system you are on. Packages normally build on all/most OpenSUSE versions and architectures, this will build just for your local system and arch.

osc build

Make sure you clean up files you aren’t using any more with:

osc rm <filename>
# This commands removes anything untracked by osc.
osc clean

Commit your changes to the OBS server, where a complete build will be triggered:

osc commit

View the results of the last commit:

osc results

Enable people to use your branch/project as a repository. You edit the project metadata and enable repo publishing:

osc meta prj -e <name of project>
osc meta prj -e home:firstyear:branches:network:ldap

# When your editor opens, change this section to enabled (disabled by default):
  <enabled />

NOTE: In some cases if you have the package already installed, and you add the repo/update it won’t install from your repo. This is because in SUSE packages have a notion of “vendoring”. They continue to update from the same repo as they were originally installed from. So if you want to change this you use:

zypper [d]up --from <repo name>

You can then create a “request” to merge your branch changes back to the project origin. This is:

osc sr

A helpful maintainer will then review your changes. You can see this with.

osc rq show <your request id>

If you change your request, to submit again, use:

osc sr

And it will ask if you want to replace (supercede) the previous request.

I was also helped by a friend to provie a “service” configuration that allows generation of tar balls from git. It’s not always appropriate to use this, but if the repo has a “_service” file, you can regenerate the tar with:

osc service ra

So far this is as far as I have gotten with OBS, but I already appreciate how great this work flow is for package maintainers, reviewers and consumers. It’s a pleasure to work with software this well built.

As an additional piece of information, it’s a good idea to read the OBS Packaging Guidelines
to be sure that you are doing the right thing!
# Acts as tail
osc bl osc r -v

# How to access the meta and docker stuff

osc meta pkg -e osc meta prj -e

(will add vim to the buildroot), then you can chroot (allow editing in the build root)

osc chroot osc build -x vim

-k <dir> keeps artifacts in directory dir IE rpm outputs

oscrc buildroot variable, mount tmpfs to that location.

docker privs SYS_ADMIN, SYS_CHROOT

Multiple spec: commit second spec to pkg then:

osc linkpac prj pkg prj new-link-pkg

rpm –eval ‘%{variable}’

%setup -n “name of what the directory unpacks to, not what to rename to”

When no link exists

osc submitpac destprj deskpkg
Sat, 19 Jan 2019 00:00:00 +1000 <![CDATA[Structuring Rust Transactions]]> Structuring Rust Transactions

I’ve been working on a database-related project in Rust recently, which takes advantage of my concurrently readable datastructures. However I ran into a problem of how to structure Read/Write transaction structures that shared the reader code, and container multiple inner read/write types.

Some Constraints

To be clear, there are some constraints. A “parent” write, will only ever contain write transaction guards, and a read will only ever contain read transaction guards. This means we aren’t going to hit any deadlocks in the code. Rust can’t protect us from mis-ording locks. An additional requirement is that readers and a single write must be able to proceed simultaneously - but having a rwlock style writer or readers behaviour would still work here.

Some Background

To simplify this, imagine we have two concurrently readable datastructures. We’ll call them db_a and db_b.

struct db_a { ... }

struct db_b { ... }

Now, each of db_a and db_b has their own way to protect their inner content, but they’ll return a DBWriteGuard or DBReadGuard when we call respectively.

impl db_a {
    pub fn read(&self) -> DBReadGuard {

    pub fn write(&self) -> DBWriteGuard {

Now we make a “parent” wrapper transaction such as:

struct server {
    a: db_a,
    b: db_b,

struct server_read {
    a: DBReadGuard,
    b: DBReadGuard,

struct server_write {
    a: DBWriteGuard,
    b: DBWriteGuard,

impl server {
    pub fn read(&self) -> server_read {
        server_read {

    pub fn write(&self) -> server_write {
        server_read {

The Problem

Now the problem is that on my server_read and server_write I want to implement a function for “search” that uses the same code. Search or a read or write should behave identically! I wanted to also avoid the use of macros as the can hide issues while stepping in a debugger like LLDB/GDB.

Often the answer with rust is “traits”, to create an interface that types adhere to. Rust also allows default trait implementations, which sounds like it could be a solution here.

pub trait server_read_trait {
    fn search(&self) -> SomeResult {
        let result_a =;
        let result_b =;
        SomeResult(result_a, result_b)

In this case, the issue is that &self in a trait is not aware of the fields in the struct - traits don’t define that fields must exist, so the compiler can’t assume they exist at all.

Second, the type of self.a/b is unknown to the trait - because in a read it’s a “a: DBReadGuard”, and for a write it’s “a: DBWriteGuard”.

The first problem can be solved by using a get_field type in the trait. Rust will also compile this out as an inline, so the correct thing for the type system is also the optimal thing at run time. So we’ll update this to:

pub trait server_read_trait {
    fn get_a(&self) -> ???;

    fn get_b(&self) -> ???;

    fn search(&self) -> SomeResult {
        let result_a = self.get_a().search(...); // note the change from self.a to self.get_a()
        let result_b = self.get_b().search(...);
        SomeResult(result_a, result_b)

impl server_read_trait for server_read {
    fn get_a(&self) -> &DBReadGuard {
    // get_b is similar, so ommitted

impl server_read_trait for server_write {
    fn get_a(&self) -> &DBWriteGuard {
    // get_b is similar, so ommitted

So now we have the second problem remaining: for the server_write we have DBWriteGuard, and read we have a DBReadGuard. There was a much longer experimentation process, but eventually the answer was simpler than I was expecting. Rust allows traits to have Self types that enforce trait bounds rather than a concrete type.

So provided that DBReadGuard and DBWriteGuard both implement “DBReadTrait”, then we can have the server_read_trait have a self type that enforces this. It looks something like:

pub trait DBReadTrait {
    fn search(&self) -> ...;

impl DBReadTrait for DBReadGuard {
    fn search(&self) -> ... { ... }

impl DBReadTrait for DBWriteGuard {
    fn search(&self) -> ... { ... }

pub trait server_read_trait {
    type GuardType: DBReadTrait; // Say that GuardType must implement DBReadTrait

    fn get_a(&self) -> &Self::GuardType; // implementors must return that type implementing the trait.

    fn get_b(&self) -> &Self::GuardType;

    fn search(&self) -> SomeResult {
        let result_a = self.get_a().search(...);
        let result_b = self.get_b().search(...);
        SomeResult(result_a, result_b)

impl server_read_trait for server_read {
    fn get_a(&self) -> &DBReadGuard {
    // get_b is similar, so ommitted

impl server_read_trait for server_write {
    fn get_a(&self) -> &DBWriteGuard {
    // get_b is similar, so ommitted

This works! We now have a way to write a single “search” type for our server read and write types. In my case, the DBReadTrait also uses a similar technique to define a search type shared between the DBReadGuard and DBWriteGuard.

Sat, 19 Jan 2019 00:00:00 +1000 <![CDATA[Useful USG pro 4 commands and hints]]> Useful USG pro 4 commands and hints

I’ve recently changed from a FreeBSD vm as my router to a Ubiquiti PRO USG4. It’s a solid device, with many great features, and I’m really impressed at how it “just works” in many cases. So far my only disappointment is lack of documentation about the CLI, especially for debugging and auditing what is occuring in the system, and for troubleshooting steps. This post will aggregate some of my knowledge about the topic.

Current config

Show the current config with:

mca-ctrl -t dump-cfg

You can show system status with the “show” command. Pressing ? will cause the current compeletion options to be displayed. For example:

# show <?>
arp              date             dhcpv6-pd        hardware


The following commands show the DNS statistics, the DNS configuration, and allow changing the cache-size. The cache-size is measured in number of records cached, rather than KB/MB. To make this permanent, you need to apply the change to config.json in your controllers sites folder.

show dns forwarding statistics
show system name-server
set service dns forwarding cache-size 10000
clear dns forwarding cache


You can see and aggregate of system logs with

show log

Note that when you set firewall rules to “log on block” they go to dmesg, not syslog, so as a result you need to check dmesg for these.

It’s a great idea to forward your logs in the controller to a syslog server as this allows you to aggregate and see all the events occuring in a single time series (great when I was diagnosing an issue recently).


To show the system interfaces

show interfaces

To restart your pppoe dhcp6c:

release dhcpv6-pd interface pppoe0
renew dhcpv6-pd interface pppoe0

There is a current issue where the firmware will start dhcp6c on eth2 and pppoe0, but the session on eth2 blocks the pppoe0 client. As a result, you need to release on eth2, then renew of pppoe0

If you are using a dynamic prefix rather than static, you may need to reset your dhcp6c duid.

delete dhcpv6-pd duid

To restart an interface with the vyatta tools:

disconnect interface pppoe
connect interface pppoe


I have setup customised OpenVPN tunnels. To show these:

show interfaces openvpn detail

These are configured in config.json with:

# Section: config.json - interfaces - openvpn
    "vtun0": {
            "encryption": "aes256",
            # This assigns the interface to the firewall zone relevant.
            "firewall": {
                    "in": {
                            "ipv6-name": "LANv6_IN",
                            "name": "LAN_IN"
                    "local": {
                            "ipv6-name": "LANv6_LOCAL",
                            "name": "LAN_LOCAL"
                    "out": {
                            "ipv6-name": "LANv6_OUT",
                            "name": "LAN_OUT"
            "mode": "server",
            # By default, ubnt adds a number of parameters to the CLI, which
            # you can see with ps | grep openvpn
            "openvpn-option": [
                    # If you are making site to site tunnels, you need the ccd
                    # directory, with hostname for the file name and
                    # definitions such as:
                    # iroute
                    "--client-config-dir /config/auth/openvpn/ccd",
                    "--keepalive 10 60",
                    "--user nobody",
                    "--group nogroup",
                    "--proto udp",
                    "--port 1195"
            "server": {
                    "push-route": [
                    "subnet": ""
            "tls": {
                    "ca-cert-file": "/config/auth/openvpn/vps/vps-ca.crt",
                    "cert-file": "/config/auth/openvpn/vps/vps-server.crt",
                    "dh-file": "/config/auth/openvpn/dh2048.pem",
                    "key-file": "/config/auth/openvpn/vps/vps-server.key"


Net flows allow a set of connection tracking data to be sent to a remote host for aggregation and analysis. Sadly this process was mostly undocumented, bar some useful forum commentors. Here is the process that I came up with. This is how you configure it live:

set system flow-accounting interface eth3.11
set system flow-accounting netflow server port 6500
set system flow-accounting netflow version 5
set system flow-accounting netflow sampling-rate 1
set system flow-accounting netflow timeout max-active-life 1

To make this persistent:

"system": {
            "flow-accounting": {
                    "interface": [
                    "netflow": {
                            "sampling-rate": "1",
                            "version": "5",
                            "server": {
                                    "": {
                                            "port": "6500"
                            "timeout": {
                                    "max-active-life": "1"

To show the current state of your flows:

show flow-accounting
Wed, 02 Jan 2019 00:00:00 +1000 <![CDATA[The idea of CI and Engineering]]> The idea of CI and Engineering

In software development I see and interesting trend and push towards continuous integration, continually testing, and testing in production. These techniques are designed to allow faster feedback on errors, use real data for application testing, and to deliver features and changes faster.

But is that really how people use software on devices? When we consider an operation like google or amazon, this always online technique may work, but what happens when we apply a continous integration and “we’ll patch it later” mindset to devices like phones or internet of things?

What happens in other disciplines?

In real engineering disciplines like aviation or construction, techniques like this don’t really work. We don’t continually build bridges, then fix them when they break or collapse. There are people who provide formal analysis of materials, their characteristics. Engineers consider careful designs, constraints, loads and situations that may occur. The structure is planned, reviewed and verified mathematically. Procedures and oversight is applied to ensure correct building of the structure. Lessons are learnt from past failures and incidents and are applied into every layer of the design and construction process. Communication between engineers and many other people is critical to the process. Concerns are always addressed and managed.

The first thing to note is that if we just built lots of scale-model bridges and continually broke them until we found their limits, this would waste many resources to do this. Bridges are carefully planned and proven.

So whats the point with software?

Today we still have a mindset that continually breaking and building is a reasonable path to follow. It’s not! It means that the only way to achieve quality is to have a large test suite (requires people and time to write), which has to be further derived from failures (and those failures can negatively affect real people), then we have to apply large amounts of electrical energy to continually run the tests. The test suites can’t even guarantee complete coverage of all situations and occurances!

This puts CI techniques out of reach of many application developers due to time and energy (translated to dollars) limits. Services like travis on github certainly helps to lower the energy requirement, but it doesn’t stop the time and test writing requirements.

No matter how many tests we have for a program, if that program is written in C or something else, we continually see faults and security/stability issues in that software.

What if we CI on … a phone?

Today we even have hardware devices that are approached as though they “test in production” is a reasonable thing. It’s not! People don’t patch, telcos don’t allow updates out to users, and those that are aware, have to do custom rom deployment. This creates an odd dichomtemy of “haves” and “haves not”, of those in technical know how who have a better experience, and the “haves not” who have to suffer potentially insecure devices. This is especially terrifying given how deeply personal phones are.

This is a reality of our world. People do not patch. They do not patch phones, laptops, network devices and more. Even enterprises will avoid patching if possible. Rather than trying to shift the entire culture of humans to “update always”, we need to write software that can cope in harsh conditions, for long term. We only need to look to software in aviation to see we can absolutely achieve this!

What should we do?

I believe that for software developers to properly become software engineers we should look to engineers in civil and aviation industries. We need to apply:

  • Regualation and ethics (Safety of people is always first)
  • Formal verification
  • Consider all software will run long term (5+ years)
  • Improve team work and collaboration on designs and development

The reality of our world is people are deploying devices (routers, networks, phones, lights, laptops more …) where they may never be updated or patched in their service life. Even I’m guilty (I have a modem that’s been unpatched for about 6 years but it’s pretty locked down …). As a result we need to rely on proof that the device can not fail at build time, rather than patch it later which may never occur! Putting formal verification first, and always considering user safety and rights first, shifts a large burden to us in terms of time. But many tools (Coq, fstar, rust …) all make formal verification more accessible to use in our industry. Verifying our software is a far stronger assertion of quality than “throw tests at it and hope it works”.

You’re crazy William, and also wrong

Am I? Looking at “critical” systems like iPhone encryption hardware, they are running the formally verified Sel4. We also heard at Kiwicon in 2018 that Microsoft and XBox are using formal verification to design their low levels of their system to prevent exploits from occuring in the first place.

Over time our industry will evolve, and it will become easier and more cost effective to formally verify than to operate and deploy CI. This doesn’t mean we don’t need tests - it means that the first line of quality should be in verification of correctness using formal techniques rather than using tests and CI to prove correct behaviour. Tests are certainly still required to assert further behavioural elements of software.

Today, if you want to do this, you should be looking at Coq and program extraction, fstar and the kremlin (project everest, a formally verified https stack), Rust (which has a subset of the safe language formally proven). I’m sure there are more, but these are the ones I know off the top of my head.


Over time our industry must evolve to put the safety of humans first. To achive this we must look to other safety driven cultures such as aviation and civil engineering. Only by learning from their strict disciplines and behaviours can we start to provide software that matches behavioural and quality expectations humans have for software.

Wed, 02 Jan 2019 00:00:00 +1000 <![CDATA[Nextcloud and badrequest filesize incorrect]]> Nextcloud and badrequest filesize incorrect

My friend came to my house and was trying to share some large files with my nextcloud instance. Part way through the upload an error occurred.

"Exception":"Sabre\\DAV\\Exception\\BadRequest","Message":"expected filesize 1768906752 got 1768554496"

It turns out this error can be caused by many sources. It could be timeouts, bad requests, network packet loss, incorrect nextcloud configuration or more.

We tried uploading larger files (by a factor of 10 times) and they worked. This eliminated timeouts as a cause, and probably network loss. Being on ethernet direct to the server generally also helps to eliminate packet loss as a cause compared to say internet.

We also knew that the server must not have been misconfigured because a larger file did upload, so no file or resource limits were being hit.

This also indicated that the client was likely doing the right thing because larger and smaller files would upload correctly. The symptom now only affected a single file.

At this point I realised, what if the client and server were both victims to a lower level issue? I asked my friend to ls the file and read me the number of bytes long. It was 1768906752, as expected in nextcloud.

Then I asked him to cat that file into a new file, and to tell me the length of the new file. Cat encountered an error, but ls on the new file indeed showed a size of 1768554496. That means filesystem corruption! What could have lead to this?


Apple’s legacy filesystem (and the reason I stopped using macs) is well known for silently eating files and corrupting content. Here we had yet another case of that damage occuring, and triggering errors elsewhere.

Bisecting these issues and eliminating possibilities through a scientific method is always the best way to resolve the cause, and it may come from surprising places!

Mon, 31 Dec 2018 00:00:00 +1000 <![CDATA[Identity ideas …]]> Identity ideas …

I’ve been meaning to write this post for a long time. Taking half a year away from the 389-ds team, and exploring a lot of ideas from other projects has led me to come up with some really interesting ideas about what we do well, and what we don’t. I feel like this blog could be divisive, as I really think that for our services to stay relevant we need to make changes that really change our own identity - so that we can better represent yours.

So strap in, this is going to be long …

What’s currently on the market

Right now the market for identity has two extremes. At one end we have the legacy “create your own” systems, that are build on technologies like LDAP and Kerberos. I’m thinking about things like 389 Directory Server, OpenLDAP, Active Directory, FreeIPA and more. These all happen to be constrained heavily by complexity, fragility, and administrative workload. You need to spend months to learn these and even still, you will make mistakes and there will be problems.

At the other end we have hosted “Identity as a Service” options like Azure AD and Auth0. These have very intelligently, unbound themself from legacy, and tend to offer HTTP apis, 2fa and other features that “just work”. But they are all in the cloud, and outside your control.

But there is nothing in the middle. There is no option that “just works”, supports modern standards, and is unhindered by legacy that you can self deploy with minimal administrative fuss - or years of experience.

What do I like from 389?

  • Replication

The replication system is extremely robust, and has passed many complex tests for cases of eventual consistency correctness. It’s very rare to hear of any kind of data corruption or loss within our replication system, and that’s testament to the great work of people who spent years looking at the topic.

  • Performance

We aren’t as fast as OpenLDAP is 1 vs 1 server, but our replication scalability is much higher, where in any size of MMR or read-only replica topology, we have higher horizontal scaling, nearly linear based on server additions. If you want to run a cloud scale replicated database, we scale to it (and people already do this!).

  • Stability

Our server stability is well known with administrators, and honestly is a huge selling point. We see servers that only go down when administrators are performing upgrades. Our work with sanitising tools and the careful eyes of the team has ensured our code base is reliable and solid. Having extensive tests and amazing dedicated quality engineers also goes a long way.

  • Feature rich

There are a lot of features I really like, and are really useful as an admin deploying this service. Things like memberof (which is actually a group resolution cache when you think about it …), automember, online backup, unique attribute enforcement, dereferencing, and more.

  • The team

We have a wonderful team of really smart people, all of whom are caring and want to advance the state of identity management. Not only do they want to keep up with technical changes and excellence, they are listening to and want to improve our social awareness of identity management.

Pain Points

  • C

Because DS is written in C, it’s risky and difficult to make changes. People constantly make mistakes that introduce unsafety (even myself), and worse. No amount of tooling or intelligence can take away the fact that C is just hard to use, and people need to be perfect (people are not perfect!) and today we have better tools. We can not spend our time chasing our tails on pointless issues that C creates, when we should be doing better things.

  • Everything about dynamic admin, config, and plugins is hard and can’t scale

Because we need to maintain consistency through operations from start to end but we also allow changing config, plugins, and more during the servers operation the current locking design just doesn’t scale. It’s also not 100% safe either as the values are changed by atomics, not managed by transactions. We could use copy-on-write for this, but why? Config should be managed by tools like ansible, but today our dynamic config and plugins is both a performance over head and an admin overhead because we exclude best practice tools and have to spend a large amount of time to maintain consistent data when we shouldn’t need to. Less features is less support overhead on us, and simpler to test and assert quality and correct behaviour.

  • Plugins to address shortfalls, but a bit odd.

We have all these features to address issues, but they all do it … kind of the odd way. Managed Entries creates user private groups on object creation. But the problem is “unix requires a private group” and “ldap schema doesn’t allow a user to be a group and user at the same time”. So the answer is actually to create a new objectClass that let’s a user ALSO be it’s own UPG, not “create an object that links to the user”. (Or have a client generate the group from user attributes but we shouldn’t shift responsibility to the client.)

Distributed Numeric Assignment is based on the AD rid model, but it’s all about “how can we assign a value to a user that’s unique?”. We already have a way to do this, in the UUID, so why not derive the UID/GID from the UUID. This means there is no complex inter-server communication, pooling, just simple isolated functionality.

We have lots of features that just are a bit complex, and could have been made simpler, that now we have to support, and can’t change to make them better. If we rolled a new “fixed” version, we would then have to support both because projects like FreeIPA aren’t going to just change over.

  • client tools are controlled by others and complex (sssd, openldap)

Every tool for dealing with ldap is really confusing and arcane. They all have wild (unhelpful) defaults, and generally this scares people off. I took months of work to get a working ldap server in the past. Why? It’s 2018, things need to “just work”. Our tools should “just work”. Why should I need to hand edit pam? Why do I need to set weird options in SSSD.conf? All of this makes the whole experience poor.

We are making client tools that can help (to an extent), but they are really limited to system administration and they aren’t “generic” tools for every possible configuration that exists. So at some point people will still find a limit where they have to touch ldap commands. A common request is a simple to use web portal for password resets, which today only really exists in FreeIPA, and that limits it’s application already.

  • hard to change legacy

It’s really hard to make code changes because our surface area is so broad and the many use cases means that we risk breakage every time we do. I have even broken customer deployments like this. It’s almost impossible to get away from, and that holds us back because it means we are scared to make changes because we have to support the 1 million existing work flows. To add another is more support risk.

Many deployments use legacy schema elements that holds us back, ranging from the inet types, schema that enforces a first/last name, schema that won’t express users + groups in a simple away. It’s hard to ask people to just up and migrate their data, and even if we wanted too, ldap allows too much freedom so we are more likely to break data, than migrate it correctly if we tried.

This holds us back from technical changes, and social representation changes. People are more likely to engage with a large migrational change, than an incremental change that disturbs their current workflow (IE moving from on prem to cloud, rather than invest in smaller iterative changes to make their local solutions better).

  • ACI’s are really complex

389’s access controls are good because they are in the tree and replicated, but bad because the syntax is awful, complex, and has lots of traps and complexity. Even I need to look up how to write them when I have to. This is not good for a project that has such deep security concerns, where your ACI’s can look correct but actually expose all your data to risks.

  • LDAP as a protocol is like an 90’s drug experience

LDAP may be the lingua franca of authentication, but it’s complex, hard to use and hard to write implementations for. That’s why in opensource we have a monoculture of using the openldap client libraries because no one can work out how to write a standalone library. Layer on top the complexity of the object and naming model, and we have a situation where no one wants to interact with LDAP and rather keeps it at arm length.

It’s going to be extremely hard to move forward here, because the community is so fragmented and small, and the working groups dispersed that the idea of LDAPv4 is a dream that no one should pursue, even though it’s desperately needed.

  • TLS

TLS is great. NSS databases and tools are not.


GSSAPI and Kerberos are a piece of legacy that we just can’t escape from. They are a curse almost, and one we need to break away from as it’s completely unusable (even if it what it promises is amazing). We need to do better.

That and SSO allows loads of attacks to proceed, where we actually want isolated token auth with limited access scopes …

What could we offer

  • Web application as a first class consumer.

People want web portals for their clients, and they want to be able to use web applications as the consumer of authentication. The HTTP protocols must be the first class integration point for anything in identity management today. This means using things like OAUTH/OIDC.

  • Systems security as a first class consumer.

Administrators still need to SSH to machines, and people still need their systems to have identities running on them. Having pam/nsswitch modules is a very major requirement, where those modules have to be fast, simple, and work correctly. Users should “imply” a private group, and UID/GID should by dynamic from UUID (or admins can override it).

  • 2FA/u2f/TOTP.

Multi-factor auth is here (not coming, here), and we are behind the game. We already have Apple and MS pushing for webauthn in their devices. We need to be there for these standards to work, and to support the next authentication tool after that.

  • Good RADIUS integration.

RADIUS is not going away, and is important in education providers and business networks, so RADIUS must “just work”. Importantly, this means mschapv2 which is the universal default for all clients to operate with, which means nthash.

However, we can make the nthash unlinked from your normal password, so you can then have wifi password and a seperate loging password. We could even generate an NTHash containing the TOTP token for more high security environments.

  • better data structure (flat, defined by object types).

The tree structure of LDAP is confusing, but a flatter structure is easier to manage and understand. We can use ideas from kubernetes like tags/labels which can be used to provide certain controls and filtering capabilities for searches and access profiles to apply to.

  • structured logging, with in built performance profiling.

Being able to diagnose why an operation is slow is critical and having structured logs with profiling information is key to allowing admins and developers to resolve performance issues at scale. It’s also critical to have auditing of every single change made in the system, including internal changes that occur during operations.

  • access profiles with auditing capability.

Access profiles that express what you can access, and how. Easier to audit, generate, and should be tightly linked to group membership for real RBAC style capabilities.

  • transactions by allowing batch operations.

LDAP wants to provide a transaction system over a set of operations, but that may cause performance issues on write paths. Instead, why not allow submission of batches of changes that all must occur “at the same time” or “none”. This is faster network wise, protocol wise, and simpler for a server to implement.

What’s next then …

Instead of fixing what we have, why not take the best of what we have, and offer something new in parallel? Start a new front end that speaks in an accessible way, that has modern structures, and has learnt from the lessons of the past? We can build it to standalone, or proxy from the robust core of 389 Directory Server allowing migration paths, but eschew the pain of trying to bring people to the modern world. We can offer something unique, an open source identity system that’s easy to use, fast, secure, that you can run on your terms, or in the cloud.

This parallel project seems like a good idea … I wonder what to name it …

Fri, 21 Dec 2018 00:00:00 +1000 <![CDATA[Work around docker exec bug]]> Work around docker exec bug

There is currently a docker exec bug in Centos/RHEL 7 that causes errors such as:

# docker exec -i -t instance /bin/sh
rpc error: code = 2 desc = oci runtime error: exec failed: container_linux.go:247: starting container process caused "process_linux.go:110: decoding init error from pipe caused \"read parent: connection reset by peer\""

As a work around you can use nsenter instead:

PID=docker inspect --format {{.State.Pid}} <name of container>
nsenter --target $PID --mount --uts --ipc --net --pid /bin/sh

For more information, you can see the bugreport here.

Sun, 09 Dec 2018 00:00:00 +1000 <![CDATA[High Available RADVD on Linux]]> High Available RADVD on Linux

Recently I was experimenting again with high availability router configurations so that in the cause of an outage or a failover the other router will take over and traffic is still served.

This is usually done through protocols like VRRP to allow virtual ips to exist that can be failed between. However with ipv6 one needs to still allow clients to find the router, and in the cause of a failure, the router advertisments still must continue for client renewals.

To achieve this we need two parts. A shared Link Local address, and a special RADVD configuration.

Because of howe ipv6 routers work, all traffic (even global) is still sent to your link local router. We can use an address like:


This doesn’t clash with any reserved or special ipv6 addresses, and it’s easy to remember. Because of how link local works, we can put this on many interfaces of the router (many vlans) with no conflict.

So now to the two components.


Keepalived is a VRRP implementation for linux. It has extensive documentation and sometimes uses some implementation specific language, but it works well for what it does.

Our configuration looks like:

#  /etc/keepalived/keepalived.conf
global_defs {
  vrrp_version 3

vrrp_sync_group G1 {
 group {

vrrp_instance ipv6_ens256 {
   interface ens256
   virtual_router_id 62
   priority 50
   advert_int 1.0
   virtual_ipaddress {
   garp_master_delay 1

Note that we provide both a global address and an LL address for the failover. This is important for services and DNS for the router to have the global, but you could omit this. The LL address however is critical to this configuration and must be present.

Now you can start up vrrp, and you should see one of your two linux machines pick up the address.


For RADVD to work, a feature of the 2.x series is required. Packaging this for el7 is out of scope of this post, but fedora ships the version required.

The feature is that RADVD can be configured to specify which address it advertises for the router, rather than assuming the interface LL autoconf address is the address to advertise. The configuration appears as:

# /etc/radvd.conf
interface ens256
    AdvSendAdvert on;
    MinRtrAdvInterval 30;
    MaxRtrAdvInterval 100;
    AdvRASrcAddress {
    prefix 2001:db8::/64
        AdvOnLink on;
        AdvAutonomous on;
        AdvRouterAddr off;

Note the AdvRASrcAddress parameter? This defines a priority list of address to advertise that could be available on the interface.

Now start up radvd on your two routers, and try failing over between them while you ping from your client. Remember to ping LL from a client you need something like:

ping6 fe80::1:1%en1

Where the outgoing interface of your client traffic is denoted after the ‘%’.

Happy failover routing!

Thu, 01 Nov 2018 00:00:00 +1000 <![CDATA[Rust RwLock and Mutex Performance Oddities]]> Rust RwLock and Mutex Performance Oddities

Recently I have been working on Rust datastructures once again. In the process I wanted to test how my work performed compared to a standard library RwLock and Mutex. On my home laptop the RwLock was 5 times faster, the Mutex 2 times faster than my work.

So checking out my code on my workplace workstation and running my bench marks I noticed the Mutex was the same - 2 times faster. However, the RwLock was 4000 times slower.

What’s a RwLock and Mutex anyway?

In a multithreaded application, it’s important that data that needs to be shared between threads is consistent when accessed. This consistency is not just logical consistency of the data, but affects hardware consistency of the memory in cache. As a simple example, let’s examine an update to a bank account done by two threads:

acc = 10
deposit = 3
withdrawl = 5

[ Thread A ]            [ Thread B ]
acc = load_balance()    acc = load_balance()
acc = acc + deposit     acc = acc - withdrawl
store_balance(acc)      store_balance(acc)

What will the account balance be at the end? The answer is “it depends”. Because threads are working in parallel these operations could happen:

  • At the same time
  • Interleaved (various possibilities)
  • Sequentially

This isn’t very healthy for our bank account. We could lose our deposit, or have invalid data. Valid outcomes at the end are that acc could be 13, 5, 8. Only one of these is correct.

A mutex protects our data in multiple ways. It provides hardware consistency operations so that our cpus cache state is valid. It also allows only a single thread inside of the mutex at a time so we can linearise operations. Mutex comes from the word “Mutual Exclusion” after all.

So our example with a mutex now becomes:

acc = 10
deposit = 3
withdrawl = 5

[ Thread A ]            [ Thread B ]
mutex.lock()            mutex.lock()
acc = load_balance()    acc = load_balance()
acc = acc + deposit     acc = acc - withdrawl
store_balance(acc)      store_balance(acc)
mutex.unlock()          mutex.unlock()

Now only one thread will access our account at a time: The other thread will block until the mutex is released.

A RwLock is a special extension to this pattern. Where a mutex guarantees single access to the data in a read and write form, a RwLock (Read Write Lock) allows multiple read-only views OR single read and writer access. Importantly when a writer wants to access the lock, all readers must complete their work and “drain”. Once the write is complete readers can begin again. So you can imagine it as:

Time ->

T1: -- read --> x
T3:     -- read --> x                x -- read -->
T3:     -- read --> x                x -- read -->
T4:                   | -- write -- |
T5:                                  x -- read -->

Test Case for the RwLock

My test case is simple. Given a set of 12 threads, we spawn:

  • 8 readers. Take a read lock, read the value, release the read lock. If the value == target then stop the thread.
  • 4 writers. Take a write lock, read the value. Add one and write. Continue until value == target then stop.

Other conditions:

  • The test code is identical between Mutex/RwLock (beside the locking costruct)
  • –release is used for compiler optimisations
  • The test hardware is as close as possible (i7 quad core)
  • The tests are run multiple time to construct averages of the performance

The idea being that X target number of writes must occur, while many readers contend as fast as possible on the read. We are pressuring the system of choice between “many readers getting to read fast” or “writers getting priority to drain/block readers”.

On OSX given a target of 500 writes, this was able to complete in 0.01 second for the RwLock. (MBP 2011, 2.8GHz)

On Linux given a target of 500 writes, this completed in 42 seconds. This is a 4000 times difference. (i7-7700 CPU @ 3.60GHz)

All things considered the Linux machine should have an advantage - it’s a desktop processor, of a newer generation, and much faster clock speed. So why is the RwLock performance so much different on Linux?

To the source code!

Examining the Rust source code , many OS primitives come from libc. This is because they require OS support to function. RwLock is an example of this as is mutex and many more. The unix implementation for Rust consumes the pthread_rwlock primitive. This means we need to read man pages to understand the details of each.

OSX uses FreeBSD userland components, so we can assume they follow the BSD man pages. In the FreeBSD man page for pthread_rwlock_rdlock we see:


 To prevent writer starvation, writers are favored over readers.

Linux however, uses different constructs. Looking at the Linux man page:

  This is the default.  A thread may hold multiple read locks;
  that is, read locks are recursive.  According to The Single
  Unix Specification, the behavior is unspecified when a reader
  tries to place a lock, and there is no write lock but writers
  are waiting.  Giving preference to the reader, as is set by
  PTHREAD_RWLOCK_PREFER_READER_NP, implies that the reader will
  receive the requested lock, even if a writer is waiting.  As
  long as there are readers, the writer will be starved.

Reader vs Writer Preferences?

Due to the policy of a RwLock having multiple readers OR a single writer, a preference is given to one or the other. The preference basically boils down to the choice of:

  • Do you respond to write requests and have new readers block?
  • Do you favour readers but let writers block until reads are complete?

The difference is that on a read heavy workload, a write will continue to be delayed so that readers can begin and complete (up until some threshold of time). However, on a writer focused workload, you allow readers to stall so that writes can complete sooner.

On Linux, they choose a reader preference. On OSX/BSD they choose a writer preference.

Because our test is about how fast can a target of write operations complete, the writer preference of BSD/OSX causes this test to be much faster. Our readers still “read” but are giving way to writers, which completes our test sooner.

However, the linux “reader favour” policy means that our readers (designed for creating conteniton) are allowed to skip the queue and block writers. This causes our writers to starve. Because the test is only concerned with writer completion, the result is (correctly) showing our writers are heavily delayed - even though many more readers are completing.

If we were to track the number of reads that completed, I am sure we would see a large factor of difference where Linux has allow many more readers to complete than the OSX version.

Linux pthread_rwlock does allow you to change this policy (PTHREAD_RWLOCK_PREFER_WRITER_NP) but this isn’t exposed via Rust. This means today, you accept (and trust) the OS default. Rust is just unaware at compile time and run time that such a different policy exists.


Rust like any language consumes operating system primitives. Every OS implements these differently and these differences in OS policy can cause real performance differences in applications between development and production.

It’s well worth understanding the constructions used in programming languages and how they affect the performance of your application - and the decisions behind those tradeoffs.

This isn’t meant to say “don’t use RwLock in Rust on Linux”. This is meant to say “choose it when it makes sense - on read heavy loads, understanding writers will delay”. For my project (A copy on write cell) I will likely conditionally compile rwlock on osx, but mutex on linux as I require a writer favoured behaviour. There are certainly applications that will benefit from the reader priority in linux (especially if there is low writer volume and low penalty to delayed writes).

Fri, 19 Oct 2018 00:00:00 +1000 <![CDATA[Photography - Why You Should Use JPG (not RAW)]]> Photography - Why You Should Use JPG (not RAW)

When I started my modern journey into photography, I simply shot in JPG. I was happy with the results, and the images I was able to produce. It was only later that I was introduced to a now good friend and he said: “You should always shoot RAW! You can edit so much more if you do.”. It’s not hard to find many ‘beginner’ videos all touting the value of RAW for post editing, and how it’s the step from beginner to serious photographer (and editor).

Today, I would like to explore why I have turned off RAW on my camera bodies for good. This is a deeply personal decision, and I hope that my experience helps you to think about your own creative choices. If you want to stay shooting RAW and editing - good on you. If this encourages you to try turning back to JPG - good on you too.

There are two primary reasons for why I turned off RAW:

  • Colour reproduction of in body JPG is better to the eye today.
  • Photography is about composing an image from what you have infront of you.

Colour is about experts (and detail)

I have always been unhappy with the colour output of my editing software when processing RAW images. As someone who is colour blind I did not know if it was just my perception, or if real issues existed. No one else complained so it must just be me right!

Eventually I stumbled on an article about how to develop real colour and extract camera film simulations for my editor. I was interested in both the ability to get true reflections of colour in my images, but also to use the film simulations in post (the black and white of my camera body is beautiful and soft, but my editor is harsh).

I spent a solid week testing and profiling both of my cameras. I quickly realised a great deal about what was occuring in my editor, but also my camera body.

The editor I have, is attempting to generalise over the entire set of sensors that a manufacturer has created. They are also attempting to create a true colour output profile, that is as reflective of reality as possible. So when I was exporting RAWs to JPG, I was seeing the differences between what my camera hardware is, vs the editors profiles. (This was particularly bad on my older body, so I suspect the RAW profiles are designed for the newer sensor).

I then created film simulations and quickly noticed the subtle changes. Blacks were blacker, but retained more fine detail with the simulation. Skin tone was softer. Exposure was more even across a variety of image types. How? RAW and my editor is meant to create the best image possible? Why is a film-simulation I have “extracted” creating better images?

As any good engineer would do I created sample images. A/B testing. I would provide the RAW processed by my editor, and a RAW processed with my film simulation. I would vary the left/right of the image, exposure, subject, and more. After about 10 tests across 5 people, only on one occasion did someone prefer the RAW from my editor.

At this point I realised that my camera manufacturer is hiring experts who build, live and breath colour technology. They have tested and examined everything about the body I have, and likely calibrated it individually in the process to make produce exact reproductions as they see in a lab. They are developing colour profiles that are not just broadly applicable, but also pleasing to look at (even if not accurate reproductions).

So how can my film simulations I extracted and built in a week, measure up to the experts? I decided to find out. I shot test images in JPG and RAW and began to provide A/B/C tests to people.

If the editor RAW was washed out compared to the RAW with my film simulation, the JPG from the body made them pale in comparison. Every detail was better, across a range of conditions. The features in my camera body are better than my editor. Noise reduction, dynamic range, sharpening, softening, colour saturation. I was holding in my hands a device that has thousands of hours of expert design, that could eclipse anything I built on a weekend for fun to achieve the same.

It was then I came to think about and realise …

Composition (and effects) is about you

Photography is a complex skill. It’s not having a fancy camera and just clicking the shutter, or zooming in. Photography is about taking that camera and putting it in a position to take a well composed image based on many rules (and exceptions) that I am still continually learning.

When you stop to look at an image you should always think “how can I compose the best image possible?”.

So why shoot in RAW? RAW is all about enabling editing in post. After you have already composed and taken the image. There are valid times and useful functions of editing. For example whitebalance correction and minor cropping in some cases. Both of these are easily conducted with JPG with no loss in quality compared to the RAW. I still commonly do both of these.

However RAW allows you to recover mistakes during composition (to a point). For example, the powerful base-curve fusion module allows dynamic range “after the fact”. You may even add high or low pass filters, or mask areas to filter and affect the colour to make things pop, or want that RAW data to make your vibrance control as perfect as possible. You may change the perspective, or even add filters and more. Maybe you want to optimise de-noise to make smooth high ISO images. There are so many options!

But all these things are you composing after. Today, many of these functions are in your camera - and better performing. So while I’m composing I can enable dynamic range for the darker elements of the frame. I can compose and add my colour saturation (or remove it). I can sharpen, soften. I can move my own body to change perspective. All at the time I am building the image in my mind, as I compose, I am able to decide on the creative effects I want to place in that image. I’m not longer just composing within a frame, but a canvas of potential effects.

To me this was an important distinction. I always found I was editing poorly-composed images in an attempt to “fix” them to something acceptable. Instead I should have been looking at how to compose them from the start to be great, using the tool in my hand - my camera.

Really, this is a decision that is yours. Do you spend more time now to make the image you want? Or do you spend it later editing to achieve what you want?


Photography is a creative process. You will have your own ideas of how that process should look, and how you want to work with it. Great! This was my experience and how I have arrived at a creative process that I am satisfied with. I hope that it provides you an alternate perspective to the generally accepted “RAW is imperative” line that many people advertise.

Mon, 06 Aug 2018 00:00:00 +1000 <![CDATA[Extracting Formally Verified C with FStar and KreMLin]]> Extracting Formally Verified C with FStar and KreMLin

As software engineering has progressed, the correctness of software has become a major issue. However the tools that exist today to help us create correct programs have been lacking. Human’s continue to make mistakes that cause harm to others (even I do!).

Instead, tools have now been developed that allow the verification of programs and algorithms as correct. These have not gained widespread adoption due to the complexities of their tool chains or other social and cultural issues.

The Project Everest research has aimed to create a formally verified webserver and cryptography library. To achieve this they have developed a language called F* (FStar) and KreMLin as an extraction tool. This allows an FStar verified algorithm to be extracted to a working set of C source code - C source code that can be easily added to existing projects.

Setting up a FStar and KreMLin environment

Today there are a number of undocumented gotchas with opam - the OCaml package manager. Most of these are silent errors. I used the following steps to get to a working environment.

# It's important to have bzip2 here else opam silently fails!
dnf install -y rsync git patch opam bzip2 which gmp gmp-devel m4 \
        hg unzip pkgconfig redhat-rpm-config

opam init
# You need 4.02.3 else wasm will not compile.
opam switch create 4.02.3
opam switch 4.02.3
echo ". /home/Work/.opam/opam-init/ > /dev/null 2> /dev/null || true" >> .bashrc
opam install batteries fileutils yojson ppx_deriving_yojson zarith fix pprint menhir process stdint ulex wasm

PATH "~/z3/bin:~/FStar/bin:~/kremlin:$PATH"
# You can get the "correct" z3 version from
mv z3- z3

# You will need a "stable" release of FStar
mv FStar-stable Fstar
cd ~/FStar
opam config exec -- make -C src/ocaml-output -j
opam config exec -- make -C ulib/ml

# You need a master branch of kremlin
cd ~
mv kremlin-master kremlin
cd kremlin
opam config exec -- make
opam config exec -- make test

Your first FStar extraction

Mon, 30 Apr 2018 00:00:00 +1000 <![CDATA[AD directory admins group setup]]> AD directory admins group setup

Recently I have been reading many of the Microsoft Active Directory best practices for security and hardening. These are great resources, and very well written. The major theme of the articles is “least privilege”, where accounts like Administrators or Domain Admins are over used and lead to further compromise.

A suggestion that is put forward by the author is to have a group that has no other permissions but to manage the directory service. This should be used to temporarily make a user an admin, then after a period of time they should be removed from the group.

This way you have no Administrators or Domain Admins, but you have an AD only group that can temporarily grant these permissions when required.

I want to explore how to create this and configure the correct access controls to enable this scheme.

Create our group

First, lets create a “Directory Admins” group which will contain our members that have the rights to modify or grant other privileges.

# /usr/local/samba/bin/samba-tool group add 'Directory Admins'
Added group Directory Admins

It’s a really good idea to add this to the “Denied RODC Password Replication Group” to limit the risk of these accounts being compromised during an attack. Additionally, you probably want to make your “admin storage” group also a member of this, but I’ll leave that to you.

# /usr/local/samba/bin/samba-tool group addmembers "Denied RODC Password Replication Group" "Directory Admins"

Now that we have this, lets add a member to it. I strongly advise you create special accounts just for the purpose of directory administration - don’t use your daily account for this!

# /usr/local/samba/bin/samba-tool user create da_william
User 'da_william' created successfully
# /usr/local/samba/bin/samba-tool group addmembers 'Directory Admins' da_william
Added members to group Directory Admins

Configure the permissions

Now we need to configure the correct dsacls to allow Directory Admins full control over directory objects. It could be possible to constrain this to only modification of the cn=builtin and cn=users container however, as directory admins might not need so much control for things like dns modification.

If you want to constrain these permissions, only apply the following to cn=builtins instead - or even just the target groups like Domain Admins.

First we need the objectSID of our Directory Admins group so we can build the ACE.

# /usr/local/samba/bin/samba-tool group show 'directory admins' --attributes=cn,objectsid
dn: CN=Directory Admins,CN=Users,DC=adt,DC=blackhats,DC=net,DC=au
cn: Directory Admins
objectSid: S-1-5-21-2488910578-3334016764-1009705076-1104

Now with this we can construct the ACE.


This permission grants:

  • RP: read property
  • WP: write property
  • LC: list child objects
  • LO: list objects
  • RC: read control

It could be possible to expand these rights: it depends if you want directory admins to be able to do “day to day” ad control jobs, or if you just use them for granting of privileges. That’s up to you. An expanded ACE might be:

# Same as Enterprise Admins

Now lets actually apply this and do a test:

# /usr/local/samba/bin/samba-tool dsacl set --sddl='(A;CI;RPWPLCLORC;;;S-1-5-21-2488910578-3334016764-1009705076-1104)' --objectdn='dc=adt,dc=blackhats,dc=net,dc=au'
# /usr/local/samba/bin/samba-tool group addmembers 'directory admins' administrator -U 'da_william%...'
Added members to group directory admins
# /usr/local/samba/bin/samba-tool group listmembers 'directory admins' -U 'da_william%...'

After we have completed our tasks, we remove da_william from the directory admins group as we no longer required the privileges. You can self-remove, or have the Administrator account do the removal.

# /usr/local/samba/bin/samba-tool group removemembers 'directory admins' da_william -U 'da_william%...'
Removed members from group directory admins

# /usr/local/samba/bin/samba-tool group removemembers 'directory admins' da_william -U 'Administrator'
Removed members from group directory admins

Finally check that da_william is no longer in the group.

# /usr/local/samba/bin/samba-tool group listmembers 'directory admins' -U 'da_william%...'


With these steps we have created a secure account that has limited admin rights, able to temporarily promote users with privileges for administrative work - and able to remove it once the work is complete.

Thu, 26 Apr 2018 00:00:00 +1000 <![CDATA[Understanding AD Access Control Entries]]> Understanding AD Access Control Entries

A few days ago I set out to work on making samba 4 my default LDAP server. In the process I was forced to learn about Active Directory Access controls. I found that while there was significant documentation around the syntax of these structures, very little existed explaining how to use them effectively.

What’s in an ACE?

If you look at the the ACL of an entry in AD you’ll see something like:


This seems very confusing and complex (and someone should write a tool to explain these … maybe me). But once you can see the structure it starts to make sense.

Most of the access controls you are viewing here are DACLs or Discrestionary Access Control Lists. These make up the majority of the output after ‘O:DAG:DAD:AI’. TODO: What does ‘O:DAG:DAD:AI’ mean completely?

After that there are many ACEs defined in SDDL or ???. The structure is as follows:


Each of these fields can take varies types. These interact to form the access control rules that allow or deny access. Thankfully, you don’t need to adjust many fields to make useful ACE entries.

MS maintains a document of these field values here.

They also maintain a list of wellknown SID values here

I want to cover some common values you may see though:


Most of the types you’ll see are “A” and “OA”. These mean the ACE allows an access by the SID.


These change the behaviour of the ACE. Common values you may want to set are CI and OI. These determine that the ACE should be inherited to child objects. As far as the MS docs say, these behave the same way.

If you see ID in this field it means the ACE has been inherited from a parent object. In this case the inherit_object_guid field will be set to the guid of the parent that set the ACE. This is great, as it allows you to backtrace the origin of access controls!


This is the important part of the ACE - it determines what access the SID has over this object. The MS docs are very comprehensive of what this does, but common values are:

  • RP: read property
  • WP: write property
  • CR: control rights
  • CC: child create (create new objects)
  • DC: delete child
  • LC: list child objects
  • LO: list objects
  • RC: read control
  • WO: write owner (change the owner of an object)
  • WD: write dac (allow writing ACE)
  • SW: self write
  • SD: standard delete
  • DT: delete tree

I’m not 100% sure of all the subtle behaviours of these, because they are not documented that well. If someone can help explain these to me, it would be great.


We will skip some fields and go straight to SID. This is the SID of the object that is allowed the rights from the rights field. This field can take a GUID of the object, or it can take a “well known” value of the SID. For example ‘AN’ means “anonymous users”, or ‘AU’ meaning authenticated users.


I won’t claim to be an AD ACE expert, but I did find the docs hard to interpret at first. Having a breakdown and explanation of the behaviour of the fields can help others, and I really want to hear from people who know more about this topic on me so that I can expand this resource to help others really understand how AD ACE’s work.

Fri, 20 Apr 2018 00:00:00 +1000 <![CDATA[Making Samba 4 the default LDAP server]]> Making Samba 4 the default LDAP server

Earlier this year Andrew Bartlett set me the challenge: how could we make Samba 4 the default LDAP server in use for Linux and UNIX systems? I’ve finally decided to tackle this, and write up some simple changes we can make, and decide on some long term goals to make this a reality.

What makes a unix directory anyway?

Great question - this is such a broad topic, even I don’t know if I can single out what it means. For the purposes of this exercise I’ll treat it as “what would we need from my previous workplace”. My previous workplace had a dedicated set of 389 Directory Server machines that served lookups mainly for email routing, application authentication and more. The didn’t really process a great deal of login traffic as the majority of the workstations were Windows - thus connected to AD.

What it did show was that Linux clients and applications:

  • Want to use anonymous binds and searchs - Applications and clients are NOT domain members - they just want to do searches
  • The content of anonymous lookups should be “public safe” information. (IE nothing private)
  • LDAPS is a must for binds
  • MemberOf and group filtering is very important for access control
  • sshPublicKey and userCertificate;binary is important for 2fa/secure logins

This seems like a pretty simple list - but it’s not the model Samba 4 or AD ship with.

You’ll also want to harden a few default settings. These include:

  • Disable Guest
  • Disable 10 machine join policy

AD works under the assumption that all clients are authenticated via kerberos, and that kerberos is the primary authentication and trust provider. As a result, AD often ships with:

  • Disabled anonymous binds - All clients are domain members or service accounts
  • No anonymous content available to search
  • No LDAPS (GSSAPI is used instead)
  • no sshPublicKey or userCertificates (pkinit instead via krb)
  • Access control is much more complex topic than just “matching an ldap filter”.

As a result, it takes a bit of effort to change Samba 4 to work in a way that suits both, securely.

Isn’t anonymous binding insecure?

Let’s get this one out the way - no it’s not. In every pen test I have seen if you can get access to a domain joined machine, you probably have a good chance of taking over the domain in various ways. Domain joined systems and krb allows lateral movement and other issues that are beyond the scope of this document.

The lack of anonymous lookup is more about preventing information disclosure - security via obscurity. But it doesn’t take long to realise that this is trivially defeated (get one user account, guest account, domain member and you can search …).

As a result, in some cases it may be better to allow anonymous lookups because then you don’t have spurious service accounts, you have a clear understanding of what is and is not accessible as readable data, and you don’t need every machine on the network to be domain joined - you prevent a possible foothold of lateral movement.

So anonymous binding is just fine, as the unix world has shown for a long time. That’s why I have very few concerns about enabling it. Your safety is in the access controls for searches, not in blocking anonymous reads outright.

Installing your DC

As I run fedora, you will need to build and install samba for source so you can access the heimdal kerberos functions. Fedora’s samba 4 ships ADDC support now, but lacks some features like RODC that you may want. In the future I expect this will change though.

These documents will help guide you:


build steps

install a domain

I strongly advise you use options similar to:

/usr/local/samba/bin/samba-tool domain provision --server-role=dc --use-rfc2307 --dns-backend=SAMBA_INTERNAL --realm=SAMDOM.EXAMPLE.COM --domain=SAMDOM --adminpass=Passw0rd

Allow anonymous binds and searches

Now that you have a working domain controller, we should test you have working ldap:

/usr/local/samba/bin/samba-tool forest directory_service dsheuristics 0000002 -H ldaps://localhost --simple-bind-dn=''
ldapsearch -b DC=samdom,DC=example,DC=com -H ldaps://localhost -x

You can see the domain object but nothing else. Many other blogs and sites recommend a blanket “anonymous read all” access control, but I think that’s too broad. A better approach is to add the anonymous read to only the few containers that require it.

/usr/local/samba/bin/samba-tool dsacl set --objectdn=DC=samdom,DC=example,DC=com --sddl='(A;;RPLCLORC;;;AN)' --simple-bind-dn="" --password=Passw0rd
/usr/local/samba/bin/samba-tool dsacl set --objectdn=CN=Users,DC=samdom,DC=example,DC=com --sddl='(A;CI;RPLCLORC;;;AN)' --simple-bind-dn="" --password=Passw0rd
/usr/local/samba/bin/samba-tool dsacl set --objectdn=CN=Builtin,DC=samdom,DC=example,DC=com --sddl='(A;CI;RPLCLORC;;;AN)' --simple-bind-dn="" --password=Passw0rd

In AD groups and users are found in cn=users, and some groups are in cn=builtin. So we allow read to the root domain object, then we set a read on cn=users and cn=builtin that inherits to it’s child objects. The attribute policies are derived elsewhere, so we can assume that things like kerberos data and password material are safe with these simple changes.

Configuring LDAPS

This is a reasonable simple exercise. Given a ca cert, key and cert we can place these in the correct locations samba expects. By default this is the private directory. In a custom install, that’s /usr/local/samba/private/tls/, but for distros I think it’s /var/lib/samba/private. Simply replace ca.pem, cert.pem and key.pem with your files and restart.

Adding schema

To allow adding schema to samba 4 you need to reconfigure the dsdb config on the schema master. To show the current schema master you can use:

/usr/local/samba/bin/samba-tool fsmo show -H ldaps://localhost --simple-bind-dn='' --password=Password1

Look for the value:

SchemaMasterRole owner: CN=NTDS Settings,CN=LDAPKDC,CN=Servers,CN=Default-First-Site-Name,CN=Sites,CN=Configuration,DC=adt,DC=blackhats,DC=net,DC=au

And note the CN=ldapkdc = that’s the hostname of the current schema master.

On the schema master we need to adjust the smb.conf. The change you need to make is:

    dsdb:schema update allowed = yes

Now restart the instance and we can update the schema. The following LDIF should work if you replace ${DOMAINDN} with your namingContext. You can apply it with ldapmodify

dn: CN=sshPublicKey,CN=Schema,CN=Configuration,DC=adt,DC=blackhats,DC=net,DC=au
changetype: add
objectClass: top
objectClass: attributeSchema
cn: sshPublicKey
name: sshPublicKey
lDAPDisplayName: sshPublicKey
description: MANDATORY: OpenSSH Public key
oMSyntax: 4
isSingleValued: FALSE
searchFlags: 8

dn: CN=ldapPublicKey,CN=Schema,CN=Configuration,DC=adt,DC=blackhats,DC=net,DC=au
changetype: add
objectClass: top
objectClass: classSchema
cn: ldapPublicKey
name: ldapPublicKey
description: MANDATORY: OpenSSH LPK objectclass
lDAPDisplayName: ldapPublicKey
subClassOf: top
objectClassCategory: 3
defaultObjectCategory: CN=ldapPublicKey,CN=Schema,CN=Configuration,DC=adt,DC=blackhats,DC=net,DC=au
mayContain: sshPublicKey

dn: CN=User,CN=Schema,CN=Configuration,DC=adt,DC=blackhats,DC=net,DC=au
changetype: modify
replace: auxiliaryClass
auxiliaryClass: ldapPublicKey
auxiliaryClass: posixAccount
auxiliaryClass: shadowAccount
sudo ldapmodify -f sshpubkey.ldif -D '' -w Password1 -H ldaps://localhost
adding new entry "CN=sshPublicKey,CN=Schema,CN=Configuration,DC=adt,DC=blackhats,DC=net,DC=au"

adding new entry "CN=ldapPublicKey,CN=Schema,CN=Configuration,DC=adt,DC=blackhats,DC=net,DC=au"

modifying entry "CN=User,CN=Schema,CN=Configuration,DC=adt,DC=blackhats,DC=net,DC=au"

To my surprise, userCertificate already exists! The reason I missed it is a subtle ad schema behaviour I missed. The ldap attribute name is stored in the lDAPDisplayName and may not be the same as the CN of the schema element. As a result, you can find this with:

ldapsearch -H ldaps://localhost -b CN=Schema,CN=Configuration,DC=adt,DC=blackhats,DC=net,DC=au -x -D '' -W '(attributeId='

This doesn’t solve my issues: Because I am a long time user of 389-ds, that means I need some ns compat attributes. Here I add the nsUniqueId value so that I can keep some compatability.

dn: CN=nsUniqueId,CN=Schema,CN=Configuration,DC=adt,DC=blackhats,DC=net,DC=au
changetype: add
objectClass: top
objectClass: attributeSchema
attributeID: 2.16.840.1.113730.3.1.542
cn: nsUniqueId
name: nsUniqueId
lDAPDisplayName: nsUniqueId
description: MANDATORY: nsUniqueId compatability
oMSyntax: 4
isSingleValued: TRUE
searchFlags: 9

dn: CN=nsOrgPerson,CN=Schema,CN=Configuration,DC=adt,DC=blackhats,DC=net,DC=au
changetype: add
objectClass: top
objectClass: classSchema
governsID: 2.16.840.1.113730.3.2.334
cn: nsOrgPerson
name: nsOrgPerson
description: MANDATORY: Netscape DS compat person
lDAPDisplayName: nsOrgPerson
subClassOf: top
objectClassCategory: 3
defaultObjectCategory: CN=nsOrgPerson,CN=Schema,CN=Configuration,DC=adt,DC=blackhats,DC=net,DC=au
mayContain: nsUniqueId

dn: CN=User,CN=Schema,CN=Configuration,DC=adt,DC=blackhats,DC=net,DC=au
changetype: modify
replace: auxiliaryClass
auxiliaryClass: ldapPublicKey
auxiliaryClass: posixAccount
auxiliaryClass: shadowAccount
auxiliaryClass: nsOrgPerson

Now with this you can extend your users with the required data for SSH, certificates and maybe 389-ds compatability.

/usr/local/samba/bin/samba-tool user edit william  -H ldaps://localhost --simple-bind-dn=''


Out of the box a number of the unix attributes are not indexed by Active Directory. To fix this you need to update the search flags in the schema.

Again, temporarily allow changes:

    dsdb:schema update allowed = yes

Now we need to add some indexes for common types. Note that in the nsUniqueId schema I already added the search flags. We also want to set that these values should be preserved if they become tombstones so we can recove them.

/usr/local/samba/bin/samba-tool schema attribute modify uid --searchflags=9
/usr/local/samba/bin/samba-tool schema attribute modify nsUniqueId --searchflags=9
/usr/local/samba/bin/samba-tool schema attribute modify uidnumber --searchflags=9
/usr/local/samba/bin/samba-tool schema attribute modify gidnumber --searchflags=9
# Preserve on tombstone but don't index
/usr/local/samba/bin/samba-tool schema attribute modify x509-cert --searchflags=8
/usr/local/samba/bin/samba-tool schema attribute modify sshPublicKey --searchflags=8
/usr/local/samba/bin/samba-tool schema attribute modify gecos --searchflags=8
/usr/local/samba/bin/samba-tool schema attribute modify loginShell --searchflags=8
/usr/local/samba/bin/samba-tool schema attribute modify home-directory --searchflags=24

AD Hardening

We want to harden a few default settings that could be considered insecure. First, let’s stop “any user from being able to domain join machines”.

/usr/local/samba/bin/samba-tool domain settings account_machine_join_quota 0 -H ldaps://localhost --simple-bind-dn=''

Now let’s disable the Guest account

/usr/local/samba/bin/samba-tool user disable Guest -H ldaps://localhost --simple-bind-dn=''

I plan to write a more complete samba-tool extension for auditing these and more options, so stay tuned!

SSSD configuration

Now that our directory service is configured, we need to configure our clients to utilise it correctly.

Here is my SSSD configuration, that supports sshPublicKey distribution, userCertificate authentication on workstations and SID -> uid mapping. In the future I want to explore sudo rules in LDAP with AD, and maybe even HBAC rules rather than GPO.

Please refer to my other blog posts on configuration of the userCertificates and sshKey distribution.

ignore_group_members = False

# There is a bug in SSSD where this actually means "ipv6 only".
# lookup_family_order=ipv6_first
cache_credentials = True
id_provider = ldap
auth_provider = ldap
access_provider = ldap
chpass_provider = ldap
ldap_search_base = dc=blackhats,dc=net,dc=au

# This prevents an infinite referral loop.
ldap_referrals = False
ldap_id_mapping = True
ldap_schema = ad
# Rather that being in domain users group, create a user private group
# automatically on login.
# This is very important as a security setting on unix!!!
# See this bug if it doesn't work correctly.
auto_private_groups = true

ldap_uri = ldaps://
ldap_tls_reqcert = demand
ldap_tls_cacert = /etc/pki/tls/certs/bh_ldap.crt

# Workstation access
ldap_access_filter = (memberOf=CN=Workstation Operators,CN=Users,DC=blackhats,DC=net,DC=au)

ldap_user_member_of = memberof
ldap_user_gecos = cn
ldap_user_uuid = objectGUID
ldap_group_uuid = objectGUID
# This is really important as it allows SSSD to respect nsAccountLock
ldap_account_expire_policy = ad
ldap_access_order = filter, expire
# Setup for ssh keys
ldap_user_ssh_public_key = sshPublicKey
# This does not require ;binary tag with AD.
ldap_user_certificate = userCertificate
# This is required for the homeDirectory to be looked up in the sssd schema
ldap_user_home_directory = homeDirectory

services = nss, pam, ssh, sudo
config_file_version = 2
certificate_verification = no_verification

domains =
homedir_substring = /home

pam_cert_auth = True







With these simple changes we can easily make samba 4 able to perform the roles of other unix focused LDAP servers. This allows stateless clients, secure ssh key authentication, certificate authentication and more.

Some future goals to improve this include:

  • Ship samba 4 with schema templates that can be used
  • Schema querying (what objectclass takes this attribute?)
  • Group editing (same as samba-tool user edit)
  • Security auditing tools
  • user/group modification commands
  • Refactor and improve the cli tools python to be api driven - move the logic from netcmd into samdb so that samdb can be an API that python can consume easier. Prevent duplication of logic.

The goal is so that an admin never has to see an LDIF ever again.

Wed, 18 Apr 2018 00:00:00 +1000 <![CDATA[Smartcards and You - How To Make Them Work on Fedora/RHEL]]> Smartcards and You - How To Make Them Work on Fedora/RHEL

Smartcards are a great way to authenticate users. They have a device (something you have) and a pin (something you know). They prevent password transmission, use strong crypto and they even come in a variety of formats. From your “card” shapes to yubikeys.

So why aren’t they used more? It’s the classic issue of usability - the setup for them is undocumented, complex, and hard to discover. Today I hope to change this.

The Goal

To authenticate a user with a smartcard to a physical linux system, backed onto LDAP. The public cert in LDAP is validated, as is the chain to the CA.

You Will Need

I’ll be focusing on the yubikey because that’s what I own.

Preparing the Smartcard

First we need to make the smartcard hold our certificate. Because of a crypto issue in yubikey firmware, it’s best to generate certificates for these externally.

I’ve documented this before in another post, but for accesibility here it is again.

Create an NSS DB, and generate a certificate signing request:

certutil -d . -N -f pwdfile.txt
certutil -d . -R -a -o user.csr -f pwdfile.txt -g 4096 -Z SHA256 -v 24 \
--keyUsage digitalSignature,nonRepudiation,keyEncipherment,dataEncipherment --nsCertType sslClient --extKeyUsage clientAuth \
-s "CN=username,O=Testing,L=example,ST=Queensland,C=AU"

Once the request is signed, and your certificate is in “user.crt”, import this to the database.

certutil -A -d . -f pwdfile.txt -i user.crt -a -n TLS -t ",,"
certutil -A -d . -f pwdfile.txt -i ca.crt -a -n TLS -t "CT,,"

Now export that as a p12 bundle for the yubikey to import.

pk12util -o user.p12 -d . -k pwdfile.txt -n TLS

Now import this to the yubikey - remember to use slot 9a this time! As well make sure you set the touch policy NOW, because you can’t change it later!

yubico-piv-tool -s9a -i user.p12 -K PKCS12 -aimport-key -aimport-certificate -k --touch-policy=always

Setting up your LDAP user

First setup your system to work with LDAP via SSSD. You’ve done that? Good! Now it’s time to get our user ready.

Take our user.crt and convert it to DER:

openssl x509 -inform PEM -outform DER -in user.crt -out user.der

Now you need to transform that into something that LDAP can understand. In the future I’ll be adding a tool to 389-ds to make this “automatic”, but for now you can use python:

>>> import base64
>>> with open('user.der', 'r') as f:
>>>    print(base64.b64encode(

That should output a long base64 string on one line. Add this to your ldap user with ldapvi:

userCertificate;binary:: <BASE64>

Note that ‘;binary’ tag has an important meaning here for certificate data, and the ‘::’ tells ldap that this is b64 encoded, so it will decode on addition.

Setting up the system

Now that you have done that, you need to teach SSSD how to intepret that attribute.

In your various SSSD sections you’ll need to make the following changes:

auth_provider = ldap
ldap_user_certificate = userCertificate;binary

# This controls OCSP checks, you probably want this enabled!
# certificate_verification = no_verification

pam_cert_auth = True

Now the TRICK is letting SSSD know to use certificates. You need to run:

sudo touch /var/lib/sss/pubconf/pam_preauth_available

With out this, SSSD won’t even try to process CCID authentication!

Add your ca.crt to the system trusted CA store for SSSD to verify:

certutil -A -d /etc/pki/nssdb -i ca.crt -n USER_CA -t "CT,,"

Add coolkey to the database so it can find smartcards:

modutil -dbdir /etc/pki/nssdb -add "coolkey" -libfile /usr/lib64/

Check that SSSD can find the certs now:

# sudo /usr/libexec/sssd/p11_child --pre --nssdb=/etc/pki/nssdb
PIN for william
CAC ID Certificate

If you get no output here you are missing something! If this doesn’t work, nothing will!

Finally, you need to tweak PAM to make sure that pam_unix isn’t getting in the way. I use the following configuration.

auth        required
# This skips pam_unix if the given uid is not local (IE it's from SSSD)
auth        [default=1 ignore=ignore success=ok]
auth        sufficient nullok try_first_pass
auth        requisite uid >= 1000 quiet_success
auth        sufficient prompt_always ignore_unknown_user
auth        required

account     required
account     sufficient
account     sufficient uid < 1000 quiet
account     [default=bad success=ok user_unknown=ignore]
account     required

password    requisite try_first_pass local_users_only retry=3 authtok_type=
password    sufficient sha512 shadow try_first_pass use_authtok
password    sufficient use_authtok
password    required

session     optional revoke
session     required
-session    optional
session     [success=1 default=ignore] service in crond quiet use_uid
session     required
session     optional

That’s it! Restart SSSD, and you should be good to go.

Finally, you may find SELinux isn’t allowing authentication. This is really sad that smartcards don’t work with SELinux out of the box and I have raised a number of bugs, but check this just in case.

Happy authentication!

Tue, 27 Feb 2018 00:00:00 +1000 <![CDATA[Using b43 firmware on Fedora Atomic Workstation]]> Using b43 firmware on Fedora Atomic Workstation

My Macbook Pro has a broadcom b43 wireless chipset. This is notorious for being one of the most annoying wireless adapters on linux. When you first install Fedora you don’t even see “wifi” as an option, and unless you poke around in dmesg, you won’t find how to enable b43 to work on your platform.


The b43 driver requires proprietary firmware to be loaded else the wifi chip will not run. There are a number of steps for this process found on the linux wireless page . You’ll note that one of the steps is:

export FIRMWARE_INSTALL_DIR="/lib/firmware"
sudo b43-fwcutter -w "$FIRMWARE_INSTALL_DIR" broadcom-wl-5.100.138/linux/wl_apsta.o

So we need to be able to write and extract our firmware to /usr/lib/firmware, and then reboot and out wifi works.

Fedora Atomic Workstation

Atomic WS is similar to atomic server, that it’s a read-only ostree based deployment of fedora. This comes with a number of unique challenges and quirks but for this issue:

sudo touch /usr/lib/firmware/test
/bin/touch: cannot touch '/usr/lib/firmware/test': Read-only file system

So we can’t extract our firmware!

Normally linux also supports reading from /usr/local/lib/firmware (which on atomic IS writeable …) but for some reason fedora doesn’t allow this path.

Solution: Layered RPMs

Atomic has support for “rpm layering”. Ontop of the ostree image (which is composed of rpms) you can supply a supplemental list of packages that are “installed” at rpm-ostree update time.

This way you still have an atomic base platform, with read-only behaviours, but you gain the ability to customise your system. To achive it, it must be possible to write to locations in /usr during rpm install.

New method - install rpmfusion tainted

As I have now learnt, the b43 data is provided as part of rpmfusion nonfree. To enable this, you need to access the tainted repo. I a file such as “/etc/yum.repos.d/rpmfusion-nonfree-tainted.repo” add the content:


Now, you should be able to run:

atomic host install b43-firmware

You should have a working wifi chipset!

Custom RPM - old method

This means our problem has a simple solution: Create a b43 rpm package. Note, that you can make this for yourself privately, but you can’t distribute it for legal reasons.

Get setup on atomic to build the packages:

rpm-ostree install rpm-build createrepo

RPM specfile:

%define debug_package %{nil}
Summary: Allow b43 fw to install on ostree installs due to bz1512452
Name: b43-fw
Version: 1.0.0
Release: 1
Group: System Environment/Kernel

BuildRequires: b43-fwcutter


Broadcom firmware for b43 chips.

%setup -q -n broadcom-wl-5.100.138


mkdir -p %{buildroot}/usr/lib/firmware
b43-fwcutter -w %{buildroot}/usr/lib/firmware linux/wl_apsta.o

%dir %{_prefix}/lib/firmware/b43

* Fri Dec 22 2017 William Brown <william at> - 1.0.0
- Initial version

Now you can put this into a folder like so:

mkdir -p ~/rpmbuild/{SPECS,SOURCES}
<editor> ~/rpmbuild/SPECS/b43-fw.spec
wget -O ~/rpmbuild/SOURCES/broadcom-wl-5.100.138.tar.bz2

We are now ready to build!

rpmbuild -bb ~/rpmbuild/SPECS/b43-fw.spec
createrepo ~/rpmbuild/RPMS/x86_64/

Finally, we can install this. Create a yum repos file:

baseurl=file:///home/<YOUR USERNAME HERE>/rpmbuild/RPMS/x86_64
rpm-ostree install b43-fw

Now reboot and enjoy wifi on your Fedora Atomic Macbook Pro!

Sat, 23 Dec 2017 00:00:00 +1000 <![CDATA[Creating yubikey SSH and TLS certificates]]> Creating yubikey SSH and TLS certificates

Recently yubikeys were shown to have a hardware flaw in the way the generated private keys. This affects the use of them to provide PIV identies or SSH keys.

However, you can generate the keys externally, and load them to the key to prevent this issue.


First, we’ll create a new NSS DB on an airgapped secure machine (with disk encryption or in memory storage!)

certutil -N -d . -f pwdfile.txt

Now into this, we’ll create a self-signed cert valid for 10 years.

certutil -S -f pwdfile.txt -d . -t "C,," -x -n "SSH" -g 2048 -s "cn=william,O=ssh,L=Brisbane,ST=Queensland,C=AU" -v 120

We export this now to PKCS12 for our key to import.

pk12util -o ssh.p12 -d . -k pwdfile.txt -n SSH

Next we import the key and cert to the hardware in slot 9c

yubico-piv-tool -s9c -i ssh.p12 -K PKCS12 -aimport-key -aimport-certificate -k

Finally, we can display the ssh-key from the token.

ssh-keygen -D /usr/lib64/ -e

Note, we can make this always used by ssh client by adding the following into .ssh/config:

PKCS11Provider /usr/lib64/

TLS Identities

The process is almost identical for user certificates.

First, create the request:

certutil -d . -R -a -o user.csr -f pwdfile.txt -g 4096 -Z SHA256 -v 24 \
--keyUsage digitalSignature,nonRepudiation,keyEncipherment,dataEncipherment --nsCertType sslClient --extKeyUsage clientAuth \
-s "CN=username,O=Testing,L=example,ST=Queensland,C=AU"

Once the request is signed, we should have a user.crt back. Import that to our database:

certutil -A -d . -f pwdfile.txt -i user.crt -a -n TLS -t ",,"

Import our CA certificate also. Next export this to p12:

pk12util -o user.p12 -d . -k pwdfile.txt -n TLS

Now import this to the yubikey - remember to use slot 9a this time!

yubico-piv-tool -s9a -i user.p12 -K PKCS12 -aimport-key -aimport-certificate -k --touch-policy=always


Sat, 11 Nov 2017 00:00:00 +1000 <![CDATA[What’s the problem with NUMA anyway?]]> What’s the problem with NUMA anyway?

What is NUMA?

Non-Uniform Memory Architecture is a method of seperating ram and memory management units to be associated with CPU sockets. The reason for this is performance - if multiple sockets shared a MMU, they will cause each other to block, delaying your CPU.

To improve this, each NUMA region has it’s own MMU and RAM associated. If a CPU can access it’s local MMU and RAM, this is very fast, and does not prevent another CPU from accessing it’s own. For example:

CPU 0   <-- QPI --> CPU 1
  |                   |
  v                   v
MMU 0               MMU 1
  |                   |
  v                   v
RAM 1               RAM 2

For example, on the following system, we can see 1 numa region:

# numactl --hardware
available: 1 nodes (0)
node 0 cpus: 0 1 2 3
node 0 size: 12188 MB
node 0 free: 458 MB
node distances:
node   0
  0:  10

On this system, we can see two:

# numactl --hardware
available: 2 nodes (0-1)
node 0 cpus: 0 1 2 3 4 5 6 7 8 9 10 11 24 25 26 27 28 29 30 31 32 33 34 35
node 0 size: 32733 MB
node 0 free: 245 MB
node 1 cpus: 12 13 14 15 16 17 18 19 20 21 22 23 36 37 38 39 40 41 42 43 44 45 46 47
node 1 size: 32767 MB
node 1 free: 22793 MB
node distances:
node   0   1
  0:  10  20
  1:  20  10

This means that on the second system there is 32GB of ram per NUMA region which is accessible, but the system has total 64GB.

The problem

The problem arises when a process running on NUMA region 0 has to access memory from another NUMA region. Because there is no direct connection between CPU 0 and RAM 1, we must communicate with our neighbour CPU 1 to do this for us. IE:

CPU 0 --> CPU 1 --> MMU 1 --> RAM 1

Not only do we pay a time delay price for the QPI communication between CPU 0 and CPU 1, but now CPU 1’s processes are waiting on the MMU 1 because we are retrieving memory on behalf of CPU 0. This is very slow (and can be seen by the node distances in the numactl –hardware output).

Today’s work around

The work around today is to limit your Directory Server instance to a single NUMA region. So for our example above, we would limit the instance to NUMA region 0 or 1, and treat the instance as though it only has access to 32GB of local memory.

It’s possible to run two instances of DS on a single server, pinning them to their own regions and using replication between them to provide synchronisation. You’ll need a load balancer to fix up the TCP port changes, or you need multiple addresses on the system for listening.

The future

In the future, we’ll be adding support for better copy-on-write techniques that allow the cores to better cache content after a QPI negotiation - but we still have to pay the transit cost. We can minimise this as much as possible, but there is no way today to avoid this penalty. To use all your hardware on a single instance, there will always be a NUMA cost somewhere.

The best solution is as above: run an instance per NUMA region, and internally provide replication for them. Perhaps we’ll support an automatic configuration of this in the future.

Tue, 07 Nov 2017 00:00:00 +1000 <![CDATA[GSoC 2017 - Mentor Report from 389 Project]]> GSoC 2017 - Mentor Report from 389 Project

This year I have had the pleasure of being a mentor for the Google Summer of Code program, as part of the Fedora Project organisation. I was representing the 389 Directory Server Project and offered students the oppurtunity to work on our command line tools written in python.


From the start we have a large number of really talented students apply to the project. This was one of the hardest parts of the process was to choose a student, given that I wanted to mentor all of them. Sadly I only have so many hours in the day, so we chose Ilias, a student from Greece. What really stood out was his interest in learning about the project, and his desire to really be part of the community after the project concluded.

The project

The project was very deliberately “loose” in it’s specification. Rather than giving Ilias a fixed goal of you will implement X, Y and Z, I chose to set a “broad and vague” task. Initially I asked him to investigate a single area of the code (the MemberOf plugin). As he investigated this, he started to learn more about the server, ask questions, and open doors for himself to the next tasks of the project. As these smaller questions and self discoveries stacked up, I found myself watching Ilias start to become a really complete developer, who could be called a true part of our community.

Ilias’ work was exceptional, and he has documented it in his final report here .

Since his work is complete, he is now free to work on any task that takes his interest, and he has picked a good one! He has now started to dive deep into the server internals, looking at part of our backend internals and how we dump databases from id2entry to various output formats.

What next?

I will be participating next year - Sadly, I think the python project oppurtunities may be more limited as we have to finish many of these tasks to release our new CLI toolset. This is almost a shame as the python components are a great place to start as they ease a new contributor into the broader concepts of LDAP and the project structure as a whole.

Next year I really want to give this oppurtunity to an under-represented group in tech (female, poc, etc). I personally have been really inspired by Noriko and I hope to have the oppurtunity to pass on her lessons to another aspiring student. We need more engineers like her in the world, and I want to help create that future.

Advice for future mentors

Mentoring is not for everyone. It’s not a task which you can just send a couple of emails and be done every day.

Mentoring is a process that requires engagement with the student, and communication and the relationship is key to this. What worked well was meeting early in the project, and working out what community worked best for us. We found that email questions and responses worked (given we are on nearly opposite sides of the Earth) worked well, along with irc conversations to help fix up any other questions. It would not be uncommon for me to spend at least 1 or 2 hours a day working through emails from Ilias and discussions on IRC.

A really important aspect of this communication is how you do it. You have to balance positive communication and encouragement, along with critcism that is constructive and helpful. Empathy is a super important part of this equation.

My number one piece of advice would be that you need to create an environment where questions are encouraged and welcome. You can never be dismissive of questions. If ever you dismiss a question as “silly” or “dumb”, you will hinder a student from wanting to ask more questions. If you can’t answer the question immediately, send a response saying “hey I know this is important, but I’m really busy, I’ll answer you as soon as I can”.

Over time you can use these questions to help teach lessons for the student to make their own discoveries. For example, when Ilias would ask how something worked, I would send my response structured in the way I approached the problem. I would send back links to code, my thoughts, and how I arrived at the conclusion. This not only answered the question but gave a subtle lesson in how to research our codebase to arrive at your own solutions. After a few of these emails, I’m sure that Ilias has now become self sufficent in his research of the code base.

Another valuable skill is that overtime you can help to build confidence through these questions. To start with Ilias would ask “how to implement” something, and I would answer. Over time, he would start to provide ideas on how to implement a solution, and I would say “X is the right one”. As time went on I started to answer his question with “What do you think is the right solution and why?”. These exchanges and justifications have (I hope) helped him to become more confident in his ideas, the presentation of them, and justification of his solutions. It’s led to this excellent exchange on our mailing lists, where Ilias is discussing the solutions to a problem with the broader community, and working to a really great answer.

Final thoughts

This has been a great experience for myself and Ilias, and I really look forward to helping another student next year. I’m sure that Ilias will go on to do great things, and I’m happy to have been part of his journey.

Thu, 24 Aug 2017 00:00:00 +1000 <![CDATA[So you want to script gdb with python …]]> So you want to script gdb with python …

Gdb provides a python scripting interface. However the documentation is highly technical and not at a level that is easily accessible.

This post should read as a tutorial, to help you understand the interface and work toward creating your own python debuging tools to help make gdb usage somewhat “less” painful.

The problem

I have created a problem program called “naughty”. You can find it here .

You can compile this with the following command:

gcc -g -lpthread -o naughty naughty.c

When you run this program, your screen should be filled with:

thread ...
thread ...
thread ...
thread ...
thread ...
thread ...

It looks like we have a bug! Now, we could easily see the issue if we looked at the C code, but that’s not the point here - lets try to solve this with gdb.

gdb ./naughty
(gdb) run
[New Thread 0x7fffb9792700 (LWP 14467)]
thread ...

Uh oh! We have threads being created here. We need to find the problem thread. Lets look at all the threads backtraces then.

Thread 129 (Thread 0x7fffb3786700 (LWP 14616)):
#0  0x00007ffff7bc38eb in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/
#1  0x00000000004007bc in lazy_thread (arg=0x7fffffffdfb0) at naughty.c:19
#2  0x00007ffff7bbd3a9 in start_thread () from /lib64/
#3  0x00007ffff78e936f in clone () from /lib64/

Thread 128 (Thread 0x7fffb3f87700 (LWP 14615)):
#0  0x00007ffff7bc38eb in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/
#1  0x00000000004007bc in lazy_thread (arg=0x7fffffffdfb0) at naughty.c:19
#2  0x00007ffff7bbd3a9 in start_thread () from /lib64/
#3  0x00007ffff78e936f in clone () from /lib64/

Thread 127 (Thread 0x7fffb4788700 (LWP 14614)):
#0  0x00007ffff7bc38eb in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/
#1  0x00000000004007bc in lazy_thread (arg=0x7fffffffdfb0) at naughty.c:19
#2  0x00007ffff7bbd3a9 in start_thread () from /lib64/
#3  0x00007ffff78e936f in clone () from /lib64/


We have 129 threads! Anyone of them could be the problem. We could just read these traces forever, but that’s a waste of time. Let’s try and script this with python to make our lives a bit easier.

Python in gdb

Python in gdb works by bringing in a copy of the python and injecting a special “gdb” module into the python run time. You can only access the gdb module from within python if you are using gdb. You can not have this work from a standard interpretter session.

We can access a dynamic python runtime from within gdb by simply calling python.

(gdb) python
>print("hello world")
>hello world

The python code only runs when you press Control D.

Another way to run your script is to import them as “new gdb commands”. This is the most useful way to use python for gdb, but it does require some boilerplate to start.

import gdb

class SimpleCommand(gdb.Command):
    def __init__(self):
        # This registers our class as "simple_command"
        super(SimpleCommand, self).__init__("simple_command", gdb.COMMAND_DATA)

    def invoke(self, arg, from_tty):
        # When we call "simple_command" from gdb, this is the method
        # that will be called.
        print("Hello from simple_command!")

# This registers our class to the gdb runtime at "source" time.

We can run the command as follows:

(gdb) source
(gdb) simple_command
Hello from simple_command!

Solving the problem with python

So we need a way to find the “idle threads”. We want to fold all the threads with the same frame signature into one, so that we can view anomalies.

First, let’s make a “stackfold” command, and get it to list the current program.

class StackFold(gdb.Command):
def __init__(self):
    super(StackFold, self).__init__("stackfold", gdb.COMMAND_DATA)

def invoke(self, arg, from_tty):
    # An inferior is the 'currently running applications'. In this case we only
    # have one.
    inferiors = gdb.inferiors()
    for inferior in inferiors:


To reload this in the gdb runtime, just run “source” again. Try running this: Note that we dumped a heap of output? Python has a neat trick that dir and help can both return strings for printing. This will help us to explore gdb’s internals inside of our program.

We can see from the inferiors that we have threads available for us to interact with:

class Inferior(builtins.object)
 |  GDB inferior object
 |  threads(...)
 |      Return all the threads of this inferior.

Given we want to fold the stacks from all our threads, we probably need to look at this! So lets get one thread from this, and have a look at it’s help.

inferiors = gdb.inferiors()
for inferior in inferiors:
    thread_iter = iter(inferior.threads())
    head_thread = next(thread_iter)

Now we can run this by re-running “source” on our script, and calling stackfold again, we see help for our threads in the system.

At this point it get’s a little bit less obvious. Gdb’s python integration relates closely to how a human would interact with gdb. In order to access the content of a thread, we need to change the gdb context to access the backtrace. If we were doing this by hand it would look like this:

(gdb) thread 121
[Switching to thread 121 (Thread 0x7fffb778e700 (LWP 14608))]
#0  0x00007ffff7bc38eb in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/
(gdb) bt
#0  0x00007ffff7bc38eb in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/
#1  0x00000000004007bc in lazy_thread (arg=0x7fffffffdfb0) at naughty.c:19
#2  0x00007ffff7bbd3a9 in start_thread () from /lib64/
#3  0x00007ffff78e936f in clone () from /lib64/

We need to emulate this behaviour with our python calls. We can swap to the thread’s context with:

class InferiorThread(builtins.object)
 |  GDB thread object
 |  switch(...)
 |      switch ()
 |      Makes this the GDB selected thread.

Then once we are in the context, we need to take a different approach to explore the stack frames. We need to explore the “gdb” modules raw context.

inferiors = gdb.inferiors()
for inferior in inferiors:
    thread_iter = iter(inferior.threads())
    head_thread = next(thread_iter)
    # Move our gdb context to the selected thread here.

Now that we have selected our thread’s context, we can start to explore here. gdb can do a lot within the selected context - as a result, the help output from this call is really large, but it’s worth reading so you can understand what is possible to achieve. In our case we need to start to look at the stack frames.

To look through the frames we need to tell gdb to rewind to the “newest” frame (ie, frame 0). We can then step down through progressively older frames until we exhaust. From this we can print a rudimentary trace:


# Reset the gdb frame context to the "latest" frame.
# Now, work down the frames.
cur_frame = gdb.selected_frame()
while cur_frame is not None:
    # get the next frame down ....
    cur_frame = cur_frame.older()
(gdb) stackfold

Great! Now we just need some extra metadata from the thread to know what thread id it is so the user can go to the correct thread context. So lets display that too:


# These are the OS pid references.
(tpid, lwpid, tid) = head_thread.ptid
# This is the gdb thread number
gtid = head_thread.num
print("tpid %s, lwpid %s, tid %s, gtid %s" % (tpid, lwpid, tid, gtid))
# Reset the gdb frame context to the "latest" frame.
(gdb) stackfold
tpid 14485, lwpid 14616, tid 0, gtid 129

At this point we have enough information to fold identical stacks. We’ll iterate over every thread, and if we have seen the “pattern” before, we’ll just add the gdb thread id to the list. If we haven’t seen the pattern yet, we’ll add it. The final command looks like:

def invoke(self, arg, from_tty):
    # An inferior is the 'currently running applications'. In this case we only
    # have one.
    stack_maps = {}
    # This creates a dict where each element is keyed by backtrace.
    # Then each backtrace contains an array of "frames"
    inferiors = gdb.inferiors()
    for inferior in inferiors:
        for thread in inferior.threads():
            # Change to our threads context
            # Get the thread IDS
            (tpid, lwpid, tid) = thread.ptid
            gtid = thread.num
            # Take a human readable copy of the backtrace, we'll need this for display later.
            o = gdb.execute('bt', to_string=True)
            # Build the backtrace for comparison
            backtrace = []
            cur_frame = gdb.selected_frame()
            while cur_frame is not None:
                cur_frame = cur_frame.older()
            # Now we have a backtrace like ['pthread_cond_wait@@GLIBC_2.3.2', 'lazy_thread', 'start_thread', 'clone']
            # dicts can't use lists as keys because they are non-hashable, so we turn this into a string.
            # Remember, C functions can't have spaces in them ...
            s_backtrace = ' '.join(backtrace)
            # Let's see if it exists in the stack_maps
            if s_backtrace not in stack_maps:
                stack_maps[s_backtrace] = []
            # Now lets add this thread to the map.
            stack_maps[s_backtrace].append({'gtid': gtid, 'tpid' : tpid, 'bt': o} )
    # Now at this point we have a dict of traces, and each trace has a "list" of pids that match. Let's display them
    for smap in stack_maps:
        # Get our human readable form out.
        o = stack_maps[smap][0]['bt']
        for t in stack_maps[smap]:
            # For each thread we recorded
            print("Thread %s (LWP %s))" % (t['gtid'], t['tpid']))

Here is the final output.

(gdb) stackfold
Thread 129 (LWP 14485))
Thread 128 (LWP 14485))
Thread 127 (LWP 14485))
Thread 10 (LWP 14485))
Thread 9 (LWP 14485))
Thread 8 (LWP 14485))
Thread 7 (LWP 14485))
Thread 6 (LWP 14485))
Thread 5 (LWP 14485))
Thread 4 (LWP 14485))
Thread 3 (LWP 14485))
#0  0x00007ffff7bc38eb in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/
#1  0x00000000004007bc in lazy_thread (arg=0x7fffffffdfb0) at naughty.c:19
#2  0x00007ffff7bbd3a9 in start_thread () from /lib64/
#3  0x00007ffff78e936f in clone () from /lib64/

Thread 2 (LWP 14485))
#0  0x00007ffff78d835b in write () from /lib64/
#1  0x00007ffff78524fd in _IO_new_file_write () from /lib64/
#2  0x00007ffff7854271 in __GI__IO_do_write () from /lib64/
#3  0x00007ffff7854723 in __GI__IO_file_overflow () from /lib64/
#4  0x00007ffff7847fa2 in puts () from /lib64/
#5  0x00000000004007e9 in naughty_thread (arg=0x0) at naughty.c:27
#6  0x00007ffff7bbd3a9 in start_thread () from /lib64/
#7  0x00007ffff78e936f in clone () from /lib64/

Thread 1 (LWP 14485))
#0  0x00007ffff7bbe90d in pthread_join () from /lib64/
#1  0x00000000004008d1 in main (argc=1, argv=0x7fffffffe508) at naughty.c:51

With our stackfold command we can easily see that threads 129 through 3 have the same stack, and are idle. We can see that tread 1 is the main process waiting on the threads to join, and finally we can see that thread 2 is the culprit writing to our display.

My solution

You can find my solution to this problem as a reference implementation here .

Fri, 04 Aug 2017 00:00:00 +1000 <![CDATA[Time safety and Rust]]> Time safety and Rust

Recently I have had the great fortune to work on this ticket . This was an issue that stemmed from an attempt to make clock performance faster. Previously, a call to time or clock_gettime would involve a context switch an a system call (think solaris etc). On linux we have VDSO instead, so we can easily just swap to the use of raw time calls.

The problem

So what was the problem? And how did the engineers of the past try and solve it?

DS heavily relies on time. As a result, we call time() a lot in the codebase. But this would mean context switches.

So a wrapper was made called “current_time()”, which would cache a recent output of time(), and then provide that to the caller instead of making the costly context switch. So the code had the following:

static time_t   currenttime;
static int      currenttime_set = 0;

    if ( !currenttime_set ) {
        currenttime_set = 1;

    time( &currenttime );
    return( currenttime );

current_time( void )
    if ( currenttime_set ) {
        return( currenttime );
    } else {
        return( time( (time_t *)0 ));

In another thread, we would poll this every second to update the currenttime value:

void *
time_thread(void *nothing __attribute__((unused)))
    PRIntervalTime    interval;

    interval = PR_SecondsToInterval(1);

    while(!time_shutdown) {
        csngen_update_time ();


So what is the problem here

Besides the fact that we may not poll accurately (meaning we miss seconds but always advance), this is not thread safe. The reason is that CPU’s have register and buffers that may cache both stores and writes until a series of other operations (barriers + atomics) occur to flush back out to cache. This means the time polling thread could update the clock and unless the POLLING thread issues a lock or a barrier+atomic, there is no guarantee the new value of currenttime will be seen in any other thread. This means that the only way this worked was by luck, and no one noticing that time would jump about or often just be wrong.

Clearly this is a broken design, but this is C - we can do anything.

What if this was Rust?

Rust touts mulithread safety high on it’s list. So lets try and recreate this in rust.

First, the exact same way:

use std::time::{SystemTime, Duration};
use std::thread;

static mut currenttime: Option<SystemTime> = None;

fn read_thread() {
    let interval = Duration::from_secs(1);

    for x in 0..10 {
        let c_time = currenttime.unwrap();
        println!("reading time {:?}", c_time);

fn poll_thread() {
    let interval = Duration::from_secs(1);

    for x in 0..10 {
        currenttime = Some(SystemTime::now());
        println!("polling time");

fn main() {
    let poll = thread::spawn(poll_thread);
    let read = thread::spawn(read_thread);

Rust will not compile this code.

> rustc
error[E0133]: use of mutable static requires unsafe function or block
13 |         let c_time = currenttime.unwrap();
   |                      ^^^^^^^^^^^ use of mutable static

error[E0133]: use of mutable static requires unsafe function or block
22 |         currenttime = Some(SystemTime::now());
   |         ^^^^^^^^^^^ use of mutable static

error: aborting due to 2 previous errors

Rust has told us that this action is unsafe, and that we shouldn’t be modifying a global static like this.

This alone is a great reason and demonstration of why we need a language like Rust instead of C - the compiler can tell us when actions are dangerous at compile time, rather than being allowed to sit in production code for years.

For bonus marks, because Rust is stricter about types than C, we don’t have issues like:

int c_time = time();

Which is a 2038 problem in the making :)

Wed, 12 Jul 2017 00:00:00 +1000 <![CDATA[indexed search performance for ds - the mystery of the and query]]> indexed search performance for ds - the mystery of the and query

Directory Server is heavily based on set mathematics - one of the few topics I enjoyed during university. Our filters really boil down to set queries:


This filter describes the intersection of sets of objects containing “attr=val1” and “attr=val2”.

One of the properties of sets is that operations on them are commutative - the sets to a union or intersection may be supplied in any order with the same results. As a result, these are equivalent:


In the past I noticed an odd behaviour: that the order of filter terms in an ldapsearch query would drastically change the performance of the search. For example:


The later query may significantly outperform the former - but 10% or greater. I have never understood the reason why though. I toyed with ideas of re-arranging queries in the optimise step to put the terms in a better order, but I didn’t know what factors affected this behaviour.

Over time I realised that if you put the “more specific” filters first over the general filters, you would see a performance increase.

What was going on?

Recently I was asked to investigate a full table scan issue with range queries. This led me into an exploration of our search internals, and yielded the answer to the issue above.

Inside of directory server, our indexes are maintained as “pre-baked” searches. Rather than trying to search every object to see if a filter matches, our indexes contain a list of entries that match a term. For example:

uid=mark: 1, 2
uid=william: 3
uid=noriko: 4

From each indexed term we construct an IDList, which is the set of entries matching some term.

On a complex query we would need to intersect these. So the algorithm would iteratively apply this:

t1 = (a, b)
t2 = (c, t1)
t3 = (d, t2)

In addition, the intersection would allocate a new IDList to insert the results into.

What would happen is that if your first terms were large, we would allocate large IDLists, and do many copies into it. This would also affect later filters as we would need to check large ID spaces to perform the final intersection.

In the above example, consider a, b, c all have 10,000 candidates. This would mean t1, t2 is at least 10,000 IDs, and we need to do at least 20,000 comparisons. If d were only 3 candidates, this means that we then throw away the majority of work and allocations when we get to t3 = (d, t2).

What is the fix?

We now wrap each term in an idl_set processing api. When we get the IDList from each AVA, we insert it to the idl_set. This tracks the “minimum” IDList, and begins our intersection from the smallest matching IDList. This means that we have the quickest reduction in set size, and results in the smallest possible IDList allocation for the results. In my tests I have seen up to 10% improvement on complex queries.

For the example above, this means we procees d first, to reduce t1 to the smallest possible candidate set we can.

t1 = (d, a)
t2 = (b, t1)
t3 = (c, t2)

This means to create t2, t3, we will do an allocation that is bounded by the size of d (aka 3, rather than 10,000), we only need to perform fewer queries to reach this point.

A benefit of this strategy is that it means if on the first operation we find t1 is empty set, we can return immediately because no other intersection will have an impact on the operation.

What is next?

I still have not improved union performance - this is still somewhat affected by the ordering of terms in a filter. However, I have a number of ideas related to either bitmask indexes or disjoin set structures that can be used to improve this performance.

Stay tuned ….

Mon, 26 Jun 2017 00:00:00 +1000 <![CDATA[TLS Authentication and FreeRADIUS]]> TLS Authentication and FreeRADIUS

In a push to try and limit the amount of passwords sent on my network, I’m changing my wireless to use TLS certificates for authentication.


Thu, 25 May 2017 00:00:00 +1000 <![CDATA[Kerberos - why the world moved on]]> Kerberos - why the world moved on

For a long time I have tried to integrate and improve authentication technologies in my own environments. I have advocated the use of GSSAPI, IPA, AD, and others. However, the more I have learnt, the further I have seen the world moving away. I want to explore some of my personal experiences and views as to why this occured, and what we can do.


Tue, 23 May 2017 00:00:00 +1000 <![CDATA[Custom OSTree images]]> Custom OSTree images

Project Atomic is in my view, one of the most promising changes to come to linux distributions in a long time. It boasts the ability to atomicupgrade and alter your OS by maintaining A/B roots of the filesystem. It is currently focused on docker and k8s runtimes, but we can use atomic in other locations.


Mon, 22 May 2017 00:00:00 +1000 <![CDATA[Your Code Has Impact]]> Your Code Has Impact

As an engineer, sometimes it’s easy to forget why we are writing programs. Deep in a bug hunt, or designing a new feature it’s really easy to focus so hard on these small things you forget the bigger picture. I’ve even been there and made this mistake.


Fri, 10 Mar 2017 00:00:00 +1000 <![CDATA[CVE-2017-2591 - DoS via OOB heap read]]> CVE-2017-2591 - DoS via OOB heap read

On 18 of Jan 2017, the following email found it’s way to my notifications .

This is to disclose the following CVE:

CVE-2017-2591 389 Directory Server: DoS via OOB heap read

Description :

The "attribute uniqueness" plugin did not properly NULL-terminate an array
when building up its configuration, if a so called 'old-style'
configuration, was being used (Using nsslapd-pluginarg<X> parameters) .

A attacker, authenticated, but possibly also unauthenticated, could
possibly force the plugin to read beyond allocated memory and trigger a

The crash could also possibly be triggered accidentally.

Upstream patch :
Affected versions : from

Fixed version : 1.3.6

Impact: Low
CVSS3 scoring : 3.7 -- CVSS:3.0/AV:N/AC:H/PR:N/UI:N/S:U/C:N/I:N/A:L

Upstream bug report :

So I decided to pull this apart: Given I found the issue and wrote the fix, I didn’t deem it security worthy, so why was a CVE raised?


Wed, 22 Feb 2017 00:00:00 +1000 <![CDATA[The next year of Directory Server]]> The next year of Directory Server

Last year I wrote a post about the vision behind Directory Server and what I wanted to achieve in the project personally. My key aims were:

  • We need to modernise our tooling, and installers.
  • Setting up replication groups and masters needs to be simpler.
  • We need to get away from long lived static masters.
  • During updates, we need to start to enable smarter choices by default.
  • Out of the box we need smarter settings.
  • Web Based authentication


Mon, 23 Jan 2017 00:00:00 +1000 <![CDATA[Usability of software: The challenges facing projects]]> Usability of software: The challenges facing projects

I have always desired the usability of software like Directory Server to improve. As a former system administrator, usabilty and documentation are very important for me. Improvements to usability can eliminate load on documentation, support services and more.

Consider a microwave. No one reads the user manual. They unbox it, plug it in, and turn it on. You punch in a time and expect it to “make cold things hot”. You only consult the manual if it blows up.

Many of these principles are rooted in the field of design. Design is an important and often over looked part of software development - All the way from the design of an API to the configuration, and even the user interface of software.


Mon, 23 Jan 2017 00:00:00 +1000 <![CDATA[LCA2017 - Getting Into the Rusty Bucket]]> LCA2017 - Getting Into the Rusty Bucket

I spoke at Linux Conf Australia 2017 recently. I presented techniques and lessons about integrating Rust with existing C code bases. This is related to my work on Directory Server.

The recording of the talk can be found on youtube and on the Linux Australia Mirror .

You can find the git repository for the project on github .

The slides can be viewed on .

I have already had a lot of feedback on improvements to make to this system including the use of struct pointers instead of c_void, and the use of bindgen in certain places.

Mon, 23 Jan 2017 00:00:00 +1000 <![CDATA[State of the 389 ds port, 2017]]> State of the 389 ds port, 2017

Previously I have written about my efforts to port 389 ds to FreeBSD.

A great deal of progress has been made in the last few weeks (owing to my taking time off work).

I have now ported nunc-stans to freebsd, which is important as it’s our new connection management system. It has an issue with long lived events, but I will resolve this soon.

The majority of patches for 389 ds have merged, with a single patch remaining to be reviewed.

Finally, I have build a freebsd makefile for the devel root to make it easier for people to install and test from source.

Once the freebsd nunc-stans and final DS patch are accepted, I’ll be able to start building the portfiles.

Fri, 06 Jan 2017 00:00:00 +1000 <![CDATA[Openshift cluster administration]]> Openshift cluster administration

Over the last 6 months I have administered a three node openshift v3 cluster in my lab environment.

The summary of this expirence is that openshift is a great idea, but not ready for production. As an administrator you will find this a frustrating, difficult experience.


Mon, 02 Jan 2017 00:00:00 +1000 <![CDATA[The minssf trap]]> The minssf trap

In directory server, we often use the concept of a “minssf”, or the “minimum security strength factor”. This is derived from cyrus sasl. However, there are some issues that will catch you out!


Wed, 23 Nov 2016 00:00:00 +1000 <![CDATA[What’s new in 389 Directory Server 1.3.5 (unofficial)]]> What’s new in 389 Directory Server 1.3.5 (unofficial)

As a member of the 389 Directory Server (389DS) core team, I am always excited about our new releases. We have some really great features in 1.3.5. However, our changelogs are always large so I want to just touch on a few of my favourites.

389 Directory Server is an LDAPv3 compliant server, used around the world for Identity Management, Authentication, Authorisation and much more. It is the foundation of the FreeIPA project’s server. As a result, it’s not something we often think about or even get excited for: but every day many of us rely on 389 Directory Server to be correct, secure and fast behind the scenes.


Wed, 21 Sep 2016 00:00:00 +1000 <![CDATA[The mysterious crashing of my laptop]]> The mysterious crashing of my laptop

Recently I have grown unhappy with Fedora. The updates to the i915 graphics driver have caused my laptop to kernel panic just connecting and removing external displays: unacceptable to someone who moves their laptop around as much as I do.


Wed, 21 Sep 2016 00:00:00 +1000 <![CDATA[Block Chain for Identity Management]]>

Block Chain for Identity Management

On Sunday evening I was posed with a question and view of someone interested in Block Chain.

“What do you think of block chain for authentication”

A very heated debate ensued. I want to discuss this topic at length.

When you roll X …

We’ve heard this. “When writing cryptography, don’t. If you have to, hire a cryptographer”.

This statement is true for Authentication and Authorisation systems

“When writing Authentication and Authorisation systems, don’t. If you have to, hire an Authentication expert”.

Guess what. Authentication and Authorisation are hard. Very hard. This is not something for the kids to play with; this is a security critical, business critical, sensitive, scrutinised and highly conservative area of technology.


Mon, 18 Jul 2016 00:00:00 +1000 <![CDATA[Can I cycle through operators in C?]]> Can I cycle through operators in C?

A friend of mine who is learning to program asked me the following:

“”” How do i cycle through operators in c? If I want to try every version of + - * / on an equation, so printf(“1 %operator 1”, oneOfTheOperators); Like I’d stick them in an array and iterate over the array incrementing on each iteration, but do you define an operator as? They aren’t ints, floats etc “””

He’s certainly on the way to the correct answer already. There are three key barriers to an answer.


Sat, 16 Jul 2016 00:00:00 +1000 <![CDATA[tracking down insane memory leaks]]> tracking down insane memory leaks

One of the best parts of AddressSanitizer is the built in leak sanitiser. However, sometimes it’s not as clear as you might wish!

I0> /opt/dirsrv/bin/pwdhash hello

==388==ERROR: LeakSanitizer: detected memory leaks

Direct leak of 72 byte(s) in 1 object(s) allocated from:
    #0 0x7f5f5f94dfd0 in calloc (/lib64/
    #1 0x7f5f5d7f72ae  (/lib64/

SUMMARY: AddressSanitizer: 72 byte(s) leaked in 1 allocation(s).

“Where is /lib64/” and what can I do with it? I have debuginfo and devel info installed, but I can’t seem to see what line that’s at.


Wed, 13 Jul 2016 00:00:00 +1000 <![CDATA[LDAP Guide Part 2: Searching]]> LDAP Guide Part 2: Searching

In the first part, we discussed how and LDAP tree is laid out, and why it’s called a “tree”.

In this part, we will discuss the most important and fundamental component of ldap: Searching.


Tue, 05 Jul 2016 00:00:00 +1000 <![CDATA[LDAP Guide Part 1: Foundations]]> LDAP Guide Part 1: Foundations

To understand LDAP we must understand a number of concepts of datastructures: Specifically graphs.


In computer science, a set of nodes, connected by some set of edges is called a graph. Here we can see a basic example of a graph.


Viewing this graph, we can see that it has a number of properties. It has 4 nodes, and 4 edges. As this is undirected we can assume the link A to B is as valid as B to A.

We also have a cycle: That is a loop between nodes. We can see this in B, C, D. If any edge between the set of B, D or B, C, or C, D were removed, this graph would no longer have cycles.


Mon, 20 Jun 2016 00:00:00 +1000 <![CDATA[GDB: Using memory watch points]]> GDB: Using memory watch points

While programming, we’ve all seen it.

“Why the hell is that variable set to 1? It should be X!”

A lot of programmers would stack print statements around till the find the issue. Others, might look at function calls.

But in the arsenal of the programmer, is the debugger. Normally, the debugger, is really overkill, and too complex to really solve a lot of issues. But while trying to find an issue like this, it shines.

All the code we are about to discuss is in the liblfdb git


Sat, 11 Jun 2016 00:00:00 +1000 <![CDATA[lock free database]]> lock free database

While discussing some ideas with the owner of liblfds I was thinking about some of the issues in the database of Directory Server, and other ldap products. What slows us down?

Why are locks slow?

It’s a good idea to read this article to understand Memory Barriers in a cpu.

When you think about the way a mutex has to work, it takes advantage of these primitives to create a full barrier, and do the compare and exchange to set the value of the lock, and to guarantee the other memory is synced to our core. This is pretty full on for cpu time, and in reverse, to unlock you have to basically do the same again. That’s a lot of operations! (NOTE: You do a load barrier on the way in to the lock, and a store barrier on the unlock. The end result is the full barrier over the set of operations as a whole.)


Tue, 07 Jun 2016 00:00:00 +1000 <![CDATA[Zero Outage Migration Of Directory Server Infrastructure]]> Zero Outage Migration Of Directory Server Infrastructure

In my previous job I used to manage the Directory Servers for a University. People used to challenge me that while doing migrations, downtime was needed.

They are all wrong

It is very possible, and achievable to have zero outage migrations of Directory Servers. All it takes is some thought, planning, dedication and testing.


Fri, 03 Jun 2016 00:00:00 +1000 <![CDATA[Acis for group creation and delegataion in DS]]> Acis for group creation and delegataion in DS

Something I get asked about frequently is ACI’s in Directory Server. They are a little bit of an art, and they have a lot of edge cases that you can hit.

I am asked about “how do I give access to uid=X to create groups in some ou=Y”.

TL;DR: You want the ACI at the end of the post. All the others are insecure in some way.

So lets do it.

First, I have my user:

dn: uid=test,ou=People,dc=example,dc=com
objectClass: top
objectClass: account
objectClass: simpleSecurityObject
uid: test
userPassword: {SSHA}LQKDZWFI1cw6EnnYtv74v622aPeNZ9cxXc/QIA==

And I have the ou I want them to edit:

dn: ou=custom,ou=Groups,dc=example,dc=com
objectClass: top
objectClass: organizationalUnit
ou: custom

So I would put the aci onto ou=custom,ou=Groups,dc=example,dc=com, and away I go:

aci: (target = "ldap:///ou=custom,ou=groups,dc=example,dc=com")
     (version 3.0; acl "example"; allow (read, search, write, add)
        (userdn = "ldap:///uid=test,ou=People,dc=example,dc=com");

Great! Now I can add a group under ou=custom,ou=Groups,dc=example,dc=com!


First, it allows uid=test,ou=People,dc=example,dc=com write access to the ou=custom itself, which means that it can alter the aci, and potentially grant further rights. That’s bad.

So lets tighten that up.

aci: (target = "ldap:///cn=*,ou=custom,ou=groups,dc=example,dc=com")
     (version 3.0; acl "example"; allow (read, search, write, add)
        (userdn = "ldap:///uid=test,ou=People,dc=example,dc=com");

Better! Now we can only create objects with cn=* under that ou, and we can’t edit the ou or it’s aci’s itself. But this is still insecure! Imagine, that I made:

dn: cn=fake_root,ou=custom,ou=groups,dc=example,dc=com
uid: xxxx
userClass: secure
memberOf: cn=some_privileged_group,....

Many sites often have their pam_ldap/nslcd/sssd set to search from the root IE dc=example,dc=com. Because ldap doesn’t define a sort order of responses, this entry may over-ride an exist admin user, or it could be a new user that matches authorisation filters. This just granted someone in your org access to all your servers!

But we can prevent this.

aci: (target = "ldap:///cn=*,ou=custom,ou=groups,dc=example,dc=com")
     (version 3.0; acl "example"; allow (read, search, write, add)
        (userdn = "ldap:///uid=test,ou=People,dc=example,dc=com");

Looks better! Now we can only create objects with objectClass top, and groupOfUniqueNames.

Then again ….

dn: cn=bar,ou=custom,ou=Groups,dc=example,dc=com
objectClass: top
objectClass: groupOfUniqueNames
objectClass: simpleSecurityObject
cn: bar
userPassword: {SSHA}UYVTFfPFZrN01puFXYJM3nUcn8lQcVSWtJmQIw==

Just because we say it has to have top and groupOfUniqueNames DOESN’T exclude adding more objectClasses!

Finally, if we make the delegation aci:

# This is the secure aci
aci: (target = "ldap:///cn=*,ou=custom,ou=groups,dc=example,dc=com")
     (targetattr="cn || uniqueMember || objectclass")
     (version 3.0; acl "example"; allow (read, search, write, add)
        (userdn = "ldap:///uid=test,ou=People,dc=example,dc=com");

This aci limits creation to only groups of unique names and top, and limits the attributes to only what can be made in those objectClasses. Finally we have a secure aci. Even though we can add other objectClasses, we can never actually add the attributes to satisfy them, so we effectively limit this to the types show. Even if we add other objectClasses that take “may” as the attribute, we can never fill in those attributes either.

Summary: Acis are hard.

Wed, 25 May 2016 00:00:00 +1000 <![CDATA[systemd is not monolithic]]> systemd is not monolithic

Go ahead. Please read this post by Lennart about systemd myths. I’ll wait.

Done? Great. You noticed the first point. “Systemd is monolithic”. Which is carefully “debunked”.

So this morning while building Ds, I noticed my compile failed:

configure: checking for Systemd...
checking for --with-systemd... using systemd native features
checking for --with-journald... using journald logging: WARNING, this may cause system instability
checking for pkg-config... (cached) /usr/bin/pkg-config
checking for Systemd with pkg-config... configure: error: no Systemd / Journald pkg-config files
Makefile:84: recipe for target 'ds-configure' failed

I hadn’t changed this part of the code, and it’s been reliably compiling for me … What changed?

Well on RHEL7 here is the layout of the system libraries:


They also each come with their own very nice pkg-config file so you can find them.


Sure these are big libraries, but it’s pretty modular. And it’s nice they are seperate out.

But today, I compiled on rawhide. What’s changed:



I almost thought this was an error. Surely they put libsystemd-journald into another package.

No. No they did not.

I0> readelf -Ws /usr/lib64/ | grep -i journal_print
   297: 00000000000248c0   177 FUNC    GLOBAL DEFAULT   12 sd_journal_print@@LIBSYSTEMD_209
   328: 0000000000024680   564 FUNC    GLOBAL DEFAULT   12 sd_journal_printv@@LIBSYSTEMD_209
   352: 0000000000023d80   788 FUNC    GLOBAL DEFAULT   12 sd_journal_printv_with_location@@LIBSYSTEMD_209
   399: 00000000000240a0   162 FUNC    GLOBAL DEFAULT   12 sd_journal_print_with_location@@LIBSYSTEMD_209

So we went from these small modular libraries:

-rwxr-xr-x. 1 root root  26K May 12 14:29 /usr/lib64/
-rwxr-xr-x. 1 root root  21K May 12 14:29 /usr/lib64/
-rwxr-xr-x. 1 root root 129K May 12 14:29 /usr/lib64/
-rwxr-xr-x. 1 root root  56K May 12 14:29 /usr/lib64/
-rwxr-xr-x. 1 root root 159K May 12 14:29 /usr/lib64/

To this monolithic library:

-rwxr-xr-x. 1 root root 556K May 22 14:09 /usr/lib64/

“Systemd is not monolithic”.

Mon, 23 May 2016 00:00:00 +1000 <![CDATA[389ds on freebsd update]]> 389ds on freebsd update

A few months ago I posted an how to build 389-ds for freebsd. This was to start my porting effort.

I have now finished the port. There are still issues that the perl installer does not work, but this will be resolved soon in other ways.

For now here are the steps for 389-ds on freebsd. You may need to wait for a few days for the relevant patches to be in git.

You will need to install these deps:


You will need to use pip for these python dependencies.

sudo python3.4 -m ensurepip
sudo pip3.4 install six pyasn1 pyasn1-modules

You will need to install svrcore.

tar -xvzf svrcore-4.1.1.tar.bz2
cd svrcore-4.1.1
CFLAGS="-fPIC "./configure --prefix=/opt/svrcore
sudo make install

You will need the following python tools checked out:

git clone
git clone
cd pyldap
python3.4 build
sudo python3.4 install

Now you can clone ds and try to build it:

git clone
cd ds
./configure --prefix=/opt/dirsrv --with-openldap=/usr/local --with-db --with-db-inc=/usr/local/include/db5/ --with-db-lib=/usr/local/lib/db5/ --with-sasl --with-sasl-inc=/usr/local/include/sasl/ --with-sasl-lib=/usr/local/lib/sasl2/ --with-svrcore-inc=/opt/svrcore/include/ --with-svrcore-lib=/opt/svrcore/lib/ --with-netsnmp=/usr/local
sudo gmake install

Go back to the lib389 directory:

sudo pw user add dirsrv
sudo PYTHONPATH=`pwd` python3.4 lib389/clitools/ -f ~/setup-ds-admin.inf -v
sudo chown -R dirsrv:dirsrv /opt/dirsrv/var/{run,lock,log,lib}/dirsrv
sudo chmod 775 /opt/dirsrv/var
sudo chmod 775 /opt/dirsrv/var/*
sudo /opt/dirsrv/sbin/ns-slapd -d 0 -D /opt/dirsrv/etc/dirsrv/slapd-localhost

This is a really minimal setup routine right now. If it all worked, you can now run your instance. Here is my output belowe: setup with verbose inf from /home/william/setup-ds-admin.inf ['general', 'slapd', 'rest', 'backend-userRoot'] general {'selinux': False, 'full_machine_name': 'localhost.localdomain', 'config_version': 2, 'strict_host_checking': True} slapd {'secure_port': 636, 'root_password': 'password', 'port': 389, 'cert_dir': '/opt/dirsrv/etc/dirsrv/slapd-localhost/', 'lock_dir': '/opt/dirsrv/var/lock/dirsrv/slapd-localhost', 'ldif_dir': '/opt/dirsrv/var/lib/dirsrv/slapd-localhost/ldif', 'backup_dir': '/opt/dirsrv/var/lib/dirsrv/slapd-localhost/bak', 'prefix': '/opt/dirsrv', 'instance_name': 'localhost', 'bin_dir': '/opt/dirsrv/bin/', 'data_dir': '/opt/dirsrv/share/', 'local_state_dir': '/opt/dirsrv/var', 'run_dir': '/opt/dirsrv/var/run/dirsrv', 'schema_dir': '/opt/dirsrv/etc/dirsrv/slapd-localhost/schema', 'config_dir': '/opt/dirsrv/etc/dirsrv/slapd-localhost/', 'root_dn': 'cn=Directory Manager', 'log_dir': '/opt/dirsrv/var/log/dirsrv/slapd-localhost', 'tmp_dir': '/tmp', 'user': 'dirsrv', 'group': 'dirsrv', 'db_dir': '/opt/dirsrv/var/lib/dirsrv/slapd-localhost/db', 'sbin_dir': '/opt/dirsrv/sbin', 'sysconf_dir': '/opt/dirsrv/etc', 'defaults': '1.3.5'} backends [{'name': 'userRoot', 'sample_entries': True, 'suffix': 'dc=example,dc=com'}] user / group checking Hostname strict checking prefix checking
INFO:lib389:dir (sys) : /opt/dirsrv/etc/sysconfig instance checking root user checking network avaliability checking beginning installation creating /opt/dirsrv/var/lib/dirsrv/slapd-localhost/bak creating /opt/dirsrv/etc/dirsrv/slapd-localhost/ creating /opt/dirsrv/etc/dirsrv/slapd-localhost/ creating /opt/dirsrv/var/lib/dirsrv/slapd-localhost/db creating /opt/dirsrv/var/lib/dirsrv/slapd-localhost/ldif creating /opt/dirsrv/var/lock/dirsrv/slapd-localhost creating /opt/dirsrv/var/log/dirsrv/slapd-localhost creating /opt/dirsrv/var/run/dirsrv Creating certificate database is /opt/dirsrv/etc/dirsrv/slapd-localhost/ Creating dse.ldif completed installation
Sucessfully created instance
[17/Apr/2016:14:44:21.030683607 +1000] could not open config file "/opt/dirsrv/etc/dirsrv/slapd-localhost//slapd-collations.conf" - absolute path?
[17/Apr/2016:14:44:21.122087994 +1000] 389-Directory/ B2016.108.412 starting up
[17/Apr/2016:14:44:21.460033554 +1000] convert_pbe_des_to_aes:  Checking for DES passwords to convert to AES...
[17/Apr/2016:14:44:21.461012440 +1000] convert_pbe_des_to_aes - No DES passwords found to convert.
[17/Apr/2016:14:44:21.462712083 +1000] slapd started.  Listening on All Interfaces port 389 for LDAP requests

If we do an ldapsearch:

fbsd-389-port# uname -r -s
fbsd-389-port# ldapsearch -h localhost -b '' -s base -x +
# extended LDIF
# LDAPv3
# base <> with scope baseObject
# filter: (objectclass=*)
# requesting: +

creatorsName: cn=server,cn=plugins,cn=config
modifiersName: cn=server,cn=plugins,cn=config
createTimestamp: 20160417044112Z
modifyTimestamp: 20160417044112Z
subschemaSubentry: cn=schema
supportedExtension: 2.16.840.1.113730.3.5.7
supportedExtension: 2.16.840.1.113730.3.5.8
supportedControl: 2.16.840.1.113730.3.4.2
supportedControl: 2.16.840.1.113730.3.4.3
supportedControl: 2.16.840.1.113730.3.4.4
supportedControl: 2.16.840.1.113730.3.4.5
supportedControl: 1.2.840.113556.1.4.473
supportedControl: 2.16.840.1.113730.3.4.9
supportedControl: 2.16.840.1.113730.3.4.16
supportedControl: 2.16.840.1.113730.3.4.15
supportedControl: 2.16.840.1.113730.3.4.17
supportedControl: 2.16.840.1.113730.3.4.19
supportedControl: 1.2.840.113556.1.4.319
supportedControl: 2.16.840.1.113730.3.4.14
supportedControl: 2.16.840.1.113730.3.4.20
supportedControl: 2.16.840.1.113730.3.4.12
supportedControl: 2.16.840.1.113730.3.4.18
supportedSASLMechanisms: EXTERNAL
supportedLDAPVersion: 2
supportedLDAPVersion: 3
vendorName: 389 Project
vendorVersion: 389-Directory/ B2016.108.412
Sun, 17 Apr 2016 00:00:00 +1000 <![CDATA[The future vision of 389-ds]]> The future vision of 389-ds

Disclaimer: This is my vision and analysis of 389-ds and it’s future. It is nothing about Red Hat’s future plans or goals. Like all predictions, they may not even come true.

As I have said before I’m part of the 389-ds core team. I really do have a passion for authentication and identity management: I’m sure some of my friends would like to tell me to shut up about it sometimes.

389-ds, or rather, ns-slapd has a lot of history. Originally from the umich code base, it has moved through Netscape, SUN, Aol and, finally to Red Hat. It’s quite something to find myself working on code that was written in 1996. In 1996 I was on a playground in Primary School, where my biggest life concerns was if I could see the next episode of [anime of choice] the next day at before school care. What I’m saying, is ns-slapd is old. Very old. There are many dark, untrodden paths in that code base.

You would get your nice big iron machine from SUN, you would setup the ns-slapd instance once. You would then probably setup one other ns-slapd master, then you would run them in production for the next 4 to 5 years with minimal changes. Business policy would ask developers to integrate with the LDAP server. Everything was great.

But it’s not 1996 anymore. I have managed to complete schooling and acquire a degree in this time. ns-slapd has had many improvements, but the work flow and way that ns-slapd in managed really hasn’t changed a lot.

While ns-slapd has stayed roughly the same, the world has evolved. We now have latte sipping code hipsters, sitting in trendy Melbourne cafes programming in go and whatever js framework of this week. They deploy to AWS, to GCE. (But not Azure, that’s not cool enough). These developers have a certain mindset and the benefits of centralised business authentication isn’t one of them. They want to push things to cloud, but no system administrator would let a corporate LDAP be avaliable on the internet. CIO’s are all drinking the “cloud” “disruption” kool aid. The future of many technologies is certainly in question.

To me, there is no doubt that ns-slapd is still a great technology: Like the unix philosohpy, tools should do “one thing” and “one thing well”. When it comes to secure authentication, user identification, and authorisation, LDAP is still king. So why are people not deploying it in their new fancy containers and cloud disruption train that they are all aboard?

ns-slapd is old. Our systems and installers, such as are really designed for the “pet” mentality of servers. They are hard to automate to replica groups, and they inist on having certain types of information avaliable before they can run. They also don’t work with automation, and are unable to accept certain types of ldifs as part of the inf file that drives the install. You have to have at least a few years experience with ns-slapd before you could probably get this process “right”.

Another, well, LDAP is … well, hard. It’s not json (which is apparently the only thing developers understand now). Developers also don’t care about identifying users. That’s just not cool. Why would we try and use some “hard” LDAP system, when I can just keep some json in a mongodb that tracks your password and groups you are in?

So what can we do? Where do I see 389-ds going in the future?

  • We need to modernise our tooling, and installers. It needs to be easier than ever to setup an LDAP instance. Our administration needs to move away from applying ldifs, into robust, command line tools.
  • Setting up replication groups and masters needs to be simpler. Replication topologies should be “self managing” (to an extent). Ie I should say “here is a new ldap server, join this replication group”. The administration layer then determines all the needed replication agreements for robust and avaliable service.
  • We need to get away from long lived static masters, and be able to have rapidly deployed, and destroyed, masters. With the changes above, this will lend itself to faster and easier deployment into containers and other such systems.
  • During updates, we need to start to enable smarter choices by default: but still allow people to fix their systems on certain configurations to guarantee stability. For example, we add new options and improvements to DS all the time: but we cannot always enable them by default. This makes our system look dated, when really a few configurations would really modernise and help improve deployments. Having mechanisms to push the updates to clients who want it, and enable them by default on new installs will go a long way.
  • Out of the box we need smarter settings: The default install should need almost no changes to be a strong, working LDAP system. It should not require massive changes or huge amounts of indepth knowledge to deploy. I’m the LDAP expert: You’re the coffee sipping developer. You should be able to trust the defaults we give you, and know that they will be well engineered and carefully considered.
  • Finally I think what is also going to be really important is Web Based authentication. We need to provide ways to setup and provision SAML and OAuth systems that “just work” with our LDAP. With improvements on the way like this draft rfc will even allow fail over between token systems, backed by the high security and performance guarantees of LDAP.

This is my vision of the future for 389-ds: Simplification of setup. Polish of the configuration. Ability to automate and tools to empower administrators. Integration with what developers want.

Lets see how much myself and the team can achieve by the end of 2016.

Sat, 16 Apr 2016 00:00:00 +1000 <![CDATA[Enabling the 389 ds nightly builds]]> Enabling the 389 ds nightly builds

I maintain a copr repo which I try to keep update with “nightly” builds of 389-ds.

You can use the following to enable them for EL7:

sudo -s
cd /etc/yum.repos.d
yum install python-lib389 python-rest389 389-ds-base
Thu, 14 Apr 2016 00:00:00 +1000 <![CDATA[Disabling journald support]]> Disabling journald support

Some people may have noticed that there is a feature open for Directory Server to support journald.

As of April 13th, we have decided to disable support for this in Directory Server.

This isn’t because anyone necesarrily hates or dislikes systemd. All too often people discount systemd due to a hate reflex.

This decision came about due to known, hard, technical limitations of journald. This is not hand waving opinion, this is based on testing, numbers, business requirements and experience.

So lets step back for a second. Directory Server is an LDAP server. On a network, LDAP is deployed typically to be responsible for authentication and authorisation of users to services. This is a highly security sensitive role. This leads to an important facet of security being auditability. For example, the need to track when and who has authenticated to a network. The ability to audit what permisions were requested and granted. Further more, the ability to audit and identify changes to Directory Server data which may represent a compromise or change of user credential or authorisation rights.

Being able to audit these is of vital importance, from small buisinesses to large enterprise. As a security system, this audit trail must have guarantees of correctness and avaliablility. Often a business will have internal rules around the length of time auditing information must be retained for. In other businesses there are legal requirements for auditing information to be retained for long periods. Often a business will keep in excess of 2 weeks of authentication and authorisation data for the purposes of auditing.

Directory Server provides this auditing capability through it’s logging functions. Directory Server is configured to produce 3 log types.

  • errors - Contains Directory Server operations, plugin data, changes. This is used by administrators to identify service behaviour and issues.
  • access - Contains a log of all search and bind (authentication) operations.
  • audit - Contains a log of all modifications, additions and deletions of objects within the Directory Server.

For the purpose of auditing in a security context the access and audit logs are of vital importance, as is their retention times.

So why is journald not fit for purpose in this context? It seems to be fine for many other systems?

Out of the box, journald has a hardcoded limit on the maximum capacity of logs. This is 4GB of on disk capacity. Once this is exceeded, the journal rotates, and begins to overwrite entries at the beginging of the log. Think circular buffer. After testing and identifying the behaviours of Directory Server, and the size of journald messages, I determined that a medium to large site will cause the journal to begin a rotation in 3 hours or less during high traffic periods.

3 hours is a far smaller number than the “weeks” of retention that is required for auditing purposes of most businesses.

Additionally, by default journald is configured to drop events if they enter the log to rapidly. This is a “performance” enhancement. However, during my tests I found that 85% of Directory Server events were being dropped. This violates the need for correct and complete audit logs in a security system.

This can be reconfigured, but the question should be asked. Why are log events dropped at all? On a system, log events are the basis of auditing and accountability, forming a historical account of evidence for an Administrator or Security personel to trace in the event of an incident. Dropping events from Directory Server is unacceptable. As I stated, this can be reconfigured.

But it does begin to expose the third point. Performance. Journald is slow, and caused an increase of 15% cpu and higher IO on my testing environments. For a system such as Directory Server, this overhead is unacceptable. We consider performance impacts of 2% to be signifigant: We cannot accept 15%.

As an API journald is quite nice, and has many useful features. However, we as a team cannot support journald with these three limitations above.

If journald support is to be taken seriously by security and performance sensitive applications the following changes are recomended.

  • Remove the 4G log size limit. It can either be configurable by a user, or there should be no limits.
  • Log events should either not be dropped by default, or a method to have per systemd unit file overrides to prevent dropping of certain services events should be added.
  • The performance of journald should be improved as to reduce the impact upon applications consuming the journald api.

I hope that this explains why we have decided to remove systemd’s journald support from Directory Server at this time.

Before I am asked: No I will not reverse my stance on this matter, and I will continue to advise my team of the same. Systemd needs to come to the table and improve their api before we can consider it for use.

The upstream issue can be seen here 389 ds trac 47968. All of my calculations are in this thread too.

Thu, 14 Apr 2016 00:00:00 +1000 <![CDATA[389 ds aci linting tool]]> 389 ds aci linting tool

In the past I have discussed aci’s and their management in directory server.

It’s a very complex topic, and there are issues that can arise.

I have now created an aci linting tool which can connect to a directory server and detect common mistakes in acis, along with explinations of how to correct them.

This will be in a release of lib389 in the future. For now, it’s under review and hopefully will be accepted soon!

Here is sample output below.

Directory Server Aci Lint Error: DSALE0001
Severity: HIGH

Affected Acis:
(targetattr!="userPassword")(version 3.0; acl "Enable anonymous access"; allow (read, search, compare) userdn="ldap:///anyone";)
(targetattr !="cn || sn || uid")(targetfilter ="(ou=Accounting)")(version 3.0;acl "Accounting Managers Group Permissions";allow (write)(groupdn = "ldap:///cn=Accounting Managers,ou=groups,dc=example,dc=com");)
(targetattr !="cn || sn || uid")(targetfilter ="(ou=Human Resources)")(version 3.0;acl "HR Group Permissions";allow (write)(groupdn = "ldap:///cn=HR Managers,ou=groups,dc=example,dc=com");)
(targetattr !="cn ||sn || uid")(targetfilter ="(ou=Product Testing)")(version 3.0;acl "QA Group Permissions";allow (write)(groupdn = "ldap:///cn=QA Managers,ou=groups,dc=example,dc=com");)
(targetattr !="cn || sn || uid")(targetfilter ="(ou=Product Development)")(version 3.0;acl "Engineering Group Permissions";allow (write)(groupdn = "ldap:///cn=PD Managers,ou=groups,dc=example,dc=com");)

An aci of the form "(targetAttr!="attr")" exists on your system. This aci
will internally be expanded to mean "all possible attributes including system,
excluding the listed attributes".

This may allow access to a bound user or anonymous to read more data about
directory internals, including aci state or user limits. In the case of write
acis it may allow a dn to set their own resource limits, unlock passwords or
their own aci.

The ability to change the aci on the object may lead to privilege escalation in
some cases.

Convert the aci to the form "(targetAttr="x || y || z")".

Directory Server Aci Lint Error: DSALE0002
Severity: HIGH

Affected Acis:
ou=People,dc=example,dc=com (targetattr !="cn || sn || uid")(targetfilter ="(ou=Accounting)")(version 3.0;acl "Accounting Managers Group Permissions";allow (write)(groupdn = "ldap:///cn=Accounting Managers,ou=groups,dc=example,dc=com");)
|- ou=People,dc=example,dc=com (targetattr !="cn || sn || uid")(targetfilter ="(ou=Human Resources)")(version 3.0;acl "HR Group Permissions";allow (write)(groupdn = "ldap:///cn=HR Managers,ou=groups,dc=example,dc=com");)
|- ou=People,dc=example,dc=com (targetattr !="cn ||sn || uid")(targetfilter ="(ou=Product Testing)")(version 3.0;acl "QA Group Permissions";allow (write)(groupdn = "ldap:///cn=QA Managers,ou=groups,dc=example,dc=com");)
|- ou=People,dc=example,dc=com (targetattr !="cn || sn || uid")(targetfilter ="(ou=Product Development)")(version 3.0;acl "Engineering Group Permissions";allow (write)(groupdn = "ldap:///cn=PD Managers,ou=groups,dc=example,dc=com");)

ou=People,dc=example,dc=com (targetattr !="cn || sn || uid")(targetfilter ="(ou=Human Resources)")(version 3.0;acl "HR Group Permissions";allow (write)(groupdn = "ldap:///cn=HR Managers,ou=groups,dc=example,dc=com");)
|- ou=People,dc=example,dc=com (targetattr !="cn || sn || uid")(targetfilter ="(ou=Accounting)")(version 3.0;acl "Accounting Managers Group Permissions";allow (write)(groupdn = "ldap:///cn=Accounting Managers,ou=groups,dc=example,dc=com");)
|- ou=People,dc=example,dc=com (targetattr !="cn ||sn || uid")(targetfilter ="(ou=Product Testing)")(version 3.0;acl "QA Group Permissions";allow (write)(groupdn = "ldap:///cn=QA Managers,ou=groups,dc=example,dc=com");)
|- ou=People,dc=example,dc=com (targetattr !="cn || sn || uid")(targetfilter ="(ou=Product Development)")(version 3.0;acl "Engineering Group Permissions";allow (write)(groupdn = "ldap:///cn=PD Managers,ou=groups,dc=example,dc=com");)

ou=People,dc=example,dc=com (targetattr !="cn ||sn || uid")(targetfilter ="(ou=Product Testing)")(version 3.0;acl "QA Group Permissions";allow (write)(groupdn = "ldap:///cn=QA Managers,ou=groups,dc=example,dc=com");)
|- ou=People,dc=example,dc=com (targetattr !="cn || sn || uid")(targetfilter ="(ou=Accounting)")(version 3.0;acl "Accounting Managers Group Permissions";allow (write)(groupdn = "ldap:///cn=Accounting Managers,ou=groups,dc=example,dc=com");)
|- ou=People,dc=example,dc=com (targetattr !="cn || sn || uid")(targetfilter ="(ou=Human Resources)")(version 3.0;acl "HR Group Permissions";allow (write)(groupdn = "ldap:///cn=HR Managers,ou=groups,dc=example,dc=com");)
|- ou=People,dc=example,dc=com (targetattr !="cn || sn || uid")(targetfilter ="(ou=Product Development)")(version 3.0;acl "Engineering Group Permissions";allow (write)(groupdn = "ldap:///cn=PD Managers,ou=groups,dc=example,dc=com");)

ou=People,dc=example,dc=com (targetattr !="cn || sn || uid")(targetfilter ="(ou=Product Development)")(version 3.0;acl "Engineering Group Permissions";allow (write)(groupdn = "ldap:///cn=PD Managers,ou=groups,dc=example,dc=com");)
|- ou=People,dc=example,dc=com (targetattr !="cn || sn || uid")(targetfilter ="(ou=Accounting)")(version 3.0;acl "Accounting Managers Group Permissions";allow (write)(groupdn = "ldap:///cn=Accounting Managers,ou=groups,dc=example,dc=com");)
|- ou=People,dc=example,dc=com (targetattr !="cn || sn || uid")(targetfilter ="(ou=Human Resources)")(version 3.0;acl "HR Group Permissions";allow (write)(groupdn = "ldap:///cn=HR Managers,ou=groups,dc=example,dc=com");)
|- ou=People,dc=example,dc=com (targetattr !="cn ||sn || uid")(targetfilter ="(ou=Product Testing)")(version 3.0;acl "QA Group Permissions";allow (write)(groupdn = "ldap:///cn=QA Managers,ou=groups,dc=example,dc=com");)

Acis on your system exist which are both not equals targetattr, and overlap in

The way that directory server processes these, is to invert them to to white
lists, then union the results.

As a result, these acis *may* allow access to the attributes you want them to


aci: (targetattr !="cn")(version 3.0;acl "Self write all but cn";allow (write)
    (userdn = "ldap:///self");)
aci: (targetattr !="sn")(version 3.0;acl "Self write all but sn";allow (write)
    (userdn = "ldap:///self");)

This combination allows self write to *all* attributes within the subtree.

In cases where the target is members of a group, it may allow a member who is
within two groups to have elevated privilege.

Convert the aci to the form "(targetAttr="x || y || z")".

Prevent the acis from overlapping, and have them on unique subtrees.

Fri, 01 Apr 2016 00:00:00 +1000 <![CDATA[Trick to debug single files in ds]]> Trick to debug single files in ds

I’ve been debugging thread deadlocks in directory server. When you turn on detailed tracing with

ns-slapd -d 1

You slow the server down so much that you can barely function.

A trick is that defines in the local .c file, override from the .h. Copy paste this to the file you want to debug. This allows the logs from this file to be emitted at -d 0, but without turning it on everywhere, so you don’t grind the server to a halt.

/* Do this so we can get the messages at standard log levels. */
#define SLAPI_LOG_FATAL         0
#define SLAPI_LOG_TRACE         0
#define SLAPI_LOG_PACKETS       0
#define SLAPI_LOG_ARGS          0
#define SLAPI_LOG_CONNS         0
#define SLAPI_LOG_BER           0
#define SLAPI_LOG_FILTER        0
#define SLAPI_LOG_CONFIG        0
#define SLAPI_LOG_ACL           0
#define SLAPI_LOG_SHELL         0
#define SLAPI_LOG_PARSE         0
#define SLAPI_LOG_HOUSE         0
#define SLAPI_LOG_REPL          0
#define SLAPI_LOG_CACHE         0
#define SLAPI_LOG_PLUGIN        0
#define SLAPI_LOG_TIMING        0
#define SLAPI_LOG_BACKLDBM      0
Wed, 16 Mar 2016 00:00:00 +1000 <![CDATA[Blog migration]]> Blog migration

I’ve migrated my blog from django to tinkerer. I’ve also created a number of helper pages to preserve all the links to old pages.

Please let me know if anything is wrong using my contact details on the about page.


Wed, 09 Mar 2016 00:00:00 +1000 <![CDATA[ldctl to generate test objects]]> ldctl to generate test objects

I was told by some coworkers today at Red Hat that I can infact use ldctl to generate my databases for load testing with 389-ds.

First, create a template.ldif

objectClass: top
objectclass: person
objectClass: organizationalPerson
objectClass: inetorgperson
objectClass: posixAccount
objectClass: shadowAccount
sn: testnew[A]
cn: testnew[A]
uid: testnew[A]
givenName: testnew[A]
description: description[A]
userPassword: testnew[A]
mail: testnew[A]
uidNumber: 3[A]
gidNumber: 4[A]
shadowMin: 0
shadowMax: 99999
shadowInactive: 30
shadowWarning: 7
homeDirectory: /home/uid[A]

Now you can use ldctl to actually load in the data:

ldclt -h localhost -p 389 -D "cn=Directory Manager" -w password -b "ou=people,dc=example,dc=com" \
-I 68 -e add,commoncounter -e "object=/tmp/template.ldif,rdn=uid:[A=INCRNNOLOOP(0;3999;5)]"

Thanks to vashirov and spichugi for their advice and this example!

Tue, 23 Feb 2016 00:00:00 +1000 <![CDATA[“Patches Welcome”]]> “Patches Welcome”

“Patches Welcome”. We’ve all seen it in the Open Source community. Nothing makes me angrier than these two words.

Often this is said by people who are too busy, or too lazy to implement features that just aren’t of interest to them. This isn’t the response you get when you submit a bad idea, or something technically unfeasible. It’s the response that speaks of an apathy to your software’s users.

I get that we all have time limits for development. I know that we have to prioritise. I know that it may not be of import to the business right now. Even at the least, reach out, say you’ll create a ticket on their behalf if they cannot. Help them work through the design, then implement it in the future.

But do not ever consider yourself so high and mighty that the request of a user “isn’t good enough for you”. These are your customers, supporters, advocates, bug reporters, testers, and users. They are what build the community. A community is not just the developers of the software. It’s the users of it too, and their skills are separate from those of the the developer.

Often people ask for features, but do not have the expertise, or domain knowledge to implement them. That does not invalidate the worth of the feature, if anything speaks to it’s value as a real customer will benefit from this, and your project as a whole will improve. Telling them “Patches Welcome” is like saying “I know you aren’t capable of implementing this yourself. I don’t care to help you at all, and I don’t want to waste my time on you. Go away”.

As is obvious from this blog, I’m part of the 389 Directory Server Team.

I will never tell a user that “patches welcome”. I will always support them to design their idea. I will ask them to lodge a ticket, or I’ll do it for them if they cannot. If a user can and wants to try to implement the software of their choice, I will help them and teach them. If they cannot, I will make sure that at some time in the future, we can deliver it to them, or if we cannot, a real, honest explanation of why.

That’s the community in 389 I am proud to be a part of.

Wed, 17 Feb 2016 00:00:00 +1000 <![CDATA[Securing IPA]]> Securing IPA

I no longer recommend using FreeIPA - Read more here!

By default IPA has some weak security around TLS and anonymous binds.

We can improve this by changing the following options.

nsslapd-minssf-exclude-rootdse: on
nsslapd-minssf: 56
nsslapd-require-secure-binds: on

The last one you may want to change is:

nsslapd-allow-anonymous-access: on

I think this is important to have on, as it allows non-domain members to use ipa, but there are arguments to disabling anon reads too.

Tue, 09 Feb 2016 00:00:00 +1000 <![CDATA[Failed to delete old semaphore for stats file]]> Failed to delete old semaphore for stats file

Today I was getting this error:

[09/Feb/2016:12:21:26 +101800] - 389-Directory/1.3.5 B2016.040.145 starting up
[09/Feb/2016:12:21:26 +101800] - Failed to delete old semaphore for stats file (/opt/dirsrv/var/run/dirsrv/slapd-localhost.stats). Error 13 (Permission denied).

But when you check:

/opt# ls -al /opt/dirsrv/var/run/dirsrv/slapd-localhost.stats
ls: cannot access /opt/dirsrv/var/run/dirsrv/slapd-localhost.stats: No such file or directory

Turns out on linux this isn’t actually where the file is. You need to remove:


A bug will be opened shortly ….

Tue, 09 Feb 2016 00:00:00 +1000 <![CDATA[389 on freebsd]]> 389 on freebsd

I’ve decided to start porting 389-ds to freebsd.

So tonight I took the first steps. Let’s see if we can get it to build in a dev environment like I would use normally.

You will need to install these deps:


You then need to install svrcore. I’ll likely add a port for this too.

tar -xvjf svrcore-4.0.4.tar.bz2
cd svrcore-4.0.4
CFLAGS="-fPIC "./configure --prefix=/opt/svrcore
sudo make install

Now you can clone ds and try to build it:

git clone
cd ds
./configure --prefix=/opt/dirsrv --with-openldap=/usr/local --with-db --with-db-inc=/usr/local/include/db5/ --with-db-lib=/usr/local/lib/db5/ --with-sasl --with-sasl-inc=/usr/local/include/sasl/ --with-sasl-lib=/usr/local/lib/sasl2/ --with-svrcore-inc=/opt/svrcore/include/ --with-svrcore-lib=/opt/svrcore/lib/ --with-netsnmp=/usr/local

If it’s like me you get the following:

make: "/usr/home/admin_local/ds/Makefile" line 10765: warning: duplicate script for target "%/dirsrv" ignored
make: "/usr/home/admin_local/ds/Makefile" line 10762: warning: using previous script for "%/dirsrv" defined here
make: "/usr/home/admin_local/ds/Makefile" line 10767: warning: duplicate script for target "%/dirsrv" ignored
make: "/usr/home/admin_local/ds/Makefile" line 10762: warning: using previous script for "%/dirsrv" defined here
make: "/usr/home/admin_local/ds/Makefile" line 10768: warning: duplicate script for target "%/dirsrv" ignored
make: "/usr/home/admin_local/ds/Makefile" line 10762: warning: using previous script for "%/dirsrv" defined here
perl ./ldap/servers/slapd/ -i /usr/local/include/db5/ -o .
make  all-am
make[1]: "/usr/home/admin_local/ds/Makefile" line 10765: warning: duplicate script for target "%/dirsrv" ignored
make[1]: "/usr/home/admin_local/ds/Makefile" line 10762: warning: using previous script for "%/dirsrv" defined here
make[1]: "/usr/home/admin_local/ds/Makefile" line 10767: warning: duplicate script for target "%/dirsrv" ignored
make[1]: "/usr/home/admin_local/ds/Makefile" line 10762: warning: using previous script for "%/dirsrv" defined here
make[1]: "/usr/home/admin_local/ds/Makefile" line 10768: warning: duplicate script for target "%/dirsrv" ignored
make[1]: "/usr/home/admin_local/ds/Makefile" line 10762: warning: using previous script for "%/dirsrv" defined here
depbase=`echo ldap/libraries/libavl/avl.o | sed 's|[^/]*$|.deps/&|;s|\.o$||'`; cc -DHAVE_CONFIG_H -I.     -DBUILD_NUM= -DVENDOR="\"389 Project\"" -DBRAND="\"389\"" -DCAPBRAND="\"389\""  -UPACKAGE_VERSION -UPACKAGE_TARNAME -UPACKAGE_STRING -UPACKAGE_BUGREPORT -I./ldap/include -I./ldap/servers/slapd -I./include -I.  -DLOCALSTATEDIR="\"/opt/dirsrv/var\"" -DSYSCONFDIR="\"/opt/dirsrv/etc\""  -DLIBDIR="\"/opt/dirsrv/lib\"" -DBINDIR="\"/opt/dirsrv/bin\""  -DDATADIR="\"/opt/dirsrv/share\"" -DDOCDIR="\"/opt/dirsrv/share/doc/389-ds-base\""  -DSBINDIR="\"/opt/dirsrv/sbin\"" -DPLUGINDIR="\"/opt/dirsrv/lib/dirsrv/plugins\"" -DTEMPLATEDIR="\"/opt/dirsrv/share/dirsrv/data\""     -g -O2 -MT ldap/libraries/libavl/avl.o -MD -MP -MF $depbase.Tpo -c -o ldap/libraries/libavl/avl.o ldap/libraries/libavl/avl.c && mv -f $depbase.Tpo $depbase.Po
rm -f libavl.a
ar cru libavl.a ldap/libraries/libavl/avl.o
ranlib libavl.a
cc -DHAVE_CONFIG_H -I.     -DBUILD_NUM= -DVENDOR="\"389 Project\"" -DBRAND="\"389\"" -DCAPBRAND="\"389\""  -UPACKAGE_VERSION -UPACKAGE_TARNAME -UPACKAGE_STRING -UPACKAGE_BUGREPORT -I./ldap/include -I./ldap/servers/slapd -I./include -I.  -DLOCALSTATEDIR="\"/opt/dirsrv/var\"" -DSYSCONFDIR="\"/opt/dirsrv/etc\""  -DLIBDIR="\"/opt/dirsrv/lib\"" -DBINDIR="\"/opt/dirsrv/bin\""  -DDATADIR="\"/opt/dirsrv/share\"" -DDOCDIR="\"/opt/dirsrv/share/doc/389-ds-base\""  -DSBINDIR="\"/opt/dirsrv/sbin\"" -DPLUGINDIR="\"/opt/dirsrv/lib/dirsrv/plugins\"" -DTEMPLATEDIR="\"/opt/dirsrv/share/dirsrv/data\""  -I./lib/ldaputil -I/usr/local/include  -I/usr/local/include/nss -I/usr/local/include/nss/nss -I/usr/local/include/nspr   -I/usr/local/include/nspr   -g -O2 -MT lib/ldaputil/libldaputil_a-cert.o -MD -MP -MF lib/ldaputil/.deps/libldaputil_a-cert.Tpo -c -o lib/ldaputil/libldaputil_a-cert.o `test -f 'lib/ldaputil/cert.c' || echo './'`lib/ldaputil/cert.c
In file included from lib/ldaputil/cert.c:16:
/usr/include/malloc.h:3:2: error: "<malloc.h> has been replaced by <stdlib.h>"
#error "<malloc.h> has been replaced by <stdlib.h>"
1 error generated.
*** Error code 1

make[1]: stopped in /usr/home/admin_local/ds
*** Error code 1

make: stopped in /usr/home/admin_local/ds

Time to start looking at including some #ifdef __FREEBSD__ macros.

Thu, 28 Jan 2016 00:00:00 +1000 <![CDATA[Renaming ovirt storage targets]]> Renaming ovirt storage targets

I run an ovirt server, and sometimes like a tinker that I am, I like to rename things due to new hardware or other ideas that come up.

Ovirt makes it quite hard to change the nfs target or name of a storage volume. Although it’s not supported, I’m more than happy to dig through the database.

NOTE: Take a backup before you start, this is some serious unsupported magic here.

First, we need to look at the main tables that are involved in nfs storage:

engine=# select id,storage,storage_name from storage_domain_static;
                  id                  |               storage                |   storage_name
 6bffd537-badb-43c9-91b2-a922cf847533 | 842add9e-ffef-44d9-bf6d-4f8231b375eb | def_t2_nfs_import
 c3aa02d8-02fd-4a16-bfe6-59f9348a0b1e | 5b8ba182-7d05-44e4-9d64-2a1bb529b797 | def_t2_nfs_iso
 a8ac8bd0-cf40-45ae-9f39-b376c16b7fec | d2fd5e4b-c3de-4829-9f4a-d56246f5454b | def_t2_nfs_lcs
 d719e5f2-f59d-434d-863e-3c9c31e4c02f | e2ba769c-e5a3-4652-b75d-b68959369b55 | def_t1_nfs_master
 a085aca5-112c-49bf-aa91-fbf59e8bde0b | f5be3009-4c84-4d59-9cfe-a1bcedac4038 | def_t1_nfs_sas

engine=# select id,connection from storage_server_connections;
                  id                  |                           connection
 842add9e-ffef-44d9-bf6d-4f8231b375eb |
 5b8ba182-7d05-44e4-9d64-2a1bb529b797 |
 d2fd5e4b-c3de-4829-9f4a-d56246f5454b |
 e2ba769c-e5a3-4652-b75d-b68959369b55 |
 f5be3009-4c84-4d59-9cfe-a1bcedac4038 |

So we are going to rename the def_t2_nfs_* targets to def_t3_nfs. First we need to update the mount point:

update storage_server_connections set connection='' where id='842add9e-ffef-44d9-bf6d-4f8231b375eb';

update storage_server_connections set connection='' where id='5b8ba182-7d05-44e4-9d64-2a1bb529b797';

update storage_server_connections set connection='' where id='d2fd5e4b-c3de-4829-9f4a-d56246f5454b';

Next we are going to replace the name in the storage_domain_static table.

update storage_domain_static set storage_name='def_t3_nfs_lcs' where storage='d2fd5e4b-c3de-4829-9f4a-d56246f5454b';

update storage_domain_static set storage_name='def_t3_nfs_iso' where storage='5b8ba182-7d05-44e4-9d64-2a1bb529b797';

update storage_domain_static set storage_name='def_t3_nfs_import' where storage='842add9e-ffef-44d9-bf6d-4f8231b375eb';

That’s it! Now check it all looks correct and restart.

engine=# select id,storage,storage_name from storage_domain_static;
                  id                  |               storage                |   storage_name
 a8ac8bd0-cf40-45ae-9f39-b376c16b7fec | d2fd5e4b-c3de-4829-9f4a-d56246f5454b | def_t3_nfs_lcs
 c3aa02d8-02fd-4a16-bfe6-59f9348a0b1e | 5b8ba182-7d05-44e4-9d64-2a1bb529b797 | def_t3_nfs_iso
 6bffd537-badb-43c9-91b2-a922cf847533 | 842add9e-ffef-44d9-bf6d-4f8231b375eb | def_t3_nfs_import
 d719e5f2-f59d-434d-863e-3c9c31e4c02f | e2ba769c-e5a3-4652-b75d-b68959369b55 | def_t1_nfs_master
 a085aca5-112c-49bf-aa91-fbf59e8bde0b | f5be3009-4c84-4d59-9cfe-a1bcedac4038 | def_t1_nfs_sas
(5 rows)

engine=# select id,connection from storage_server_connections;
                  id                  |                           connection
 e2ba769c-e5a3-4652-b75d-b68959369b55 |
 f5be3009-4c84-4d59-9cfe-a1bcedac4038 |
 842add9e-ffef-44d9-bf6d-4f8231b375eb |
 5b8ba182-7d05-44e4-9d64-2a1bb529b797 |
 d2fd5e4b-c3de-4829-9f4a-d56246f5454b |
(5 rows)
Sat, 16 Jan 2016 00:00:00 +1000 <![CDATA[Running your own mailserver: Mailbox rollover]]> Running your own mailserver: Mailbox rollover

UPDATE 2019: Don’t run your own! Use fastmail instead :D!

I go to a lot of effort to run my own email server. I don’t like google, and I want to keep them away from my messages. While it incurs both financial, and administrative cost, sometimes the benefits are fantastic.

I like to sort my mail to folders based on server side filters (which are fantastic, server side filtering is the way to go). I also like to keep my mailboxes in yearly fashion, so they don’t grow tooo large. I keep every email I ever receive, and it’s saved my arse a few times.

Rolling over year to year for most people would be a pain: You need to move all the emails from one folder (mailbox) to another, which incurs a huge time / download / effort cost.

Running your own mailserver though, you don’t have this issue. It takes a few seconds to complete a year rollover. You can even script it like I did.


export MAILUSER='email address here'
export LASTYEAR='2015'
export THISYEAR='2016'

# Stop postfix first. this way server side filters aren't being used and mails routed while we fiddle around.
systemctl stop postfix

# Now we can fiddle with mailboxes

# First, we want to make the new archive.

doveadm mailbox create -u ${MAILUSER} archive.${THISYEAR}

# Create a list of mailboxes.

export MAILBOXES=`doveadm mailbox list -u ${MAILUSER} 'INBOX.*' | awk -F '.' '{print $2}'`

# Now move the directories to archive.
# Create the new equivalents

    doveadm mailbox rename -u ${MAILUSER} INBOX.${MAILBOX} archive.${LASTYEAR}.${MAILBOX}
    doveadm mailbox subscribe -u ${MAILUSER} archive.${LASTYEAR}.${MAILBOX}
    doveadm mailbox create -u ${MAILUSER} INBOX.${MAILBOX}

doveadm mailbox list -u ${MAILUSER}

# Start postfix back up

systemctl start postfix

Now I have clean, shiny mailboxes, all my filters still work, and my previous year’s emails are tucked away for safe keeping and posterity.

The only catch with my script is you need to run it on January 1st, else you get 2016 mails in the 2015 archive. You also still need to move the inbox contents from 2015 manually to the archive. But it’s not nearly the same hassle as moving thousands of mailing list messages around.

Fri, 15 Jan 2016 00:00:00 +1000 <![CDATA[FreeRADIUS: Using mschapv2 with freeipa]]> FreeRADIUS: Using mschapv2 with freeipa

I no longer recommend using FreeIPA - Read more here!

Wireless and radius is pretty much useless without mschapv2 and peap. This is because iPhones, androids, even linux have fundamental issues with ttls or other 802.1x modes. mschapv2 “just works”, yet it’s one of the most obscure to get working in some cases without AD.

If you have an active directory environment, it’s pretty well a painless process. But when you want to use anything else, you are in a tight spot.

The FreeRADIUS team go on a lot about how mschapv2 doesn’t work with ldap: and they are correct. mschapv2 is a challenge response protocol, and you can’t do that in conjunction with an ldap bind.

However it IS possible to use mschapv2 with an ldap server: It’s just not obvious or straight forwards.

The way that this works is you need freeradius to look up a user to an ldap dn, then you read (not bind) the nthash of the user from their dn. From there, the FreeRADIUS server is able to conduct the challenge response component.

So the main things here to note:

  • nthash are pretty much an md4. They are broken and terrible. But you need to use them, so you need to secure the access to these.
  • Because you need to secure these, you need to be sure your access controls are correct.

We can pretty easily make this setup work with freeipa in fact.

First, follow the contents of my previous blog post on how to setup the adtrust components and the access controls.

You don’t actually need to complete the trust with AD, you just need to run the setup util, as this triggers IPA to generate and store nthashes in ipaNTHash on the user account.

Now armed with your service account that can read these hashes, and the password, we need to configure FreeRADIUS.


Thankfully, the developers provide an excellent default configuration that should only need minimal tweaks to make this configuration work.

first, symlink ldap to mods-enabled

cd /etc/raddb/mods-enabled
ln -s ../mods-available/ldap ./ldap

Now, edit the ldap config in mods-available (That way if a swap file is made, it’s not put into mods-enabled where it may do damage)

You need to change the parameters to match your site, however the most important setting is:

identity = krbprincipalname=radius/,cn=services,cn=accounts,dc=ipa,dc=example,dc=net,dc=au


update {
      control:NT-Password··   := 'ipaNTHash'

 .....snip ....

user {
       base_dn = "cn=users,cn=accounts,dc=ipa,dc=example,dc=net,dc=au"
       filter = "(uid=%{%{Stripped-User-Name}:-%{User-Name}})"

Next, you want to edit the mods-available/eap

you want to change the value of default_eap_type to:

default_eap_type = mschapv2

Finally, you need to update your sites-available, most likely inner-tunnel and default to make sure that they contain:

authorize {

      ....snip .....


That’s it! Now you should be able to test an ldap account with radtest, using the default NAS configured in /etc/raddb/clients.conf.

radtest -t mschap william password 0 testing123
    User-Name = 'william'
    NAS-IP-Address =
    NAS-Port = 0
    Message-Authenticator = 0x00
    MS-CHAP-Challenge = 0x642690f62148e238
        MS-CHAP-Response = ....
Received Access-Accept Id 130 from to length 84
    MS-CHAP-MPPE-Keys = 0x
    MS-MPPE-Encryption-Policy = Encryption-Allowed
    MS-MPPE-Encryption-Types = RC4-40or128-bit-Allowed

Why not use KRB?

I was asked in IRC about using KRB keytabs for authenticating the service account. Now the configuration is quite easy - but I won’t put it hear.

The issue is that it opens up a number of weaknesses. Between FreeRADIUS and LDAP you have communication. Now FreeIPA/389DS doesn’t allow GSSAPI over LDAPS/StartTLS. When you are doing an MSCHAPv2 authentication this isn’t so bad: FreeRADIUS authenticates with GSSAPI with encryption layers, then reads the NTHash. The NTHash is used inside FreeRADIUS to generate the challenge, and the 802.1x authentication suceeds or fails.

Now what happens when we use PAP instead? FreeRADIUS can either read the NTHash and do a comparison (as above), or it can directly bind to the LDAP server. This means in the direct bind case, that the transport may not be encrypted due to the keytab. See, the keytab when used for the service account, will install encryption, but when the simple bind occurs, we don’t have GSSAPI material, so we would send this clear text.

Which one will occur … Who knows! FreeRADIUS is a complex piece of software, as is LDAP. Unless you are willing to test all the different possibilities of 802.1x types and LDAP i