Weekly Meeting: 2015-01-14

Agenda

"Mirage OS" or "MirageOS"?
Pull OCaml runtime/bindings into packages / OCaml 4.02.1 support
TLS on Xen
Error handling in Mirage
Planning for 3.0

Attendees: Amir Chaudhry (chair), Thomas Gazagnaire, David Kaloper, Thomas Leonard, Jon Ludlam, Anil Madhavapeddy, Hannes Mehnert, Richard Mortier, Dave Scott and Magnus Skjegstad

Notes

"Mirage OS" or "MirageOS"?

We need to settle on how we refer to the project and then be consistent in its use. In the early days it was referred to variously as 'Mirage', 'Open Mirage', 'Mirage OS' and 'MirageOS'. When we became a Xen Incubated project, we had a brief discussion about the 'official' name (for press releases etc) and we settled on 'Mirage OS'. However, we haven't always been consistent with the way we use the term. So far all the official materials such as new releases, have referred to "Mirage OS" (with a space), whereas on Twitter and our slides we see 'MirageOS' (without a space) being used. We should stick to only one of these.

Why does this matter? - As we attract more users and raise more awareness, being consistent in how we represent ourselves will begin to matter more. The name is just one of these aspects and even though it may seem trivial, it is the name of the project we're discussing here. At this stage, there should be no confusion about how it's written.

After a brief discussion, it's clear that most of the team are less concerned about the exact choice rather than being consistent. Only Amir and Anil had opinions to share and they both proposed that we use 'MirageOS' from now on. The reasons are that users have independently used this form in tweets and posts, it is easier to search for, and it's better to go with the flow as we still can.

Overall, we unanimously decided on two things: (1) to move forward with 'MirageOS' as the official name of the project and (2) be much more consistent in using it on our written material.

In terms of practical steps, Amir will update things as he sees them on the website and written materials. Anil suggested that we may start using a different domain for the site too (mirageos.org).

OCaml 4.02.1 support

Anil's working on this (mirage/mirage-platform#115) and thanks ThomasL for his patches. Hasn't had time to merge yet but will do soon. Might not rebase it and hopes that won't be too much of a problem for anyone. In general we shouldn't do that but this one would take a while to rebase and would prefer to merge quickly. The upcoming functor inline patch will a have big performance boost but we should clearly state that we're not dropping support for 4.01 though. The patch applies cleanly on trunk (it's currently in a branch) and the work already done to port Mirage to 4.02.1 should mean that any additional changes we need to make are trivial.

In addition, we should be careful to maintain support for OPAM 1.1 or at the very least, warn the user if something won't work with v1.1. Many of the upstream distros are still on this version and silent failures result in confusion and frustration. There are also an increasing number of libraries going for 4.02.1 support only, but it would be better to try and maintain support for 4.01 as long as possible (for the same reasons as above).

TLS on Xen

ThomasL started an email thread about getting the TLS stack working on Xen. He's submitted a number of patches for review to different libraries and the changes needed are:

Add generic error handling for FLOWs, so we can propagate errors reliably.
Fix the page alignment requirements for Netif.
Add TLS support to conduit.

TLS now works if you apply lots of patches and works pretty well (half the speed of not using TLS — which is good!). Issue now is how to merge without breaking things as certain module signatures change. There was some discussion of functorising and the knock-on effect this would have but overall it seems like we can keep this as an interface-level problem.

There's still the question of getting entropy in Xen. We need several ways to do this and xentropyd is the one way we currently have if dom0 cooperates. This is fine for our infrastructure as we control dom0 (on cubieboards and x86). However, we also need to be able to gather entropy in other ways for deployments to third-party infrastructure. The actual code for getting entropy is straightforward but how it is then composed needs to be considered.

In general, there's reluctance to release the TLS stuff until there's an entropy story so DavidK may take a look at the options. In the meantime, we're happy to have an intermediate patch that only uses xentropyd and we can then use it for websites hosted on our own infrastructure.

Error handling in Mirage

ThomasL has been working on this. To make conduit FLOW wrap the TLS flows we need a way to propagate the errors sensibly, so has added a type to FLOW that can do this. It might be nice to have an algebraic data type to describe an error type, or if we wait a few months until we switch to 4.02.1, then we can just use Open Types (see additional comment from ThomasL below). The nice thing with current patch is that you don't lose the error type. It works well but there's an issue regarding merge order, e.g. TLS and conduit, where they use it but also provide it. Easiest way would be to adjust the dependencies and push things to mirage dev and then co-ordinate the releases. It then becomes straightforward to get things merged into the main OPAM repository.

Another comment was that it might make sense to have the err handling in DEVICE and have it everywhere but that would touch about 20 repos, so not likely to do that just yet.

[Thomas Leonard additionally sent the following comment:]

"We already have algebraic data types for errors, which is good if you want to match on a particular error from a particular concrete implementation (e.g. FAT_index_corrupted or something). In theory, this is slightly more correct than using exceptions because error returns don't propagate automatically. e.g. if you get Refused from TCP then you know it was the TCP connection itself that was refused, not some other refusal from another function TCP called that happened to throw the same exception. In practice, this doesn't seem to be a problem except when using very generic exceptions like Not_found.

I guess using open types means we can list some common errors in the interfaces but have implementations provide more, which may be useful but doesn't solve my problem: when you don't want to handle specific exceptions we need a way to get a human-readable string for display (and this is the common case).

So:

Replacing error codes with exceptions (which I proposed originally) would work most of the time and would simplify the code greatly, although it's slightly less strict than the current system. On the other hand, the current system encourages people to throw away errors (look at any main.ml for examples!) so I'm sure it would be an overall improvement. It would require API changes everywhere, however.
Adding an error_message function (which my current patches do) means that people can still handle errors as before when needed, but can also always get a human-readable string too, so there's never any excuse for throwing away diagnostic information. You do still need to write error handling code everywhere when you just want to propagate exceptions, though, which clutters the code, and old code will continue to discard useful information for some time to come."

Planning for 3.0

Amir would like to start putting together a plans for a 3.0 release but there wasn't time to begin discussions on this call. Will schedule it for next time and keep it as a recurring item once every other call (i.e once a month).

AoB

Need to get Jitsu ready in next couple of weeks. Need libxl patches from DavidS/JonL and they're probably ok (will submit as an RFC). There's also a desire to calculate timings from HTTP req to serving the response so shared some ideas about how best to do this. Amir also has a use-case he wants to deploy and write about.
The next call is scheduled for Wednesday, 28th January - Please add any agenda items you wish to discuss in advance and refer to the mailing list for actual details a day or so in advance.

The MirageOS Documentation