Welcome to Gentoo Universe, an aggregation of weblog articles on all topics written by Gentoo developers. For a more refined aggregation of Gentoo-related topics only, you might be interested in Planet Gentoo.

Views expressed in the content published here do not necessarily represent the views of Gentoo Linux or the Gentoo Foundation.
February 23, 2020
Alice Ferrazzi a.k.a. alicef (homepage, bugs)
Searx and Gentoo wiki search (February 23, 2020, 15:00 UTC)

Two years ago I started to get interested in selfhosting services. I started to go away from private services and implementing selfhosting, manly because private services was disabling most of the features that I liked and I had no way to contribute or see how they was working.
That is what made look into https://old.reddit.com/r/selfhosted/ and https://www.privacytools.io/ That is when I disovered searx, as the github page say searx is a "Privacy-respecting metasearch engine".
As any selfhost service, you can install easily on your server or
also on the local computer.
For the installation instruction go here or use the searx-docker project.
As I use selfhosted services also because I like to contribute back,
after a few look I decided to add a meta-engine in searx.
Specifically a Gentoo wiki search meta-engine. Pull requeset #1368

Gentoo wiki search is usually enabled by default in the
it tab :)


Gentoo wiki search can also be used by the searx shortcut system (same as bang in duckduckgo if you have familiarity)

The Gentoo wiki search shortcut is !ge
for example

will give you this:

for concluding, have fun with searx and Gentoo!

I also highly recommend to have your own searx instance but you can play with public instances.

February 21, 2020
Michał Górny a.k.a. mgorny (homepage, bugs)
Gentoo Python Guide (February 21, 2020, 10:35 UTC)

Gentoo provides one of the best frameworks for providing Python support in packages among operating systems. This includes support for running multiple versions of Python (while most other distributions avoid going beyond simultaneous support for Python 2 and one version of Python 3), alternative implementations of Python, reliable tests, deep QA checks. While we aim to keep things simple, this is not always possible.

At the same time, the available documentation is limited and not always up-to-date. Both the built-in eclass documentation and Python project wiki page provide bits of documentation but they are mostly in reference form and not very suitable for beginners nor people who do not actively follow the developments within the ecosystem. This results in suboptimal ebuilds, improper dependencies, missing tests.

Gentoo Python Guide aims to fill the gap by providing a good, complete, by-topic (rather than reference-style) documentation for the ecosystem in Gentoo and the relevant eclasses. Combined with examples, it should help you write good ebuilds and solve common problems as simply as possible.

Gentoo Python Guide sources are available on GitHub. Suggestions and improvements are welcome.

February 10, 2020
Michał Górny a.k.a. mgorny (homepage, bugs)
No more PYTHON_TARGETS in single-r1 (February 10, 2020, 07:39 UTC)

Since its inception in 2012, python-single-r1 has been haunting users with two sets of USE flags: PYTHON_TARGETS and PYTHON_SINGLE_TARGET. While this initially seemed a necessary part of the grand design, today I know we could have done better. Today this chymera is disappearing for real, and python-single-r1 are going to use PYTHON_SINGLE_TARGET flags only.

I would like to take this opportunity to explain why the eclass has been designed this way in the first place, and what has been done to change that.


Why did we need a second variable in the first place? After all, we could probably get away with using PYTHON_TARGETS everywhere, and adding an appropriate REQUIRED_USE constraint.

Back in the day we have established that for users’ convenience we need to default to enabling one version of Python 2 and one version of Python 3. If we enabled only one of them, the users would end up having to enable the other for a lot of packages. On the other had, if we combined both with using PT for single-r1 packages, the users would have to disable the extra implementation for a lot of them. Neither option was good.

The primary purpose of PYTHON_SINGLE_TARGET was to provide a parallel sensible setting for those packages. It was not only to make the default work out of the box but also to let users change it in one step.

Today, with the demise of Python 2 and the effort to remove Python 2 from default PT, it may seem less important to keep the distinction. Nevertheless, a number of developers and at least some users keep multiple versions of Python in PT to test their packages. Having PST is still helpful to them.

Why additional PYTHON_TARGETS then?

PST is only half of the story. What I explained above does not justify having PYTHON_TARGETS on those packages as well, and a REQUIRED_USE constraint to make them superset of enabled PST. Why did we need to have two flag sets then?

The answer is: PYTHON_USEDEP. The initial design goal was that both python-r1 eclasses would use the same approach to declaring USE dependencies between packages. This also meant that this variable must work alike on dependencies that are multi-impl and single-r1 packages. In the end, this meant a gross hack.

Without getting into details, the currently available USE dependency syntax does not permit directly depending on PT flags based on PST-based conditions. This needs to be done using the more verbose expanded syntax:

pst2_7? ( foo[pt2_7] )
pst3_7? ( foo[pt3_7] )

While this was doable back in the day, it was not possible with PYTHON_USEDEP-based approach. Hence, all single-r1 packages gained additional set of flags merely to construct dependencies conveniently.

What is the problem with that?

I suppose some of you see the problem already. Nevertheless, let’s list them explicitly.

Firstly, enabling additional implementations is inconvenient. Whenever you need to do that, you need to add both PST and PT flags.

Secondly, the PT flags are entirely redundant and meaningless for the package in question. Whenever your value of PT changes, all single-r1 packages trigger rebuilds even if their PST value stays the same.

Thirdly, the PT flags overspecify dependencies. If your PT flags specify multiple implementations (which is normally the case), all dependencies will also have to be built for those interpreters even though PST requires only one of them.

The solution

The user-visible part of the solution is that PYTHON_TARGETS are disappearing from single-r1 packages. From now on, only PYTHON_SINGLE_TARGET will be necessary. Furthermore, PT enforcement on dependencies (if necessary) will be limited to the single implementation selected by PST rather than all of PT.

The developer-oriented part is that PYTHON_USEDEP is no longer valid in single-r1 packages. Instead, PYTHON_SINGLE_USEDEP is provided for dependencies on other single-r1 packages, and PYTHON_MULTI_USEDEP placeholder is used for multi-impl packages. The former is available as a global variable, the latter only as a placeholder in python_gen_cond_dep (the name is a bit of misnomer now but I’ve decided not to introduce additional function).

All existing uses have been converted, and the eclasses will now fail if someone tries to use the old logic. The conversion of existing ebuilds is rather simple:

  1. Replace all ${PYTHON_USEDEP}s with ${PYTHON_SINGLE_USEDEP} when the dep is single-r1, or with ${PYTHON_MULTI_USEDEP} otherwise.
  2. Wrap all dependencies containing ${PYTHON_MULTI_USEDEP} in a python_gen_cond_dep. Remember that the variable must be a literal placeholder, i.e. use single quotes.

An example of the new logic follows:

  $(python_gen_cond_dep '

If you get the dependency type wrong, repoman/pkgcheck will complain about bad dependency.

January 31, 2020
Thomas Raschbacher a.k.a. lordvan (homepage, bugs)

Recently I have heard from several people that even here in Austria there keep being calls supposedly from Microsoft "About my computer" ...
Ultimately they want to remote connect - I presume to install malware on your pc or some crap ... the last one I couldn'T bear to listen to any longer, because his english (reading off a sheet/screen as far as I can tell) was so bad that it practically hurt my ears .. O.o

I will see if i can find somewhere to report this.

Here are the numbers in case someone gets a similar call from them:
+44 20 8491 1893
+44 20 8044 7563

Anyone know where to report them please let me know. (my email is not hard to find ^^

January 03, 2020
FOSDEM 2020 (January 03, 2020, 00:00 UTC)


It’s FOSDEM time again! Join us at Université libre de Bruxelles, Campus du Solbosch, in Brussels, Belgium. This year’s FOSDEM 2020 will be held on February 1st and 2nd.

Our developers will be happy to greet all open source enthusiasts at our Gentoo stand in building K where we will also celebrate 20 years compiling! Visit this year’s wiki page to see who’s coming.

December 28, 2019
Alexys Jacob a.k.a. ultrabug (homepage, bugs)
Scylla Summit 2019 (December 28, 2019, 19:04 UTC)

I’ve had the pleasure to attend again and present at the Scylla Summit in San Francisco and the honor to be awarded the Most innovative use case of Scylla.

It was a great event, full of friendly people and passionate conversations. Peter did a great full write-up of it already so I wanted to share some of my notes instead…

This a curated set of topics that I happened to question or discuss in depth so this post is not meant to be taken as a full coverage of the conference.

Scylla Manager version 2

The upcoming version of scylla-manager is dropping its dependency on SSH setup which will be replaced by an agent, most likely shipped as a separate package.

On the features side, I was a bit puzzled by the fact that ScyllaDB is advertising that its manager will provide a repair scheduling window so that you can control when it’s running or not.

Why did it struck me you ask?

Because MongoDB does the same thing within its balancer process and I always thought of this as a patch to a feature that the database should be able to cope with by itself.

And that database-do-it-better-than-you motto is exactly one of the promises of Scylla, the boring database, so smart at handling workload impacts on performance that you shouldn’t have to start playing tricks to mitigate them… I don’t want this time window feature on scylla-manager to be a trojan horse on the demise of that promise!


They almost got late on this but are working hard to play well with the new toy of every tech around the world. Helm charts are also being worked on!

The community developed scylla operator by Yannis is now being worked on and backed by ScyllaDB. It can deploy, scale up and down a cluster.

Few things to note:

  • it’s using a configmap to store the scylla config
  • no TLS support yet
  • no RBAC support yet
  • kubernetes networking is lighter on the network performance hit that was seen on Docker
  • use placement strategies to dedicate kubernetes nodes to scylla!

Change Data Capture

Oh boy this one was awaited… but it’s now coming soon!

I inquired about it’s performance impact since every operation will be written to a table. Clearly my questioning was a bit alpha since CDC is still being worked on.

I had the chance to discuss ideas with Kamil, Tzach and Dor: one of the thing that one of my colleague Julien asked for was the ability for the CDC to generate an event when a tombstone is written so we could actually know when a specific data expired!

I want to stress a few other things too:

  • default TTL on CDC table is 24H
  • expect I/O impact (logical)
  • TTL tombstones can have a hidden disk space cost and nobody was able to tell me if the CDC table was going to be configured with a lower gc_grace_period than the default 10 days so that’s something we need to keep in mind and check for
  • there was no plan to add user information that would allow us to know who actually did the operation, so that’s something I asked for because it could be used as a cheap and open source way to get auditing!

LightWeight Transactions

Another so long awaited feature is also coming from the amazing work and knowledge of Konstantin. We had a great conversation about the differences between the currently worked on Paxos based LWT implementation and the maybe later Raft one.

So yes, the first LWT implementation will be using Paxos as a consensus algorithm. This will make the LWT feature very consistent while having it slower that what could be achieved using Raft. That’s why ScyllaDB have plans on another implementation that could be faster with less data consistency guarantees.

User Defined Functions / Aggregations

This one is bringing the Lua language inside Scylla!

To be precise, it will be a Lua JIT as its footprint is low and Lua can be cooperative enough but the ScyllaDB people made sure to monitor its violations (when it should yield but does not) and act strongly upon them.

I got into implementation details with Avi, this is what I noted:

  • lua function return type is not checked at creation but at execution, so expect runtime errors if your lua code is bad
  • since lua is lightweight, there’s no need to assign a core to lua execution
  • I found UDA examples, like top-k rows, to be very similar to the Map/Reduce logic
  • UDF will allow simpler token range full table scans thanks to syntax sugar
  • there will be memory limits applied to result sets from UDA, and they will be tunable

Text search

Dejan is the text search guy at ScyllaDB and the one who kindly implemented the LIKE feature we asked for and that will be released in the upcoming 3.2 version.

We discussed ideas and projected use cases to make sure that what’s going to be worked on will be used!

Redis API

I’ve always been frustrated about Redis because while I love the technology I never trusted its clustering and scaling capabilities.

What if you could scale your Redis like Scylla without giving up on performance? That’s what the implementation of the Redis API backed by Scylla will get us!

I’m desperately looking forward to see this happen!

December 24, 2019
Michał Górny a.k.a. mgorny (homepage, bugs)

So far, the majority of Python packages have either used distutils, or a build system built upon it. Most frequently, this was setuptools. All those solutions provided a setup.py script with a semi-standard interface, and we were able to handle them reliably within distutils-r1.eclass. PEP 517 changed that.

Instead of a setup script, packages now only need to supply a declarative project information in pyproject.toml file (fun fact: TOML parser is not even part of Python stdlib yet). The build system used is specified as a combination of a package requirement and a backend object to use. The backends are expected to provide a very narrow API: it’s limited to building wheel packages and source distribution tarballs.

The new build systems built around this concept are troublesome to Gentoo. They are more focused on being standalone package managers than build systems. They lack the APIs matching our needs. They have large dependency trees, including circular dependencies. Hence, we’ve decided to try an alternate route.

Instead of trying to tame the new build systems, or work around their deficiencies (i.e. by making them build wheel packages, then unpacking and repackaging them), we’ve explored the possibility of converting the pyproject.toml files into setup.py scripts. Since the new formats are declarative, this should not be that hard.

We’ve found poetry-setup project which seemed to have a similar goal. However, it was already discontinued at the time in favor of dephell. The latter project looked pretty powerful but the name was pretty ominous. We did not need most of the functions, and it was hell to package.

Finally, I’ve managed to dedicate some time into building an in-house solution instead. pyproject2setuppy is a small-ish (<100 SLOC) pyproject.toml-to-setuptools adapter which allows us to run flit- or poetry-based projects as if they used regular distutils. While it’s quite limited, it’s good enough to build and install the packages that we needed to deal with so far.

The design is quite simple — it reads pyproject.toml and calls setuptools’ setup() function with the metadata read. As such, the package can even be used to provide a backwards-compatible setup.py script in other packages. In fact, this is how its own setup.py works — it carries flit-compatible pyproject.toml and uses itself to install itself via setuptools.

dev-python/pyproject2setuppy is already packaged in Gentoo. I’ve sent eclass patches to easily integrate it into distutils-r1. Once they are merged, installing pyproject.toml packages should be as simple as adding the following declaration into ebuilds:


This should make things easier both for us (as it saves us from having to hurriedly add new build systems and their NIH dependencies) and for users who will not have to suffer from more circular dependencies in the Python world. It may also help some upstream projects to maintain backwards compatibility while migrating to new build systems.

December 19, 2019
Michał Górny a.k.a. mgorny (homepage, bugs)
A distribution kernel for Gentoo (December 19, 2019, 12:32 UTC)

The traditional Gentoo way of getting a kernel is to install the sources, and then configure and build one yourself. For those who didn’t want to go through the tedious process of configuring it manually, an alternative route of using genkernel was provided. However, neither of those variants was able to really provide the equivalent of kernels provided by binary distributions.

I have manually configured the kernels for my private systems long time ago. Today, I wouldn’t really have bothered. In fact, I realized that for some time I’m really hesitant to even upgrade them because of the effort needed to update configuration. The worst part is, whenever a new kernel does not boot, I have to ask myself: is it a real bug, or is it my fault for configuring it wrong?

I’m not alone in this. Recently Михаил Коляда (zlogene) has talked to me about providing binary kernels for Gentoo. While I have not strictly implemented what he had in mind, he inspired me to start working on a distribution kernel. The goal was to create a kernel package that users can install to get a working kernel with minimal effort, and that would be upgraded automatically as part of regular @world upgrades.

Pros and cons of your own kernel

If I am to justify switching from the old tradition of custom kernels to a universal kernel package, I should start by discussing the reasons why you may want to configure a custom kernel in the first place.

In my opinion, the most important feature of a custom kernel is that you can fine-tune it to your hardware. You just have to build the drivers you need (or may need), and the features you care about. The modules for my last custom kernel have occupied 44 MiB. The modules for the distribution kernel occupy 294 MiB. Such a difference in size also comes with a proportional increase of build time. This can be an important argument for people with low-end hardware. On the other hand, the distribution kernel permits building reusable binary packages that can save more computing power.

The traditional Gentoo argument is performance. However, these days I would be very careful arguing about that. I suppose you are able to reap benefits if you know how to configure your kernel towards a specific workload. But then — a misconfiguration can have the exact opposite effect. We must not forget that binary distributions are important players in the field — and the kernel must also be able to achieve good performance when not using a dedicated configuration.

At some point I have worked on achieving a very fast startup. For this reason I’ve switched to using LILO as the bootloader, and a kernel suitable for booting my system without an initramfs. A universal kernel naturally needs an initramfs, and is slower to boot.

The main counterargument is the effort. As mentioned earlier, I’ve personally grown tired of having to manually deal with my kernel. Do the potential gains mentioned outweigh the loss of human time on configuring and maintaining a custom kernel?

Creating a truly universal kernel

A distribution kernel makes sense only if it works on a wide range of systems. Furthermore, I didn’t forget the original idea of binary kernel packages. I didn’t want to write an ebuild that can install a working kernel anywhere. I wanted to create an ebuild that can be used to build a binary package that’s going to work on a wide range of setups — including not only different hardware but also bootloaders and /boot layout. A package that would work fine both for my ‘traditional’ LILO setup and UEFI systemd-boot setup.

The first part of a distribution kernel is the right configuration. I wanted to use a well-tested configuration known to build kernels used on many systems, while at the same time minimizing the maintenance effort on our end. Reusing the configuration from a binary distro was the obvious solution. I went for using the config from Arch Linux’s kernel package with minimal changes (e.g. changing the default hostname to Gentoo).

The second part is an initramfs. Since we need to support a wide variety of setups, we can’t get away without it. To follow the configuration used, Dracut was the natural choice.

The third and hardest part is installing it all. Since I’ve already set a goal of reusing the same binary package on different filesystem layouts, the actual installation needed to be moved to postinst phase. Our distribution kernel package installs the kernel into an interim location which is entirely setup-independent, rendering the binary packages setup-agnostic as well. The initramfs is created and installed into the final location along with the kernel in pkg_postinst.

Support for different install layouts is provided by reusing the installkernel tool, originally installed by debianutils. As part of the effort, it was extended with initramfs support and moved into a separate sys-kernel/installkernel-gentoo package. Furthermore, an alternative sys-kernel/installkernel-systemd-boot package was created to provide an out-of-the-box support for systemd-boot layout. If neither of those two work for you, you can easily create your own /usr/local/bin/installkernel that follows your own layout.


The experimental versions of the distribution kernel are packaged as sys-kernel/vanilla-kernel (in distinction from sys-kernel/vanilla-sources that install the sources). Besides providing the default zero-effort setup, the package supports using your own configuration via savedconfig (but no easy way to update it at the moment). It also provides a forced flag that can be used by expert users to disable the initramfs.

The primary goal at the moment is to test the package and find bugs that could prevent our users from using it. In the future, we’re planning to extend it to other architectures, kernel variants (Gentoo patch set in particular) and LTS versions. We’re also considering providing prebuilt binary packages — however, this will probably be a part of a bigger effort into providing an official Gentoo binhost.

December 16, 2019
Hanno Böck a.k.a. hanno (homepage, bugs)
#include </etc/shadow> (December 16, 2019, 17:38 UTC)

Recently I saw a tweet where someone mentioned that you can include /dev/stdin in C code compiled with gcc. This is, to say the very least, surprising.

When you see something like this with an IT security background you start to wonder if this can be abused for an attack. While I couldn't come up with anything, I started to wonder what else you could include. As you can basically include arbitrary paths on a system this may be used to exfiltrate data - if you can convince someone else to compile your code.

There are plenty of webpages that offer online services where you can type in C code and run it. It is obvious that such systems are insecure if the code running is not sandboxed in some way. But is it equally obvious that the compiler also needs to be sandboxed?

How would you attack something like this? Exfiltrating data directly through the code is relatively difficult, because you need to include data that ends up being valid C code. Maybe there's a trick to make something like /etc/shadow valid C code (you can put code before and after the include), but I haven't found it. But it's not needed either: The error messages you get from the compiler are all you need. All online tools I tested will show you the errors if your code doesn't compile.

I even found one service that allowed me to add

#include </etc/shadow>

and showed me the hash of the root password. This effectively means this service is running compile tasks as root.

Including various files in /etc allows to learn something about the system. For example /etc/lsb-release often gives information about the distribution in use. Interestingly, including pseudo-files from /proc does not work. It seems gcc treats them like empty files. This limits possibilities to learn about the system. /sys and /dev work, but they contain less human readable information.

In summary I think services letting other people compile code should consider sandboxing the compile process and thus make sure no interesting information can be exfiltrated with these attack vectors.

December 12, 2019
Michał Górny a.k.a. mgorny (homepage, bugs)

Many developers today continue using repoman commit as their primary way of committing to Gentoo. While this tool was quite helpful, if not indispensable in times of CVS, today it’s a burden. The workflow using a single serial tool to check your packages and commit to them is not very efficient. Not only it wastes your time and slows you down — it discourages you from splitting your changes into more atomic commits.

Upon hearing the pkgcheck advocacy, many developers ask whether it can commit for you. It won’t do that, that’s not its purpose. Not only it’s waste of time to implement that — it would actually make it a worse tool. With its parallel engine pkgcheck really shines when dealing with multiple packages — forcing it to work on one package is a waste of its potential.

Rather than trying to proliferate your bad old habits, you should learn how to use git and pkgcheck efficiently. This post aims to give you a few advices.

pkgcheck after committing

Repoman was built under the assumption that checks should be done prior to committing. That is understandable when you’re working on a ‘live’ repository as the ones used by CVS or Subversion. However, in case of VCS-es involving staging commits such as Git there is no real difference between checking prior to or post commit. The most efficient pkgcheck workflow is to check once all changes are committed and you are ready to push.

The most recent version of pkgcheck has a command just for that:

$ pkgcheck scan --commits

Yes, it’s that simple. It checks what you’ve committed compared to origin (note: you’ll need to have a correct origin remote), and runs scan on all those packages. Now, if you’re committing changes to multiple packages (which should be pretty common), the scan is run in parallel to utilize your CPU power better.

You might say: but repoman ensures that my commit message is neat these days! Guess what. The --commits option does exactly that — it raises warnings if your commit message is bad. Admittedly, it only checks summary line at the moment but that’s something that can (and will) be improved easily.

And I’ve forgotten the most cool thing of all: pkgcheck also reports if you accidentally remove the newest ebuild with stable keywords on given arch!

One more tip. You can use the following option to include full live verification of URLs:

$ pkgcheck scan --net --commits

Again, this is a feature missing entirely from repoman.

pkgcommit to ease committing to ebuilds

While the majority of repoman’s VCS support is superficial or better implemented elsewhere, there’s one killer feature worth keeping: automatically prepending the package name to the summary line. Since that is a really trivial thing, I’ve reimplemented it in a few lines of bash as pkgcommit.

When run in a package directory, it runs an editor with pre-filled commit message template to let you type it in, then passes it along with its own arguments to git. Usually, I use it as (I like to be explicit about signoffs and signing, you can make .git/config take care of it):

$ pkgcommit -sS .

Its extra feature is that it processes -m option and lets you skip the editor for simple messages:

$ pkgcommit -sS . -m 'Bump to 1.2.3'

Note that it does not go out of its way to figure out what to commit. You need to either stage changes yourself via git add, or pass appropriate paths to the command. What’s important is that it does not limit you to committing to one directory — you can e.g. include some profile changes easily.

You’ll also need pkg script from the same repository. Or you just install the whole bundle of app-portage/mgorny-dev-scripts.

Amending commits via fixups

Most of you know probably know that you can update commits via git commit --amend. However, that’s useful only for editing the most recent commit. You can also use interactive rebase to choose specific commits for editing, and then amend them. Yet, usually there’s a much more convenient way of doing that.

In order to commit a fixup to a particular past commit, use:

$ git commit --fixup OLD_COMMIT_ID

This will create a specially titled commit that will be automatically picked up and ordered by the interactive rebase:

$ git rebase -i -S origin

Again, I have a tool of greater convenience. Frequently, I just want to update the latest commit to a particular package (directory). git-fixup does exactly that — it finds the identifier of the latest commit to a particular file/directory (or the current directory when no parameter is given) and commits a fixup to that:

$ git-fixup .

Note that if you try to push fixups into the repository, nothing will stop you. This is one of the reasons that I don’t enable signoffs and signing on all commits by default. This way, if I forget to rebase my fixups, the git hook will reject them as lacking signoff and/or signature.

Again, it is part of app-portage/mgorny-dev-scripts.

Interactive rebase to the rescue

When trivial tools are no longer sufficient, interactive rebase is probably one of the best tools for editing your commits. Start by initiating it for all commits since the last push:

$ git rebase -i -S origin

It will bring your editor with a list of all commits. Using this list, you can do a lot: reorder commits, drop them, reword their commit messages, use squash or fixup to merge them into other commits, and finally: edit them (open for amending).

The interactive rebase is probably the most powerful porcelain git command. I’ve personally found the immediate tips given by git good enough but I realize that many people find it hard nevertheless. Since it’s not my goal here to provide detailed instructions on using git, I’m going to suggest looking online for tutorials and guides. The Rewriting History section of the Git Book also has a few examples.

Before pushing: git log

git log seems to be one of the most underappreciated pre-push tools. However, it can be of great service to you. When run prior to pushing, it can help you verify that what you’re pushing is actually what you’ve meant to push.

$ git log --stat

will list all staged commits along with a pretty summary of affected files. This can help you notice that you’ve forgotten to git add a patch, or that you’ve accidentally committed some extraneous change, or that you’ve just mixed changes from two commits.

Of course, you can go even further and take a look at the changes in patch form:

$ git log -p

While I realize this is nothing new or surprising to you, sometimes it’s worthwhile to reiterate the basics in a different context to make you realize something obvious.

November 23, 2019
Luca Barbato a.k.a. lu_zero (homepage, bugs)
rav1e-0.1.0 Made in Tokyo (November 23, 2019, 06:17 UTC)


AV1 is a modern video codec brought to you by an alliance of many different bigger and smaller players in the multimedia field.
I’m part of the VideoLan organization and I spent quite a bit of time on this codec lately.

rav1e: The safest and fastest AV1 encoder, built by many volunteers and Mozilla/Xiph developers.
It is written in rust and strives to provide good speed, quality and stay maintainable.

Made in Tokyo

The first official release of rav1e happened during the Video Dev Days 2019 in Tokyo.
Since it was originally presented during Video Dev Days 2017 it felt the right thing to do.

What is inside

Since last time I blogged about it, there are few changes:

crav1e is no more

The C-API now is part of rav1e itself and everything is built by cargo-c.

cargo install cargo-c
cargo cinstall --destdir=/tmp/staging
sudo cp /tmp/staging/* /

Thats’s all you need to use rav1e from C.

New API features

Keyframe placement

The rust API now let you override the keyframe placement

  let cfg = Config::default();
  let mut ctx: Context<u8> = cfg.new_context().unwrap();
  let f1 = ctx.new_frame();
  let f2 = f1.clone();
  let info = FrameParameters {
    frame_type_override: FrameTypeOverride::Key

  // Send the plain frame data
  // Send the data and the per-frame parameters
  // In this case the frame is forced to be a keyframe.
  ctx.send_frame((f2, info))?;
  // Flush the encoder, it is equivalent to a call to `flush()`

Multipass rate control

It is possible to use Context::twopass_out() to feed back the rate control information that Context::twopass_in() can produce.

The API is intentionally opaque to the point you deal with pre-serialized data and not structured information.

Config validation

We added Config::validate() to make sure the settings are correct and return a detailed error (InvalidConfig) if that’s not the case.


We overall made it a lot faster:

15:23 < koda> hey guys, did an encode with the new 0.1.0, 20 hours for 8 minutes, down from 32 hours using a two month old build
15:23 < koda> congrats (and thanks) for making these speed improvements

And we are still working to speed it up a lot. The current weekly snapshot is an additional 20-25% faster compared to 0.1.0.

NOTE: rav1e is still resources conscious, so it will not use all the threads and memory available. This makes it good if you want to encode multiple videos in parallel, but we will work on adding additional parallelism so even the single-video scenario is covered better.

What’s next

Ideally a 0.2.0 will appear the early December, it will contain many speed improvements, lots of bugfixes (in particular docs.rs will serve the documentation) and possibly a large boost in single-pass quality if what Tim and Guillaume are working on will land in time.

For those that want to try it in Gentoo a live package is available.

November 11, 2019
Craig Andrews a.k.a. candrews (homepage, bugs)
HTTP/3 Support Added to cURL in Gentoo (November 11, 2019, 17:03 UTC)

HTTP/3 may still be in the draft state but that isn’t stopping software from adding support for it. As a Gentoo developer, I decided to maintain Gentoo’s reputation for not being one to shy away from the bleeding edge by adding (optional) support for HTTP/3 to cURL. I believe that this makes Gentoo the first Linux distribution to ship support for this new protocol outside of the Firefox and Chrome/Chromium browsers.

cURL is a command line tool as well a library (libcurl) that is used by a wide variety of software. It’s commonly used by applications written in php, it’s used by the Kodi media center, and it’s at least an optional dependency of everything from git to systemd to cmake and dovecot. By adding support for HTTP/3 to cURL, potentially everything that uses cURL will also instantly also start supporting HTTP/3.

cURL added HTTP/3 support in version 7.66.0. Rather than writing the entirety of large, complex, and evolving HTTP/3 protocol implementation again (and having to maintain that forever), cURL instead leverages libraries. The two options it currently supports for this purpose are quiche and the combination of ngtcp2 and nghttp3.

Quiche is an HTTP/3 implementation first released by Cloudflare in January 2019. Since Cloudflare is using it to add support for HTTP/3 to its entire CDN (Content Distribution Network), they’re actively developing it keeping track of the latest changes being made in the HTTP/3 drafts. Quiche uses Google’s boringssl for cryptography which allows it to evolve faster, not having to wait for OpenSSL to implement features. It’s written in Rust which is great for security and maintainability. However, being written in Rust is also a problem as that means quiche is only available on platforms that Rust supports (amd64, arm64, ppc64, and x86) which is a much reduced subset of what cURL and the C language support (which is pretty much everything).

ngtcp2 (which implements IETF QUIC, the underlying HTTP/3 protocol) and nghttp3 (which implements the higher level HTTP/3 protocol) together form an HTTP/3 implementation. They are closely modeled on nghttp2 which is already used by cURL as well as the Apache web server (httpd). Therefore, they’re easier for existing software to use. They are written in C using standard build tools making them highly portable and able to run on essentially any architecture. ngtcp2 uses OpenSSL but the changes necessary for HTTP/3 support are not yet available in OpenSSL. This situation is also preventing HTTP/3 support from being available in other software that uses OpenSSL, including nodejs (see nodejs issue). Therefore, for the moment, in order to use ngtcp2, a patched version of OpenSSL must also be used. That isn’t an tenable solution for a Linux distribution such as Gentoo for a variety of reasons, including maintainability and security concerns involved with carrying a non-upstream version of such a critical package as OpenSSL. In the mean time, I’ve included the net-libs/ngtcp2 and net-libs/nghttp3 packages in Gentoo but masked them; that way, when OpenSSL is updated, the packages are ready and can simply be unmasked.

To enable HTTP/3 support in Gentoo, add the quiche use flag to the net-misc/curl package and re-emerge curl:

echo "net-misc/curl quiche" >> /etc/portage/package.use
emerge -1 net-misc/curl

After that, use the curl command’s new --http3 argument when making https requests. See the cURL documentation for more information.

November 06, 2019
Craig Andrews a.k.a. candrews (homepage, bugs)

Linters are static analysis tools that analyze source code and report problems. The term goes all the way back to Bell Labs in 1978 but the concept is still very important today. In my opinion, linters are a key ingredient of a successful DevSecOps implementation, and yet not enough people are aware of linters, how easy they are to use, and how important to quality and security they are.

Linters can be syntax checkers or more complex tools. “Lint” is a more or less a term used for lightweight, simple static analysis. For example, yamllint checks the syntax of YAML files. At first, this tool may seem to be nice but not really necessary; YAML is pretty easy to read and understand. However, it’s simple to make a mistake in YAML and introduce a hard to discover problem. Take this .gitlab-ci.yml for instance:

  password: "swordfish"
  - template: Code-Quality.gitlab-ci.yml
  stage: build
    name: python:buster
    - ./build.sh
      - target/beanstalk-app.zip
  password: "taco"
  - template: SAST.gitlab-ci.yml

This file is valid and GitLab will process it. However, it’s not clear what it actually does – did you spot all the errors? In this case, an unexpected password is specified among other issues. This error may introducing a security vulnerability. And this example is short and relatively easy to manually check – what if the YAML file was much longer?

For more complex languages than YAML, linting is even more important. With more expressive language, errors are easier to introduce (through misunderstanding and typos) and harder to notice. Linters also make code more consistent, understandable, and maintainable. They not only improve security but also reduce cost and improve quality.

For a real world example, I’ve been doing a lot of CloudFormation work lately. It’s easy to accidentally create more open security groups and network access control lists than necessary, to forget to enable encryption, or make other such mistakes. cfn_nag and cfn-lint have caught many errors for me, as well as encouraged me to improve the quality by setting resource descriptions and being explicit about intentions.

Another example is with Java. By using PMD to check for logic, syntax, and convention violation errors, the code can be more likely to work as expected. By using Checkstyle, the code is all consistently formatted, follows the same naming conventions, has required comments, and other such benefits that make the code easy to understand and maintain. And easy to understand and maintain inherently means more secure.

Therefore, always add as many linters as possible and have them run as often as possible. Running linters in the git pre-commit hook is ideal (as then detected errors are never even committed). Running them from the build process (maven, msbuild, make, grunt, gulp, etc) is really important. But ultimately, running them in continuous integration is an absolute requirement. Running them daily or weekly is simply not enough.

A common scenario I’ve seen is that static analysis is only done periodically (once per day or once per week) instead of for every commit (via a commit hook or continuous integration). For example, I’ve seen SonarQube set up to run daily for many projects. The problem with this approach is that errors are reported much later than they’re introduced making them lower priority to fix and harder to fix. If a daily SonarQube scan discovers a security vulnerability, management will triage the issue and perhaps put fixing it on the backlog, then eventually an engineer is tasked with fixing it but before they can do so they have to study and understand the relevant code. A superior approach leading to better efficiency and better security is to perform this scanning for every commit and fail the build if the scan fails – that way, the person who introduced the problem has to immediately fix it. This reduces exposure (as detected issues can never progress in the pipeline) and improves efficiency (as the same person who introduced the issue fixes it immediately without having to re-learn anything).

Here’s how a few linters are run on GitLab CI for the VersionPress on AWS project:

  stage: test
  image: sdesbure/yamllint
    - yamllint -s ./beanstalk/.ebextensions/*.config .

  stage: test
  image: hadolint/hadolint:latest-debian
    - hadolint beanstalk/Dockerfile

  stage: test
  image: aztek/cfn-lint
    - cfn-lint target/cloudformation.json

  stage: test
  image: python:latest
    - pip install awscli
    - aws cloudformation validate-template --template-body file://target/cloudformation.json

  stage: test
    name: stelligent/cfn_nag
    entrypoint: [""]
    - cfn_nag target/cloudformation.json

Note that docker is used to run the linters. This approach allows them to be quickly and easily run, and it’s much easier to maintain than having to manually install each one.

And finally, here are my favorite linters that I use frequently:

  • shellcheck is a static analysis tool for shell scripts.
  • cfn-lint validates templates against the CloudFormation spec and additional checks. Includes checking valid values for resource properties and best practices.
  • cfn-nag tool looks for patterns in CloudFormation templates that may indicate insecure infrastructure.
  • yamllint does not only check for syntax validity, but for weirdnesses like key repetition and cosmetic problems such as lines length, trailing spaces, indentation, etc.
  • PMD is an extensible cross-language static code analyzer. Easy to use from Java via Maven with the Maven PMD Plugin.
  • Checkstyle is a development tool to help programmers write Java code that adheres to a coding standard. Easy to use from Java via Maven with the Maven Checkstyle Plugin.
  • PHP_CodeSniffer tokenizes PHP, JavaScript and CSS files and detects violations of a defined set of coding standards.
  • Hadolint is a Dockerfile linter which validates inline bash.
  • CSSLint is a tool to help point out problems with your CSS code.
  • ESLint is the pluggable linting utility for JavaScript and JSX (prefer this tool over JSLint)
  • JSLint is The Javascript Code Quality Tool.
  • pkgcheck and repoman check Gentoo ebuilds (packages).
  • GitLab offers SAST (which is a bit more than the usual lightweight linter)

Michał Górny a.k.a. mgorny (homepage, bugs)
Gentoo eclass design pitfalls (November 06, 2019, 07:57 UTC)

I have written my share of eclasses, and I have made my share of mistakes. Designing good eclasses is a non-trivial problem, and there are many pitfalls you should be watching for. In this post, I would like to highlight three of them.

Not all metadata variables are combined

PMS provides a convenient feature for eclass writers: cumulative handling of metadata variables. Quoting the relevant passage:

The IUSE, REQUIRED_USE, DEPEND, BDEPEND, RDEPEND and PDEPEND variables are handled specially when set by an eclass. They must be accumulated across eclasses, appending the value set by each eclass to the resulting value after the previous one is loaded. Then the eclass-defined value is appended to that defined by the ebuild. […]

Package Manager Specification (30th April 2018), 10.2 Eclass-defined Metadata Keys

That’s really handy! However, the important thing that’s not obvious from this description is that not all metadata variables work this way. The following multi-value variables don’t: HOMEPAGE, SRC_URI, LICENSE, KEYWORDS, PROPERTIES and RESTRICT. Surely, some of them are not supposed to be set in eclasses but e.g. the last two are confusing.

This means that technically you need to append when defining them, e.g.:

# my.eclass
RESTRICT+=" !test? ( test )"

However, that’s not the biggest problem. The real issue is that those variables are normally set in ebuilds after inherit, so you actually need to make sure that all ebuilds append to them. For example, the ebuild needs to do:

# my-1.ebuild
inherit my
RESTRICT+=" bindist"

Therefore, this design is prone to mistakes at ebuild level. I’m going to discuss an alternative solution below.

Declarative vs functional

It is common to use declarative style in eclasses — create a bunch of variables that ebuilds can use to control the eclass behavior. However, this style has two significant disadvantages.

Firstly, it is prone to typos. If someone recalls the variable name wrong, and its effects are not explicitly visible, it is very easy to commit an ebuild with a silly bug. If the effects are visible, it can still give you some quality debugging headache.

Secondly, in order to affect global scope, the variables need to be set before inherit. This is not trivially enforced, and it is easy to miss that the variable doesn’t work (or partially misbehaves) when set too late.

The alternative is to use functional style, especially for affecting global scope variables. Instead of immediately editing variables in global scope and expecting ebuilds to control the behavior via variables, give them a function to do it:

# my.eclass
my_enable_pytest() {
  IUSE+=" test"
  RESTRICT+=" !test? ( test )"
  BDEPEND+=" test? ( dev-python/pytest[${PYTHON_USEDEP}] )"
  python_test() {
    pytest -vv || die

Note that this function is evaluated in ebuild context, so all variables need appending. Its main advantage is that it works independently of where in ebuild it’s called (but if you call it early, remember to append!), and in case of typo you get an explicit error. Example use in ebuild:

# my-1.ebuild
inherit my
RDEPEND="randomstuff? ( dev-libs/random )"

Think what phases to export

Exporting phase functions is often a matter of convenience. However, doing it poorly can cause ebuild writers more pain than if they weren’t exported in the first place. An example of this is vala.eclass as of today. It wrongly exports dysfunctional src_prepare(), and all ebuilds have to redefine it anyway.

It is often a good idea to consider how your eclass is going to be used. If there are both use cases for having the phases exported and for providing utility functions without any phases, it is probably a good idea to split the eclass in two: into -utils eclass that just provides the functions, and main eclass that combines them with phase functions. A good examples today are xdg and xdg-utils eclasses.

When you do need to export phases, it is wortwhile to consider how different eclasses are going to be combined. Generally, a few eclass types could be listed:

  • Unpacking (fetching) eclasses; e.g. git-r3 with src_unpack(),
  • Build system eclasses; e.g. cmake-utils, src_prepare() through src_install(),
  • Post-install eclasses; e.g. xdg, pkg_*inst(), pkg_*rm(),
  • Build environment setup eclasses; e.g. python-single-r1, pkg_setup().

Generally, it’s best to fit your eclass into as few of those as possible. If you do that, there’s a good chance that the ebuild author would be able to combine multiple eclasses easily:

# my-1.ebuild
PYTHON_COMPAT=( python3_7 )
inherit cmake-utils git-r3 python-single-r1

Note that since each of those eclasses uses a different phase function set to do its work, they combine just fine! The inherit order is also irrelevant. If we e.g. need to add llvm to the list, we just have to redefine pkg_setup().

October 30, 2019
Craig Andrews a.k.a. candrews (homepage, bugs)
The Importance of Upstreaming Issues (October 30, 2019, 00:40 UTC)

Any software builds upon other software – nothing truly starts from scratch. Even the most trivial “Hello World” demo program relies on a compiler, (most likely) a standard library, and then all of the low level system services, such as the operating system, drivers, and hardware. In any of those areas, it’s pretty much certain that there are bugs, and in all of those areas, there are guaranteed to be features that would be nice to have. So what does one do when a bug is discovered or a need for a new feature arises?

The Two Schools of Thought

There are two schools of thought when it comes to these bugs and feature requests in dependent software (and hardware):

  1. Work around the problem / implement the enhancement and move on (keeping the work to yourself)
  2. Report the bug or enhancement request to the dependent software’s maintainer/vendor (also known as “upstream”), then wait for them to do what is necessary, working with them as necessary

The “Easy” Way

Option 1 is the so called easy way – for at first, it seems easy. The basic idea is that if you encounter a problem, work around it. Let’s say a developer is graphing some data,  and finds that if certain types of data are graphed, the graph doesn’t look right (perhaps infinities aren’t handled in a way that makes sense). The obvious approach is, whenever an infinity is encountered, substitute a value which makes the graph look acceptable (perhaps a really big but not infinite number). In a matter of an hour or two, the bug is solved.

This approach seems great – the problem was solved quickly. However, perhaps there are multiple graphs in the application – this workaround must now be made to all the areas of code that perform graphing. And when a new graph is introduced, we have to remember to add this workaround.

Worse yet, now consider that some time has passed and the project is done and the developer has moved on. A new developer is now working on the project, and adds a new graph – and is baffled by why infinities don’t work right. Time is lost as the new developer has to re-discovered the original developer’s workaround.

And even worse, let’s say the original developer is now working on a completely new project that happens to need graphs. Having experience with this graphing library, he chooses it for the new project. And once again, he gets a bug report saying that infinities are not handled properly. And once again, he must re-discover and re-apply the workaround all over this new project.

The end result is that there is substantial time wasted at multiple levels with lots of unhappy people:

  • The developer loses time by applying the same fix in multiple places
  • The testers/QA lose time by continuously finding the same bug in multiple places
  • The project managers get annoyed as the schedule is less than predictable
  • The client is unhappy because a bug that was “fixed” keeps reappearing
  • On subsequent projects, all the same time loss and malaise occurs again and again

The Right Way

Option 2 initially seems like more work but ends up being a significantly lower effort for everyone. Continuing the previous scenario, if instead of simply making a one off fix, the developer immediately reported the bug to the charting library’s project. He spends a few minutes to come up with a test case, fills in the form on the charting library’s bug tracker, and then moves on to something else. There are two things that happen at this point: either someone will reply to the bug with a fix, or not.

If someone replies with a fix, and that fix gets included in the charting library, then great! The developer never even needs to figure out the bug – he simply applies the fix to his copy of the charting library or upgrades the charting library. He immediately saved time and effort, and the problem is fixed for this particular chart, all other charts in the project, and even all other projects.

If no one replies, the developer eventually goes back and tries to fix it himself. He figures out the problem, and now posts a patch to the bug tracker with the fix. With luck, someone accepts the patch into the project, and as with the previous case, the bug is fixed for everyone. In this scenario, the developer may have actually spent more time on the bug than in the “Easy” Way scenario initially (as he had to collaborate with the charting library project, get the patch to be acceptable, etc) – but this bug won’t be coming back. And, because the experts from the charting library project reviewed the work, we can be sure the work is good, maintainable, and doesn’t introduce other problems.

In this scenario, the project manager is happy – the same bug isn’t being reported over and over. The developer is happy  – developers hate fixing the same thing multiple times. And the client never encounters that annoying scenario of having to question why the managers/developers are not fixing things they claimed to have fixed. Plus, other managers and developers out there are unknowingly happy, as they will never encounter this problem.

In the Real World

In the real world, projects tend to not think ahead. As deadlines loom, projects enter “fire fighting” mode – rather than take 5 minutes to do something right, the response becomes “just hack it and worry about it another day.” The same goes for documentation, testing, and good architecture. All of these hacks add up to horrible technical debt, leading to a scenario where a project is in constant fire fighting mode and can never catch up, constantly fixing bugs caused by earlier hacks, laying hacks on hacks. This spiral results in turnover of the managers and developers and very unhappy clients.

Let’s take an example from a project I once worked on. The project was an Android app that, among lots of other things, graphs some data. Rather than reinvent the wheel by creating my own graphing library, which would be to say the least a large project in of itself, I chose to use achartengine. The library had a small community and was pretty new, so it had a decent number of bugs, and could use some features.

The project deadline (as always) was very tight. But, I was determined to do things “The Right Way” and have the team do the same. The result, for example, was this bug about how infinities are graphed incorrectly. There were also other issues, such as this one about zooming not working right, and bar charts not handling nulls properly. I reported bugs in worked with other projects as well, such as Spring and Android. Not everything is fixed – but even when it’s not (such as this example from Android), the team learned something, and ultimately the project became better and more efficient. The code is clean, high quality, and easy to work with. In dollars and sense terms, it has a lower TCO (better maintainability) than if we had “done things quickly.”And, based on previous experience, if we had “done things quickly” (aka the “easy” way) the project would have actual taken longer.

When the Right Way Isn’t Possible

Doing it “The Right Way” is great when it’s possible. However, there are cases where it is actually impossible.

  • The problem is in hardware, and replacing it isn’t feasible. In that case, reporting the issue is still important as the hardware’s creator could help design the best workaround, but the end result will be a workaround and not a new, fixed version.
  • The bug is in proprietary software that doesn’t accept bug reports (Microsoft for example… they often times charge fees to report bug in their products, then take years to provide a fix if they ever do at all)
  • The problem is in software that cannot be replaced/upgraded/modified for any number of reasons (legal, compliance, etc). Again reporting the is issue is still important so the author could potentially help in formulating a workaround.

In that case, well… good luck 🙂 The best that can be done in such cases is to have plenty of comments in code and documentation (especially in the relevant bugs in the project’s bug tracker).

AWS Secrets Manager is a simple and powerful way to handle secrets (such as database username/password credentials). It provides support for storing, retrieving, managing, and rotating credentials at an affordable cost (currently $0.40 per secret per month). However, it’s not terribly easy to use with WordPress. I have not been able to find any documentation or samples for how to set up WordPress to use AWS Secrets Manager for its database access credentials, so I figured out how to do that and I’m sharing my findings here.

In the process, I encountered a few challenges:

To get around these challenges, I decided to not use the AWS SDK for PHP and instead have PHP use exec to call the AWS CLI. For caching, the secret is written to a file in the system temp directory. When accessing the database, the credentials are read from the file. If login fails, the credentials are retrieved from AWS Secrets Manager again and the file is updated, then the connection is retried. This approach is similar to that used by the AWS Secrets Manager JDBC Library except instead of a file, it stores the information in memory. And finally, to get around WordPress’s lack of database access hooks, the wpdb global is overridden by placing a drop-in db.php file in the wp-content directory.

WP-CLI is another challenge. It always uses the DB_USER and DB_PASSWORD constants defined in wp-config.php; it will not use the wp-content/db.php drop in. Therefore, a WP-CLI specific file is necessary, wp-cli-secrets-manager.php. Unlike the WordPress db.php approach, however, this WP-CLI implementation always gets the secret and cannot cache it. The reason is that there is no WP-CLI hook or method that consistently is used to get the database connection or credentials, so there is no way to try potentially expired credentials then refresh them only if they do not work.

The Implementation

You must set two environment variables: AWS_DEFAULT_REGION (must contain the region of the secret, for example, us-east-1) and WORDPRESS_DB_SECRET_ID (contains either the name of ARN of the secret). The secret must have a SecretString containing a JSON object with username and password properties. Then place the follow file in wp-content/db.php:

if ( empty ( getenv('WORDPRESS_DB_SECRET_ID') ) ) {
  // if the secret id isn't set, don't install this approach
  Plugin Name: Extended wpdb to use AWS Secrets Manager credentials
  Description: Get the database username/password from an AWS Secrets Manager
  Version: 1.0
  Author: Craig Andrews
class wpdb_aws_secrets_manager_extended extends wpdb {
  * Path to the cache file
  * @var string
  private $secretCacheFile;

  public function __construct() {
    $this->dbname     = defined( 'DB_NAME' ) ? DB_NAME : '';
    $this->dbhost     = defined( 'DB_HOST' ) ? DB_HOST : '';
    $this->secretCacheFile = sys_get_temp_dir() . DIRECTORY_SEPARATOR . md5(getenv('WORDPRESS_DB_SECRET_ID'));
    parent::__construct( $this->dbuser, $this->dbpassword, $this->dbname, $this->dbhost );

  public function db_connect( $allow_bail = true ) {
    $ret = parent::db_connect( false );
    if (! $ret ) {
      // connection failed, refresh the credentials
      $ret = parent::db_connect( $allow_bail );
    return $ret;

  * Load the credentials from cached storage
  * If no credentials are cached, refresh credentials
  private function _load_credentials() {
    if ( file_exists ( $this->secretCacheFile ) ) {
      $data = json_decode ( file_get_contents ( $this->secretCacheFile ) );
      $this->dbuser = $data->username;
      $this->dbpassword = $data->password;
    } else {

  * Refresh the credentials from Secrets Mananager
  * and write to cached storage
  private function _refresh_credentials() {
    exec('aws secretsmanager get-secret-value --secret-id ' . escapeshellarg(getenv('WORDPRESS_DB_SECRET_ID')) . ' --query SecretString --output text > ' . escapeshellarg($this->secretCacheFile), $retArr, $status);
    chmod($this->secretCacheFile, 0600); // Read and write for owner, nothing for everybody else
    if ( $status != 0 ) {
      $this->bail("Could not refresh the AWS Secrets Manager secret");

global $wpdb;
$wpdb = new wpdb_aws_secrets_manager_extended();

For WP-CLI, create this file:

if ( empty ( getenv('WORDPRESS_DB_SECRET_ID') ) ) {
  // if the secret id isn't set, don't install this approach

class aws_secrets_manager_utility {

  * Database user name
  * @var string
  public $dbuser;

  * Database password
  * @var string
  public $dbpassword;

  public function __construct() {
    exec('aws secretsmanager get-secret-value --secret-id ' . escapeshellarg(getenv('WORDPRESS_DB_SECRET_ID')) . ' --query SecretString --output text', $retArr, $status);
    if ( $status != 0 ) {
      die("Could not retrieve the AWS Secrets Manager secret");
      $data = json_decode ( $retArr[0] );
      $this->dbuser = $data->username;
      $this->dbpassword = $data->password;

$aws_secrets_manager_utility = new aws_secrets_manager_utility();
# These constants have to be defined here before WP-CLI loads wp-config.php
define('DB_USER', $aws_secrets_manager_utility->dbuser);
define('DB_PASSWORD', $aws_secrets_manager_utility->dbpassword);

Then invoke wp-cli using the --require=wp-cli-secrets-manager.php argument.

In Action as Part of VersionPress On AWS

I’ve implemented full support for AWS Secrets Manager for WordPress’s database credentials (including automatic password rotation) in VersionPress on AWS. If you’d like to see a complete, real world example of how to configure AWS Secrets Manager including the rotation lambda, VPC support, security groups configuration, and more, please take a look at this commit in VersionPress on AWS.

AWS Secrets Manager Rotation in CloudFormation (October 30, 2019, 00:32 UTC)

I found AWS’s documentation for how to setup Secrets Manager secret rotation in CloudFormation to be severely lacking as no AWS documentation explains how to use the secret rotation templates provided by AWS within CloudFormation. Automating Secret Creation in AWS CloudFormation gives an example of how to setup the CloudFormation resources for the secret and its rotation, but it’s an incomplete example; it tells you where to place your Lambda’s S3 information in the template, but what if you want to use one of AWS’s provided rotation functions? AWS Templates You Can Use to Create Lambda Rotation Functions give you a list of RDS secret rotation “templates” and information about them but it does not give any information or examples for how to actually use them. After a bit of work, I figured out how to combine the information from those two documents to create a complete CloudFormation based RDS secret rotation example.

The AWS templates are Serverless Applications which can loaded in CloudFormation using AWS::Serverless::Application. To use them, start with the CloudFormation example given in Automating Secret Creation in AWS CloudFormation. Add the top level directive Transform: AWS::Serverless-2016-10-31 to allow processing of Serverless resources. Remove the MyRotationLambda resource. Add a new resource of type AWS::Serverless::Application referencing one of the rotation function listed in AWS Templates You Can Use to Create Lambda Rotation Functions. Then change LambdaInvokePermission to reference the Serverless Application. Here’s a complete example:

Description: "This is an example template to demonstrate CloudFormation resources for Secrets Manager"
Transform: AWS::Serverless-2016-10-31

  #This is a Secret resource with a randomly generated password in its SecretString JSON.
    Type: AWS::SecretsManager::Secret
      Description: 'This is my rds instance secret'
        SecretStringTemplate: '{"username": "admin"}'
        GenerateStringKey: 'password'
        PasswordLength: 16
        ExcludeCharacters: '"@/\'
          Key: AppName
          Value: MyApp

  #This is a RDS instance resource. Its master username and password use dynamic references to resolve values from
  #SecretsManager. The dynamic reference guarantees that CloudFormation will not log or persist the resolved value
  #We use a ref to the Secret resource logical id in order to construct the dynamic reference, since the Secret name is being
  #generated by CloudFormation
    Type: AWS::RDS::DBInstance
      AllocatedStorage: 20
      DBInstanceClass: db.t2.micro
      Engine: mysql
      MasterUsername: !Join ['', ['{{resolve:secretsmanager:', !Ref MyRDSInstanceRotationSecret, ':SecretString:username}}' ]]
      MasterUserPassword: !Join ['', ['{{resolve:secretsmanager:', !Ref MyRDSInstanceRotationSecret, ':SecretString:password}}' ]]
      BackupRetentionPeriod: 0
      DBInstanceIdentifier: 'rotation-instance'

  #This is a SecretTargetAttachment resource which updates the referenced Secret resource with properties about
  #the referenced RDS instance
    Type: AWS::SecretsManager::SecretTargetAttachment
      SecretId: !Ref MyRDSInstanceRotationSecret
      TargetId: !Ref MyDBInstance
      TargetType: AWS::RDS::DBInstance

  #This is a RotationSchedule resource. It configures rotation of password for the referenced secret using a rotation lambda
  #The first rotation happens at resource creation time, with subsequent rotations scheduled according to the rotation rules
  #We explicitly depend on the SecretTargetAttachment resource being created to ensure that the secret contains all the
  #information necessary for rotation to succeed
    Type: AWS::SecretsManager::RotationSchedule
    DependsOn: SecretRDSInstanceAttachment
      SecretId: !Ref MyRDSInstanceRotationSecret
      RotationLambdaARN: !GetAtt MyRotationServerlessApplication.Outputs.RotationLambdaARN
        AutomaticallyAfterDays: 30

  #This is ResourcePolicy resource which can be used to attach a resource policy to the referenced secret.
  #The resource policy in this example denies the DeleteSecret action to all principals within the current account
    Type: AWS::SecretsManager::ResourcePolicy
      SecretId: !Ref MyRDSInstanceRotationSecret
      ResourcePolicy: !Sub '{
                         "Version" : "2012-10-17",
                         "Statement" : [
                             "Effect": "Deny",
                             "Principal": {"AWS":"arn:aws:iam::${AWS::AccountId}:root"},
                             "Action": "secretsmanager:DeleteSecret",
                             "Resource": "*"

  #This is a Serverless Application resource. We will use its lambda to rotate secrets
  #For details about rotation lambdas, see https://docs.aws.amazon.com/secretsmanager/latest/userguide/rotating-secrets.html
    Type: AWS::Serverless::Application
        ApplicationId: arn:aws:serverlessrepo:us-east-1:297356227824:applications/SecretsManagerRDSMySQLRotationSingleUser
        SemanticVersion: 1.0.117
        endpoint: !Sub 'https://secretsmanager.${AWS::Region}.${AWS::URLSuffix}'
        functionName: 'cfn-rotation-lambda'

  #This is a lambda Permission resource which grants Secrets Manager permission to invoke the rotation lambda function
    Type: AWS::Lambda::Permission
      FunctionName: !GetAtt MyRotationServerlessApplication.Outputs.RotationLambdaARN
      Action: 'lambda:InvokeFunction'
      Principal: secretsmanager.amazonaws.com

This example uses SecretsManagerRDSMySQLRotationSingleUser which is the single user template for MySQL, however there are AWS provided templates for MariaDB, Oracle, and PostgreSQL as well. To find them, search the AWS Serverless Application Repository (the rotation templates all start with “SecretsManagerRDS”). Once you’ve found the desired template, click the “Deploy” button then select “Copy as SAM Resource” – the clipboard will now contain the CloudFormation YAML for the AWS::Serverless::Application resource.

October 29, 2019
Craig Andrews a.k.a. candrews (homepage, bugs)
End to End Encryption with Beanstalk (October 29, 2019, 18:52 UTC)

Beanstalk is often configured to terminate SSL at the load balancer then make the connection to the web server/application instances using unencrypted HTTP. That’s usually okay as the AWS network is designed to keep such traffic private, but under certain conditions, such as those requiring PCI compliance, DoD/government rules, or simply out of an abundance of caution, there’s a desire to have all traffic encrypted – including that between the Beanstalk load balancer and servers.

There are two approaches for implementing end to end encryption on Beanstalk:

  • Use a layer 4 load balancer (Network or Classic Elastic Load Balancer.
    Using this approach, the load balancer never decrypts the traffic. The downside is that advanced reporting isn’t possible and layer 7 features, such as session affinity, cannot be implemented.
  • Use a layer 7 load balancer (Application or Classic Load Balancer).
    Using this approach, traffic is decrypted at the load balancer. The load balancer would then re-encrypt traffic to the servers. Session affinity and traffic reporting are available.

The preferred solution is to use the layer 7 approach with an Application Load Balancer. This preference is due to the additional features the layer 7 offers, because Network Load Balancers are more expensive, and because AWS is deprecating Classic Load Balancers.

The simplest way to accomplish this goal is to use a self signed certificate on the servers and then use HTTPS from the load balancer to the server. Application Load Balancers do not currently perform validation of certificates which is why the self signed approach works and why there’s no advantage to using a CA issued certificate.

The following approach will work on any Beanstalk supported platform that uses nginx as the proxy server. This configuration is based on AWS’s documentation, but trimmed for only Application Load Balancers and to include the nginx configuration and self-signed certificate generation.

In your Beanstalk application archive, add these files:


# HTTPS server

server {
    listen       443;
    server_name  localhost;
    ssl                  on;
    # The certificate is generated in generate-certificate.sh
    ssl_certificate      /etc/pki/tls/certs/server.crt;
    ssl_certificate_key  /etc/pki/tls/certs/server.key;
    ssl_session_timeout  5m;
    ssl_protocols  TLSv1.2 TLSv1.3;
    ssl_prefer_server_ciphers   on;
    location / {
        proxy_pass  http://localhost:5000;
        proxy_set_header   Connection "";
        proxy_http_version 1.1;
        proxy_set_header        Host            $host;
        proxy_set_header        X-Real-IP       $remote_addr;
        proxy_set_header        X-Forwarded-For $proxy_add_x_forwarded_for;
        proxy_set_header        X-Forwarded-Proto https;


    Application Healthcheck URL: HTTPS:443/
    Port: '443'
    Protocol: HTTPS
    DefaultProcess: https
    ListenerEnabled: 'true'
    Protocol: HTTPS
    Port: '443'
    Protocol: HTTPS
    bash: []
    openssl: []
    mode: "000755"
    owner: root
    group: root
    content: |
      #!/usr/bin/env bash
      set -Eeuo pipefail # stop on all errors
      # These files are used by nginx, see nginx/conf.d/https.conf
      openssl genrsa 2048 > /etc/pki/tls/certs/server.key
      openssl req -new -x509 -nodes -sha1 -days 3650 -extensions v3_ca -key /etc/pki/tls/certs/server.key -subj "/CN=localhost" > /etc/pki/tls/certs/server.crt
    command: "/tmp/generate_nginx_certificate.sh"

To connect to AWS RDS databases using TLS/SSL, the client must trust the certificate provided by RDS; RDS doesn’t use certificates trusted by the CAs (Certificate Authorities) included by operating systems.

Without TLS/SSL, the connection to the database isn’t secure, meaning an attacker on the network between the client (running in EC2) and the database (running RDS) could eavesdrop or modify data.

To trust the AWS RDS certificate authority, on Docker, for a Red Hat / CentOS / Fedora / Amazon Linux (or other Fedora-type system) derived container, add the following to the Dockerfile:

set -e # stop on all errors
RUN curl "https://s3-us-gov-west-1.amazonaws.com/rds-downloads/rds-combined-ca-us-gov-bundle.pem" --output /etc/pki/ca-trust/source/anchors/rds-combined-ca-us-gov-bundle.pem \
&amp;&amp; curl "https://s3.amazonaws.com/rds-downloads/rds-combined-ca-bundle.pem" --output /etc/pki/ca-trust/source/anchors/rds-combined-ca-bundle.pem \
&amp;&amp; update-ca-trust \
&amp;&amp; update-ca-trust force-enable

On AWS Elastic Beanstalk the .ebextensions mechanism can be used. In the jar/war/etc deployment archive, add this file:


    bash: []
    curl: []
    mode: "000755"
    owner: root
    group: root
    content: |
      #!/usr/bin/env bash
      set -Eeuo pipefail # stop on all errors
      AVAILABILITY_ZONE=$(curl -s | grep region | cut -d\" -f4)
      if [[ ${AVAILABILITY_ZONE} == us-gov-* ]]
	curl "https://s3-us-gov-west-1.amazonaws.com/rds-downloads/rds-combined-ca-us-gov-bundle.pem" --output /etc/pki/ca-trust/source/anchors/rds-combined-ca-us-gov-bundle.pem
	curl "https://s3.amazonaws.com/rds-downloads/rds-combined-ca-bundle.pem" --output /etc/pki/ca-trust/source/anchors/rds-combined-ca-bundle.pem
      update-ca-trust force-enable
    command: "/tmp/install_rds_certificates.sh"

Next, modify the client to require a secure connection. For example, with the PostgreSQL JDBC client, add “?ssl=true” to the connection string url.

That it – you can now connect to your RDS database using SSL/TLS with the assurance that no MITM (Man In The Middle) attacks, eavesdropping attacks, etc are possible.

AWS recently enhanced its Systems Manager offering with shell access to EC2 instances and then they enhanced it further with SSH tunnel support. With these improvements, it’s now possible to improve your application’s security posture while reducing it’s operational costs and simplifying setup/maintenance.

Systems Manager vs Bastion Hosts

Minimizing the attack surface, simplifying as much as possible, not sharing credentials, and having audit trails are key facets of information security. The classic approach to providing access for authorized personnel (such as system administrators, database administrators, sometimes developers) to AWS resources is by setting up a bastion host (sometimes called a “jump box”) which acts as a gateway to other AWS resources. This approach has significant disadvantages:

  • The bastion hosts represents an additional attack surface. The surface can be minimized by using networking restrictions (such as NACLs and security group rules to limit access to a trusted IP address range), but no matter what, the surface is still larger than just the application.
  • The bastion host is additional complexity beyond just the application. It has to be set up, secured, audited, updated, and maintained. There’s a non-zero risk that in this work, a mistake can be made opening a vulnerability.
  • Authorized users have to have credentials to access the bastion host. Oftentimes, because AWS allows only a single SSH secret key to be associated with an EC2 instance, multiple users will each get a copy of and use the same SSH key. This credential sharing is far from best practice, eliminating non-repudiation: now it’s impossible to tell from the logs who did what. Alternatively, either manually or through automation such as Ansible, keys can be created and managed for each user. However, that contributes greatly to cost and complexity.
  • Audit trails are not automatically generated using the basic configuration of bastion hosts. To generate audit trails, CloudWatch Logs has to be configured and the bastion host has to be configured to send logs (such as /var/log/secure) to CloudWatch Logs. Even then, the individual commands run are not logged; additional work (again using Ansible or some other automation system) has to be done to get that level of detail.
  • The bastion host also has a cost associated with it as it is a running EC2 instance. Even a t2.micro costs about $10/month. If a larger instance if used to support more users, many bastions are used across many applications, or multiple availability zone redundancy is needed, the cost of bastion hosts can climb quickly.

AWS’s new SSM features solve all of these problems.

  • SSM is part of AWS, so there is nothing exposed. It has no additional attack surface beyond the existing use of AWS.
  • In terms of complexity, SSM is easier to enable than setting up a bastion host. There are no SSH keys to manage and no additional credentials to create (it uses AWS IAM credentials that authorized users would already have). In summary, additional IAM roles must be added to the instance profiles of the EC2 instances that users will access via SSM, the SSM service has to be installed on the EC2 instances (the most common Amazon Linux AMIs have it pre-installed), and standard IAM rules should be used to grant SSM access to users who should have it.
  • In terms of credentials, users use their existing AWS IAM credentials. That means any IAM polices apply, including Single Sign On (SSO), password rotation, multi-factor authentication (MFA), etc. Non-repudiation is in effect.
  • Because it is part of AWS, AWS SSM logs to AWS CloudTrail.

Setting up Systems Manager

So how do you get rid of that bastion in favor of using SSM? I recently introduced SSM support into VersionPress On AWS (which I’ll use as an example here) as well client projects. The process is to configure the EC2 instance to be able to communicate with SSM; the following details how to do that stand alone EC2 instances as well as ones managed by Elastic Beanstalk.

Enabling EC2 Instances to use SSM

For EC2 instances, create a new Instance Profile pointing to a new IAM role that includes the AmazonSSMManagedInstanceCore policy. This allows the EC2 instance to communicate with SSM. In CloudFormation json:

"EC2IamRole": {
	"Type" : "AWS::IAM::Role",
	"Properties" : {
		"AssumeRolePolicyDocument": {
			"Version" : "2012-10-17",
			"Statement": [ {
				"Effect": "Allow",
				"Principal": {
				"Service": [ "ec2.amazonaws.com" ]
			"Action": [ "sts:AssumeRole" ]
			} ]
		"ManagedPolicyArns" : [
"EC2IamInstanceProfile": {
	"Type" : "AWS::IAM::InstanceProfile",
	"Properties" : {
		"Roles" : [
				"Ref": "EC2IamRole"

CloudWatchAgentServerPolicy is not strictly required, but it’s generally a good idea to allow EC2s to log to CloudWatch. Add any other roles as desired, of course.

For the EC2, in its properties, set IamInstanceProfile to EC2IamInstanceProfile:

"EC2Host": {
	"Type": "AWS::EC2::Instance",
	"Properties": {
			"Ref": "BastionIamInstanceProfile"

If you’re using an Amazon Linux AMI base AMIs dated 2017.09 or later or an Amazon Linux 2 AMI, then you’re done unless you want to make sure the latest version of the SSM agent gets installed. Otherwise, you need to install the SSM agent. Here’s how to do that using CloudFormation with EC2 user data assuming an Amazon Linux 1 or 2 AMI is used:

"EC2Host": {
	"Properties": {
		"UserData": {
			"Fn::Base64": {
				"Fn::Join": [
						{ "Fn::Sub" : "  - yum -y localinstall https://s3.${AWS::Region}.amazonaws.com/amazon-ssm-${AWS::Region}/latest/linux_amd64/amazon-ssm-agent.rpm" }

Further instructions, including for other operations systems, is covered in the AWS documentation.

That’s it for EC2.

Enabling Beanstalk Instances to use SSM

For Elastic Beanstalk, the process is very similar to standalone EC2 instances. Create a new Instance Profile pointing to a new IAM role that includes the AmazonSSMManagedInstanceCore policy. In CloudFormation json:

"BeanstalkInstanceIamRole": {
	"Type" : "AWS::IAM::Role",
	"Properties" : {
		"AssumeRolePolicyDocument": {
			"Version" : "2012-10-17",
			"Statement": [ {
				"Effect": "Allow",
				"Principal": {
				"Service": [ "ec2.amazonaws.com" ]
			"Action": [ "sts:AssumeRole" ]
			} ]
		"ManagedPolicyArns" : [
"BeanstalkInstanceIamInstanceProfile": {
	"Type" : "AWS::IAM::InstanceProfile",
	"Properties" : {
		"Roles" : [
				"Ref": "BeanstalkInstanceIamRole"

For the Beanstalk Environment, in its properties, in the namespace aws:autoscaling:launchconfiguration set IamInstanceProfile to BeanstalkInstanceIamInstanceProfile:

"BeanstalkEnvironment": {
	"Type": "AWS::ElasticBeanstalk::Environment",
	"Properties": {
		"OptionSettings": [
				"Namespace": "aws:autoscaling:launchconfiguration",
				"OptionName": "IamInstanceProfile",
						"Ref": "BeanstalkInstanceIamInstanceProfile"

If you’re using an Amazon Linux AMI base AMIs dated 2017.09 or later or an Amazon Linux 2 AMI, then you’re done. Otherwise, you need to install the SSM agent. Create an ebextension (named, for example, ssm-agent.config) like this to do so:

    command: 'yum -y localinstall https://s3.`{ "Ref" : "AWS::Region"}`.amazonaws.com/amazon-ssm-`{ "Ref" : "AWS::Region"}`/latest/linux_amd64/amazon-ssm-agent.rpm'

That’s it – the Beanstalk managed EC2 instances should now be visible to SSM.

Using the SSM Console to Connect to Instances

The SSM Console allow you to connect to EC2 instances from the browser – there is no client software (other than the browser) involved. This eliminates the need to worry about SSH clients and firewalls.

To start a session, from the SSM console, go to Session Manager.

AWS Systems Manager > Start a session

Select the desired instance and click “Start Session”. The result is a shell in your browser.

AWS Systems Manager session started

Use an SSH Client to Connect to Instances Including File Transfers and Port Forwarding

The browser based access is nice, but it doesn’t allow for file transfers or port forwarding. To be able to do that, SSM supports SSH connections. To set that up:

With that setup in place, ssh to any SSH registered EC2 instance by running:

ssh ec2-user@i-012345679

Port forwarding also works. For example, I set up a tunnel to my RDS instance so I can use SQuirreL SQL to query my RDS database like this:

ssh ec2-user@i-012345679 -A -L5432:production-env-db.c5nl5xt26oh8.us-east-1.rds.amazonaws.com:5432

Finally, scp and sftp work for copying files both to and from EC2 instances:

scp myfile ec2-user@i-012345679:


Bastion hosts were a good solution, but now there is a better solution.

And it’s not just AWS that’s offering better alternatives to bastion hosts; Microsoft offers Azure Bastion which closely mirror’s AWS’s Systems Manager Session Manager.

Dynamic references in CloudFormation to secure strings are very handy, providing a simple way to keep secrets (such as passwords) secure. However, SSM Secure String Parameters are only supported in a limited set of places and Elastic Beanstalk environment variables are not one of them (feature request for adding support). Therefore, if you want to use SSM Secure Strings with a Beanstalk application some extra work is necessary.

There are a few ways to access SSM Secure Strings from a Beanstalk application. One approach is to modify the application to get the secure string itself. But that can be quite a bit of extra work, or even impossible if the application can’t be modified. Another approach is to use an ebextension to pre-process the environment variable values provided in CloudFormation to resolve SSM secure string dynamic references. That way, the application continues to use environment variables removing the need to modify it.

To implement this approach of adding support for SSM Secure String dynamic references in Beanstalk environment variables values, first add this ebextension to your Elastic Beanstalk deployment archive:

    bash: []
    curl: []
    jq: []
    perl: []
    mode: "000700"
    owner: root
    group: root
    content: |
      #!/usr/bin/env bash

    mode: "000700"
    owner: root
    group: root
    content: |
      #!/usr/bin/env bash

    mode: "000700"
    owner: root
    group: root
    content: |
      #!/usr/bin/env bash

    mode: "000700"
    owner: root
    group: root
    content: |
      #!/usr/bin/env bash
      set -Eeuo pipefail

      # Resolve SSM parameter references in the elasticbeanstalk option_settings environment variables.
      # SSM parameter references must take the same form used in CloudFormation, see https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/dynamic-references.html#dynamic-references-ssm-secure-strings
      # supported forms are:
      # {{resolve:ssm-secure-env:path:version}}
      # {{resolve:ssm-secure-env:path}}
      # {{resolve:ssm-env:path:version}}
      # {{resolve:ssm-env:path}}
      # where "path" is the SSM parameter path and "version" is the parameter version.

      if [[ -z "${AWS_DEFAULT_REGION:-}" ]]; then
        # not set so get from configuration
        AWS_DEFAULT_REGION="$(aws configure get region)" || :
      if [[ -z "${AWS_DEFAULT_REGION:-}" ]]; then
        # not set so get from metadata
        AWS_DEFAULT_REGION="$(curl -s | jq -r .region)" || :
      if [[ -z "${AWS_DEFAULT_REGION:-}" ]]; then
        echo "Could not determine region." 1>&amp;2
        exit 1

      readonly CONTAINER_CONFIG_FILE="${1:-/opt/elasticbeanstalk/deploy/configuration/containerconfiguration}"
      readonly TEMP_CONTAINER_CONFIG_FILE="$(mktemp)"

      for envvar in $(jq -r ".optionsettings[\"aws:elasticbeanstalk:application:environment\"][]" "${CONTAINER_CONFIG_FILE}"); do
        envvar="$(echo "${envvar}" | perl -p \
          -e 's|{{resolve:ssm(?:-secure)-env:([a-zA-Z0-9_.-/]+?):(\d+?)}}|qx(aws ssm get-parameter-history --name "$1" --with-decryption --query Parameters[?Version==\\\x60$2\\\x60].Value --output text) or die("Failed to get SSM parameter named \"$1\" with version \"$2\"")|eg;' \
          -e 's|{{resolve:ssm(?:-secure)-env:([a-zA-Z0-9_.-/]+?)}}|qx(aws ssm get-parameter --name "$1" --with-decryption --query Parameter.Value --output text) or die("Failed to get SSM parameter named \"$1\"")|eg;')"
        export envvar
        jq ".optionsettings[\"aws:elasticbeanstalk:application:environment\"][${i}]=env.envvar" < "${CONTAINER_CONFIG_FILE}" > "${TEMP_CONTAINER_CONFIG_FILE}"
        ((i++)) || :

Then use dynamic references in your Beanstalk environment variables in one of these forms:

  • {{resolve:ssm-secure-env:path:version}}
  • {{resolve:ssm-secure-env:path}}
  • {{resolve:ssm-env:path:version}}
  • {{resolve:ssm-env:path}}

For example, a fragment of the CloudFormation (in yaml) might look like this:

AWSTemplateFormatVersion: '2010-09-09'
    Type: AWS::ElasticBeanstalk::Environment
          Namespace: "aws:elasticbeanstalk:application:environment"
          Value: !Sub "{{resolve:ssm-secure-env:/my/parameter:42}}"

Spring Session JDBC is a great way to allow an application to be stateless. By storing the session in the database, a request can be routed to any application server. This approach provides significant advantages such as automatic horizontal scaling, seamless failover, and no need for session affinity. By using JDBC, the database the application is already using provides the storage avoiding the need to setup and maintain other software, such as Memcache or Redis.

When Spring Session JDBC stores the session in the database, it has to serialize (convert from a Java object to a string) the session and also deserialize it (convert from a string back to a Java object). By default, it uses Java’s built in serialization.

There are numerous reasons not to use Java’s built in serialization (ObjectInputSteam / ObjectOutputStream). Oracle calls it a “Horrible Mistake” and plans to remove it in a future Java release. It’s also less performant and produces a larger serialized form than many alternatives.

Since Java serialization is (at least, for now) included in Java, it’s still commonly used, including by Spring Session JDBC. Switching to another serialization method can be a relatively quick and easy way to improve performance.

Any serialization can be used, including Jackson (which uses the JSON or XML format), Protocol Buffers, Avro, and more. However, all require work to define schemas for the data and additional configuration. In the interest of avoiding those efforts (which is especially important for legacy applications), a schemaless serializer (which is what Java’s built in serializer is) can be used such as FST (fast-serializer) or Kryo.

Switching the serializer used by Spring Session JDBC is done by defining a a bean named springSessionConversionService of type ConversionService. The following examples provide the code to use FST or Kryo.

Using FST with Spring Session JDBC

Add FST as dependency to the project. For example, using Maven:


And these add these classes:


import org.springframework.beans.factory.BeanClassLoaderAware;
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;
import org.springframework.core.convert.ConversionService;
import org.springframework.core.convert.support.GenericConversionService;
import org.springframework.core.serializer.support.DeserializingConverter;
import org.springframework.core.serializer.support.SerializingConverter;

public class FstSessionConfig implements BeanClassLoaderAware {

	private ClassLoader classLoader;

	public ConversionService springSessionConversionService() {
		final FstDeserializerSerializer fstDeserializerSerializer = new FstDeserializerSerializer(classLoader);

		final GenericConversionService conversionService = new GenericConversionService();
		conversionService.addConverter(Object.class, byte[].class,
				new SerializingConverter(fstDeserializerSerializer));
		conversionService.addConverter(byte[].class, Object.class,
				new DeserializingConverter(fstDeserializerSerializer));
		return conversionService;

	public void setBeanClassLoader(final ClassLoader classLoader) {
		this.classLoader = classLoader;


import java.io.IOException;
import java.io.InputStream;
import java.io.OutputStream;

import org.nustaq.serialization.FSTConfiguration;
import org.nustaq.serialization.FSTObjectOutput;
import org.springframework.core.NestedIOException;
import org.springframework.core.serializer.Deserializer;
import org.springframework.core.serializer.Serializer;

public class FstDeserializerSerializer implements Serializer<Object>, Deserializer<Object> {

	private final FSTConfiguration fstConfiguration;
	public FstDeserializerSerializer(final ClassLoader classLoader) {
		fstConfiguration = FSTConfiguration.createDefaultConfiguration();

	public Object deserialize(InputStream inputStream) throws IOException {
			return fstConfiguration.getObjectInput(inputStream).readObject();
		catch (ClassNotFoundException ex) {
			throw new NestedIOException("Failed to deserialize object type", ex);

	public void serialize(Object object, OutputStream outputStream) throws IOException {
		// Do not close fstObjectOutput - that would prevent reuse and cause an error
		// see https://github.com/RuedigerMoeller/fast-serialization/wiki/Serialization
		final FSTObjectOutput fstObjectOutput = fstConfiguration.getObjectOutput(outputStream);

Using Kryo with Spring Session JDBC

Add Kryo as dependency to the project. For example, using Maven:


And these add these classes:


import org.springframework.beans.factory.BeanClassLoaderAware;
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;
import org.springframework.core.convert.ConversionService;
import org.springframework.core.convert.support.GenericConversionService;
import org.springframework.core.serializer.support.DeserializingConverter;
import org.springframework.core.serializer.support.SerializingConverter;

public class KryoSessionConfig implements BeanClassLoaderAware {

	private ClassLoader classLoader;

	public ConversionService springSessionConversionService() {
		final KryoDeserializerSerializer kryoDeserializerSerializer = new KryoDeserializerSerializer(classLoader);

		final GenericConversionService conversionService = new GenericConversionService();
		conversionService.addConverter(Object.class, byte[].class,
				new SerializingConverter(kryoDeserializerSerializer));
		conversionService.addConverter(byte[].class, Object.class,
				new DeserializingConverter(kryoDeserializerSerializer));
		return conversionService;

	public void setBeanClassLoader(final ClassLoader classLoader) {
		this.classLoader = classLoader;


import java.io.IOException;
import java.io.InputStream;
import java.io.OutputStream;

import org.springframework.core.serializer.Deserializer;
import org.springframework.core.serializer.Serializer;

import com.esotericsoftware.kryo.Kryo;
import com.esotericsoftware.kryo.io.Input;
import com.esotericsoftware.kryo.io.Output;
import com.esotericsoftware.kryo.util.Pool;

public class KryoDeserializerSerializer implements Serializer<Object>, Deserializer<Object> {
	private final ClassLoader classLoader;
	// Pool constructor arguments: thread safe, soft references, maximum capacity
	private final Pool<Kryo> kryoPool = new Pool<Kryo>(true, true) {
	   protected Kryo create () {
	      final Kryo kryo = new Kryo();
	      return kryo;
	public KryoDeserializerSerializer(final ClassLoader classLoader) {
		this.classLoader = classLoader;

	public Object deserialize(InputStream inputStream) throws IOException {
		final Kryo kryo = kryoPool.obtain();
		try(final Input input = new Input(inputStream)){
			return kryo.readObjectOrNull(input, null);
		}finally {

	public void serialize(Object object, OutputStream outputStream) throws IOException {
		final Kryo kryo = kryoPool.obtain();
		try(final Output output = new Output(outputStream)){
			kryo.writeObject(output, object);
		}finally {

How to Choose which Serializer to Use

The process of selecting which serializer to use for an application should be done only through testing, both for functionality and for performance.

For some applications, a serializer won’t work due to known limitations in the serializer or bugs in it. For example, FST doesn’t currently support the Java 8 time classes, so if your application stores session data using such a class, FST is not for you. With Kryo, I ran into a bug stopping me from using it (which will be fixed in version 5.0.0-RC5 and later).

Performance will also vary between serializers for each application. Factors that impact performance include exactly what is being serialized, how big that data is, how often it’s accessed, the version of Java, and how the system is configured. FST has published some benchmarks, but that information must be taken with a grain of salt as those benchmarks are only measuring very specific, isolated scenarios. That data does provide general guidance though – you can expect better performance when you switch from the Java serializer to FST, for example, but testing of the full application will need to be done to determine if the improvement is 0.1% or 10%.

October 13, 2019
Michał Górny a.k.a. mgorny (homepage, bugs)
Improving distfile mirror structure (October 13, 2019, 14:34 UTC)

The Gentoo distfile mirror network is essential in distributing sources to our users. It offloads upstream download locations, improves throughput and reliability, guarantees distfile persistency.

The current structure of distfile mirrors dates back to 2002. It might have worked well back when we mirrored around 2500 files but it proved not to scale well. Today, mirrors hold almost 70 000 files, and this number has been causing problems for mirror admins.

The most recent discussion on restructuring mirrors started in January 2015. I have started the preliminary research in January 2017, and it resulted in GLEP 75 being created in January 2018. With the actual implementation effort starting in October 2019, I’d like to summarize all the data and update it with fresh statistics.

Continue reading

October 11, 2019
Thomas Raschbacher a.k.a. lordvan (homepage, bugs)

So I was trying to set up a service that should be accessible from the internet and locally (under the same Domain name and in a different subnet as the internal hosts for security reasons).

the variables (from my script I use for testing) are self explainatory I think (IP Addresses and interface name)

iptables -t nat -A PREROUTING -p tcp -d ${EXTIP} --dport 80 -j DNAT --to ${WEB}
iptables -A INPUT -p tcp -m state --state NEW --dport 80 -i ${IFACE_WAN} -j ACCEPT

Trying to just use my usual DNAT rules did work from outside, but not from the inside.. luckily I found help in #Netfilter on irc.freenode.net ..  duclicsic pointed out to me that i needed SNAT or MASQUERADE too so the router rewrites the local packets too. And also told me that this whole thing was called Hairpin NAT. Thanks to that I now have my iptables rules in place:

iptables -t nat -A POSTROUTING -s ${INTERNALNET} -d ${WEB} -p tcp --dport 80 -j MASQUERADE

INTERNALNET is just the internal network in CIDR notation.

He also pointed out to me that a packet forwared this way would not hit the INPUT chain, but FORWARD instead on the host (since the LAN interface on my router does not block port 80 i did not have an issue with this).

An alternative to this would be split horizon DNS, where the internal DNS server returns the internal IP for internal hosts instead of the public IP.

October 04, 2019
Thomas Raschbacher a.k.a. lordvan (homepage, bugs)
[gentoo] Network Bridge device for Qemu kvm (October 04, 2019, 08:53 UTC)

So I needed to set up qemu+kvm on a new server (After the old one died)

Seems like i forgot to mention how I set up the bridge network on my previous blog post so here you go:

First let me mention that I am using the second physical Interface on the server for the bridge. Depending on your available hardware or use case you might need / want to change this:

So this is fairly simple (if one has a properly configured kernel of course - details on the Gentoo Wiki article on Network Bridges):

First add this in your /etc/conf.d/net (adjust the interface names as needed):

# QEMU / KVM bridge

then add an init script and start it:

cd /etc/init.d; ln -s net.lo net.br0
/etc/init.d/net.br0 start # to test it

So then I get this error when trying to start my kvm/qemu instance:

 * creating qtap (auto allocating one) ...
/usr/sbin/qtap-manipulate: line 28: tunctl: command not found
tunctl failed
 * failed to create qtap interface

seems like I was missing sys-apps/usermode-utilities .. so just emerge that, only to get this:

 * creating qtap (auto allocating one) ...
/usr/sbin/qtap-manipulate: line 29: brctl: command not found
brctl failed
 * failed to create qtap interface

yepp I forgot to install that too ^^ .. so Install net-misc/bridge-utils and now it starts the VM

September 30, 2019
Thomas Raschbacher a.k.a. lordvan (homepage, bugs)

So on a new install I was just sitting there and wondering .. what did I do wrong .. why do I keep getting those errors:

# ping lordvan.com
connect: Network is unreachable

then I realized something when checking the output of route -n:

 # route -n
Kernel IP routing table
Destination     Gateway         Genmask         Flags Metric Ref    Use Iface   U     0      0        0 enp96s0f0 UH    2      0        0 enp96s0f0

It should be:

 # route -n
Kernel IP routing table
Destination     Gateway         Genmask         Flags Metric Ref    Use Iface         UG    2      0        0 enp96s0f0   U     0      0        0 enp96s0f0

Turns out I had forgotten something quite simple, yet important: add "default via <router IP>" to /etc/conf.d/net.. So after changing it from



routes_enp96s0f0="default via"

and restarting the interface everything works just fine ;)

Silly mistake, easy fix .. can be a pain to realize what went wrong though .. maybe someone will make the same mistake and find this blog post hopefully to fix it faster than me ;)

September 22, 2019
Alexys Jacob a.k.a. ultrabug (homepage, bugs)
py3status v3.20 – EuroPython 2019 edition (September 22, 2019, 14:27 UTC)

Shame on me to post this so long after it happened… Still, that’s a funny story to tell and a lot of thank you to give so let’s go!

The py3status EuroPython 2019 sprint

I’ve attended all EuroPython conferences since 2013. It’s a great event and I encourage everyone to get there!

The last two days of the conference week are meant for Open Source projects collaboration: this is called sprints.

I don’t know why but this year I decided that I would propose a sprint to welcome anyone willing to work on py3status to come and help…

To be honest I was expecting that nobody would be interested so when I sat down at an empty table on saturday I thought that it would remain empty… but hey, I would have worked on py3status anyway so every option was okay!

Then two students came. They ran Windows and Mac OS and never heard of i3wm or py3status but were curious so I showed them. They could read C so I asked them if they could understand how i3status was reading its horrible configuration file… and they did!

Then Oliver Bestwalter (main maintainer of tox) came and told me he was a long time py3status user… followed by Hubert Bryłkowski and Ólafur Bjarni! Wow..

We joined forces to create a py3status module that allows the use of the great PewPew hardware device created by Radomir Dopieralski (which was given to all attendees) to control i3!

And we did it and had a lot of fun!

Oliver’s major contribution

The module itself is awesome okay… but thanks to Oliver’s experience with tox he proposed and contributed one of the most significant feature py3status has had: the ability to import modules from other pypi packages!

The idea is that you have your module or set of modules. Instead of having to contribute them to py3status you could just publish them to pypi and py3status will automatically be able to detect and load them!

The usage of entry points allow custom and more distributed modules creation for our project!

Read more about this amazing feature on the docs.

All of this happened during EuroPython 2019 and I want to extend once again my gratitude to everyone who participated!

Thank you contributors

Version 3.20 is also the work of cool contributors.
See the changelog.

  • Daniel Peukert
  • Kevin Pulo
  • Maxim Baz
  • Piotr Miller
  • Rodrigo Leite
  • lasers
  • luto

Michał Górny a.k.a. mgorny (homepage, bugs)
The gruesome MediaWiki API (September 22, 2019, 06:44 UTC)

I have recently needed to work with MediaWiki API. I wanted to create a trivial script to update UID/GID assignment table from its text counterpart. Sounds trivial? Well, it was not, as update-wiki-table script proves.

MediaWiki API really feels like someone took the webpage and replaced HTML templates with JSON, preserving all the silly aspects that do not make any sense. In this short article, I would like to summarize my experience by pointing out what is wrong with it, why and how it could be done much better.

How many requests does it take?

How many API requests does it take to change a Project page? None because you can’t grant your bot password permissions to do that. Sadly, that’s not a joke but the reality in Gentoo Wiki. Surely it is our fault for not configuring it properly — or maybe upstream’s for making that configuration so hard? Nevertheless, the actual table had to be moved to public space to resolve that.

Now, seriously, how many requests? I would have thought: maybe two. Actually, it’s four. Five, if you want to be nice. That is, in order:

  1. request login token,
  2. log in using the login token,
  3. request CSRF token,
  4. update the page using CSRF token,
  5. log out.

Good news: you don’t need to fetch yet another token to log out. You can use the same CSRF token you used to edit the Wiki page!

Now, what’s the deal with all those tokens? CSRF attacks are a danger to people using a web browser! How would you issue a CSRF attack against an API client if the client has pretty clearly defined what it’s supposed to do? Really, requesting extra tokens is just busywork that makes the API unpleasant and slow.

So, the bare minimum: remove those useless tokens, and get down to three requests. Ideally, since I only care to perform a single action, the API would let me provide credentials along with it without the login-logout ping-pong. This would get it down to one request.

Cookies, sir?

The Python examples in MediaWiki documentation use requests module (e.g. API:Edit example). Since I don’t like extraneous external dependencies, I’ve initially rewritten it to use the built-in urllib. That was a mistake, and a badly documented one.

I’ve gotten as far as to the login request. However, it repeatedly claimed that I’m implicitly requesting a login token (which is deprecated) and gave me a new one rather than actually logging me in. Except that I did pass the login token!

It turned out that everything hinges on… cookies. Sure, it’s my fault for not reading the API documentation thoroughly. It is upstream’s fault for making a really silly API that requires a deep browser emulation to work, and for providing horribly misguided error messages.

Why should an API use cookies in the first place? You don’t need to pass data behind my back! Since I am writing the API client, I am more than happy to pass whatever data needs to be passed explicitly, in API requests. After all, you require me to pass lots of tokens explicitly anyway, so why not actually make them do something useful?!

Bot passwords, bot usernames

Now, MediaWiki prohibits you from using your account credentials to log in, without engaging in ever bigger hoops. Of course, that makes sense — I neither want my password stored somewhere script-accessible, nor give the script full admin powers. Instead, I am supposed to obtain a bot password and grant it specific permissions. Feels like typical case of an API key? Well, it’s not.

Not only I do need to explicitly pass username with the autogenerated bot password but I need to pass a special bot username. This is just plain silly. Since bot passwords are autogenerated, it should be trivially possible to enforce their uniqueness and infer the correct bot account from that. There is no technical reason to require username/password pair for bot login, and it just adds complexity for no benefit.

I am actively using both Bugzilla and GitHub APIs. Both work fine with a simple API token that I keep stored in an unstructured text file. Now I’m being picky but why has MediaWiki to be a special snowflake here?

Summary, or the ideal API

How should the MediaWiki API look like, if done properly? For a start, it would be freed of all its~useless complexity. Instead of bot username/password pair, just a single unique API key. No login tokens, no CSRF tokens, no cookies! Just issue a login request with your API key, get a session key in return and pass it to other requests. Or even better — just pass the API key directly to all the requests, so simple one-shot actions such as edits would actually take one request.

September 13, 2019
Nathan Zachary a.k.a. nathanzachary (homepage, bugs)
Vim pulling in xorg dependencies in Gentoo (September 13, 2019, 17:10 UTC)

Today I went to update one of my Gentoo servers, and noticed that it wanted to pull in a bunch of xorg dependencies. This is a simple music server without any type of graphical environment, so I don’t really want any xorg libraries or other GUI components installed. Looking through the full output, I couldn’t see a direct reason that these components were now requirements.

To troubleshoot, I started adding packages to /etc/portage/package.mask, starting with cairo (which was the package directly requesting the ‘X’ USE flag be enabled). That didn’t get me very far as it still just indicated that GTK+ needed to be installed. After following the dependency chain for a bit, I noticed that something was pulling in libcanberra and found that the default USE flags now include ‘sound’ and that vim now had it enabled. It looks like this USE flag was added between vim-8.1.1486 and vim-8.1.1846.

For my needs, the most straightforward solution was to just remove the ‘sound’ USE flag from vim by adding the following to /etc/portage/package.use:

# grep vim /etc/portage/package.use 
app-editors/vim -sound

Hanno Böck a.k.a. hanno (homepage, bugs)

In discussions around the PGP ecosystem one thing I often hear is that while PGP has its problems, it's an important tool for package signatures in Linux distributions. I therefore want to highlight a few issues I came across in this context that are rooted in problems in the larger PGP ecosystem.

Let's look at an example of the use of PGP signatures for deb packages, the Ubuntu Linux installation instructions for HHVM. HHVM is an implementation of the HACK programming language and developed by Facebook. I'm just using HHVM as an example here, as it nicely illustrates two attacks I want to talk about, but you'll find plenty of similar installation instructions for other software packages. I have reported these issues to Facebook, but they decided not to change anything.

The instructions for Ubuntu (and very similarly for Debian) recommend that users execute these commands in order to install HHVM from the repository of its developers:

apt-get update
apt-get install software-properties-common apt-transport-https
apt-key adv --recv-keys --keyserver hkp://keyserver.ubuntu.com:80 0xB4112585D386EB94

add-apt-repository https://dl.hhvm.com/ubuntu
apt-get update
apt-get install hhvm

The crucial part here is the line starting with apt-key. It fetches the key that is used to sign the repository from the Ubuntu key server, which itself is part of the PGP keyserver network.

KeyAttack 1: Flooding Key with Signatures

The first possible attack is actually quite simple: One can make the signature key offered here unusable by appending many signatures.

A key concept of the PGP keyservers is that they operate append-only. New data gets added, but never removed. PGP keys can sign other keys and these signatures can also be uploaded to the keyservers and get added to a key. Crucially the owner of a key has no influence on this.

This means everyone can grow the size of a key by simply adding many signatures to it. Lately this has happened to a number of keys, see the blog posts by Daniel Kahn Gillmor and Robert Hansen, two members of the PGP community who have personally been affected by this. The effect of this is that when GnuPG tries to import such a key it becomes excessively slow and at some point will simply not work any more.

For the above installation instructions this means anyone can make them unusable by attacking the referenced release key. In my tests I was still able to import one of the attacked keys with apt-key after several minutes, but these keys "only" have a few ten thousand signatures, growing them to a few megabytes size. There's no reason an attacker couldn't use millions of signatures and grow single keys to gigabytes.

Attack 2: Rogue packages with a colliding Key Id

The installation instructions reference the key as 0xB4112585D386EB94, which is a 64 bit hexadecimal key id.

Key ids are a central concept in the PGP ecosystem. The key id is a truncated SHA1 hash of the public key. It's possible to either use the last 32 bit, 64 bit or the full 160 bit of the hash.

It's been pointed out in the past that short key ids allow colliding key ids. This means an attacker can generate a different key with the same key id where he owns the private key simply by bruteforcing the id. In 2014 Richard Klafter and Eric Swanson showed with the Evil32 attack how to create colliding key ids for all keys in the so-called strong set (meaning all keys that are connected with most other keys in the web of trust). Later someone unknown uploaded these keys to the key servers causing quite some confusion.

It should be noted that the issue of colliding key ids was known and discussed in the community way earlier, see for example this discussion from 2002.

The practical attacks targeted 32 bit key ids, but the same attack works against 64 bit key ids, too, it just costs more. I contacted the authors of the Evil32 attack and Eric Swanson estimated in a back of the envelope calculation that it would cost roughly $ 120.000 to perform such an attack with GPUs on cloud providers. This is expensive, but within the possibilities of powerful attackers. Though one can also find similar installation instructions using a 32 bit key id, where the attack is really cheap.

Going back to the installation instructions from above we can imagine the following attack: A man in the middle network attacker can intercept the connection to the keyserver - it's not encrypted or authenticated - and provide the victim a colliding key. Afterwards the key is imported by the victim, so the attacker can provide repositories with packages signed by his key, ultimately leading to code execution.

You may notice that there's a problem with this attack: The repository provided by HHVM is using HTTPS. Thus the attacker can not simply provide a rogue HHVM repository. However the attack still works.

The imported PGP key is not bound to any specific repository. Thus if the victim has any non-HTTPS repository configured in his system the attacker can provide a rogue repository on the next call of "apt update". Notably by default both Debian and Ubuntu use HTTP for their repositories (a Debian developer even runs a dedicated web page explaining why this is no big deal).

Attack 3: Key over HTTP

Issues with package keys aren't confined to Debian/APT-based distributions. I found these installation instructions at Dropbox (Link to Wayback Machine, as Dropbox has changed them after I reported this):

Add the following to /etc/yum.conf.

name=Dropbox Repository

It should be obvious what the issue here is: Both the key and the repository are fetched over HTTP, a network attacker can simply provide his own key and repository.


The standard answer you often get when you point out security problems with PGP-based systems is: "It's not PGP/GnuPG, people are just using it wrong". But I believe these issues show some deeper problems with the PGP ecosystem. The key flooding issue is inherited from the systemically flawed concept of the append only key servers.

The other issue here is lack of deprecation. Short key ids are problematic, it's been known for a long time and there have been plenty of calls to get rid of them. This begs the question why no effort has been made to deprecate support for them. One could have said at some point: Future versions of GnuPG will show a warning for short key ids and in three years we will stop supporting them.

This reminds of other issues like unauthenticated encryption, where people have been arguing that this was fixed back in 1999 by the introduction of the MDC. Yet in 2018 it was still exploitable, because the unauthenticated version was never properly deprecated.


For all people having installation instructions for external repositories my recommendation would be to avoid any use of public key servers. Host the keys on your own infrastructure and provide them via HTTPS. Furthermore any reference to 32 bit or 64 bit key ids should be avoided.

Update: Some people have pointed out to me that the Debian Wiki contains guidelines for third party repositories that avoid the issues mentioned here.

September 06, 2019
Nathan Zachary a.k.a. nathanzachary (homepage, bugs)
Adobe Flash and Firefox 68+ in Gentoo Linux (September 06, 2019, 19:17 UTC)

Though many sites have abandoned Adobe Flash in favour of HTML5 these days, there are still some legacy applications (e.g. older versions of VMWare’s vSphere web client) that depend on it. Recent versions of Firefox in Linux (68+) started failing to load Flash content for me, and it took some digging to find out why. First off, I noticed that the content wouldn’t load even on Adobe’s Flash test page. Second off, I found that the plugin wasn’t listed in Firefox’s about:plugins page.

So, I realised that the problem was due to the Adobe Flash plugin not integrating properly with Firefox. I use Gentoo Linux, so these instructions may not directly apply to other distributions, but I would imagine that the directory structures are at least similar. To start, I made sure that I had installed the www-plugins/adobe-flash ebuild with the ‘npapi’ USE flag enabled:

$ eix adobe-flash
[I] www-plugins/adobe-flash
     Available versions:  (22)^ms
       {+nsplugin +ppapi ABI_MIPS="n32 n64 o32" ABI_RISCV="lp64 lp64d" ABI_S390="32 64" ABI_X86="32 64 x32"}
     Installed versions:^ms(03:13:05 22/08/19)(nsplugin -ppapi ABI_MIPS="-n32 -n64 -o32" ABI_RISCV="-lp64 -lp64d" ABI_S390="-32 -64" ABI_X86="64 -32 -x32")
     Homepage:            https://www.adobe.com/products/flashplayer.html https://get.adobe.com/flashplayer/ https://helpx.adobe.com/security/products/flash-player.html
     Description:         Adobe Flash Player

That ebuild installs the libflashplayer.so (shared object) in the /usr/lib64/nsbrowser/plugins/ directory by default.

However, through some digging, I found that Firefox 68+ was looking in another directory for the plugin (in my particular situation, that directory was /usr/lib64/mozilla/plugins/, which actually didn’t exist on my system). Seeing as the target directory didn’t exist, I had to firstly create it, and then I decided to symlink the shared object there so that future updates to the www-plugins/adobe-flash package would work without any further manual intervention:

mkdir -p /usr/lib64/mozilla/plugins/
cd $_
ln -s /usr/lib64/nsbrowser/plugins/libflashplayer.so .

After restarting Firefox, the Adobe Flash test page started working as did other sites that use Flash. So, though your particular Linux distribution, version of Firefox, and version of Adobe Flash may require the use of different directories than the ones I referenced above, I hope that these instructions can help you troubleshoot the problem with Adobe Flash not showing in the Firefox about:plugins page.

August 12, 2019
Andreas K. Hüttel a.k.a. dilfridge (homepage, bugs)

We are happy to be able to announce that our manuscript "Coulomb Blockade Spectroscopy of a MoS2 Nanotube" has been accepted for publication by pssRRL Rapid Research Letters.

Everybody is talking about novel semiconductor materials, and in particular the transition metal dichalcogenides (TMDCs), "layer materials" similar to graphene. With a chemical composition of TX2, where the transition metal T is, e.g., tungsten W or molybdenum Mo, and the chalcogenide X is, e.g., sulphur S or selenium Se, a wide range of interesting properties is expected.

What's by far not so well known is that many of these materials also form  nanotubes, similar to carbon nanotubes in structure but with distinct properties inherited from the planar system. Here, we present first low temperature transport measurements on a quantum dot in a MoS2 nanotube. The metallic contacts to the nanotube still require a lot of improvements, but the  nanotube between them acts as clean potential well for electrons.

Also, our measurements show possible traces of quantum confined behaviour. This is something that has not been achieved yet in planar, lithographically designed devices - since these have by their very geometric nature larger length scales. It means that via transport spectroscopy we can learn about the material properties and its suitability for quantum electronics devices.

A lot of complex physical phenomena have been predicted for MoS2, including spin filtering and intrinsic, possibly topologic superconductivity - a topic of high interest for the quantum computing community, where larger semiconductor nanowires are used at the moment. So this is the start of an exciting project!

"Coulomb Blockade Spectroscopy of a MoS2 Nanotube"
S. Reinhardt, L. Pirker, C. Bäuml, M. Remskar, and A. K. Hüttel
Physica Status Solidi RRL, doi:10.1002/pssr.201900251 (2019); arXiv:1904.05972 (PDF)

August 11, 2019
AArch64 (arm64) profiles are now stable! (August 11, 2019, 00:00 UTC)

Packet.com logo

The ARM64 project is pleased to announce that all ARM64 profiles are now stable.

While our developers and users have contributed significantly in this accomplishment, we must also thank our Packet sponsor for their contribution. Providing the Gentoo developer community with access to bare metal hardware has accelerated progress in achieving the stabilization of the ARM64 profiles.

About Packet.com

This access has been kindly provided to Gentoo by bare metal cloud Packet via their Works on Arm project. Learn more about their commitment to supporting open source here.

About Gentoo

Gentoo Linux is a free, source-based, rolling release meta distribution that features a high degree of flexibility and high performance. It empowers you to make your computer work for you, and offers a variety of choices at all levels of system configuration.

As a community, Gentoo consists of approximately two hundred developers and over fifty thousand users globally.

July 21, 2019
Thomas Raschbacher a.k.a. lordvan (homepage, bugs)

So I had been wondering why my roundcube had lost all the contacts and settings for my users (I changed DB host after  a server outage - see my older post).

Checked the DB and seen this:

roundcubemail=# select user_id, username, mail_host, created  from users;
 user_id | username | mail_host |            created
       3 | User3    | localhost | 2016-04-06 14:32:28.637887+02
       2 | User2    | localhost | 2013-09-26 15:32:08.848301+02
       4 | User2    | | 2019-06-17 14:21:18.059167+02
       5 | User1    | | 2019-07-19 14:26:41.113583+02
       1 | User1    | localhost | 2013-08-29 10:39:47.995082+02
(5 rows)

(I changd the actual usernames to User1,2,3 ..

Turns out whlie I was changing stuff I had changed the config from

$config['default_host'] = 'localhost';


$config['default_host'] = '';

Which resulted in roundcube thinking this is a different host - apparently it doesn't check/resolve hostnames to IP, but instead just assumes each is their own.

The solution was simple: changing it back to 'localhost' from '' -- hope this maybe helps someone else too. Took me a bit to figure it out.

July 18, 2019
Thomas Raschbacher a.k.a. lordvan (homepage, bugs)
(Suspected) Arson at Kyoto Animation Studio (July 18, 2019, 07:47 UTC)

Latest info 18:03 (JST) 

Already 12 dead at Kyoto Animation Studio .. I hope that this number won't go up anymore ! :

I will probably add more links and information later as I find out.

Article at NHK world

Article at NHK world in japanese - update 2019-07-18 17:19 (JST) -- 13 people confirmed dead :(

Article on bbc.com

Our thoughts and prayers are with the victims and their families & friends!

July 09, 2019
Michał Górny a.k.a. mgorny (homepage, bugs)

Gentoo elections are conducted using a custom software called votify. During the voting period, the developers place their votes in their respective home directories on one of the Gentoo servers. Afterwards, the election officials collect the votes, count them, compare their results and finally announce them.

The simplified description stated above suggests two weak points. Firstly, we rely on honesty of election officials. If they chose to conspire, they could fake the result. Secondly, we rely on honesty of all Infrastructure members, as they could use root access to manipulate the votes (or the collection process).

To protect against possible fraud, we make the elections transparent (but pseudonymous). This means that all votes cast are public, so everyone can count them and verify the result. Furthermore, developers can verify whether their personal vote has been included. Ideally, all developers would do that and therefore confirm that no votes were manipulated.

Currently, we are pretty much implicitly relying on developers doing that, and assuming that no protest implies successful verification. However, this is not really reliable, and given the unfriendly nature of our scripts I have reasons to doubt that the majority of developers actually verify the election results. In this post, I would like to shortly explain how Gentoo elections work, how they could be manipulated and introduce Votrify — a tool to explicitly verify election results.

Gentoo voting process in detail

Once the nomination period is over, an election official sets the voting process up by creating control files for the voting scripts. Those control files include election name, voting period, ballot (containing all vote choices) and list of eligible voters.

There are no explicit events corresponding to the beginning or the end of voting period. The votify script used by developers reads election data on each execution, and uses it to determine whether the voting period is open. During the voting period, it permits the developer to edit the vote, and finally to ‘submit’ it. Both draft and submitted vote are stored as appropriate files in the developer’s home directory, ‘submitted’ votes are not collected automatically. This means that the developer can still manually manipulate the vote once voting period concludes, and before the votes are manually collected.

Votes are collected explicitly by an election official. When run, the countify script collects all vote files from developers’ home directories. An unique ‘confirmation ID’ is generated for each voting developer. All votes along with their confirmation IDs are placed in so-called ‘master ballot’, while mapping from developer names to confirmation IDs is stored separately. The latter is used to send developers their respective confirmation IDs, and can be discarded afterwards.

Each of the election officials uses the master ballot to count the votes. Afterwards, they compare their results and if they match, they announce the election results. The master ballot is attached to the announcement mail, so that everyone can verify the results.

Possible manipulations

The three methods of manipulating the vote that I can think of are:

  1. Announcing fake results. An election result may be presented that does not match the votes cast. This is actively prevented by having multiple election officials, and by making the votes transparent so that everyone can count them.
  2. Manipulating votes cast by developers. The result could be manipulated by modifying the votes cast by individual developers. This is prevented by including pseudonymous vote attribution in the master ballot. Every developer can therefore check whether his/her vote has been reproduced correctly. However, this presumes that the developer is active.
  3. Adding fake votes to the master ballot. The result could be manipulated by adding votes that were not cast by any of the existing developers. This is a major problem, and such manipulation is entirely plausible if the turnout is low enough, and developers who did not vote fail to check whether they have not been added to the casting voter list.

Furthermore, the efficiency of the last method can be improved if the attacker is able to restrict communication between voters and/or reliably deliver different versions of the master ballot to different voters, i.e. convince the voters that their own vote was included correctly while manipulating the remaining votes to achieve the desired result. The former is rather unlikely but the latter is generally feasible.

Finally, the results could be manipulated via manipulating the voting software. This can be counteracted through verifying the implementation against the algorithm specification or, to some degree, via comparing the results a third party tool. Robin H. Johnson and myself were historically working on this (or more specifically, on verifying whether the Gentoo implementation of Schulze method is correct) but neither of us was able to finish the work. If you’re interested in the topic, you can look at my election-compare repository. For the purpose of this post, I’m going to consider this possibility out of scope.

Verifying election results using Votrify

Votrify uses a two-stage verification model. It consists of individual verification which is performed by each voter separately and produces signed confirmations, and community verification that uses the aforementioned files to provide final verified election result.

The individual verification part involves:

  1. Verifying that the developer’s vote has been recorded correctly. This takes part in detecting whether any votes have been manipulated. The positive result of this verification is implied by the fact that a confirmation is produced. Additionally, developers who did not cast a vote also need to produce confirmations, in order to detect any extraneous votes.
  2. Counting the votes and producing the election result. This produces the election results as seen from the developer’s perspective, and therefore prevents manipulation via announcing fake results. Furthermore, comparing the results between different developers helps finding implementation bugs.
  3. Hashing the master ballot. The hash of master ballot file is included, and comparing it between different results confirms that all voters received the same master ballot.

If the verification is positive, a confirmation is produced and signed using developer’s OpenPGP key. I would like to note that no private data is leaked in the process. It does not even indicate whether the dev in question has actually voted — only that he/she participates in the verification process.

Afterwards, confirmations from different voters are collected. They are used to perform community verification which involves:

  1. Verifying the OpenPGP signature. This is necessary to confirm the authenticity of the signed confirmation. The check also involves verifying that the key owner was an eligible voter and that each voter produced only one confirmation. Therefore, it prevents attempts to~fake the verification results.
  2. Comparing the results and master ballot hashes. This confirms that everyone participating received the same master ballot, and produced the same results.

If the verification for all confirmations is positive, the election results are repeated, along with explicit quantification of how trustworthy they are. The number indicates how many confirmations were used, and therefore how many of the votes (or non-votes) in master ballot were confirmed. The difference between the number of eligible voters and the number of confirmations indicates how many votes may have been altered, planted or deleted. Ideally, if all eligible voters produced signed confirmations, the election would be 100% confirmed.

Thomas Raschbacher a.k.a. lordvan (homepage, bugs)
Autoclicker for Linux (July 09, 2019, 14:02 UTC)

So I wanted an autoclicker for linux - for one of my browser based games that require a lot of clicking.

Looked around and tried to find something useful, but all i could find was old pages outdated download links,..

In the end I stumbled upon something simple yet immensely more powerful:xdotool (github) or check out the xdotool website

As an extra bonus it is in the Gentoo repository so a simple

emerge xdotool

Got it installed. it also has minimal dependencies which is nice.

The good part, but also a bit of a downside is that there is no UI (maybe I'll write one when I get a chance .. just as a wrapper).

anyway to do what I wanted was simply this:

xdotool click --repeat 1000 --delay 100 1

Pretty self explainatory, but here's a short explaination anyway:

  • click .. simulate a mouse click
  • --repeat 1000 ... repeat 1000 times
  • --delay 100 ... wait 100ms between clicks
  • 1  .. mouse button 1

The only problem is I need to know how many clicks I need beforehand - which can also be a nice feature of course.

There is one way to stop it if you have the terminal you ran this command from visible (which i always have - and set it to always on top): click with your left mouse button - this stops the click events being registered since it is mouse-down and waits for mouse-up i guess .. but not sure if that is the reason. then move to the terminal and either close it or ctrl+c abort the command -- or just wait for the program to exit after finishing the requested number of clicks. -- On a side note if you don't like that way of stopping it you could always just ctrl+alt+f1 (or whatever terminal you want to use) and log in there and kill the xdotool process (either find thepid and kill it or just killall xdotool - which will of course kill all, but i doubt you'll run more than one at once)

July 05, 2019
Thomas Raschbacher a.k.a. lordvan (homepage, bugs)
openvpn client on raspberry PI (July 05, 2019, 05:56 UTC)

Needing OpenVPN on my raspberry PI caused me to have some .. unexpected issues. But first a very quick run-down on what I did:

apt-get install openvpn

(I did an upgrade and dist-upgrade to buster too since my install was quite old already, but that is a different story).

then create a .conf file in /etc/openvpn:

Here's a simple example that I am using (I am using the "embedded" style config since I don't like to have loads of files in that folder:

# OpenVPN CLient Configuration

dev tun

proto udp

resolv-retry infinite

user nobody
group nogroup


# if you want to save your certificates in seperate files then use this:
# ca <path/to/your/ca.crt>
# cert <path/to/your/client.crt>
# key <path/to/your/client.key>

ns-cert-type server
verb 3

# I like to just embed the keys and certificates in the conf file
# useful also for the android client,..

# paste contents of ca.crt
# paste contents of client.key
# paste contents of client.cert

Then just testing it by running sudo openvpn /etc/openvpn/client.conf.  If you didn't make any mistakes it will look something like this:

Fri Jul  5 07:21:33 2019 OpenVPN 2.4.7 arm-unknown-linux-gnueabihf [SSL (OpenSSL)] [LZO] [LZ4] [EPOLL] [PKCS11] [MH/PKTINFO] [AEAD] built on Feb 20 2019
Fri Jul  5 07:21:33 2019 library versions: OpenSSL 1.1.1c  28 May 2019, LZO 2.10
Fri Jul  5 07:21:33 2019 WARNING: --ns-cert-type is DEPRECATED.  Use --remote-cert-tls instead.
Fri Jul  5 07:21:33 2019 TCP/UDP: Preserving recently used remote address: [AF_INET]SERVER_NAME_IP:SERVER_PORT
Fri Jul  5 07:21:33 2019 Socket Buffers: R=[163840->163840] S=[163840->163840]
Fri Jul  5 07:21:33 2019 UDP link local: (not bound)
Fri Jul  5 07:21:33 2019 UDP link remote: [AF_INET]SERVER_NAME_IP:SERVER_PORT
Fri Jul  5 07:21:33 2019 NOTE: UID/GID downgrade will be delayed because of --client, --pull, or --up-delay
Fri Jul  5 07:21:33 2019 TLS: Initial packet from [AF_INET]SERVER_NAME_IP:SERVER_PORT, sid=60ead8c7 c03a7c1d
Fri Jul  5 07:21:33 2019 VERIFY OK: nsCertType=SERVER
Fri Jul  5 07:21:34 2019 Control Channel: TLSv1.2, cipher TLSv1.2 ECDHE-RSA-AES256-GCM-SHA384, 1024 bit RSA
Fri Jul  5 07:21:34 2019 [SERVER_NAME_IP] Peer Connection Initiated with [AF_INET]SERVER_NAME_IP:SERVER_PORT
Fri Jul  5 07:21:35 2019 SENT CONTROL [SERVER_NAME_IP]: 'PUSH_REQUEST' (status=1)
Fri Jul  5 07:21:35 2019 PUSH: Received control message: 'PUSH_REPLY,route VPN_SUBNET,topology net30,ping 10,ping-restart 120,ifconfig YOUR_VPN_IP YOUR_VPN_ROUTER,peer-id 1,cipher AES-256-GCM'
Fri Jul  5 07:21:35 2019 OPTIONS IMPORT: timers and/or timeouts modified
Fri Jul  5 07:21:35 2019 OPTIONS IMPORT: --ifconfig/up options modified
Fri Jul  5 07:21:35 2019 OPTIONS IMPORT: route options modified
Fri Jul  5 07:21:35 2019 OPTIONS IMPORT: peer-id set
Fri Jul  5 07:21:35 2019 OPTIONS IMPORT: adjusting link_mtu to 1625
Fri Jul  5 07:21:35 2019 OPTIONS IMPORT: data channel crypto options modified
Fri Jul  5 07:21:35 2019 Data Channel: using negotiated cipher 'AES-256-GCM'
Fri Jul  5 07:21:35 2019 Outgoing Data Channel: Cipher 'AES-256-GCM' initialized with 256 bit key
Fri Jul  5 07:21:35 2019 Incoming Data Channel: Cipher 'AES-256-GCM' initialized with 256 bit key
Fri Jul  5 07:21:35 2019 TUN/TAP device tun0 opened
Fri Jul  5 07:21:35 2019 TUN/TAP TX queue length set to 100
Fri Jul  5 07:21:35 2019 /sbin/ip link set dev tun0 up mtu 1500
Fri Jul  5 07:21:35 2019 /sbin/ip addr add dev tun0 local YOUR_VPN_IP peer YOUR_VPN_ROUTER
Fri Jul  5 07:21:35 2019 /sbin/ip route add VPN_SUBNET/24 via YOUR_VPN_ROUTER
Fri Jul  5 07:21:35 2019 GID set to nogroup
Fri Jul  5 07:21:35 2019 UID set to nobody
Fri Jul  5 07:21:35 2019 WARNING: this configuration may cache passwords in memory -- use the auth-nocache option to prevent this
Fri Jul  5 07:21:35 2019 Initialization Sequence Completed

If you run ifconfig it should now include an entry for your new VPN device - similar to this:

        inet YOUR_VPN_IP netmask  destination YOUR_VPN_ROUTER
        inet6 YOUR_VPN_IPV6_IP  prefixlen 64  scopeid 0x20<link>
        unspec 00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00  txqueuelen 100  (UNSPEC)
        RX packets 0  bytes 0 (0.0 B)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 9  bytes 432 (432.0 B)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

Or if you prefer ip addr show:

5: tun0: <POINTOPOINT,MULTICAST,NOARP,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UNKNOWN group default qlen 100
    inet YOUR_VPN_IP peer YOUR_VPN_ROUTER/32 scope global tun0
       valid_lft forever preferred_lft forever
    inet6 YOUR_VPN_IPV6_IP/64 scope link stable-privacy
       valid_lft forever preferred_lft forever

Then just stop it with <ctrl>+c or otherwise kill the process. The output will be like this:

Fri Jul  5 07:22:15 2019 event_wait : Interrupted system call (code=4)
Fri Jul  5 07:22:15 2019 /sbin/ip route del VPN_SUBNET/24
RTNETLINK answers: Operation not permitted
Fri Jul  5 07:22:15 2019 ERROR: Linux route delete command failed: external program exited with error status: 2
Fri Jul  5 07:22:15 2019 Closing TUN/TAP interface
Fri Jul  5 07:22:15 2019 /sbin/ip addr del dev tun0 local YOUR_VPN_IP peer YOUR_VPN_ROUTER
RTNETLINK answers: Operation not permitted
Fri Jul  5 07:22:15 2019 Linux ip addr del failed: external program exited with error status: 2
Fri Jul  5 07:22:15 2019 SIGTERM[hard,] received, process exiting

So I was having a little .. issue with getting my openvpn client to start (and then to start on boot).

Turns out both were easy to solve issues:

First I had this error:

ERROR: Cannot open TUN/TAP dev /dev/net/tun: No such device (errno=19)

That was just because I forgot to reboot after a dist-upgrade (which included a new kernel). So Reboot and done :)

The second issue is because I am using Gentoo usually - and without systemd that is .. so I was like how the hell do I get it to start my vpn. Just running /etc/init.d/openvpn start did not start it for one. Changing the init.d file AUTOSTART="all" to AUTOSTART="client" did not do anything either. After looking it up on google a bit I found what I needed:

systemctl start openvpn@lv_new.service

And then you can check again with ifconfig or ip addr show that you have your device up. So then how to get it to autostart? Turns out that is fairly similar:

systemctl enable openvpn@lv_new.service

Turns out most of it was rather simple and the "biggest" issue for me was lack of systemd knowledge .. maybe I should install Gentoo on it after all ;)

July 04, 2019
Michał Górny a.k.a. mgorny (homepage, bugs)

The recent key poisoning attack on SKS keyservers shook the world of OpenPGP. While this isn’t a new problem, it has not been exploited on this scale before. The attackers have proved how easy it is to poison commonly used keys on the keyservers and effectively render GnuPG unusably slow. A renewed discussion on improving keyservers has started as a result. It also forced Gentoo to employ countermeasures. You can read more on them in the ‘Impact of SKS keyserver poisoning on Gentoo’ news item.

Coincidentally, the attack happened shortly after the launch of keys.openpgp.org, that advertises itself as both poisoning-resistant and GDPR-friendly keyserver. Naturally, many users see it as the ultimate solution to the issues with SKS. I’m afraid I have to disagree — in my opinion, this keyserver does not solve any problems, it merely cripples OpenPGP in order to avoid being affected by them, and harms its security in the process.

In this article, I’d like to shortly explain what the problem is, and which of the different solutions proposed so far to it (e.g. on gnupg-users mailing list) make sense, and which make things even worse. Naturally, I will also cover the new Hagrid keyserver as one of the glorified non-solutions.

The attack — key poisoning

OpenPGP uses a distributed design — once the primary key is created, additional packets can be freely appended to it and recombined on different systems. Those packets include subkeys, user identifiers and signatures. Signatures are used to confirm the authenticity of appended packets. The packets are only meaningful if the client can verify the authenticity of their respective signatures.

The attack is carried through third-party signatures that normally are used by different people to confirm the authenticity of the key — that is, to state that the signer has verified the identity of the key owner. It relies on three distinct properties of OpenPGP:

  1. The key can contain unlimited number of signatures. After all, it is natural that very old keys will have a large number of signatures made by different people on them.
  2. Anyone can append signatures to any OpenPGP key. This is partially keyserver policy, and partially the fact that SKS keyserver nodes are propagating keys one to another.
  3. There is no way to distinguish legitimate signatures from garbage. To put it other way, it is trivial to make garbage signatures look like the real deal.

The attacker abuses those properties by creating a large number of garbage signatures and sending them to keyservers. When users fetch key updates from the keyserver, GnuPG normally appends all those signatures to the local copy. As a result, the key becomes unusually large and causes severe performance issues with GnuPG, preventing its normal usage. The user ends up having to manually remove the key in order to fix the installation.

The obvious non-solutions and potential solutions

Let’s start by analyzing the properties I’ve listed above. After all, removing at least one of the requirements should prevent the attack from being possible. But can we really do that?

Firstly, we could set a hard limit on number of signatures or key size. This should obviously prevent the attacker from breaking user systems via huge keys. However, it will make it entirely possible for the attacker to ‘brick’ the key by appending garbage up to the limit. Then it would no longer be possible to append any valid signatures to the key. Users would suffer less but the key owner will lose the ability to use the key meaningfully. It’s a no-go.

Secondly, we could limit key updates to the owner. However, the keyserver update protocol currently does not provide any standard way of verifying who the uploader is, so it would effectively require incompatible changes at least to the upload protocol. Furthermore, in order to prevent malicious keyservers from propagating fake signatures we’d also need to carry the verification along when propagating key updates. This effectively means an extension of the key format, and it has been proposed e.g. in ‘Abuse-Resistant OpenPGP Keystores’ draft. This is probably a wortwhile option but it will take time before it’s implemented.

Thirdly, we could try to validate signatures. However, any validation can be easily worked around. If we started requiring signing keys to be present on the keyserver, the attackers can simply mass-upload keys used to create garbage signatures. If we went even further and e.g. started requiring verified e-mail addresses for the signing keys, the attackers can simply mass-create e-mail addresses and verify them. It might work as a temporary solution but it will probably cause more harm than good.

There were other non-solutions suggested — most notably, blacklisting poisoned keys. However, this is even worse. It means that every victim of poisoning attack would be excluded from using the keyserver, and in my opinion it will only provoke the attackers to poison even more keys. It may sound like a good interim solution preventing users from being hit but it is rather short-sighted.

keys.openpgp.org / Hagrid — a big non-solution

A common suggestion for OpenPGP users — one that even Gentoo news item mentions for lack of alternative — is to switch to keys.openpgp.org keyserver, or switch keyservers to their Hagrid software. It is not vulnerable to key poisoning attack because it strips away all third-party signatures. However, this and other limitations make it a rather poor replacement, and in my opinion can be harmful to security of OpenPGP.

Firstly, stripping all third-party signatures is not a solution. It simply avoids the problem by killing a very important portion of OpenPGP protocol — the Web of Trust. Without it, the keys obtained from the server can not be authenticated otherwise than by direct interaction between the individuals. For example, Gentoo Authority Keys can’t work there. Most of the time, you won’t be able to tell whether the key on keyserver is legitimate or forged.

The e-mail verification makes it even worse, though not intentionally. While I agree that many users do not understand or use WoT, Hagrid is implicitly going to cause users to start relying on e-mail verification as proof of key authenticity. In other words, people are going to assume that if a key on keys.openpgp.org has verified e-mail address, it has to be legitimate. This makes it trivial for an attacker that manages to gain unauthorized access to the e-mail address or the keyserver to publish a forged key and convince others to use it.

Secondly, Hagrid does not support UID revocations. This is an entirely absurd case where GDPR fear won over security. If your e-mail address becomes compromised, you will not be able to revoke it. Sure, the keyserver admins may eventually stop propagating it along with your key, but all users who fetched the key before will continue seeing it as a valid UID. Of course, if users send encrypted mail the attacker won’t be able to read it. However, the users can be trivially persuaded to switch to a new, forged key.

Thirdly, Hagrid rejects all UIDs except for verified e-mail-based UIDs. This is something we could live with if key owners actively pursue having their identities verified. However, this also means you can’t publish a photo identity or use keybase.io. The ‘explicit consent’ argument used by upstream is rather silly — apparently every UID requires separate consent, while at the same time you can trivially share somebody else’s PII as the real name of a valid e-mail address.

Apparently, upstream is willing to resolve the first two of those issues once satisfactory solutions are established. However, this doesn’t mean that it’s fine to ignore those problems. Until they are resolved, and necessary OpenPGP client updates are sufficiently widely deployed, I don’t believe Hagrid or its instance at keys.openpgp.org are good replacements for SKS and other keyservers.

So what are the solutions?

Sadly, I am not aware of any good global solution at the moment. The best workaround for GnuPG users so far is the new self-sigs-only option that prevents it from importing third-party signatures. Of course, it shares the first limitation of Hagrid keyserver. The future versions of GnuPG will supposedly fallback to this option upon meeting excessively large keys.

For domain-limited use cases such as Gentoo’s, running a local keyserver with restricted upload access is an option. However, it requires users to explicitly specify our keyserver, and effectively end up having to specify multiple different keyservers for each domain. Furthermore, WKD can be used to distribute keys. Sadly, at the moment GnuPG uses it only to locate new keys and does not support refreshing keys via WKD (gemato employs a cheap hack to make it happen). In both cases, the attack is prevented via isolating the infrastructure and preventing public upload access.

The long-term solution probably lies in the ‘First-party-attested Third-party Certifications‘ section of the ‘Abuse-Resistant OpenPGP Keystores’ draft. In this proposal, every third-party signature must be explicitly attested by the key owner. Therefore, only the key owner can append additional signatures to the key, and keyservers can reject any signatures that were not attested. However, this is not currently supported by GnuPG, and once it is, deploying it will most likely take significant time.

July 03, 2019
Marek Szuba a.k.a. marecki (homepage, bugs)
Case label for Pocket Science Lab V5 (July 03, 2019, 17:28 UTC)

tl;dr: Here (PDF, 67 kB) is a case label for Pocket Science Lab version 5 that is compatible with the design for a laser-cut case published by FOSSAsia.

In case you haven’t heard about it, Pocket Science Lab [1] is a really nifty board developed by the FOSSAsia community which combines a multichannel, megahertz-range oscilloscope, a multimeter, a logic probe, several voltage sources and a current source, several wave generators, UART and I2C interfaces… and all of this in the form factor of an Arduino Mega, i.e. only somewhat larger than that of a credit card. Hook it up over USB to a PC or an Android device running the official (free and open source, of course) app and you are all set.

Well, not quite set yet. What you get for your 50-ish EUR is just the board itself. You will quite definitely need a set of probe cables (sadly, I have yet to find even an unofficial adaptor allowing one to equip PSLab with standard industry oscilloscope probes using BNC connectors) but if you expect to lug yours around anywhere you go, you will quite definitely want to invest in a case of some sort. While FOSSAsia does not to my knowledge sell PSLab cases, they provide a design for one [2]. It is meant to be laser-cut but I have successfully managed to 3D-print it as well, and for the more patient among us it shouldn’t be too difficult to hand-cut one with a jigsaw either.

Of course in addition to making sure your Pocket Science Lab is protected against accidental damage it would also be nice to have all the connectors clearly labelled. Documentation bundled with PSLab software does show not a few “how to connect instrument X” diagrams but unfortunately said diagrams picture a version 4 of the board and the current major version, V5, features radically different pinout (compare [3] with [4]/[5] and you will see immediately what I mean), not to mention that having to stare at a screen while wiring your circuit isn’t always optimal. Now, all versions of the board feature a complete set of header labels (along with LEDs showing the device is active) on the front side and at least the more recent ones additionally show more detailed descriptions on the back, clearly suggesting the optimal way to go is to make your case our of transparent material. But what if looking at the provided labels directly is not an option, for instance because you have gone eco-friendly and made your case out of wood? Probably stick a label to the front of the case… which brings us back to the problem of the case label from [5] not being compatible with recent versions of the board.

Which brings me to my take on adapting the design from [5] to match the header layout and labels of PSLab V5.1 as well as the laser-cut case design from [2]. It could probably be more accurate but having tried it out, it is close enough. Bluetooth and ICSP-programmer connectors near the centre of the board are not included because the current case design does not provide access to them and indeed, they haven’t even got headers soldered in. Licence and copyright: same as the original.






Impact of SKS keyserver poisoning on Gentoo (July 03, 2019, 00:00 UTC)

The SKS keyserver network has been a victim of certificate poisoning attack lately. The OpenPGP verification used for repository syncing is protected against the attack. However, our users can be affected when using GnuPG directly. In this post, we would like to shortly summarize what the attack is, what we did to protect Gentoo against it and what can you do to protect your system.

The certificate poisoning attack abuses three facts: that OpenPGP keys can contain unlimited number of signatures, that anyone can append signatures to any key and that there is no way to distinguish a legitimate signature from garbage. The attackers are appending a large number of garbage signatures to keys stored on SKS keyservers, causing them to become very large and cause severe performance issues in GnuPG clients that fetch them.

The attackers have poisoned the keys of a few high ranking OpenPGP people on the SKS keyservers, including one Gentoo developer. Furthermore, the current expectation is that the problem won’t be fixed any time soon, so it seems plausible that more keys may be affected in the future. We recommend users not to fetch or refresh keys from SKS keyserver network (this includes aliases such as keys.gnupg.net) for the time being. GnuPG upstream is already working on client-side countermeasures and they can be expected to enter Gentoo as soon as they are released.

The Gentoo key infrastructure has not been affected by the attack. Shortly after it was reported, we have disabled fetching developer key updates from SKS and today we have disabled public key upload access to prevent the keys stored on the server from being poisoned by a malicious third party.

The gemato tool used to verify the Gentoo ebuild repository uses WKD by default. During normal operation it should not be affected by this vulnerability. Gemato has a keyserver fallback that might be vulnerable if WKD fails, however gemato operates in an isolated environment that will prevent a poisoned key from causing permanent damage to your system. In the worst case; Gentoo repository syncs will be slow or hang.

The webrsync and delta-webrsync methods also support gemato, although it is not used by default at the moment. In order to use it, you need to remove PORTAGE_GPG_DIR from /etc/portage/make.conf (if it present) and put the following values into /etc/portage/repos.conf:

sync-type = webrsync
sync-webrsync-delta = true  # false to use plain webrsync
sync-webrsync-verify-signature = true

Afterwards, calling emerge --sync or emaint sync --repo gentoo will use gemato key management rather than the vulnerable legacy method. The default is going to be changed in a future release of Portage.

When using GnuPG directly, Gentoo developer and service keys can be securely fetched (and refreshed) via:

  1. Web Key Directory, e.g. gpg --locate-key developer@gentoo.org
  2. Gentoo keyserver, e.g. gpg --keyserver hkps://keys.gentoo.org ...
  3. Key bundles, e.g.: active devs, service keys

Please note that the aforementioned services provide only keys specific to Gentoo. Keys belonging to other people will not be found on our keyserver. If you are looking for them, you may try keys.openpgp.org keyserver that is not vulnerable to the attack, at the cost of stripping all signatures and unverified UIDs.

July 02, 2019
Andreas K. Hüttel a.k.a. dilfridge (homepage, bugs)

The Nature Index 2019 Annual Tables have been published, and there is a valuable new addition: the tables now include a "normalized ranking", where the quality of a university's research output, and not its quantity counts. If we look at the world-wide natural sciences ranking, University of Regensburg is at spot 44, best of all universities in Germany, and in a similar ranking range as, e.g., University of Oxford, University of Tokyo, or University of California San Francisco! Cheers and congratulations!

July 01, 2019
Luca Barbato a.k.a. lu_zero (homepage, bugs)

I presented cargo-c at the rustlab 2019, here is a longer followup of this.

Mixing Rust and C

One of the best selling point for rust is being highly interoperable with the C-ABI, in addition to safety, speed and its amazing community.

This comes really handy when you have well optimized hand-crafted asm kernels you’d like to use as-they-are:

  • They are small and with a clear interface, usually strict boundaries on what they read/write by their own nature.
  • You’d basically rewrite them as they are using some inline assembly for dubious gains.
  • Both cc-rs and nasm-rs make the process of building and linking relatively painless.

Also, if you plan to integrate in a foreign language project some rust component, it is quite straightforward to link the staticlib produced by cargo in your main project.

If you have a pure-rust crate and you want to export it to the world as if it were a normal C (shared/dynamic) library, it gets quite gory.

Well behaved C-API Library structure

Usually when you want to use a C-library in your own project you should expect it to provide the following:

  • A header file, telling the compiler which symbols it should expect
  • A static library
  • A dynamic library
  • A pkg-config file giving you direction on where to find the header and what you need to pass to the linker to correctly link the library, being it static or dynamic

Header file

In C you usually keep a list of function prototypes and type definitions in a separate file and then embed it in your source file to let the compiler know what to expect.

Since you rely on a quite simple preprocessor to do that you have to be careful about adding guards so the file does not get included more than once and, in order to avoid clashes you install it in a subdirectory of your include dir.

Since the location of the header could be not part of the default search path, you store this information in pkg-config usually.

Static Libraries

Static libraries are quite simple in concept (and execution):

  • they are an archive of object code files.
  • the linker simply reads them as it would read just produced .os and link everything together.

There is a pitfall though:

  • In some platforms even if you want to make a fully static binary you end up dynamically linking some system library for a number of reasons.

    The worst offenders are the pthread libraries and in some cases the compiler builtins (e.g. libgcc_s)

  • The information on what they are is usually not known

rustc comes to the rescue with --print native-static-libs, it isn’t the best example of integration since it’s a string produced on stderr and it behaves as a side-effect of the actual building, but it is still a good step in the right direction.

pkg-config is the de-facto standard way to preserve the information and have the build systems know about it (I guess you are seeing a pattern now).

Dynamic Libraries

A shared or dynamic library is a specially crafted lump of executable code that gets linked to the binary as it is being executed.
The advantages compared to statically linking everything are mainly two:

  • Sparing disk space: since without link-time pruning you end up carrying multiple copies of the same library with every binary using it.
  • Safer and simpler updates: If you need to update say, openssl, you do that once compared to updating the 100+ consumers of it existing in your system.

There is some inherent complexity and constraints in order to get this feature right, the most problematic one is ABI stability:

  • The dynamic linker needs to find the symbols the binary expects and have them with the correct size
  • If you change the in-memory layout of a struct or how the function names are represented you should make so the linker is aware.

Usually that means that depending on your platform you have some versioning information you should provide when you are preparing your library. This can be as simple as telling the compile-time linker to embed the version information (e.g. Mach-O dylib or ELF) in the library or as complex as crafting a version script.

Compared to crafting a staticlib it there are more moving parts and platform-specific knowledge.

Sadly in this case rustc does not provide any help for now: even if the C-ABI is stable and set in stone, the rust mangling strategy is not finalized yet, and it is a large part of being ABI stable, so the work on fully supporting dynamic libraries is yet to be completed.

Dynamic libraries in most platforms have a mean to store which other dynamic libraries they reliy on and which are the paths in which to look for. When the information is incomplete, or you are storing the library in a non-standard path, pkg-config comes to the rescue again, helpfully storing the information for you.


It is your single point of truth as long your build system supports it and the libraries you want to use craft it properly.
It simplifies a lot your life if you want to keep around multiple versions of a library or you are doing non-system packaging (e.g.: Homebrew or Gentoo Prefix).
Beside the search path, link line and dependency information I mentioned above, it also stores the library version and inter-library compatibility relationships.
If you are publishing a C-library and you aren’t providing a .pc file, please consider doing it.

Producing a C-compatible library out of a crate

I explained what we are expected to produce, now let see what we can do on the rust side:

  • We need to export C-ABI-compatible symbols, that means we have to:
  • Decorate the data types we want to export with #[repr(C)]
  • Decorate the functions with #[no_mangle] and prefix them with export "C"
  • Tell rustc the crate type is both staticlib and cdylib
  • Pass rustc the platform-correct link line so the library produced has the right information inside.
    > NOTE: In some platforms beside the version information also the install path must be encoded in the library.
  • Generate the header file so that the C compiler knows about them.
  • Produce a pkg-config file with the correct information

    NOTE: It requires knowing where the whole lot will be eventually installed.

cargo does not support installing libraries at all (since for now rust dynamic libraries should not be used at all) so we are a bit on our own.

For rav1e I did that the hard way and then I came up an easy way for you to use (and that I used for doing the same again with lewton spending about 1/2 day instead of several ~~weeks~~months).

The hard way

As seen in crav1e, you can explore the history there.

It isn’t the fully hard way since before cargo-c there was already nice tools to avoid some time consuming tasks: cbindgen.
In a terse summary what I had to do was:

  • Come up with an external build system since cargo itself cannot install anything nor have direct knowledge of the install path information. I used Make since it is simple and sufficiently widespread, anything richer would probably get in the way and be more time consuming to set up.
  • Figure out how to extract the information provided in Cargo.toml so I have it at Makefile level. I gave up and duplicated it since parsing toml or json is pointlessly complicated for a prototype.
  • Write down the platform-specific logic on how to build (and install) the libraries. It ended up living in the build.rs and the Makefile. Thanks again to Derek for taking care of the Windows-specific details.
  • Use cbindgen to generate the C header (And in the process smooth some of its rough edges
  • Since we already have a build system add more targets for testing and continuous integration purpose.

If you do not want to use cargo-c I spun away the cdylib-link line logic in a stand alone crate so you can use it in your build.rs.

The easier way

Using a Makefile and a separate crate with a customized build.rs works fine and keeps the developers that care just about writing in rust fully shielded from the gory details and contraptions presented above.

But it comes with some additional churn:

  • Keeping the API in sync
  • Duplicate the release work
  • Have the users confused on where to report the issues or where to find the actual sources. (The users tend to miss the information presented in the obvious places such as the README way too often)

So to try to minimize it I came up with a cargo applet that provides two subcommands:

  • cbuild to build the libraries, the .pc file and header.
  • cinstall to install the whole lot, if already built or to build and then install it.

They are two subcommands since it is quite common to build as user and then install as root. If you are using rustup and root does not have cargo you can get away with using --destdir and then sudo install or craft your local package if your distribution provides a mean to do that.

All I mentioned in the hard way happens under the hood and, beside bugs in the current implementation, you should be completely oblivious of the details.

Using cargo-c

As seen in lewton and rav1e.

  • Create a capi.rs with the C-API you want to expose and use #[cfg(cargo_c)] to hide it when you build a normal rust library.
  • Make sure you have a lib target and if you are using a workspace the first member is the crate you want to export, that means that you might have to add a "." member at the start of the list.
  • Remember to add a cbindgen.toml and fill it with at least the include guard and probably you want to set the language to C (it defaults to C++)
  • Once you are happy with the result update your documentation to tell the user to install cargo-c and do cargo cinstall --prefix=/usr --destdir=/tmp/some-place or something along those lines.

Coming next

cargo-c is a young project and far from being complete even if it is functional for my needs.

Help in improving it is welcome, there are plenty of rough edges and bugs to find and squash.


Thanks to est31 and sdroege for the in-depth review in #rust-av and kodabb for the last minute edits.

June 26, 2019
Sergei Trofimovich a.k.a. slyfox (homepage, bugs)
An old linux kernel tty/vt bug (June 26, 2019, 00:00 UTC)

trofi's blog: An old linux kernel tty/vt bug

An old linux kernel tty/vt bug

This post is another one in series of obscure bugs. This time elusive bug manifested on my desktop for years until it was pinned down by luck.

The Bug

Initial bug manifested in a very magical way: I boot up my desktop, start a window manager, use it for a week and then at some point when I press Ctrl-F1 my machine reboots gracefully. System logs say I pressed power button. I did not though :)

That kept happening once in a few months and was very hard to say what changed.

I was not sure how to debug that. My only clue was the following message in boot logs:

Mar 29 19:22:42 sf systemd-logind[413]: Power key pressed
<graceful shutdown goes here>

To workaround the effect I made poweroff a no-op in systemd. I hever use “power” button.

The patch still kept messages popping up in the logs but did not shutdown my machine any more. This allowed me to track frequency of these events without distracting actual work on the machine.

But how one would find out how to track it down to a faulty component? Was it my hardware (keyboard, USB host, etc.) losing mind for a second or some obscure software bug?

I tried to track it down backwards from “Power key pressed” in systemd down to a source that registered generated the event.

Apparently all systemd does is reading /dev/input/event<N> device for power keypress and reacts accordingly. That means kernel itself sends those signals as code=KEY_POWER and code=KEY_POWER2 values of struct input_event. I was not able to trace it down to my keyboard driver at that time.

The clue

A few years passed. I forgot about the local systemd patch.

And one day I got a very scary kernel backtraces when my system booted:

Apr 29 13:12:24 sf kernel: BUG: unable to handle kernel paging request at ffffa39b3b117000
Apr 29 13:12:24 sf kernel: #PF error: [PROT] [WRITE]
Apr 29 13:12:24 sf kernel: PGD 5e4a01067 P4D 5e4a01067 PUD 5e4a06067 PMD 7f7d0f063 PTE 80000007fb117161
Apr 29 13:12:24 sf kernel: Oops: 0003 [#1] PREEMPT SMP
Apr 29 13:12:24 sf kernel: CPU: 7 PID: 423 Comm: loadkeys Tainted: G         C        5.1.0-rc7 #98
Apr 29 13:12:24 sf kernel: Hardware name: Gigabyte Technology Co., Ltd. To be filled by O.E.M./H77M-D3H, BIOS F12 11/14/2013
Apr 29 13:12:24 sf kernel: RIP: 0010:__memmove+0x81/0x1a0
Apr 29 13:12:24 sf kernel: Code: 4c 89 4f 10 4c 89 47 18 48 8d 7f 20 73 d4 48 83 c2 20 e9 a2 00 00 00 66 90 48 89 d1 4c 8b 5c 16 f8 4c 8d 54 17 f8 48 c1 e9 03 <f3> 48 a5 4d 89 1a e9 0c 01 00 00 0f 1f 40 00 48 89 d1 4c $
Apr 29 13:12:24 sf kernel: RSP: 0018:ffffc0c3c0c7fd08 EFLAGS: 00010203
Apr 29 13:12:24 sf kernel: RAX: ffffa39b39c9b08c RBX: 0000000000000019 RCX: 00000b8c90633fcb
Apr 29 13:12:24 sf kernel: RDX: 00005c648461bdcd RSI: ffffa39b3b116ffc RDI: ffffa39b3b116ffc
Apr 29 13:12:24 sf kernel: RBP: ffffa39b3ac04400 R08: ffffa39b3b802f00 R09: 00000000fffff73b
Apr 29 13:12:24 sf kernel: R10: ffffffffbe2b6e51 R11: 00505b1b004d5b1b R12: 0000000000000000
Apr 29 13:12:24 sf kernel: R13: ffffa39b39c9b087 R14: 0000000000000018 R15: ffffa39b39c9b08c
Apr 29 13:12:24 sf kernel: FS:  00007f84c341e580(0000) GS:ffffa39b3f1c0000(0000) knlGS:0000000000000000
Apr 29 13:12:24 sf kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Apr 29 13:12:24 sf kernel: CR2: ffffa39b3b117000 CR3: 00000007e9d42003 CR4: 00000000000606e0
Apr 29 13:12:24 sf kernel: Call Trace:
Apr 29 13:12:24 sf kernel:  vt_do_kdgkb_ioctl+0x352/0x450
Apr 29 13:12:24 sf kernel:  vt_ioctl+0xba3/0x1190
Apr 29 13:12:24 sf kernel:  ? __bpf_prog_run32+0x39/0x60
Apr 29 13:12:24 sf kernel:  ? trace_hardirqs_on+0x31/0xe0
Apr 29 13:12:24 sf kernel:  tty_ioctl+0x23f/0x920
Apr 29 13:12:24 sf kernel:  ? preempt_count_sub+0x98/0xe0
Apr 29 13:12:24 sf kernel:  ? __seccomp_filter+0xc2/0x450
Apr 29 13:12:24 sf kernel:  ? __handle_mm_fault+0x7b0/0x1530
Apr 29 13:12:24 sf kernel:  do_vfs_ioctl+0xa2/0x6a0
Apr 29 13:12:24 sf kernel:  ? syscall_trace_enter+0x126/0x280
Apr 29 13:12:24 sf kernel:  ksys_ioctl+0x3a/0x70
Apr 29 13:12:24 sf kernel:  __x64_sys_ioctl+0x16/0x20
Apr 29 13:12:24 sf kernel:  do_syscall_64+0x54/0xe0
Apr 29 13:12:24 sf kernel:  entry_SYSCALL_64_after_hwframe+0x49/0xbe
Apr 29 13:12:24 sf kernel: RIP: 0033:0x7f84c334a3b7
Apr 29 13:12:24 sf kernel: Code: 00 00 00 75 0c 48 c7 c0 ff ff ff ff 48 83 c4 18 c3 e8 dd d2 01 00 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 00 b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d a9 ca 0c 00 f7 d8 64 $
Apr 29 13:12:24 sf kernel: RSP: 002b:00007ffed2cc88f8 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
Apr 29 13:12:24 sf kernel: RAX: ffffffffffffffda RBX: 0000000000000018 RCX: 00007f84c334a3b7
Apr 29 13:12:24 sf kernel: RDX: 00007ffed2cc8910 RSI: 0000000000004b49 RDI: 0000000000000003
Apr 29 13:12:24 sf kernel: RBP: 00007ffed2cc8911 R08: 00007f84c3417c40 R09: 0000561cb25db4a0
Apr 29 13:12:24 sf kernel: R10: 0000000000000000 R11: 0000000000000246 R12: 0000561cb25d32b0
Apr 29 13:12:24 sf kernel: R13: 00007ffed2cc8910 R14: 0000000000000018 R15: 0000000000000003
Apr 29 13:12:24 sf kernel: Modules linked in: sit tunnel4 ip_tunnel snd_hda_codec_hdmi snd_hda_codec_via snd_hda_codec_generic snd_hda_intel snd_hda_codec r8712u(C) snd_hwdep ath9k_htc snd_hda_core ath9k_common ath9k_h$
Apr 29 13:12:24 sf kernel: CR2: ffffa39b3b117000
Apr 29 13:12:24 sf kernel: ---[ end trace 9c4dbd36dd993d54 ]---
Apr 29 13:12:24 sf kernel: RIP: 0010:__memmove+0x81/0x1a0
Apr 29 13:12:24 sf kernel: Code: 4c 89 4f 10 4c 89 47 18 48 8d 7f 20 73 d4 48 83 c2 20 e9 a2 00 00 00 66 90 48 89 d1 4c 8b 5c 16 f8 4c 8d 54 17 f8 48 c1 e9 03 <f3> 48 a5 4d 89 1a e9 0c 01 00 00 0f 1f 40 00 48 89 d1 4c $
Apr 29 13:12:24 sf kernel: RSP: 0018:ffffc0c3c0c7fd08 EFLAGS: 00010203
Apr 29 13:12:24 sf kernel: RAX: ffffa39b39c9b08c RBX: 0000000000000019 RCX: 00000b8c90633fcb
Apr 29 13:12:24 sf kernel: RDX: 00005c648461bdcd RSI: ffffa39b3b116ffc RDI: ffffa39b3b116ffc
Apr 29 13:12:24 sf kernel: RBP: ffffa39b3ac04400 R08: ffffa39b3b802f00 R09: 00000000fffff73b
Apr 29 13:12:24 sf kernel: R10: ffffffffbe2b6e51 R11: 00505b1b004d5b1b R12: 0000000000000000
Apr 29 13:12:24 sf kernel: R13: ffffa39b39c9b087 R14: 0000000000000018 R15: ffffa39b39c9b08c
Apr 29 13:12:24 sf kernel: FS:  00007f84c341e580(0000) GS:ffffa39b3f1c0000(0000) knlGS:0000000000000000
Apr 29 13:12:24 sf kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Apr 29 13:12:24 sf kernel: CR2: ffffa39b3b117000 CR3: 00000007e9d42003 CR4: 00000000000606e0
Apr 29 13:12:24 sf kernel: BUG: sleeping function called from invalid context at include/linux/percpu-rwsem.h:34
Apr 29 13:12:24 sf kernel: in_atomic(): 0, irqs_disabled(): 1, pid: 423, name: loadkeys
Apr 29 13:12:24 sf kernel: CPU: 7 PID: 423 Comm: loadkeys Tainted: G      D  C        5.1.0-rc7 #98
Apr 29 13:12:24 sf kernel: Hardware name: Gigabyte Technology Co., Ltd. To be filled by O.E.M./H77M-D3H, BIOS F12 11/14/2013
Apr 29 13:12:24 sf kernel: Call Trace:
Apr 29 13:12:24 sf kernel:  dump_stack+0x67/0x90
Apr 29 13:12:24 sf kernel:  ? wake_up_klogd+0x10/0x70
Apr 29 13:12:24 sf kernel:  ___might_sleep.cold.18+0xd4/0xe4
Apr 29 13:12:24 sf kernel:  exit_signals+0x1c/0x200
Apr 29 13:12:24 sf kernel:  do_exit+0xa8/0xbb0
Apr 29 13:12:24 sf kernel:  ? ksys_ioctl+0x3a/0x70
Apr 29 13:12:24 sf kernel:  rewind_stack_do_exit+0x17/0x20

These backtraces did not prevent machine from booting and did not seem to cause any ill immediate effect. But they still looked very scary: something failed to copy data somewhere after all, that meant certain corruption.

This trace says that loadkeys program managed to crash the kernel by calling an ioctl syscall(__x64_sys_ioctl) and that crash happens somewhere in memmove() function.

Sounds like a very strange bug to have. What could loadkeys do so complicated to get kernel confused? It’s whole source is 200 lines. Well, actual key loading happens here via ioctl(KDSKBMODE) and ioctl(KDSKBENT).

Searching internet for __memmove+loadkeys showsh that people are occasionally seeing these crashes since at least 2009 (kernel 4.1). I encountered no conclusive investigations and dived in.

The backtrace above suggests crash happened somewhere at vt_do_kdgkb_ioctl():

/* FIXME: This one needs untangling and locking */
int vt_do_kdgkb_ioctl(int cmd, struct kbsentry __user * user_kdgkb, int perm)
	struct kbsentry *kbs;
	char *p;
	u_char *q;
	u_char __user *up;
	int sz;
	int delta;
	char *first_free, *fj, *fnw;
	int i, j, k;
	int ret;

	if (!capable(CAP_SYS_TTY_CONFIG))
		perm = 0;

	kbs = kmalloc(sizeof(*kbs), GFP_KERNEL);
	if (!kbs) {
		ret = -ENOMEM;
		goto reterr;

	/* we mostly copy too much here (512bytes), but who cares ;) */
	if (copy_from_user(kbs, user_kdgkb, sizeof(struct kbsentry))) {
		ret = -EFAULT;
		goto reterr;
	kbs->kb_string[sizeof(kbs->kb_string) - 1] = '\0';
	i = kbs->kb_func;

	switch (cmd) {
		sz = sizeof(kbs->kb_string) - 1;	/* sz should have been
							   a struct member */
		up = user_kdgkb->kb_string;
		p = func_table[i];
		if (p)
			for (; *p && sz; p++, sz--)
				if (put_user(*p, up++)) {
					ret = -EFAULT;
					goto reterr;
		if (put_user('\0', up)) {
			ret = -EFAULT;
			goto reterr;
		return ((p && *p) ? -EOVERFLOW : 0);
		if (!perm) {
			ret = -EPERM;
			goto reterr;

		q = func_table[i];
		first_free = funcbufptr + (funcbufsize - funcbufleft);
		for (j = i + 1; j < MAX_NR_FUNC && !func_table[j]; j++) ;
		if (j < MAX_NR_FUNC)
			fj = func_table[j];
			fj = first_free;

		delta = (q ? -strlen(q) : 1) + strlen(kbs->kb_string);
		if (delta <= funcbufleft) {	/* it fits in current buf */
			if (j < MAX_NR_FUNC) {
				memmove(fj + delta, fj, first_free - fj);
				for (k = j; k < MAX_NR_FUNC; k++)
					if (func_table[k])
						func_table[k] += delta;
			if (!q)
				func_table[i] = fj;
			funcbufleft -= delta;
		} else {	/* allocate a larger buffer */
			sz = 256;
			while (sz < funcbufsize - funcbufleft + delta)
				sz <<= 1;
			fnw = kmalloc(sz, GFP_KERNEL);
			if (!fnw) {
				ret = -ENOMEM;
				goto reterr;

			if (!q)
				func_table[i] = fj;
			if (fj > funcbufptr)
				memmove(fnw, funcbufptr, fj - funcbufptr);
			for (k = 0; k < j; k++)
				if (func_table[k])
					func_table[k] =
					    fnw + (func_table[k] - funcbufptr);

			if (first_free > fj) {
				memmove(fnw + (fj - funcbufptr) + delta, fj,
					first_free - fj);
				for (k = j; k < MAX_NR_FUNC; k++)
					if (func_table[k])
						func_table[k] =
						    fnw + (func_table[k] -
							   funcbufptr) + delta;
			if (funcbufptr != func_buf)
			funcbufptr = fnw;
			funcbufleft = funcbufleft - delta + sz - funcbufsize;
			funcbufsize = sz;
		strcpy(func_table[i], kbs->kb_string);
	ret = 0;
	return ret;

It’s a huge function but it’s high-level purpose is simple:

  • handle ioctl(KDGKBSENT) call (Get KeyBoard Entries)
  • handle ioctl(KDSKBSENT) call (Set KeyBoard Entries)

Entries are struct kbsentry:

All it does is to substitute input char kb_func for a sequence of chars as kb_string (they can be scape sequences understood by linux terminal).

KDSKBSENT handler above is full of array handling logic. To understand is we need to look at the actual data structures in drivers/tty/vt/defkeymap.c_shipped:

/* Do not edit this file! It was automatically generated by   */
/*    loadkeys --mktable defkeymap.map > defkeymap.c          */

#include <linux/types.h>
#include <linux/keyboard.h>
#include <linux/kd.h>


 * Philosophy: most people do not define more strings, but they who do
 * often want quite a lot of string space. So, we statically allocate
 * the default and allocate dynamically in chunks of 512 bytes.

char func_buf[] = {
 '\033', '[', '[', 'A', 0, 
 '\033', '[', '[', 'B', 0, 
 '\033', '[', '[', 'C', 0, 
 '\033', '[', '[', 'D', 0, 
 '\033', '[', '[', 'E', 0, 
 '\033', '[', '1', '7', '~', 0, 
 '\033', '[', '1', '8', '~', 0, 
 '\033', '[', '1', '9', '~', 0, 
 '\033', '[', '2', '0', '~', 0, 
 '\033', '[', '2', '1', '~', 0, 
 '\033', '[', '2', '3', '~', 0, 
 '\033', '[', '2', '4', '~', 0, 
 '\033', '[', '2', '5', '~', 0, 
 '\033', '[', '2', '6', '~', 0, 
 '\033', '[', '2', '8', '~', 0, 
 '\033', '[', '2', '9', '~', 0, 
 '\033', '[', '3', '1', '~', 0, 
 '\033', '[', '3', '2', '~', 0, 
 '\033', '[', '3', '3', '~', 0, 
 '\033', '[', '3', '4', '~', 0, 
 '\033', '[', '1', '~', 0, 
 '\033', '[', '2', '~', 0, 
 '\033', '[', '3', '~', 0, 
 '\033', '[', '4', '~', 0, 
 '\033', '[', '5', '~', 0, 
 '\033', '[', '6', '~', 0, 
 '\033', '[', 'M', 0, 
 '\033', '[', 'P', 0, 

char *funcbufptr = func_buf;
int funcbufsize = sizeof(func_buf);
int funcbufleft = 0;          /* space left */

char *func_table[MAX_NR_FUNC] = {
 func_buf + 0,
 func_buf + 5,
 func_buf + 10,
 func_buf + 15,
 func_buf + 20,
 func_buf + 25,
 func_buf + 31,
 func_buf + 37,
 func_buf + 43,
 func_buf + 49,
 func_buf + 55,
 func_buf + 61,
 func_buf + 67,
 func_buf + 73,
 func_buf + 79,
 func_buf + 85,
 func_buf + 91,
 func_buf + 97,
 func_buf + 103,
 func_buf + 109,
 func_buf + 115,
 func_buf + 120,
 func_buf + 125,
 func_buf + 130,
 func_buf + 135,
 func_buf + 140,
 func_buf + 145,
 func_buf + 149,

Here we can see that func_buf is statically allocated flattened array of default keymaps. func_table array of pointers is a fast lookup table into flat func_buf array. If func_buf has not enough space it gets reallocated at funcbufptr.

That’s why vt_do_kdgkb_ioctl() is so complicated: it patches and update all these offsets.

Also note: func_buf and funcbufptr are both global pointers without any locking around these globals (also stressed by a FIXME above).

This is our somewhat smoking gun: if something in my system happens to call ioctl(KDSKBSENT) in parallel on multiple CPUs it will be able to mess up func_table into something that does not make sense. That can lead to strange things when you press these keys!

The only problem was that normally you have only one loadkeys being ran for a short time when your system boots up. Nothing else should be touching keymaps at that time anyway (or after).

Into the rabbit hole

To validate the race theory I added debug statement into vt_do_kdgkb_ioctl() function to see who calls it at boot:

Feb 24 12:06:35 sf systemd-vconsole-setup[343]: Executing "/usr/bin/loadkeys -q -C /dev/tty1 -u ru4"...
Feb 24 12:06:35 sf systemd-vconsole-setup[344]: /usr/bin/setfont succeeded.
Feb 24 12:06:35 sf systemd-vconsole-setup[344]: Executing "/usr/bin/loadkeys -q -C /dev/tty1 -u ru4"...
Feb 24 12:06:35 sf systemd-vconsole-setup[343]: Successfully forked off '(loadkeys)' as PID 423.
Feb 24 12:06:35 sf systemd-vconsole-setup[344]: Successfully forked off '(loadkeys)' as PID 424.
Feb 24 12:06:35 sf kernel: In vt_do_kdgkb_ioctl(19273=KDSKBSENT)/cpu=5/comm=loadkeys(424)
Feb 24 12:06:35 sf kernel: In vt_do_kdgkb_ioctl(19273=KDSKBSENT)/cpu=2/comm=loadkeys(423)
<more of these with interleaved PIDs>

Bingo: systemd was running exactly two instances of loadkeys at the same time: loadkeys(424) and loadkeys(423). It’s an ideal way to trigger the race: two processes are likely blocked by IO as they are executed for the first time from disk, and once unblocked execute exactly the same code in parallel instruction for instruction.

But why does systemd runs loadkeys twice? Why not once or as many times as I have ttys?

For many systems it’s supposed to happen only once. See 90-vconsole.rules udev rule:

# Each vtcon keeps its own state of fonts.
ACTION=="add", SUBSYSTEM=="vtconsole", KERNEL=="vtcon*", RUN+="@rootlibexecdir@/systemd-vconsole-setup"

Normally you have only one /sys/devices/virtual/vtconsole/vtcon0. But my system has two of these:

# cat /sys/devices/virtual/vtconsole/vtcon0/name
(S) dummy device
# cat /sys/devices/virtual/vtconsole/vtcon1/name
(M) frame buffer device

That dummy console comes from intel framebuffer driver:

i915 is an intel VGA video driver. My system has this driver compiled into kernel. That triggers kernel to discover and expose vtcon0/vtcon1 at the same time.

My speculation is that for non-intel-video systems (or for systems with intel driver loaded at a late stage) the condition might not trigger at all because those get only one loadkeys run (or a few runs spanned in time after each module is loaded).

The fix was simple: add some locking at least for write/write race. I did not touch read paths as I was not sure which subsystems use vt subsystem. Maybe some of them require decent throughput and lock for every character would be too much.

After this patch applied I had no bactraces at boot and no more unexpected poweroffs. But who knows, maybe it was a distraction and power button can’t be simulated through any tty escapes. We’ll see.

If you are wondering what you could fix yourself in linux kernel you can finish this work and also add read/write locking!

Parting words

  • The possible cause of spurious reboots was data corruption caused by very old race condiiton in kernel.
  • Silent data corruption is hard to diagnose if you don’t know where to look. I was lucky to get a kernel oops in the same buggy code.
  • tty/vt driver is full of globals. Those should perhaps be changed to be per-vtcon arrays (some non-x86 already have it tht way).
  • tty/vt global tables are actually generated by an old userspace tool loadkeys --mktable tool and stored in kernel as-is.
  • There is still a read/write race in kernel waiting for you to fix it!

Have fun!

Posted on June 26, 2019
<noscript>Please enable JavaScript to view the <a href="http://disqus.com/?ref_noscript">comments powered by Disqus.</a></noscript> comments powered by Disqus

June 16, 2019
Thomas Raschbacher a.k.a. lordvan (homepage, bugs)
DBMail swap from LDAP to SQL auth (June 16, 2019, 19:05 UTC)

Due to my old ldap server (with samba 2.X) dying I wanted to decouple the emails from the rest of the accounts since only 3 users actually use the email anyway the whole setup with LDAP was complete overkill..

Fortunately swapping that was relatively simple! - more to come soon (when I get a chance to finish this blog entry)

June 14, 2019
Thomas Raschbacher a.k.a. lordvan (homepage, bugs)

So I had errors in my nagios for UCS for the backup domain server and fileserver (after an upgrade probably):

these services were CRITICAL on several servers:


The errors were like "NRPE: Command 'UNIVENTION_DISK_ROOT' not defined" .. and so on ...

After a bit of searching i found this help forum page (in german): https://help.univention.com/t/nagios-zeigt-nach-update-auf-4-3-0-fur-slave-dc-nrpe-command-not-defined/8261/21

that in the end helped me solve the problem.

the following commands are what I did run (not all may be needed on each system, since for some reason one package was missing on the fileserver but not the backup domain server (also i am not sure if it is 100% needed but it works now so ..)

apt-get install nagios-nrpe-server nagios-nrpe-plugin univention-nagios-s4-connector # make sure all packages are installed
systemctl restart nagios-nrpe-server.service # restart the service
univention-directory-listener-ctrl resync nagios-client # resync - settings i guess?

I had ran the following 2 commands before the above too, but just got errors..

sudo -u nagios /usr/lib/nagios/plugins/check_univention_s4_connector_suidwrapper 
sudo -u nagios /usr/lib/nagios/plugins/check_univention_samba_drs_failures_suidwrapper

now the first of the 2 just gives me this: S4CONNECTOR OK: Connector not activated (connector/s4/autostart =! true)
The second one still gives command not found, but seems to not matter in my case.

Then i re-scheduled the nagios checks because I didn'T feel like waiting 10 minutes..then everything was ok again. looks like the upgrade missed some packages and/or settings?

May 03, 2019
Luca Barbato a.k.a. lu_zero (homepage, bugs)
Using Wireguard (May 03, 2019, 08:51 UTC)

wireguard is a modern, secure and fast vpn tunnel that is extremely simple to setup and works already nearly everywhere.

Since I spent a little bet to play with it because this looked quite interesting, I thought of writing a small tutorial.

I normally use Gentoo (and macos) so this guide is about Gentoo.

General concepts

Wireguard sets up peers identified by an public key and manages a virtual network interface and the routing across them (optionally).

The server is just a peer that knows about loots of peers while a client knows how to directly reach the server and that’s it.

Setting up in Gentoo

Wireguard on Linux is implemented as a kernel module.

So in general you have to build the module and the userspace tools (wg).
If you want to have some advanced feature make sure that your kernel has the following settings:


After that using emerge will get you all you need:

$ emerge wireguard


The default distribution of tools come with the wg command and an helper script called wg-quick that makes easier to bring up and down the virtual network interface.

wg help
Usage: wg <cmd> [<args>]

Available subcommands:
  show: Shows the current configuration and device information
  showconf: Shows the current configuration of a given WireGuard interface, for use with `setconf'
  set: Change the current configuration, add peers, remove peers, or change peers
  setconf: Applies a configuration file to a WireGuard interface
  addconf: Appends a configuration file to a WireGuard interface
  genkey: Generates a new private key and writes it to stdout
  genpsk: Generates a new preshared key and writes it to stdout
  pubkey: Reads a private key from stdin and writes a public key to stdout
You may pass `--help' to any of these subcommands to view usage.
Usage: wg-quick [ up | down | save | strip ] [ CONFIG_FILE | INTERFACE ]

  CONFIG_FILE is a configuration file, whose filename is the interface name
  followed by `.conf'. Otherwise, INTERFACE is an interface name, with
  configuration found at /etc/wireguard/INTERFACE.conf. It is to be readable
  by wg(8)'s `setconf' sub-command, with the exception of the following additions
  to the [Interface] section, which are handled by wg-quick:

  - Address: may be specified one or more times and contains one or more
    IP addresses (with an optional CIDR mask) to be set for the interface.
  - DNS: an optional DNS server to use while the device is up.
  - MTU: an optional MTU for the interface; if unspecified, auto-calculated.
  - Table: an optional routing table to which routes will be added; if
    unspecified or `auto', the default table is used. If `off', no routes
    are added.
  - PreUp, PostUp, PreDown, PostDown: script snippets which will be executed
    by bash(1) at the corresponding phases of the link, most commonly used
    to configure DNS. The string `%i' is expanded to INTERFACE.
  - SaveConfig: if set to `true', the configuration is saved from the current
    state of the interface upon shutdown.

See wg-quick(8) for more info and examples.

Creating a configuration

Wireguard is quite straightforward, you can either prepare a configuration with your favourite text editor or generate one by setting by hand the virtual network device and then saving the result wg showconf presents.

A configuration file then can be augmented with wg-quick-specific options (such as Address) or just passed to wg setconf while the other networking details are managed by your usual tools (e.g. ip).

Create your keys

The first step is to create the public-private key pair that identifies your peer.

  • wg genkey generates a private key for you.
  • You feed it to wg pubkey to have your public key.

In a single line:

$ wg genkey | tee privkey | wg pubkey > pubkey

Prepare a configuration file

Both wg-quick and wg setconf use an ini-like configuration file.

If you put it in /etc/wireguard/${ifname}.conf then wg-quick would just need the interface name and would look it up for you.

The minimum configuration needs an [Interface] and a [Peer] set.
You may add additional peers later.
A server would specify its ListenPort and identify the peers by their PublicKey.

Address =
ListenPort = 51820
PrivateKey = <key>

PublicKey = <key>
AllowedIPs =

A client would have a peer with an EndPoint defined and optionally not specify the ListenPort in its interface description.

PrivateKey = <key>
Address =

PublicKey = <key>
AllowedIPs =
Endpoint = <ip>:<port>

The AllowedIPs mask let you specify how much you want to route over the vpn.
By setting you tell you want to route ALL the traffic through it.

NOTE: Address is a wg-quick-specific option.

Using a configuration

wg-quick is really simple to use, assuming you have created /etc/wireguard/wg0.conf:

$ wg-quick up wg0
$ wg-quick down wg0

If you are using netifrc from version 0.6.1 wireguard is supported and you can have a configuration such as:


With the wg0.conf file like the above but stripped of the wg-quick-specific options.

Summing up

Wireguard is a breeze to set up compared to nearly all the other vpn solutions.

Non-linux systems can currently use a go implementation and in the future a rust implementation (help welcome).

Android and macos have already some pretty front-ends that make the setup easy even on those platforms.

I hope you enjoyed it 🙂

May 02, 2019
Andreas K. Hüttel a.k.a. dilfridge (homepage, bugs)

Term has already started, so this announcement is technically a bit late, however... This summer term I'm offering a lecture "High Frequency Engineering for Physicists". If you plan to work with signals in the frequency range 10MHz - 50GHz, this might be interesting for you...

When and where? Wednesdays, 12h - 14h, seminar room PHY 9.1.10. The next lecture is on 8 May 2019
  • Concepts and formalisms for the frequency range 10MHz - 50GHz
  • Handling equipment for this frequency range, designing devices and measurements
  • Using this frequency range in a (millikelvin) cryostat
More information can be found soon on the homepage of the lecture.

See you next wednesday!

April 30, 2019
Andreas K. Hüttel a.k.a. dilfridge (homepage, bugs)
Press release (in German) on our recent PRL (April 30, 2019, 13:43 UTC)

Regensburg University has published a press release (in German) on our recent Physical Review Letters "Editor's Suggestion" publication, "Shaping Electron Wave Functions in a Carbon Nanotube with a Parallel Magnetic Field". Read it on the university web page!

(A summary in English can be found in a previous blog post.)

April 29, 2019
Yury German a.k.a. blueknight (homepage, bugs)
Gentoo Blogs Update (April 29, 2019, 03:41 UTC)

This is just a notification that the Blogs and the appropriate plug-ins for the release 5.1.1 have been updated.

With the release of these updated we (The Gentoo Blog Team) have updated the themes that had updates. If you have a blog on this site, and have a theme that is based on one of the following themes please consider updating as these themes are no longer updated and things will break in your blogs.

  • KDE Breathe
  • KDE Graffiti
  • Oxygen
  • The Following WordPress versions might stop working (Simply because of age)
    • Twenty Fourteen
    • Twenty Fifteen
    • Twenty Sixteen

If you are using one of these themes it is recommended that you update to the other themes available. If you think that there is an open source theme that you would like to have available please contact the Blogs team by opening a Bugzilla Bug with pertinent information.