November 27, 2015

Russ Allbery

remctl 3.10

remctl is a simple and secure remote command execution protocol using GSS-API. Essentially, it's the thinnest and simplest possible way to deploy remote network APIs for commands using Kerberos authentication and encryption.

Most of the work in this release is around supporting anonymous authentication for an upcoming project of mine. This included cleaning up ACL handling so that clients that authenticated with anonymous PKINIT didn't count as ANYUSER (it's not likely this would have been a security problem for existing users, since you would have had to enable anonymous service tickets in your KDC), and adding new anyuser:auth and anyuser:anonymous ACLs that are explicit about whether anonymous users are included.

With this change, it's possible, using a KDC with anonymous service tickets enabled, to use anonymous PKINIT to make entirely unauthenticated remctl calls. I plan on using this with wallet to support initial system key bootstrapping using external validation of whether a system is currently allowed to bootstrap keys. Note that you need to be very careful when enabling anonymous service tickets, since many other Kerberos applications (including remctl prior to this release) assume that any client that can get a service ticket is in some way authenticated.

Also new in this release, the server now sets the REMOTE_EXPIRES environment variable to the time when the authenticated remote session will expire. This is usually th expiration time of the user's credentials. I'm planning on using this as part of a better kx509 replacement to issue temporary X.509 certificates from Kerberos tickets, limited in lifetime to the lifetime of the Kerberos ticket.

This release also includes some portability fixes, a bug fix for the localgroup ACL for users who are members of lots of local groups, and some (mildly backwards-incompatible) fixes for the Python RemctlError exception class.

You can get the latest release from the remctl distribution page.

27 November, 2015 11:42PM

Simon Josefsson

Automatic Replicant Backup over USB using rsync

I have been using Replicant on the Samsung SIII I9300 for over two years. I have written before on taking a backup of the phone using rsync but recently I automated my setup as described below. This work was prompted by a screen accident with my phone that caused it to die, and I noticed that I hadn’t taken regular backups. I did not lose any data this time, since typically all content I create on the device is immediately synchronized to my clouds. Photos are uploaded by the ownCloud app, SMS Backup+ saves SMS and call logs to my IMAP server, and I use DAVDroid for synchronizing contacts, calendar and task lists with my instance of ownCloud. Still, I strongly believe in regular backups of everything, so it was time to automate this.

For my use-case, taking backups of the phone whenever I connect it to one of my laptops is sufficient. I typically connect it to my laptops for charging at least every other day. My laptops are all running Debian, but this should be applicable to most modern GNU/Linux system. This is not Replicant-specific, although you need a rooted phone. I thought that automating this would be simple, but I got to learn the ins and outs of systemd and udev in the process and this ended up taking the better part of an evening.

I started out adding an udev rule and a small script, thinking I could invoke the backup process from the udev rule. However rsync would magically die after running a few seconds. After an embarrassing long debugging session, finally I found someone with a similar problem which led me to a nice writeup on the topic of running long-running services on udev events. I created a file /etc/udev/rules.d/99-android-backup.rules with the following content:

ACTION=="add", SUBSYSTEMS=="usb", ENV{ID_SERIAL_SHORT}=="323048a5ae82918b", TAG+="systemd", ENV{SYSTEMD_WANTS}+="android-backup@$env{ID_SERIAL_SHORT}.service"
ACTION=="add", SUBSYSTEMS=="usb", ENV{ID_SERIAL_SHORT}=="4df9e09c25e75f63", TAG+="systemd", ENV{SYSTEMD_WANTS}+="android-backup@$env{ID_SERIAL_SHORT}.service"

The serial numbers correspond to the device serial numbers of the two devices I wish to backup. The adb devices command will print them for you, and you need to replace my values with the values from your phones. Next I created a systemd service to describe a oneshot service. The file /etc/systemd/system/android-backup@.service have the following content:

ExecStart=/usr/local/sbin/android-backup %I

The at-sign (“@”) in the service filename signal that this is a service that takes a parameter. I’m not enough of an udev/systemd person to explain these two files using the proper terminology, but at least you can pattern-match and follow the basic idea of them: the udev rule matches the devices that I’m interested in (I don’t want this to happen to all random Android devices I attach, hence matching against known serial numbers), and it causes a systemd service with a parameter to be started. The systemd service file describe the script to run, and passes on the parameter.

Now for the juicy part, the script. I have /usr/local/sbin/android-backup with the following content.


export ANDROID_SERIAL="$1"

exec 2>&1 | logger

if ! test -d "$DIRBASE-$ANDROID_SERIAL"; then
    echo "could not find directory: $DIRBASE-$ANDROID_SERIAL"
    exit 1

set -x

adb wait-for-device
adb root
adb wait-for-device
adb shell printf "address\nuid = root\ngid = root\n[root]\n\tpath = /\n" \> /mnt/secure/rsyncd.conf
adb shell rsync --daemon --no-detach --config=/mnt/secure/rsyncd.conf &
adb forward tcp:6010 tcp:873
sleep 2
rsync -av --delete --exclude /dev --exclude /acct --exclude /sys --exclude /proc rsync://localhost:6010/root/ $DIRBASE-$ANDROID_SERIAL/
: rc $?
adb forward --remove tcp:6010
adb shell rm -f /mnt/secure/rsyncd.conf

This script warrant more detailed explanation. Backups are placed under, e.g., /var/backups/android-323048a5ae82918b/ for later off-site backup (you do backup your laptop, right?). You have to manually create this directory, as a safety catch to not wildly rsync data into non-existing directories. The script logs everything using syslog, so run a tail -F /var/log/syslog& when setting this up. You may want to reduce verbosity of rsync if you prefer (replace rsync -av with rsync -a). The script runs adb wait-for-device which you rightly guessed will wait for the device to settle. Next adb root is invoked to get root on the device (reading all files from the system naturally requires root). It takes some time to switch, so another wait-for-device call is needed. Next a small rsyncd configuration file is created in /mnt/secure/rsyncd.conf on the phone. The file tells rsync do listen on localhost, run as root, and use / as the path. By default, rsyncd is read-only so the host will not be able to upload any data over rsync, just read data out. Next rsync is started on the phone. The adb forward command forwards port 6010 on the laptop to port 873 on the phone (873 is the default rsyncd port). Unfortunately, setting up the TCP forward appears to take some time, and adb wait-for-device will not wait for that to complete, hence an ugly sleep 2 at this point. Next is the rsync invocation itself, which just pulls in everything from the phone to the laptop, excluding some usual suspects. The somewhat cryptic : rc $? merely logs the exit code of the rsync process into syslog. Finally we clean up the TCP forward and remove the rsyncd.conf file that was temporarily created.

This setup appears stable to me. I can plug in a phone and a backup will be taken. I can even plug in both my devices at the same time, and they will run at the same time. If I unplug a device, the script or rsync will error out and systemd cleans up.

If anyone has ideas on how to avoid the ugly temporary rsyncd.conf file or the ugly sleep 2, I’m interested. It would also be nice to not have to do the ‘adb root’ dance, and instead have the phone start the rsync daemon when connecting to my laptop somehow. TCP forwarding might be troublesome on a multi-user system, but my laptops aren’t. Killing rsync on the phone is probably a good idea too. If you have ideas on how to fix any of this, other feedback, or questions, please let me know!

27 November, 2015 11:33PM by simon

Russ Allbery

rra-c-util 5.9

A minor release of my C utility library, including some changes required for the previous release of pam-afs-session and the upcoming release of remctl.

The Kerberos portability layer now correctly defines the strings for dealing with anonymous principals when built with Heimdal, and adds KRB5_ANON_REALM (required for doing the authentication). The PAM testing framework has some improvements for handling pam_modutil_getpwnam and supports testing against PAM_SESSION_ERR.

You can get the latest version from the rra-c-util distribution page.

27 November, 2015 10:42PM

hackergotchi for Erich Schubert

Erich Schubert

ELKI 0.7.0 on Maven and GitHub

Version 0.7.0 of our data mining toolkit ELKI is now available on the project homepage, GitHub and Maven.
You can also clone this example project to get started easily.
What is new in ELKI 0.7.0? Too much, see the release notes, please!
What is ELKI exactly?
ELKI is a Java based data mining toolkit. We focus on cluster analysis and outlier detection, because there are plenty of tools available for classification already. But there is a kNN classifier, and a number of frequent itemset mining algorithms in ELKI, too.
ELKI is highly modular. You can combine almost everything with almost everything else. In particular, you can combine algorithms such as DBSCAN, with arbitrary distance functions, and you can choose from many index structures to accelerate the algorithm. But because we separate them well, you can add a new index, or a new distance function, or a new data type, and still benefit from the other parts. In other tools such as R, you cannot easily add a new distance function into an arbitrary algorithm and get good performance - all the fast code in R is written in C and Fortran; and cannot be easily extended this way. In ELKI, you can define a new data type, new distance function, new index, and still use most algorithms. (Some algorithms may have prerequisites that e.g. your new data type does not fulfill, of course).
ELKI is also very fast. Of course a good C code can be faster - but then it usually is not as modular and easy to extend anymore.
ELKI is documented. We have JavaDoc, and we annotate classes with their scientific references (see a list of all references we have). So you know which algorithm a class is supposed to implement, and can look up details there. This makes it very useful for science.
ELKI is not: a turnkey solution. It aims at researchers, developers and data scientists. If you have a SQL database, and want to do a point-and-click analysis of your data, please get a business solution instead with commercial support.

27 November, 2015 05:27PM

hackergotchi for Gergely Nagy

Gergely Nagy

Feeding Emacs

For the past fifteen years, I have been tweaking my ~/.emacs continously, most recently by switching to Spacemacs. With that switch done, I started to migrate a few more things to Emacs, an Atom/RSS reader being one that's been in the queue for years - ever since Google Reader shut down. Since March 2013, I have been a Feedly user, but I wanted to migrate to something better for a long time. I wanted to use Free Software, for one.

I saw a mention of Elfeed somewhere a little while ago, and in the past few days, I decided to give it a go. The results are pretty amazing.

For now, I'm using my own fork of Elfeed, because of some new features I'm adding, which were not submitted upstream yet. It's all possible using the upstream sources, if one monkey-patches a few functions - but that's ugly.

Anyhow, this is how my feed reader looks right now:

Elfeed setup

This required quite a bit of elisp to be written, and depends on popwin and powerline, but I'm very happy with the results.

Without much further ado, the magic to make this happen:

(defun feed-reader/stats () "Count the number of entries and feeds being currently displayed." (if (and elfeed-search-filter-active elfeed-search-filter-overflowing) (list 0 0 0) (cl-loop with feeds = (make-hash-table :test 'equal) for entry in elfeed-search-entries for feed = (elfeed-entry-feed entry) for url = (elfeed-feed-url feed) count entry into entry-count count (elfeed-tagged-p 'unread entry) into unread-count do (puthash url t feeds) finally (cl-return (list unread-count entry-count (hash-table-count feeds)))))) (defun feed-reader/search-header () "Returns the string to be used as the Elfeed header." (let* ((separator-left (intern (format "powerline-%s-%s" (powerline-current-separator) (car powerline-default-separator-dir)))) (separator-right (intern (format "powerline-%s-%s" (powerline-current-separator) (cdr powerline-default-separator-dir))))) (if (zerop (elfeed-db-last-update)) (elfeed-search--intro-header) (let* ((db-time (seconds-to-time (elfeed-db-last-update))) (update (format-time-string "%Y-%m-%d %H:%M:%S %z" db-time)) (stats (feed-reader/stats)) (search-filter (cond (elfeed-search-filter-active "") (elfeed-search-filter elfeed-search-filter) (""))) (lhs (list (powerline-raw (concat search-filter " ") 'powerline-active1 'l) (funcall separator-right 'powerline-active1 'mode-line))) (center (list (funcall separator-left 'mode-line 'powerline-active2) (destructuring-bind (unread entry-count feed-count) stats (let* ((content (format " %d/%d:%d" unread entry-count feed-count)) (help-text nil) ) (if url-queue (let* ((total (length url-queue)) (in-process (cl-count-if #'url-queue-buffer url-queue))) (setf content (concat content " (*)")) (setf help-text (format " %d feeds pending, %d in process ... " total in-process)))) (propertize content 'face 'powerline-active2 'help-echo help-text))) (funcall separator-right 'powerline-active2 'mode-line))) (rhs (list (funcall separator-left 'mode-line 'powerline-active1) (powerline-raw (concat " " update) 'powerline-active1 'r)))) (concat (powerline-render lhs) (powerline-fill-center nil (/ (powerline-width center) 2.0)) (powerline-render center) (powerline-fill nil (powerline-width rhs)) (powerline-render rhs)))))) (defun popwin:elfeed-show-entry (buff) (popwin:popup-buffer buff :position 'right :width 0.5 :dedicated t :noselect nil :stick t)) (defun popwin:elfeed-kill-buffer () (interactive) (let ((window (get-buffer-window (get-buffer "*elfeed-entry*")))) (kill-buffer (get-buffer "*elfeed-entry*")) (delete-window window))) (setq elfeed-show-entry-switch #'popwin:elfeed-show-entry elfeed-show-entry-delete #'popwin:elfeed-kill-buffer elfeed-search-header-function #'feed-reader/search-header)

I probably learned more elisp with this experiment than in the past 15 years combined, and I finally start to understand how properties and powerline work. Great! Looking forward to simplify and then push the current setup further!

I'd like to be able to change the search page layout, for example: I want the tags on a separate column. Elfeed also looks like a great candidate for a number of pull requests I plan to make in December...

27 November, 2015 02:01PM by Gergely Nagy

hackergotchi for Norbert Preining

Norbert Preining

Slick Google Map 0.3 released

I just have pushed a new version of the Slick Google Map plugin for WordPress to the servers. There are not many changes, but a crucial fix for parsing coordinates in DMS (degree-minute-second) format.


The documentation described that all kind of DMS formats can be used to specify a location, but these DMS encoded locations were sent to Google for geocoding. Unfortunately it seems Google is incapable of handling DMS formats, and returns slightly of coordinates. By using a library for DMS conversion, which I adapted slightly, it is now possible to use a wide variety of location formats. Practically everything that can reasonably interpreted as a location will be properly converted to decimal coordinates.

Plans for the next release are:

  • prepare for translation via the WordPress translation team
  • add html support for the marker text via uuencoding

Please see the dedicated page or the WordPress page for more details and downloads.


27 November, 2015 07:17AM by Norbert Preining

Paul Wise

The GPL is not magic pixie dust

Become a Software Freedom Conservancy Supporter!

The GPL is not magic pixie dust. It does not work by itself.
The first step is to choose a copyleft license for your code.
The next step is, when someone fails to follow that copyleft license, it must be enforced
and its a simple fact of our modern society that such type of work
is incredibly expensive to do and incredibly difficult to do.

-- Bradley Kuhn, in FaiF episode 0x57

As the Debian Website used to imply, public domain and permissively licensed software can lead to the production of more proprietary software as people discover useful software, extend it and or incorporate it into their hardware or software products. Copyleft licenses such as the GNU GPL were created to close off this avenue to the production of proprietary software but such licenses are not enough. With the ongoing adoption of Free Software by individuals and groups, inevitably the community's expectations of license compliance are violated, usually out of ignorance of the way Free Software works, but not always. As Karen and Bradley explained in FaiF episode 0x57, copyleft is nothing if no-one is willing and able to stand up in court to protect it. The reality of today's world is that legal representation is expensive, difficult and time consuming. With in hiatus until some time in 2016, the Software Freedom Conservancy (a tax-exempt charity) is the major defender of the Linux project, Debian and other groups against GPL violations. In March the SFC supported a lawsuit by Christoph Hellwig against VMware for refusing to comply with the GPL in relation to their use of parts of the Linux kernel. Since then two of their sponsors pulled corporate funding and conferences blocked or cancelled their talks. As a result they have decided to rely less on corporate funding and more on the broad community of individuals who support Free Software and copyleft. So the SFC has launched a campaign to create a community of folks who stand up for copyleft and the GPL by supporting their work on promoting and supporting copyleft and Free Software.

If you support Free Software, like what the SFC do, agree with their compliance principles, are happy about their successes in 2015, work on a project that is an SFC member and or just want to stand up for copyleft, please join Christopher Allan Webber, Carol Smith, Jono Bacon, myself and others in becoming a supporter. For the next week your donation will be matched by an anonymous donor. Please also consider asking your employer to match your donation or become a sponsor of SFC. Don't forget to spread the word about your support for SFC via email, your blog and or social media accounts.

27 November, 2015 03:48AM

November 26, 2015

hackergotchi for Olivier Berger

Olivier Berger

Handling video files produced for a MOOC on Windows with git and git-annex

This post is intended to document some elements of workflow that I’ve setup to manage videos produced for a MOOC, where different colleagues work collaboratively on a set of video sequences, in a remote way.

We are a team of several schools working on the same course, and we have an incremental process, so we need some collaboration over a quite long period of many remote authors, over a set of video sequences.

We’re probably going to review some of the videos and make changes, so we need to monitor changes, and submit versions to colleagues on remote sites so they can criticize and get later edits. We may have more that one site doing video production. Thus we need to share videos along the flow of production, editing and revision of the course contents, in a way that is manageable by power users (we’re all computer scientists, used to SVN or Git).

I’ve decided to start an experiment with Git and Git-Annex to try and manage the videos like we use to do for slides sources in LaTeX. Obviously the main issue is that videos are big files, demanding in storage space and bandwidth for transfers.

We want to keep a track of everything which is done during the production of the videos, so that we can later re-do some of the video editing, for instance if we change the graphic design elements (logos, subtitles, frame dimensions, additional effects, etc.), for instance in the case where we would improve the classes over the seasons. On the other hand, not all colleagues want to have to download a full copy of all rushes on their laptop if they just want to review on particular sequence of the course. They will only need to download the final edit MP4. Even if they’re interested in being able to fetch all the rushes, should they want to try and improve the videos.

Git-Annex brings us the ability to decouple the presence of files in directories, managed by regular Git commands, from the presence of the file contents (the big stuff), which is managed by Git-Annex.

Here’s a quick description of our setup :

  • we do screen capture and video editing with Camtasia on a Windows 7 system. Camtasia (although proprietary) is quite manageable without being a video editing expert, and suits quite well our needs in terms of screen capture, green background shooting and later face insertion over slides capture, additional “motion design”-like enhancement, etc.
  • the rushes captured (audio, video) are kept on that machine
  • the MP4 rendering of the edits are performed on that same machine
  • all these files are stored locally on that computer, but we perform regular backups, on demand, on a remote system, with rsync+SSH. We have installed git for Windows so we use bash and rsync and ssh from git’s install. SSH happens using a public key without a passphrase, to connect easily to the Linux remote, but that isn’t mandatory.
  • the mirrored files appear on a Linux filesystem on another host (running Debian), where the target is actually managed with git and git-annex.
  • there we handle all the files added, removed or modified with git-annex.
  • we have 2 more git-annex remote repos, accessed through SSH (again using a passphrase-less public key), run by GitoLite, to which git-annex rsyncs copies of all the file contents. These repos are on different machines keeping backups in case of crashes. git-annex is setup to mandate keeping at least 2 copies of files (numcopies).
  • colleagues in turn clone from either of these repos and git-annex get to download the video contents, only for files which they are interested in (for instance final edits, but not rushes), which they can then play locally on their preferred OS and video player.

Why didn’t we use git-annex on windows directly, on the Windows host which is the source of the files ?

We tried, but that didn’t make it. Git-Annex assistant somehow crashed on us, thus causing the Git history to be strange, so that became unmanageable, and more important, we need robust backups, so we can’t allow to handle something we don’t fully trust: shooting again a video is really costly (setting up again a shooting set, with lighting, cameras, and a professor who has to repeat again the speech!).

The rsync (with –delete on destination) from windows to Linux is robust. Git-Annex on Linux seems robust so far. That’s enough for now :-)

The drawback is that we need manual intervention for starting the rsync, and also that we must make sure that the rsync target is ready to get a backup.

The target of the rsync on Linux is a git-annex clone using the default “indirect” mode, which handles the files as symlinks to the actual copies managed by git-annex inside the .git/ directory. But that ain’t suitable to be compared to the origin of the rsync mirror which are plain files on the Windows computer.

We must then do a “git-annex edit” on the whole target of the rsync mirror before the rsync, so that the files are there as regular video files. This is costly, in terms of storage, and also copying time (our repo contains around 50 Gb, and the Linux host is a rather tiny laptop).

After the rsync, all the files need to be compared to the SHA256 known to git-annex so that only modified files are taken into account in the commit. We perform a “git-annex add” on the whole files (for new files having appeared at rsync time), and then a “git-annex sync”. That takes a lot of time, since all SHA256 computations are quite long for such a set of big files (the video rushes and edited videos are in HD).

So the process needs to be the following, on the target Linux host:

  1. git annex add .
  2. git annex sync
  3. git annex copy . –to server1
  4. git annex copy . –to server2
  5. git annex edit .
  6. only then : rsync

Iterate ad lib 😉

I would have preferred to have a working git-annex on windows, but that is a much more manageable process for me for now, and until we have more videos to handle in our repo that our Linux laptop can hold, we’re quite safe.

Next steps will probably involve gardening the contents of the repo on the Linux host so we’re only keeping copies of current files, and older copies are only kept on the 2 servers, in case of later need.

I hope this can be useful to others, and I’d welcome suggestions on how to improve our process.

26 November, 2015 09:51AM by Olivier Berger

Tiago Bortoletto Vaz

Birthday as in the good old days

This year I've got zero happy birthday spam message from phone, post, email, and from random people on that Internet social thing. In these days, that's a WOW, yes it is.

On the other hand, full of love and simple celebrations together with local ones. A few emails and phone calls from close friends/family who are physically distant.

I'm happier than ever with my last years' choices of caring about my privacy, not spending time with fake relationships and keeping myself an unimportant one for the $SYSTEM. That means a lot for me.

26 November, 2015 12:43AM by Tiago Bortoletto Vaz

November 25, 2015

hackergotchi for Steve Kemp

Steve Kemp

A transient home-directory?

For the past few years all my important work has been stored in git repositories. Thanks to the mr tool I have a single configuration file that allows me to pull/maintain a bunch of repositories with ease.

Having recently wiped & reinstalled a pair of desktop systems I'm now wondering if I can switch to using a totally transient home-directory.

The basic intention is that:

  • Every time I login "rm -rf $HOME/*" will be executed.

I see only three problems with this:

  • Every time I login I'll have to reclone my "dotfiles", passwords, bookmarks, etc.
  • Some programs will need their configuration updated, post-login.
  • SSH key management will be a pain.

My dotfiles contain my my bookmarks, passwords, etc. But they don't contain setup for GNOME, etc.

So there might be some configuration that will become annoying - For example I like "Ctrl-Alt-t" to open a new gnome-terminal command. That's configured on each new system I login to the first time.

My images/videos/books are all stored beneath /srv and not in my home directory - so the only thing I'll be losing is program configuration, caches, and similar.

Ideally I'd be using a smartcard for my SSH keys - but I don't have one - so for the moment I might just have to rsync them into place, but that's grossly bad.

I'll be interesting to see how well this works out, but I see a potential gain in portability and discipline at the very least.

25 November, 2015 02:00PM

hackergotchi for Daniel Pocock

Daniel Pocock

Introducing elfpatch, for safely patching ELF binaries

I recently had a problem with a program behaving badly. As a developer familiar with open source, my normal strategy in this case would be to find the source and debug or patch it. Although I was familiar with the source code, I didn't have it on hand and would have faced significant inconvenience having it patched, recompiled and introduced to the runtime environment.

Conveniently, the program has not been stripped of symbol names, and it was running on Solaris. This made it possible for me to whip up a quick dtrace script to print a log message as each function was entered and exited, along with the return values. This gives a precise record of the runtime code path. Within a few minutes, I could see that just changing the return value of a couple of function calls would resolve the problem.

On the x86 platform, functions set their return value by putting the value in the EAX register. This is a trivial thing to express in assembly language and there are many web-based x86 assemblers that will allow you to enter the instructions in a web-form and get back hexadecimal code instantly. I used the bvi utility to cut and paste the hex code into a copy of the binary and verify the solution.

All I needed was a convenient way to apply these changes to all the related binary files, with a low risk of error. Furthermore, it needed to be clear for a third-party to inspect the way the code was being changed and verify that it was done correctly and that no other unintended changes were introduced at the same time.

Finding or writing a script to apply the changes seemed like the obvious solution. A quick search found many libraries and scripts for reading ELF binary files, but none offered a patching capability. Tools like objdump on Linux and elfedit on Solaris show the raw ELF data, such as virtual addresses, which must be converted manually into file offsets, which can be quite tedious if many binaries need to be patched.

My initial thought was to develop a concise C/C++ program using libelf to parse the ELF headers and then calculating locations for the patches. While searching for an example, I came across pyelftools and it occurred to me that a Python solution may be quicker to write and more concise to review.

elfpatch (on github) was born. As input, it takes a text file with a list of symbols and hexadecimal representations of the patch for each symbol. It then reads one or more binary files and either checks for the presence of the symbols (read-only mode) or writes out the patches. It can optionally backup each binary before changing it.

25 November, 2015 10:30AM by Daniel.Pocock

Drone strikes coming to Molenbeek?

The St Denis siege last week and the Brussels lockdown this week provides all of us in Europe with an opportunity to reflect on why over ten thousand refugees per day have been coming here from the middle east, especially Syria.

At this moment, French warplanes and American drones are striking cities and villages in Syria, killing whole families in their effort to shortcut the justice system and execute a small number of very bad people without putting them on trial. Some observers estimate air strikes and drones kill twenty innocent people for every one bad guy. Women, children, the sick, elderly and even pets are most vulnerable. The leak of the collateral murder video simultaneously brought Wikileaks into the public eye and demonstrated how the crew of a US attack helicopter had butchered unarmed civilians and journalists like they were playing a video game.

Just imagine that the French president had sent the fighter jets to St Denis and Molenbeek instead of using law enforcement. After all, how are the terrorists there any better or worse than those in Syria, don't they deserve the same fate? Or what if Obama had offered to help out with a few drone strikes on suburban Brussels? After all, if the drones are such a credible solution for Syria's future, why won't they solve Brussels' (perceived) problems too?

If the aerial bombing "solution" had been attempted in a western country, it would have lead to chaos. Half the population of Paris and Brussels would find themselves camping at the migrant camps in Calais, hoping to sneak into the UK in the back of a truck.

Over a hundred years ago, Russian leaders proposed a treaty agreeing never to drop bombs from balloons and the US and UK happily signed it. Sadly, the treaty wasn't updated after the invention of fighter jets, attack helicopters, rockets, inter-continental ballistic missiles, satellites and drones.

The reality is that asymmetric warfare hasn't worked and never will work in the middle east and as long as it is continued, experts warn that Europe may continue to face the consequences of refugees, terrorists and those who sympathize with their methods. By definition, these people can easily move from place to place and it is ordinary citizens and small businesses who will suffer a lot more under lockdowns and other security measures.

In our modern world, people often look to technology for shortcuts. The use of drones in the middle east is a shortcut from a country that spent enormous money on ground invasions of Iraq and Afghanistan and doesn't want to do it again. Unfortunately, technological shortcuts can't always replace the role played by real human beings, whether it is bringing law and order to the streets or in any other domain.

Aerial bombardment - by warplane or by drone - carries an implicitly racist message, that the people abused by these drone attacks are not equivalent to the rest of us, they can't benefit from the normal procedures of justice, they don't have rights, they are not innocent until proven guilty and they are expendable.

The French police deserve significant credit for the relatively low loss of life in the St Denis siege. If their methods and results were replicated in Syria and other middle eastern hotspots, would it be more likely to improve the situation in the long term than drone strikes?

25 November, 2015 07:28AM by Daniel.Pocock

hackergotchi for Ben Armstrong

Ben Armstrong

Debian Live After Debian Live

Get involved

After this happened, my next step was to get re-involved in Debian Live to help it carry on after the loss of Daniel. Here’s a quick update on some team progress, notes that could help people building Stretch images right now, and what to expect next.

Team progress

  • Iain uploaded live-config, incorporating an important fix, #bc8914bc, that prevented images from booting.
  • I want to get live-images ready for an upload, including #8f234605 to fix wrong config/bootloaders that prevented images from building.

Test build notes

  • As always, build Stretch images with latest live-build from Sid (i.e. 5.x).
  • Build Stretch images, not Sid, as there’s less of a chance of dependency issues spoiling the build, and that’s the default anyway.
  • To make build iterations faster, make sure the config is modified to not build source & not include installer (edit auto/config before ‘lb config’) and use an apt caching proxy.
  • Don’t forget to inject fixed packages (e.g. live-config) into each config. Use apt pinning as per live-manual, or drop the debs into config/packages.chroot.

Test boot notes

  • Use kvm, giving it enough ram (-m 1024 works for me).
  • For gnome-desktop and kde-desktop, use -vga qxl, or else the desktop will crash and restart repeatedly.
  • When using qxl, edit boot params to add qxl.modeset=1 (workaround for #779515, which will be fixed in kernel >= 4.3).
  • My gnome image test was spoiled by #802929. The mouse doesn’t work (pointer moves, but no buttons work). Waiting on a new kernel to fix this. This is a test environment related bug only, i.e. should work fine on hardware. (Test pending.)
  • The Stretch standard, lxde-desktop, cinnamon-desktop, xfce-desktop, and gnome-desktop images all built and booted fine (except for the gnome issue noted above).
  • The Stretch kde-desktop and mate-desktop images are next on my list to test, along with Jessie images.
  • I’ve only tested on the standard and lxde-desktop images that if the installer is included, booting from the Install boot menu option starts the installer (i.e. didn’t do an actual install).

Coming soon

See the TODO in the wiki. We’re knocking these off steadily. It will be faster with more people helping (hint, hint).


25 November, 2015 01:00AM by Ben Armstrong

November 24, 2015

hackergotchi for Bernd Zeimetz

Bernd Zeimetz online again

Finally, is back online and I’m planning to start blogging again! Part of the reason why I became inactive was the usage of ikiwiki, which is great, but at end unnecessarily complicated. So I’ve migrated by page to - a static website generator, written in go. Hugo has an active community and it is easy to create themes for it or to enhance it. Also it is using plain Markdown syntax instead of special ikiwiki syntax mixed into it - should make it easy to migrate away again if necessary.

In case somebody else would like to convert from ikiwiki to Hugo, here is the script I’ve hacked together to migrate my old blog posts.


find . -type f -name '*.mdwn' | while read i; do
        echo '+++'
        slug="$(echo $i | sed 's,.*/,,;s,\.mdwn$,,')"
        echo "slug = \"${slug}\""
        echo "title = \"$(echo $i | sed 's,.*/,,;s,\.mdwn$,,;s,_, ,g;s/\b\(.\)/\u\1/;s,debian,Debian,g')\""
        if grep -q 'meta updated' $i; then
            echo -n 'date = '
            sed '/meta updated/!d;/.*meta updated.*/s,.*=",,;s,".*,,;s,^,",;s,$,",' $i
            echo -n 'date = '
            git log --diff-filter=A --follow --format='"%aI"' -1 -- $i
        if grep -q '\[\[!tag' $i; then
            echo -n 'tags ='
            sed '/\[\[!tag/!d;s,[^ ]*tag ,,;s,\]\],,;s,\([^ ]*\),"\1",g;s/ /,/g;s,^,[,;s,$,],' $i
        echo 'categories = ["linux"]'
        echo 'draft = false'
        echo '+++'
        echo ''

        sed -e '/\[\[!tag/d' \
            -e '/meta updated/d' \
            -e '/\[\[!plusone *\]\]/d' \
            -e 's,\[\[!img files[0-9/]*/\([^ ]*\) alt="\([^"]*\).*,![\2](../\1),g' \
            -e 's,\[\([^]]*\)\](\([^)]*\)),[\1](\2),g' \
            -e 's,\[\[\([^|]*\)|\([^]]*\)\]\],[\1](\2),g' \
    } > $tmp
    #cat $tmp; rm $tmp 
    mv $tmp `echo $i | sed 's,\.mdwn,.md,g'`

For the planet Debian readers - only linux related posts will show up on the planet. If you are interested in my mountain activities and other things I post, please follow my blog on directly.

24 November, 2015 07:41PM

Carl Chenet

db2twitter: Twitter out of the browser

You have a database, a tweet pattern and wants to automatically tweet on a regular basis? No need for RSS, fancy tricks, 3rd party website to translate RSS to Twitter or whatever. Just use db2twitter.

db2twitter is pretty easy to use!  First define your Twitter credentials:


Then your database information:


Then the pattern of your tweet, a Python-style formatted string:

tweet={} hires a {}{}

Add db2twitter in your crontab:

*/10 * * * * db2witter db2twitter db2twitter.ini

And you’re all set! db2twitter will generate and tweet the following tweets:

MyGreatCompany hires a web developer
CoolStartup hires a devops skilled in Docker

db2twitter is developed by and run for, the job board of th french-speaking Free Software and Opensource community.


db2twitter also has cool options like;

  • only tweet during user-specified time (e.g 9AM-6PM)
  • use user-specified SQL filter in order to get data from the database (e.g only fetch rows where status == « edited »)

db2twitter is coded in Python 3.4, uses SQlAlchemy (see supported database types) and  Tweepy. The official documentation is available on readthedocs.

24 November, 2015 06:00PM by Carl Chenet

hackergotchi for Rhonda D'Vine

Rhonda D'Vine

Salut Salon

I don't really remember where or how I stumbled upon this four women so I'm sorry that I can't give credit where credit is due, and I even do believe that I started writing a blog entry about them already somewhere. Anyway, I want to present you today Salut Salon. They might play classic instruments, but not in a classic way. But see and hear yourself:

  • Wettstreit zu viert: This is the first that I stumbled upon that did catch my attention. Lovely interpretation of classic tunes and sweet mixup.
  • Ievan Polkka: I love the catchy tune—and their interpretation of the song.
  • We'll Meet Again: While the history of the song might not be so laughable the giggling of them is just contagious. :)

So like always, enjoy!

/music | permanent link | Comments: 0 | Flattr this

24 November, 2015 08:26AM by Rhonda

hackergotchi for Michal Čihař

Michal Čihař

Wammu 0.40

Yesterday, Wammu 0.40 has been released.

The list of changes is not really huge:

  • Correctly escape XML output.
  • Make error message selectable.
  • Fixed spurious D-Bus error message.
  • Translation updates.

I will not make any promises for future releases (if there will be any) as the tool is not really in active development.

Filed under: English Gammu Wammu | 0 comments

24 November, 2015 08:09AM by Michal Čihař (

November 23, 2015

hackergotchi for Riku Voipio

Riku Voipio

Using ser2net for serial access.

Is your table a mess of wires? Do you have multiple devices connected via serial and can't remember which is /dev/ttyUSBX is connected to what board? Unless you are a embedded developer, you are unlikely to deal with serial much anymore - In that case you can just jump to the next post in your news feed.

Introducting ser2net

Usually people start with minicom for serial access. There are better tools - picocom, screen, etc. But to easily map multiple serial ports, use ser2net. Ser2net makes serial ports available over telnet.

Persistent usb device names and ser2net

To remember which usb-serial adapter is connected to what, we use the /dev/serial tree created by udev, in /etc/ser2net.conf:

# arndale
7004:telnet:0:'/dev/serial/by-path/pci-0000:00:1d.0-usb-0:1.8.1:1.0-port0':115200 8DATABITS NONE 1STOPBIT
# cubox
7005:telnet:0:/dev/serial/by-id/usb-Prolific_Technology_Inc._USB-Serial_Controller_D-if00-port0:115200 8DATABITS NONE 1STOPBIT
# sonic-screwdriver
7006:telnet:0:/dev/serial/by-id/usb-FTDI_FT230X_96Boards_Console_DAZ0KA02-if00-port0:115200 8DATABITS NONE 1STOPBIT
The by-path syntax is needed, if you have many identical usb-to-serial adapters. In that case a Patch from BTS is needed to support quoting in serial path. Ser2net doesn't seems very actively maintained upstream - a sure sign that project is stagnant is a homepage still at This patch among other interesting features can be also be found in various ser2net forks in github.

Setting easy to remember names

Finally, unless you want to memorize the port numbers, set TCP port to name mappings in /etc/services:

# Local services
arndale 7004/tcp
cubox 7005/tcp
sonic-screwdriver 7006/tcp
Now finally:
telnet localhost sonic-screwdriver
^Mandatory picture of serial port connection in action

23 November, 2015 07:55PM by Riku Voipio (

hackergotchi for C.J. Adams-Collier

C.J. Adams-Collier

Regarding fdupes

Dear readers,

There is a very useful tool for finding and merging shared permanent storage, and its name is fdupes. There was a terrible occurrence in the software after version 1.51, however. They removed the -L argument because too many people were complaining about lost data. It sounds like user error to me, and so I continue to use this one. I have to build from source, since the newer versions do not have the -L option.

And so there you are. I recommend using it, even though this most useful feature has been deprecated and removed from the software. Perhaps there should be a fdupes-danger package in Debian?

23 November, 2015 06:04PM by C.J. Adams-Collier

hackergotchi for Lunar


Reproducible builds: week 30 in Stretch cycle

What happened in the reproducible builds effort this week:

Toolchain fixes

  • Markus Koschany uploaded antlr3/3.5.2-3 which includes a fix by Emmanuel Bourg to make the generated parser reproducible.
  • Markus Koschany uploaded maven-bundle-plugin/2.4.0-2 which includes a fix by Emmanuel Bourg to use the date in the DEB_CHANGELOG_DATETIME variable in the file embedded in the jar files.
  • Niels Thykier uploaded debhelper/9.20151116 which makes the timestamp of directories created by dh_install, dh_installdocs, and dh_installexamples reproducible. Patch by Niko Tyni.

Mattia Rizzolo uploaded a version of perl to the “reproducible” repository including the patch written by Niko Tyni to add support for SOURCE_DATE_EPOCH in Pod::Man.

Dhole sent an updated version of his patch adding support for SOURCE_DATE_EPOCH in GCC to the upstream mailing list. Several comments have been made in response which have been quickly addressed by Dhole.

Dhole also forwarded his patch adding support for SOURCE_DATE_EPOCH in libxslt upstream.

Packages fixed

The following packages have become reproducible due to changes in their build dependencies: antlr3/3.5.2-3, clusterssh, cme, libdatetime-set-perl, libgraphviz-perl, liblingua-translit-perl, libparse-cpan-packages-perl, libsgmls-perl, license-reconcile, maven-bundle-plugin/2.4.0-2, siggen, stunnel4, systemd, x11proto-kb.

The following packages became reproducible after getting fixed:

Some uploads fixed some reproducibility issues, but not all of them:

Vagrant Cascadian has set up a new armhf node using a Raspberry Pi 2. It should soon be added to the Jenkins infrastructure.

diffoscope development

diffoscope version 42 was release on November 20th. It adds a missing dependency on python3-pkg-resources and to prevent similar regression another autopkgtest to ensure that the command line is functional when Recommends are not installed. Two more encoding related problems have been fixed (#804061, #805418). A missing Build-Depends has also been added on binutils-multiarch to make the test suite pass on architectures other than amd64.

Package reviews

180 reviews have been removed, 268 added and 59 updated this week.

70 new “fail to build from source” bugs have been reported by Chris West, Chris Lamb and Niko Tyni.

New issue this week: randomness_in_ocaml_preprocessed_files.


Jim MacArthur started to work on a system to rebuild and compare packages built on using .buildinfo and

On December 1-3rd 2015, a meeting of about 40 participants from 18 different free software projects will be held in Athens, Greece with the intent of improving the collaboration between projects, helping new efforts to be started, and brainstorming on end-user aspects of reproducible builds.

23 November, 2015 04:43PM

hackergotchi for Jonathan Dowland

Jonathan Dowland

CDs should come with download codes

boxes of CDs & the same data on MicroSD

boxes of CDs & the same data on MicroSD

There's a Vinyl resurgence going on, with vinyl record sales growing year-on-year. Many of the people buying records don't have record players. Many records are sold including a download code, granting the owner an (often one-time) opportunity to download a digital copy of the album they just bought.

Some may be tempted to look down upon those buying vinyl records, especially those who don't have a means to play them. The record itself is, now more than ever, a physical totem rather than a media for the music. But is this really that different to how we've treated audio CDs this century?

For at least 15 years, I've ripped every CD I've bought and then stored it in a shoebox. (I'm up to 10 shoeboxes). The ripped copy is the only thing I listen to. The CD is little more than a totem, albeit one which I have to use in a relatively inconvenient ritual in order to get something I can conveniently listen to.

The process of ripping CDs has improved a lot in this time, but it's still a pain. CD-ROM drives are also becoming a lot more scarce. Ripping is not necessary reliable, either. The best tool to verify a rip is AccurateRip, a privately-owned database of track checksums. The private status is a problem for the community (Remember what happened to CDDB?) and is only useful if other people using an AccurateRip-supported ripper have already successfully ripped the CD.

Then there's things like CD pre-emphasis. It turns out that the Red Book standard defines a rarely-used flag that means the CD (or individual tracks) have had pre-emphasis applied to the treble-end of the frequency spectrum. The CD player is supposed to apply de-emphasis on playback. This doesn't happen if you fetch the audio data digitally, so it becomes the CD rippers responsibility to handle this. CD rippers have only relatively recently grown support for it. Awareness has been pretty low, so low that nobody has a good idea about how many CDs actually have pre-emphasis set: it's thought to be very rare, but (as far as I know) MusicBrainz doesn't (yet) track it.

So some proportion of my already-ripped CDs may have actually been ripped incorrectly, and I can't easily determine which ones without re-ripping them all. I know that at least my Quake computer game CD has it set, and I have suspicions about some other releases.

Going forward, this could be avoided entirely if CDs were treated more like totems, as vinyl records are, than the media delivering the music itself, and if record labels routinely included download cards with audio CDs. For just about anyone, no matter how the music was obtained, media-less digital is the canonical form for engaging with it. Attention should also be paid to make sure that digital releases are of a high quality: but that's a topic for another blog post.

23 November, 2015 04:06PM

hackergotchi for Gergely Nagy

Gergely Nagy

Keyboard updates

Last Friday, I compiled a list of keyboards I'm interested in, and received a lot of incredible feedback, thank you all! This allowed me to shorten the list considerably, two basically two pieces. I'm reasonably sure by now which one I want to buy (both), but will spend this week calming down to avoid impulse-buying. My attention was also brought to a few keyboards originally not on my list, and I'll take this opportunity to present my thoughts on those too.

The Finalists




  • Great design, by the looks of it.
  • Mechanical keys.
  • Open source hardware and firmware, thus programmable.
  • Thumb keys.
  • Available as an assembled product, from multiple sources.


  • Primarily a kit, but assembled available.
  • Assembled versions aren't as nice as home-made variants.


The keyboard looks interesting, primarily due to the thumb keys. From the ErgoDox EZ campaign, I'm looking at $270. That's friendly, and makes ErgoDox a viable option! (Thanks @miffe!)

There's also another option, FalbaTech, which ships sooner, I can customize the keyboard to some extent, and Poland is much closer to Hungary than the US. With this option, I'm looking at $205 + shipping, a very low price for what the keyboard has to offer. (Thanks @pkkolos for the suggestion!)

Keyboardio M01

Keyboardio Model 01


  • Mechanical keyboard.
  • Hardwood body.
  • Blank and dot-only keycaps option.
  • Open source: firmware, hardware, and so on. Comes with a screwdriver.
  • The physical key layout has much in common with my TypeMatrix.
  • Numerous thumb-accessible keys.
  • A palm key, that allows me to use the keyboard as a mouse.
  • Fully programmable LEDs.
  • Custom macros, per-application even.


  • Fairly expensive.
  • Custom keycap design, thus rearranging them physically is not an option, which leaves me with the blank or dot-only keycap options only.
  • Available late summer, 2016.


With shipping cost and whatnot, I'm looking at something in the $370 ballpark, which is on the more expensive side. On the other hand, I get a whole lot of bang for my buck: LEDs, two center bars (tripod mounting sounds really awesome!), hardwood body, and a key layout that is very similar to what I came to love on the TypeMatrix.

I also have a thing for wooden stuff. I like the look of it, the feel of it.

The Verdict

Right now, I'm seriously considering the Model 01, because even if it is about twice the price of the ErgoDox, it also offers a lot more: hardwood body (I love wood), LEDs, palm key. I also prefer the layout of the thumb keys on the Model 01.

The Model 01 also comes pre-assembled, looks stunning, while the ErgoDox pales a little in comparsion. I know I could make it look stunning too, but I do not want to build things. I'm not good at it, I don't want to be good at it, I don't want to learn it. I hate putting things together. I'm the kind of guy who needs three tries to put together a set of IKEA shelves, and I'm not exaggerating. I also like the shape of the keys better on the Model 01.

Nevertheless, the ErgoDox is still an option, due to the price. I'd love to buy both, if I could. Which means that once I'm ready to replace my keyboard at work, I will likely buy an ErgoDox. But for home, Model 01 it is, unless something even better comes along before my next pay.

The Kinesis Advantage was also a strong contender, but I ended up removing it from my preferred options, because it doesn't come with blank keys, and is not a split keyboard. And similar to the ErgoDox, I prefer the Model 01's thumb-key layout. Despite all this, I'm very curious about the key wells, and want to try it someday.

Suggested options



Suggested by Andred Carter, a very interesting keyboard with a unique design.


  • Portable, foldable.
  • Active support for forearm and hand.
  • Hands never obstruct the view.


  • Not mechanical.
  • Needs a special inlay.
  • Best used for word processing, programmers may run into limitations.


I like the idea of the keyboard, and if it wouldn't need a special inlay, but used a small screen or something to show the keys, I'd like it even more. Nevertheless, I'm looking for a mechanical keyboard right now, which I can also use for coding.

But I will definitely keep the Yogitype in mind for later!

Matias Ergo Pro

Matias Ergo Pro


  • Mechanical keys.
  • Simple design.
  • Split keyboard.


  • Doesn't seem to come with a blank keys option, nor in Dvorak.
  • No thumb key area.
  • Neither open source, nor open hardware.
  • I have no need for the dedicated undo, cut, paste keys.
  • Does not appear to be programmable.


This keyboard hardly meets any of my desired properties, and doesn't have anything standing out in comparison with the others. I had a quick look at this when compiling my original list, but was quickly discarded. Nevertheless, people asked me why, so I'm including my reasoning here:

While it is a split keyboard, with a fairly simple design, it doesn't come in the layout I'd prefer, nor with blank keys. It lacks the thumb key area that ErgoDox and the Model 01 have, and which I developed an affection for.

Microsoft Sculpt Ergonomic Keyboard

Microsoft Sculpt


  • Numpad is a separate unit.
  • Reverse tilt.
  • Well positioned, big Alt keys.
  • Cheap.


  • Not a split keyboard.
  • Not mechanical.
  • No blank or Dvorak option as far as I see.


This keyboard does not buy me much over my current TypeMatrix 2030. If I'd be looking for the cheapest possible among ergonomic keyboards, this would be my choice. But only because of the price.

Truly Ergonomic Keyboard

Truly Ergonomic Keyboard


  • Mechanical.
  • Detachable palm rest.
  • Programmable firmware.


  • Not a split keyboard.
  • Layouts are virtual only, the printed keycaps stay QWERTY, as far as I see.
  • Terrible navigation key setup.


Two important factors for me are physical layout and splittability. This keyboard fails both. While it is a portable device, that's not a priority for me at this time.

23 November, 2015 12:00PM by Gergely Nagy

hackergotchi for Jonathan Dowland

Jonathan Dowland

On BBC 6 Music

Back in July I had a question of mine read out on the Radcliffe and Maconie programme on BBC 6 Music. The pair were interviewing Stephen Morris of New Order and I took the opportunity to ask a question about backing vocals on the 1989 song "Run2". Here's the question and answer (318K MP3, 21s):

23 November, 2015 11:02AM

hackergotchi for Thomas Goirand

Thomas Goirand

OpenStack Liberty and Debian

Long over due post

It’s been a long time I haven’t written here. And lots of things happened in the OpenStack planet. As a full time employee with the mission to package OpenStack in Debian, it feels like it is kind of my duty to tell everyone about what’s going on.

Liberty is out, uploaded to Debian

Since my last post, OpenStack Liberty, the 12th release of OpenStack, was released. In late August, Debian was the first platform which included Liberty, as I proudly outran both RDO and Canonical. So I was the first to make the announcement that Liberty passed most of the Tempest tests with the beta 3 release of Liberty (the Beta 3 is always kind of the first pre-release, as this is when feature freeze happens). Though I never made the announcement that Liberty final was uploaded to Debian, it was done just a single day after the official release.

Before the release, all of Liberty was living in Debian Experimental. Following the upload of the final packages in Experimental, I uploaded all of it to Sid. This represented 102 packages, so it took me about 3 days to do it all.

Tokyo summit

I had the pleasure to be in Tokyo for the Mitaka summit. I was very pleased with the cross-project sessions during the first day. Lots of these sessions were very interesting for me. In fact, I wish I could have attended them all, but of course, I can’t split myself in 3 to follow all of the 3 tracks.

Then there was the 2 sessions about Debian packaging on upstream OpenStack infra. The goal is to setup the OpenStack upstream infrastructure to allow packaging using Gerrit, and gating each git commit using the usual tools: building the package and checking there’s no FTBFS, running checks like lintian, piuparts and such. I knew already the overview of what was needed to make it happen. What I didn’t know was the implementation details, which I hoped we could figure out during the 1:30 slot. Unfortunately, this didn’t happen as I expected, and we discussed more general things than I wished. I was told that just reading the docs from the infra team was enough, but in reality, it was not. What currently needs to happen is building a Debian based image, using disk-image-builder, which would include the usual tools to build packages: git-buildpackage, sbuild, and so on. I’m still stuck at this stage, which would be trivial if I knew a bit more about how upstream infra works, since I already know how to setup all of that on a local machine.

I’ve been told by Monty Tailor that he would help. Though he’s always a very busy man, and to date, he still didn’t find enough time to give me a hand. Nobody replied to my request for help in the openstack-dev list either. Hopefully, with a bit of insistence, someone will help.

Keystone migration to Testing (aka: Debian Stretch) blocked by python-repoze.who

Absolutely all of OpenStack Liberty, as of today, has migrated to Stretch. All? No. Keystone is blocked by a chain of dependency. Keystone depends on python-pysaml2, itself blocked by python-repoze.who. The later, I upgraded it to version 2.2. Though python-repoze.what depends on version <= 1.9, which is blocking the migration. Since python-repoze.who-plugins, python-repoze.what and python-repoze.what-plugins aren’t used by any package anymore, I asked for them to be removed from Debian (see #805407). Until this request is processed by the FTP masters, Keystone, which is the most important piece of OpenStack (it does the authentication) will be blocked for migration to Stretch.

New OpenStack server packages available

On my presentation at Debconf 15, I quickly introduced new services which were released upstream. I have since packaged them all:

  • Barbican (Key management as a Service)
  • Congress (Policy as a Service)
  • Magnum (Container as a Service)
  • Manila (Filesystem share as a Service)
  • Mistral (Workflow as a Service)
  • Zaqar (Queuing as a Service)

Congress, unfortunately, was not accepted to Sid yet, because of some licensing issues, especially with the doc of python-pulp. I will correct this (remove the non-free files) and reattempt an upload.

I hope to make them all available in jessie-backports (see below). For the previous release of OpenStack (ie: Kilo), I skipped the uploads of services which I thought were not really critical (like Ironic, Designate and more). But from the feedback of users, they would really like to have them all available. So this time, I will upload them all to the official jessie-backports repository.

Keystone v3 support

For those who don’t know about it, Keystone API v3 means that, on top of the users and tenant, there’s a new entity called a “domain”. All of the Liberty is now coming with Keystone v3 support. This includes the automated Keystone catalog registration done using debconf for all *-api packages. As much as I could tell by running tempest on my CI, everything still works pretty well. In fact, Liberty is, to my experience, the first release of OpenStack to support Keystone API v3.

Uploading Liberty to jessie-backports

I have rebuilt all of Liberty for jessie-backports on my laptop using sbuild. This is more than 150 packages (166 packages currently). It took me about 3 days to rebuild them all, including unit tests run at build time. As soon as #805407 is closed by the FTP masters, all what’s remaining will be available in Stretch (mostly Keystone), and the upload will be possible. As there will be a lot of NEW packages (from the point of view of backports), I do expect that the approval will take some time. Also, I have to warn the original maintainers of the packages that I don’t maintain (for example, those maintained within the DPMT), that because of the big number of packages, I will not be able to process the usual communication to tell that I’m uploading to backports. However, here’s the list of package. If you see one that you maintain, and that you wish to upload the backport by yourself, please let me know. Here’s the list of packages, hopefully, exhaustive, that I will upload to jessie-backports, and that I don’t maintain myself:

alabaster contextlib2 kazoo python-cachetools python-cffi python-cliff python-crank python-ddt python-docker python-eventlet python-git python-gitdb python-hypothesis python-ldap3 python-mock python-mysqldb python-pathlib python-repoze.who python-setuptools python-smmap python-unicodecsv python-urllib3 requests routes ryu sphinx sqlalchemy turbogears2 unittest2 zzzeeksphinx.

More than ever, I wish I could just upload these to a PPA^W Bikeshed, to minimize the disruption for both the backports FTP masters, other maintainers, and our OpenStack users. Hopefully, Bikesheds will be available soon. I am sorry to give that much approval work to the backports FTP masters, however, using the latest stable system with the latest release, is what most OpenStack users really want to do. All other major distributions have specific repositories too (ie: RDO for CentOS / Red Hat, and cloud archive for Ubuntu), and stable-backports is currently the only place where I can upload support for the Stable release.

Debian listed as supported distribution on

Good news! If you go at you will see a list of supported distributions. I am proud to be able to tell that, after 6 months of lobbying from my side, Debian is also listed there. The process of having Debian there included talking with folks from the OpenStack foundation, and having Bdale to sign an agreement so that the Debian logo could be reproduced on Thanks to Bdale Garbee, Neil McGovern, Jonathan Brice, and Danny Carreno, without who this wouldn’t have happen.

23 November, 2015 08:30AM by Goirand Thomas

November 21, 2015

hackergotchi for Bálint Réczey

Bálint Réczey

Wireshark 2.0 switched default UI to Qt in unstable

Wireshark With the latest release the Wireshark Project decided to make the Qt GUI the default interface. In line with Debian’s Policy the packages shipped by Debian also switched the default GUI to minimize the difference from upstream. The GTK+ interface which was the previous default is still available from the wireshark-gtk package.

You can read more about the new 2.0.0 release in the release notes or on the Wireshark Blog featuring some of the improvements.

Happy sniffing!

update: Wireshark 2.0.0 will be available from testing and jessie-backports in a week. Ubuntu users can already download binary packages from the Wireshark stable releases PPA maintained by the Wireshark Project (including me:-)).

21 November, 2015 10:54PM by Réczey Bálint

hackergotchi for Jonathan McDowell

Jonathan McDowell

Updating a Brother HL-3040CN firmware from Linux

I have a Brother HL-3040CN networked colour laser printer. I bought it 5 years ago and I kinda wish I hadn’t. I’d done the appropriate research to confirm it worked with Linux, but I didn’t realise it only worked via a 32-bit binary driver. It’s the only reason I have 32 bit enabled on my house server and I really wish I’d either bought a GDI printer that had an open driver (Samsung were great for this in the past) or something that did PCL or Postscript (my parents have an Xerox Phaser that Just Works). However I don’t print much (still just on my first set of toner) and once setup the driver hasn’t needed much kicking.

A more major problem comes with firmware updates. Brother only ship update software for Windows and OS X. I have a Windows VM but the updater wants the full printer driver setup installed and that seems like overkill. I did a bit of poking around and found reference in the service manual to the ability to do an update via USB and a firmware file. Further digging led me to a page on resurrecting a Brother HL-2250DN, which discusses recovering from a failed firmware flash. It provided a way of asking the Brother site for the firmware information.

First I queried my printer details:

$ snmpwalk -v 2c -c public hl3040cn.local iso.
iso. = STRING: "MODEL=\"HL-3040CN series\""
iso. = STRING: "SPEC=\"0001\""
iso. = STRING: "FIRMID=\"MAIN\""
iso. = STRING: "FIRMVER=\"1.11\""
iso. = STRING: "FIRMVER=\"1.02\""
iso. = STRING: ""
iso. = STRING: ""
iso. = STRING: ""
iso. = STRING: ""
iso. = STRING: ""
iso. = STRING: ""
iso. = STRING: ""
iso. = STRING: ""
iso. = STRING: ""

I used that to craft an update file which I sent to Brother via curl:

curl -X POST -d @hl3040cn-update.xml -H "Content-Type:text/xml" --sslv3

This gave me back some XML with a URL for the latest main firmware, version 1.19, filename LZ2599_N.djif. I downloaded that and took a look at it, discovering it looked like a PJL file. I figured I’d see what happened if I sent it to the printer:

cat LZ2599_N.djf | nc hl3040cn.local 9100

The LCD on the front of printer proceeded to display something like “Updating Program” and eventually the printer re-DHCPed and indicated the main firmware had gone from 1.11 to 1.19. Great! However the PCLPS firmware was still at 1.02 and I’d got the impression that 1.04 was out. I didn’t manage to figure out how to get the Brother update website to give me the 1.04 firmware, but I did manage to find a copy of LZ2600_D.djf which I was then able to send to the printer in the same way. This led to:

$ snmpwalk -v 2c -c public hl3040cn.local iso.
iso. = STRING: "MODEL=\"HL-3040CN series\""
iso. = STRING: "SPEC=\"0001\""
iso. = STRING: "FIRMID=\"MAIN\""
iso. = STRING: "FIRMVER=\"1.19\""
iso. = STRING: "FIRMVER=\"1.04\""
iso. = STRING: ""
iso. = STRING: ""
iso. = STRING: ""
iso. = STRING: ""
iso. = STRING: ""
iso. = STRING: ""
iso. = STRING: ""
iso. = STRING: ""
iso. = STRING: ""

Cool, eh?

[Disclaimer: This worked for me. I’ve no idea if it’ll work for anyone else. Don’t come running to me if you brick your printer.]

21 November, 2015 01:27PM

November 20, 2015

hackergotchi for Jonathan Dowland

Jonathan Dowland


It's been at least a year since I last did any work on Debian, but this week I finally uploaded a new version of squishyball, an audio sample comparison tool, incorporating a patch from Thibaut Girka which fixes the X/X/Y test method. Shamefully Thibaut's patch is nearly a year old too. Better late than never...

I've also uploaded a new version of smartmontools which updates the package to the new upstream version. I'm not the regular maintainer for this package, but it is in the set of packages covered by the collab-maint team. To be polite I uploaded it to DELAYED-7, so it will take a week to hit unstable. I've temporarily put a copy of the package here in the meantime.

20 November, 2015 09:08PM

John Goerzen

I do not fear

I am so saddened by the news this week. The attacks in Paris, Beirut, and Mali. The reaction of fear, anger, and hate. Governors racing to claim they will keep out refugees, even though they lack the power to do so. Congress voting to keep out refugees.

Emotions are a powerful thing. They can cause people to rise up and accomplish stunning things that move humanity forward. And they can move us back. Fear, and the manipulation of it, is one of those.

What have I to fear?

Even if the United States accepted half a million Syrian refugees tomorrow, I would be far more likely to die in a car accident than at the hands of a Syrian terrorist. I am a careful and cautious person, but I understand that life is not lived unless risk is balanced. I know there is a risk of being in a car crash every time I drive somewhere — but if that kept me at home, I would never see my kids’ violin concert, the beautiful “painted” canyon of Texas, or the Flint Hills of Kansas. So I drive smart and carefully, but I still drive without fear. I accept this level of risk as necessary to have a life worth living in this area (where there are no public transit options and the nearest town is miles away).

I have had pain in my life. I’ve seen grandparents pass away, I’ve seen others with health scares. These things are hard to think about, but they happen to us all at some point.

What have I to fear?

I do not fear giving food to the hungry, shelter to the homeless, comfort to those that have spent the last years being shot at. I do not fear helping someone that is different than me. If I fail to do these things for someone because of where they come from or what their holy book is, then I have become less human. I have become consumed by fear. I have let the terrorists have control over my life. And I refuse to do that.

If governors really wanted to save lives, they would support meaningful mass transit alternatives that would prevent tens of thousands of road deaths a year. They would support guaranteed health care for all. They would support good education, science-based climate change action, clean water and air, mental health services for all, and above all, compassion for everyone.

By supporting Muslim registries, we look like Hitler to them. By discriminating against refugees based on where they’re from or their religion, we support the terrorists, making it easy for them to win hearts and minds. By ignoring the fact that entering the country as a refugee takes years, as opposed to entering as a tourist taking only minutes, we willfully ignore the truth about where dangers lie.

So what do I have to fear?

Only, as the saying goes, fear. Fear is making this country turn its backs on the needy. Fear is making not just the US but much of Europe turn its backs on civil liberties and due process. Fear gives the terrorists control, and that helps them win.

I refuse. I simply refuse to play along. No terrorist, no politician, no bigot gets to steal MY humanity.

Ultimately, however, I know that the long game is not one of fear. The arc of the universe bends towards justice, and ultimately, love wins. It takes agonizingly long sometimes, but in the end, love wins.

So I do not fear.

20 November, 2015 07:22PM by John Goerzen

hackergotchi for Gergely Nagy

Gergely Nagy

Looking for a keyboard

Even though I spend more time staring at the screen than typing, there are times when I - after lots and lots of prior brain work - sit down and start typing, a lot. A couple of years ago, I started to feel pain in my wrists, and there were multiple occasions when I had to completely stop writing for longer periods of time. These were situations I obviously did not want repeated, so I started to look for remedies. First, I bought a new keyboard, a TypeMatrix 2300, which while not ergonomic, was a huge relief for my hands and wrists. I also started to learn Dvorak, but that's still something that is kind-of in progress: my left hand can write Dvorak reasonably fast, but my right one seems to be Qwerty-wired, even after a month of typing Dvorak almost exclusively.

This keyboard served me well for the past five year or so. But recently, I started to look for a replacement, partly triggered by a Clojure/conj talk I watched. I got as far as assembling a list of keyboards I'm interested in, but I have a hard time choosing. This blog post here serves two purposes then: first to make a clear pros/cons list for myself, second, to solicit feedback from others who may have more experience with any of the options below.

Update: There is a [follow up post], with a few more keyboards explored, and a semi-final verdict. Thanks everyone for the feedback and help, much appreciated!

Lets start with the current keyboard!

TypeMatrix 2030

TypeMatrix 2030


  • The Matrix architecture, with straight vertical key columns has been incredibly convenient.
  • Enter and Backspace in the middle, both large: loving it.
  • Skinnable (easier to clean, and aids in learning a new layout).
  • Optional dvorak skin, and a hardware Dvorak switch.
  • The layout (cursor keys, home/end, page up/down, etc) is something I got used to very fast.
  • Multimedia keys close by with Fn.
  • Small, portable, lightweight - ideal for travel.


  • Small: while also a feature, this is a downside too. Shoulder position is not ideal.
  • Skins: while they are a terrific aid when learning a new layout, and make cleaning a lot easier, they wear off quickly. Sometimes fingernails are left to grow too long, and that doesn't do good to the skin. One of my two QWERTY layouts has a few holes already, sadly.
  • Not a split keyboard, which is starting to feel undesirable.


All in all, this is a keyboard I absolutely love, and am very happy with. Yet, I feel I'm ready to try something different. With my skins aging, and the aforementioned Clojure/conj talk, the desire to switch has been growing for a while now.

Desired properties

There are a few desired properties of the keyboard I want next. The perfect keyboard need not have all of these, but the more the merrier.

  • Ergonomic design.
  • Available in Dvorak, or with blank keys.
  • Preferably a split keyboard, so I can position the two parts as I see fit.
  • Ships to Hungary, or Germany, in a reasonable time frame. (If all else fails, shipping to the US may work too, but I'd rather avoid going through extra hoops.)
  • Mechanical keys preferred. But not the loud clicky type: I work in an office; and at home, I don't want to wake my wife either.

I plan to buy one keyboard for a start, but may end up buying another to bring to work (like I did with the TypeMatrix, except my employer at the time bought the second one for me). At work, I will continue using the TypeMatrix, most likely, but I'm not sure yet.

Anyhow, there are a number of things I do with my computer that require a keyboard:

  • I write code, a considerable amount.
  • I write prose, even more than code. Usually in English, sometimes in Hungarian.
  • I play games. Most of them, with a dedicated controller, but there are some where I use the keyboard a lot.
  • I browse the web, listen to music, and occasionally edit videos.
  • I multi-task all the time.
  • 90% of my time is spent within Emacs (recently switched to Spacemacs).
  • I hate the mouse, with a passion. Trackballs, trackpoints and touchpads even more. If I can use my keyboard to do mouse-y stuff well enough to control the browser, and do some other things that do not require precise movement (that is, not games), I'll be very happy.

I am looking for a keyboard that helps me do these things. A keyboard that will stay with me not for five years or a decade, but pretty much forever.

The options

Ultimate Hacking Keyboard

Ultimate Hacking Keyboard


  • Split keyboard.
  • Mechanical keys (with a quiet option).
  • Ships to Hungary. Made in Hungary!
  • Optional addons: three extra buttons and a small trackball for the left side, and a trackball for the right side. While I'm not a big fan of the mouse, the primary reasons is that I have to move my hand. If it's in the middle, that sounds much better.
  • Four layers of the factory keymap: I love the idea of these layers, especially the mouse layer.
  • Programmable, so I can define any layout I want.
  • Open source firmware, design and agent!
  • An optional palm rest is available as well.
  • Blank option available.


  • Likely not available before late summer, 2016.
  • No thumb keys.
  • Space/Mod arrangement feels alien.
  • The LED area is useless to me, and bothers my eye. Not a big deal, but still.
  • While thumb keys are available for the left side, not so for the right one. I'd rather have keys there than a trackball. The only reason I'd want the $50 addon set, is the left thumb-key module (which also seems to have a trackpoint, another pointless gadget).


The keyboard looks nice, has a lot of appealing features. It is programmable, so much so that by the looks of it, I could emulate the hardware dvorak switch my TypeMatrix has. However, I'm very unhappy with the addons, so there's that too.

All in all, this would cost me about $304 (base keyboard, modules, palm rest and shipping). Not too bad, certainly a strong contender, despite the shortcomings.




  • Great design, by the looks of it.
  • Mechanical keys.
  • Open source hardware and firmware, thus programmable.
  • Thumb keys.
  • Available via ErgoDox EZ as an assembled product.


  • Primarily a kit, but assembled available.
  • Not sure when it'd ship (december shipments are sold out).


The keyboard looks interesting, primarily due to the thumb keys. From the ErgoDox EZ campaign, I'm looking at $270. That's friendly, and makes ErgoDox a viable option! (Thanks @miffe!)

Kinesis Advantage

Kinesis Advantage


  • Mechanical keys, Cherry-MX brown.
  • Separate thumb keys.
  • Key wells look interesting.
  • Available right now.
  • QWERTY/Dvorak layout available.


  • Not a split keyboard.
  • Not open source, neither hardware, nor firmware.
  • Shipping to Hungary may be problematic.
  • The QWERTY/Dvorak layout is considerably more expensive.
  • Judging by some of the videos I saw, keys are too loud.


The key wells look interesting, but it's not a split keyboard, nor is it open source. The cost come out about $325 plus shipping and VAT and so on, so I'm probably looking at something closer to $400. Nah. I'm pretty sure I can rule this out.

Kinesis FreeStyle2

Kinesis FreeStyle2


  • Split keyboard.
  • Available right now.
  • Optional accessory, to adjust the slope of the keyboard.


  • Not open source, neither hardware, nor firmware.
  • Doesn't seem to be mechanical.
  • Shipping to Hungary may be problematic.
  • No Dvorak layout.
  • No thumb keys.


While a split keyboard, at a reasonably low cost ($149 + shipping + VAT), it lacks too many things to be considered a worthy contender.




  • Mechanical keyboard.
  • Key wells.
  • Thumb keys.
  • Built in palm rest.
  • Available in Dvorak too.


  • Not a split keyboard.
  • The center numeric area looks weird.
  • Not sure about programmability.
  • Not open source.
  • Expensive.


Without shipping, I'm looking at £450. That's a very steep price. I love the wells, and the thumb keys, but it's not split, and customisability is a big question here.




  • Sleek, compact design.
  • No keycaps.
  • Mechanical keyboard.
  • Open source firmware.
  • More keys within thumbs reach.
  • Available right now.


  • Ships as a DIY kit.
  • Not a split keyboard.


While not a split keyboard, it does look very interesting, and the price is much lower than the rest: $149 + shipping ($50 or so). It is similar - in spirit - to my existing TypeMatrix. It wouldn't take much to get used to, and is half the price of the alternatives. A strong option, for sure.

Keyboardio M01

Keyboardio Model 01


  • Mechanical keyboard.
  • Hardwood body.
  • Blank and dot-only keycaps option.
  • Open source: firmware, hardware, and so on. Comes with a screwdriver.
  • The physical key layout has much in common with my TypeMatrix.
  • Numerous thumb-accessible keys.
  • A palm key, that allows me to use the keyboard as a mouse.
  • Fully programmable LEDs.
  • Custom macros, per-application even.


  • Fairly expensive.
  • Custom keycap design, thus rearranging them physically is not an option, which leaves me with the blank or dot-only keycap options only.
  • Available late summer, 2016.


With shipping cost and whatnot, I'm looking at something in the $370 ballpark, which is on the more expensive side. On the other hand, I get a whole lot of bang for my buck: LEDs, two center bars (tripod mounting sounds really awesome!), hardwood body, and a key layout that is very similar to what I came to love on the TypeMatrix.

I also have a thing for wooden stuff. I like the look of it, the feel of it.

The Preference List

After writing this all up, I think I prefer the Model 01, but the UHK and ErgoDox come close too!

The UHK is cheaper, but not by a large margin. It lacks the thumb keys and the palm key the M01 has. It also looks rather dull (sorry). They'd both ship about the same time, but, the M01 is already funded, while the UHK is not (mind you, there's a pretty darn high chance it will be).

The ErgoDox has thumb keys, split keyboard, and is open source. Compared to the UHK, we have the thumb keys, and less distraction, for a better price. But the case is not so nice. Compared to the Model 01: no leds, or center bar, and an inferior case. But, much better price, which is an important factor too.

Then, there's the Atreus. While it's a DIY kit, it is much more affordable than the rest, and I could have it far sooner. Yet... it doesn't feel like a big enough switch from my current keyboard. I might as well continue using the TypeMatrix then, right?

The rest, I ruled out earlier, while I was reviewing them anyway.

So, the big question is: should I invest close to $400 into a keyboard that looks stunning, and will likely grow old with me? Or should I give up some of the features, and settle for one of the $300 ones, that'll also grow old with me. Or is there an option I did not consider, that may match my needs and preferences better?

If you, my dear reader, got this far, and have a suggestion, please either tweet at me, or write an email, or reach me over any other medium I am reachable at (including IRC, hanging out as algernon on FreeNode and OFTC).

Thank you in advance, to all of you who contact me, and help me choose a keyboard!

20 November, 2015 06:45PM by Gergely Nagy

hackergotchi for Daniel Pocock

Daniel Pocock

Databases of Muslims and homosexuals?

One US presidential candidate has said a lot recently, but the comments about making a database of Muslims may qualify as the most extreme.

Of course, if he really wanted to, somebody with this mindset could find all the Muslims anyway. A quick and easy solution would involve tracing all the mobile phone signals around mosques on a Friday. Mr would-be President could compel Facebook and other social networks to disclose lists of users who identify as Muslim.

Databases are a dangerous side-effect of gay marriage

In 2014 there was significant discussion about Brendan Eich's donation to the campaign against gay marriage.

One fact that never ranked very highly in the debate at the time is that not all gay people actually support gay marriage. Even where these marriages are permitted, not everybody who can marry now is choosing to do so.

The reasons for this are varied, but one key point that has often been missed is that there are two routes to marriage equality: one involves permitting gay couples to visit the register office and fill in a form just as other couples do. The other route to equality is to remove all the legal artifacts around marriage altogether.

When the government does issue a marriage certificate, it is not long before other organizations start asking for confirmation of the marriage. Everybody from banks to letting agents and Facebook wants to know about it. Many companies outsource that data into cloud CRM systems such as Salesforce. Before you know it, there are numerous databases that somebody could mine to make a list of confirmed homosexuals.

Of course, if everybody in the world was going to live happily ever after none of this would be a problem. But the reality is different.

While discrimination: either against Muslims or homosexuals - is prohibited and can even lead to criminal sanctions in some countries, this attitude is not shared globally. Once gay people have their marriage status documented in the frequent flyer or hotel loyalty program, or in the public part of their Facebook profile, there are various countries where they are going to be at much higher risk of prosecution/persecution. The equality to marry in the US or UK may mean they have less equality when choosing travel destinations.

Those places are not as obscure as you might think: even in Australia, regarded as a civilized and laid-back western democracy, the state of Tasmania fought tooth-and-nail to retain the criminalization of virtually all homosexual conduct until 1997 when the combined actions of the federal government and high court compelled the state to reform. Despite the changes, people with some of the most offensive attitudes are able to achieve and retain a position of significant authority. The same Australian senator who infamously linked gay marriage with bestiality has successfully used his position to set up a Senate inquiry as a platform for conspiracy theories linking Halal certification with terrorism.

There are many ways a database can fall into the wrong hands

Ironically, one of the most valuable lessons about the risk of registering Muslims and homosexuals was an injustice against the very same tea-party supporters a certain presidential candidate is trying to woo. In 2013, it was revealed IRS employees had started applying a different process to discriminate against groups with Tea party in their name.

It is not hard to imagine other types of rogue or misinformed behavior by people in positions of authority when they are presented with information that they don't actually need about somebody's religion or sexuality.

Beyond this type of rogue behavior by individual officials and departments, there is also the more sinister proposition that somebody truly unpleasant is elected into power and can immediately use things like a Muslim database, surveillance data or the marriage database for a program of systematic discrimination. France had a close shave with this scenario in the 2002 presidential election when
Jean-Marie Le Pen, who has at least six convictions for racism or inciting racial hatred made it to the final round in a two-candidate run-off with Jacques Chirac.

The best data security

The best way to be safe- wherever you go, both now and in the future - is not to have data about yourself on any database. When filling out forms, think need-to-know. If some company doesn't really need your personal mobile number, your date of birth, your religion or your marriage status, don't give it to them.

20 November, 2015 06:02PM by Daniel.Pocock

Sylvain Beucler

Rebuilding Android proprietary SDK binaries

Going back to Android recently, I saw that all tools binaries from the Android project are now click-wrapped by a quite ugly proprietary license, among others an anti-fork clause (details below). Apparently those T&C are years old, but the click-wrapping is newer.

This applies to the SDK, the NDK, Android Studio, and all the essentials you download through the Android SDK Manager.

Since I keep my hands clean of smelly EULAs, I'm working on rebuilding the Android tools I need.
We're talking about hours-long, quad-core + 8GB-RAM + 100GB-disk-eating builds here, so I'd like to publish them as part of a project who cares.

As a proof-of-concept, the Replicant project ships a 4.2 SDK and I contributed build instructions for ADT and NDK (which I now use daily).

(Replicant is currently stuck to a 2013 code base though.)

I also have in-progress instructions on my hard-drive to rebuild various newer versions of the SDK/API levels, and for the NDK whose releases are quite hard to reproduce (no git tags, requires fixes committed after the release, updates are partial rebuilds, etc.) - not to mention that Google doesn't publish the source code until after the official release (closed development) :/ And in some cases like Android Support Repository [not Library] I didn't even find the proper source code, only an old prebuilt.

Would you be interested in contributing, and would you recommend a structure that would promote Free, rebuilt Android *DK?

The legalese

Anti-fork clause:

3.4 You agree that you will not take any actions that may cause or result in the fragmentation of Android, including but not limited to distributing, participating in the creation of, or promoting in any way a software development kit derived from the SDK.

So basically the source is Apache 2 + GPL, but the binaries are non-free. By the way this is not a GPL violation because right after:

3.5 Use, reproduction and distribution of components of the SDK licensed under an open source software license are governed solely by the terms of that open source software license and not this License Agreement.

Still, AFAIU by clicking "Accept" to get the binary you still accept the non-free "Terms and Conditions".

(Incidentally, if Google wanted SDK forks to spread and increase fragmentation, introducing an obnoxious EULA is probably the first thing I'd have recommended. What was its legal team thinking?)

Indemnification clause:

12.1 To the maximum extent permitted by law, you agree to defend, indemnify and hold harmless Google, its affiliates and their respective directors, officers, employees and agents from and against any and all claims, actions, suits or proceedings, as well as any and all losses, liabilities, damages, costs and expenses (including reasonable attorneys fees) arising out of or accruing from (a) your use of the SDK, (b) any application you develop on the SDK that infringes any copyright, trademark, trade secret, trade dress, patent or other intellectual property right of any person or defames any person or violates their rights of publicity or privacy, and (c) any non-compliance by you with this License Agreement.

Usage restriction:

3.1 Subject to the terms of this License Agreement, Google grants you a limited, worldwide, royalty-free, non-assignable and non-exclusive license to use the SDK solely to develop applications to run on the Android platform.

3.3 You may not use the SDK for any purpose not expressly permitted by this License Agreement. Except to the extent required by applicable third party licenses, you may not: (a) copy (except for backup purposes), modify, adapt, redistribute, decompile, reverse engineer, disassemble, or create derivative works of the SDK or any part of the SDK; or (b) load any part of the SDK onto a mobile handset or any other hardware device except a personal computer, combine any part of the SDK with other software, or distribute any software or device incorporating a part of the SDK.

If you know the URLs, you can still direct-download some of the binaries which don't embed the license, but all this feels fishy. GNU licensing didn't answer me (yet). Maybe debian-legal has an opinion?

In any case, the difficulty to reproduce the *DK builds is worrying enough to warrant an independent rebuild.

Did you notice this?

20 November, 2015 02:18PM

No to ACTA - Paris

Today, there were events all around Europe to block ACTA.

In Paris, the protest started at Place de la Bastille :

APRIL was present, with in particular its president Lionel Allorge, and two members who wore the traditional anti-DRM suit :

Jérémie Zimmermann from La Quadrature du Net gave a speech and urged people to contact their legal representatives, in addition to protesting in the street :

The protest was cheerful and free of violence :

It got decent media coverage :

Notable places it crossed include Place des Victoires :

and Palais Royal, where it ended :

Next protest is in 2 weeks, on March 10th. Update your agenda!

20 November, 2015 02:18PM

New free OpenGL ES documentation

Great news!

The Learn OpenGL ES website recently switched its licensing to Creative Commons BY-SA 3.0 :)

It provides tutorials for OpenGL ES using Java/Android and WebGL, and is focusing on a more community-oriented creative process. Give them cheers!

20 November, 2015 02:18PM

Mini-sendmail... in bash

I recently faced an environment where there is no MTA.

WTF? The reason is that people who work there get security audits on a regular basis, and the security people are usually mo...deratly skilled guys who blindly run a set of scripts, e.g. by ordering to disable Apache modules that "where seen enabled in /etc/apache2/mods-available/"...

To avoid spending days arguing with them and nitpicking with non-technical managers, the system is trimmed to the minimum - and there is no MTA. No MTA, so no cron output, so difficulty to understand why last night's cron job failed miserably.

Since it was not my role to reshape the whole business unit, I decided to hack a super-light, but functional way to get my cron output:

cat <<'EOF' > /usr/sbin/sendmail
    echo "From me  $(LANG=C date)"
) >> /var/mail/all
chmod 755 /usr/sbin/sendmail

It works! :)

There is a companion logrotate script, to avoid filling the file system:

cat <<'EOF' > /etc/logrotate.d/mail-all
/var/mail/all {
  rotate 10
  create 622 root mail

Bootstrap with:

touch /var/mail/all
logrotate -f /var/mail/all

You now can check your sys-mails with:

mutt -f /var/mail/all


20 November, 2015 02:18PM

Meritous: Free game ported on Android

Base attack

Meritous is a nice, addictive action-adventure dungeon crawl game. Each new game is unique since the dungeon is built in a semi-random fashion. Last but not least, the engine, graphics and sound effects are GPL'd :)

The game is based on SDL 1.2, which has an unofficial Android variant, so I decided to try and port it on my cell phone! The port was surprinsingly smooth and only non-SDL fixes (move big stack allocation to heap) were necessary. Who said it was difficult to program in C on Android? ;)

It was also an opportunity to study the build system for F-Droid, an app market for free software apps, where APKs are rebuilt from source. The spec-like file is here.

The game packaging is also being ressurected for Debian but is being distressfully held hostage in the NEW queue for 2 weeks!

You can download the very first (aka beta) Android version:

  • for free at F-Droid
  • for 0.50€ at GPlay - because publishing at GPlay costs $25 (+30% of sells..)

Comments welcome!

20 November, 2015 02:18PM

hackergotchi for Timo Jyrinki

Timo Jyrinki

Converting an existing installation to LUKS using luksipc

This is a burst of notes that I wrote in an e-mail in June when asked about it, and I'm not going to have any better steps since I don't remember even that amount as back then. I figured it's better to have it out than not.

So... if you want to use LUKS In-Place Conversion Tool, the notes below on converting a shipped-with-Ubuntu Dell XPS 13 Developer Edition (2015 Intel Broadwell model) may help you. There were a couple of small learnings to be had...
The page itself is good and without errors, although funnily uses reiserfs as an example. It was only a bit unclear why I did save the initial_keyfile.bin since it was then removed in the next step (I guess it's for the case you want to have a recovery file hidden somewhere in case you forget the passphrase).

For using the tool I booted from a 14.04.2 LTS USB live image and operated there, including downloading and compiling luksipc in the live session. The exact reason of resizing before luksipc was a bit unclear to me at first so I simply indeed resized the main rootfs partition and left unallocated space in the partition table.

Then finally I ran ./luksipc -d /dev/sda4 etc.

I realized I want /boot to be on an unencrypted partition to be able to load the kernel + initrd from grub before entering into LUKS unlocking. I couldn't resize the luks partition anymore since it was encrypted... So I resized what I think was the empty small DIAGS partition (maybe used for some system diagnostic or something, I don't know), or possibly the next one that is the actual recovery partition one can reinstall the pre-installed Ubuntu from. And naturally I had some problems because it seems vfatresize tool didn't do what I wanted it to do and gparted simply crashed when I tried to use it first to do the same. Anyway, when done with getting some extra free space somewhere, I used the remaining 350MB for /boot where I copied the rootfs's /boot contents to.

After adding the passphrase in luks I had everything encrypted etc and decryptable, but obviously I could only access it from a live session by manual cryptsetup luksOpen + mount /dev/mapper/myroot commands. I needed to configure GRUB, and I needed to do it with the grub-efi-amd64 which was a bit unfamiliar to me. There's also grub-efi-amd64-signed I have installed now but I'm not sure if it was required for the configuration. Secure boot is not enabled by default in BIOS so maybe it isn't needed.

I did GRUB installation – I think inside rootfs chroot where I also mounted /dev/sda6 as /boot (inside the rootfs chroot), ie mounted dev, sys with -o bind to under the chroot (from outside chroot) and mount -t proc proc proc too. I did a lot of trial and effort so I surely also tried from outside the chroot, in the live session, using some parameters to point to the mounted rootfs's directories...

I needed to definitely install cryptsetup etc inside the encrypted rootfs with apt, and I remember debugging for some time if they went to the initrd correctly after I executed mkinitramfs/update-initramfs inside the chroot.

At the end I had grub asking for the password correctly at bootup. Obviously I had edited the rootfs's /etc/fstab to include the new /boot partition, I changed / to be "UUID=/dev/mapper/myroot /     ext4    errors=remount-ro 0       ", kept /boot/efi as coming from the /dev/sda1 and so on. I had also added "myroot /dev/sda4 none luks" to /etc/crypttab. I seem to also have GRUB_CMDLINE_LINUX="cryptdevice=/dev/sda4:myroot root=/dev/mapper/myroot" in /etc/default/grub.

The only thing I did save from the live session was the original partition table if I want to revert.

So the original was:

Found valid GPT with protective MBR; using GPT.
Disk /dev/sda: 500118192 sectors, 238.5 GiB
Logical sector size: 512 bytes
First usable sector is 34, last usable sector is 500118158
Partitions will be aligned on 2048-sector boundaries
Total free space is 6765 sectors (3.3 MiB)
Number  Start (sector)    End (sector)  Size       Code  Name
1            2048         1026047   500.0 MiB   EF00  EFI system partition
2         1026048         1107967   40.0 MiB    FFFF  Basic data partition
3         1107968         7399423   3.0 GiB     0700  Basic data partition
4         7399424       467013631   219.2 GiB   8300
5       467017728       500117503   15.8 GiB    8200

And I now have:

Number  Start (sector)    End (sector)  Size       Code  Name

1            2048         1026047   500.0 MiB   EF00  EFI system partition
2         1026048         1107967   40.0 MiB    FFFF  Basic data partition
3         1832960         7399423   2.7 GiB     0700  Basic data partition
4         7399424       467013631   219.2 GiB   8300
5       467017728       500117503   15.8 GiB    8200
6         1107968         1832959   354.0 MiB   8300

So it seems I did not edit DIAGS (and it was also originally just 40MB) but did something with the recovery partition while preserving its contents. It's a FAT partition so maybe I was able to somehow resize it after all.

The 16GB partition is the default swap partition. I did not encrypt it at least yet, I tend to not run into swap anyway ever in my normal use with the 8GB RAM.

If you go this route, good luck! :D

20 November, 2015 02:14PM by Timo Jyrinki (

hackergotchi for Pau Garcia i Quiles

Pau Garcia i Quiles

November 19, 2015

hackergotchi for Matthew Garrett

Matthew Garrett

If it's not practical to redistribute free software, it's not free software in practice

I've previously written about Canonical's obnoxious IP policy and how Mark Shuttleworth admits it's deliberately vague. After spending some time discussing specific examples with Canonical, I've been explicitly told that while Canonical will gladly give me a cost-free trademark license permitting me to redistribute unmodified Ubuntu binaries, they will not tell me what Any redistribution of modified versions of Ubuntu must be approved, certified or provided by Canonical if you are going to associate it with the Trademarks. Otherwise you must remove and replace the Trademarks and will need to recompile the source code to create your own binaries actually means.

Why does this matter? The free software definition requires that you be able to redistribute software to other people in either unmodified or modified form without needing to ask for permission first. This makes it clear that Ubuntu itself isn't free software - distributing the individual binary packages without permission is forbidden, even if they wouldn't contain any infringing trademarks[1]. This is obnoxious, but not inherently toxic. The source packages for Ubuntu could still be free software, making it fairly straightforward to build a free software equivalent.

Unfortunately, while true in theory, this isn't true in practice. The issue here is the apparently simple phrase you must remove and replace the Trademarks and will need to recompile the source code. "Trademarks" is defined later as being the words "Ubuntu", "Kubuntu", "Juju", "Landscape", "Edubuntu" and "Xubuntu" in either textual or logo form. The naive interpretation of this is that you have to remove trademarks where they'd be infringing - for instance, shipping the Ubuntu bootsplash as part of a modified product would almost certainly be clear trademark infringement, so you shouldn't do that. But that's not what the policy actually says. It insists that all trademarks be removed, whether they would embody an infringement or not. If a README says "To build this software under Ubuntu, install the following packages", a literal reading of Canonical's policy would require you to remove or replace the word "Ubuntu" even though failing to do so wouldn't be a trademark infringement. If an email address is present in a changelog, you'd have to change it. You wouldn't be able to ship the juju-core package without renaming it and the application within. If this is what the policy means, it's so impractical to be able to rebuild Ubuntu that it's not free software in any meaningful way.

This seems like a pretty ludicrous interpretation, but it's one that Canonical refuse to explicitly rule out. Compare this to Red Hat's requirements around Fedora - if you replace the fedora-logos, fedora-release and fedora-release-notes packages with your own content, you're good. A policy like this satisfies the concerns that Dustin raised over people misrepresenting their products, but still makes it easy for users to distribute modified code to other users. There's nothing whatsoever stopping Canonical from adopting a similarly unambiguous policy.

Mark has repeatedly asserted that attempts to raise this issue are mere FUD, but he won't answer you if you ask him direct questions about this policy and will insist that it's necessary to protect Ubuntu's brand. The reality is that if Debian had had an identical policy in 2004, Ubuntu wouldn't exist. The effort required to strip all Debian trademarks from the source packages would have been immense[2], and this would have had to be repeated for every release. While this policy is in place, nobody's going to be able to take Ubuntu and build something better. It's grotesquely hypocritical, especially when the Ubuntu website still talks about their belief that people should be able to distribute modifications without licensing fees.

All that's required for Canonical to deal with this problem is to follow Fedora's lead and isolate their trademarks in a small set of packages, then tell users that those packages must be replaced if distributing a modified version of Ubuntu. If they're serious about this being a branding issue, they'll do it. And if I'm right that the policy is deliberately obfuscated so Canonical can encourage people to buy licenses, they won't. It's easy for them to prove me wrong, and I'll be delighted if they do. Let's see what happens.

[1] The policy is quite clear on this. If you want to distribute something other than an unmodified Ubuntu image, you have two choices:
  1. Gain approval or certification from Canonical
  2. Remove all trademarks and recompile the source code
Note that option 2 requires you to rebuild even if there are no trademarks to remove.

[2] Especially when every source package contains a directory called "debian"…

comment count unavailable comments

19 November, 2015 10:16PM

Vincent Fourmond

Purely shell way to extract a numbered line from a file

I feel almost shameful to write it down, but as it took me a long time to realize this, I'll write it down anyway. Here's the simplest portable shell one-liner I've found to extract only the, say, 5th line from file:

~ cat file | tail -n +5 | head -n1

Hope it helps...

Update: following the comments to this post, here are a couple of other solutions. Thanks to all who contributed !

~ cat file | sed -ne 5p
~ cat file | sed -ne '5{p;q}' 

The second solution has the advantage of closing the input after line 5, so if you have an expensive command, you'll kill it with a SIGPIPE soon after it produces line 5. Other ones:

~ cat file | awk 'NR==5 {print; exit}'
~ cat file | head -n5 | tail -n1 

The last one, while simpler, is slightly more expensive because all the lines before the one you're interested in are copied twice (first from cat to head and then from head to tail). This happens less with the first solution because, even if tail passes on all the lines after the one you're interested in, it is killed by a SIGPIPE when head closes.

19 November, 2015 09:38PM by Vincent Fourmond (

hackergotchi for Miriam Ruiz

Miriam Ruiz

Projects, Conflicts and Emotions

The Debian Project includes many people, groups and teams with different goals, priorities and ways of doing things. Diversity is a good thing, and the results of the continuous interaction, cooperation and competition among different points of view and components make up a successful developing framework both in Debian and in other Free / Libre / Open Source Software communities.

The cost of this evolutionary paradigm is that sometimes there are subprojects that might have been extremely successful and useful that are surpassed by newer approaches, or that have to compete with alternative approaches that were not there before, and which might pursue different goals or have a different way of doing things that their developers find preferable in terms of modularity, scalability, stability, maintenance, aesthetics or any other reason.

Whenever this happens, the emotional impact on the person or group of people that are behind the established component (or process, or organizational structure), that is being questioned and put under test by the newer approach can be important, particularly when they have invested a lot of time and effort and a considerable amount of emotional energy doing a great job for many years. Something they should be thanked for.

This might be particularly hard when -for whatever reason- the communication between both teams is not too fluent or constant, and sometimes the author or authors of the solution that was considered mainstream until then might feel left out and their territory stolen. As generally development teams and technical people in the Free / Libre / Open Source world are more focused on results than on relationships, projects are generally not too good at managing this (emotional, relational) situations, even though they (we) are gradually learning and improving.

What has happened with the Debian Live Project is indeed a hurtful situation, even though it’s probably an unavoidable one. The Debian Live Project has done a great job for many years and it is sad to see it dying abruptly. A new competing approach is on its way with a different set of priorities and different way of doing things, and all that can be done at the moment is to thank Daniel for all his work, as well as everyone who has made the Debian Live Project successful for so many years, also thank the people who are investing their time and effort in developing something that might be even better. Lets wait and see.

Source of the image: Conflict Modes and Managerial Styles by Ed Batista

19 November, 2015 12:09PM by Miry

November 18, 2015

hackergotchi for Daniel Pocock

Daniel Pocock

Improving DruCall and JSCommunicator user interface

DruCall is one of the easiest ways to get up and running with WebRTC voice and video calling on your own web site or blog. It is based on 100% open source and 100% open standards - no binary browser plugins and no lock-in to a specific service provider or vendor.

On Debian or Ubuntu, just running a command such as

# apt-get install -t jessie-backports drupal7-mod-drucall

will install Drupal, Apache, MySQL, JSCommunicator, JsSIP and all the other JavaScript library packages and module dependencies for DruCall itself.

The user interface

Most of my experience is in server-side development, including things like the powerful SIP over WebSocket implementation in the reSIProcate SIP proxy repro.

In creating DruCall, I have simply concentrated on those areas related to configuring and bringing up the WebSocket connection and creating the authentication tokens for the call.

Those things provide a firm foundation for the module, but it would be nice to improve the way it is presented and optimize the integration with other Drupal features. This is where the projects (both DruCall and JSCommunicator) would really benefit from feedback and contributions from people who know Drupal and web design in much more detail.

Benefits for collaboration

If anybody wants to collaborate on either or both of these projects, I'd be happy to offer access to a pre-configured SIP WebSocket server in my lab for more convenient testing. The DruCall source code is a hosted project and the JSCommunicator source code is on Github.

When you get to the stage where you want to run your own SIP WebSocket server as well then free community support can also be provided through the repro-user mailing list. The free, online RTC Quick Start Guide gives a very comprehensive overview of everything you need to do to run your own WebRTC SIP infrastructure.

18 November, 2015 05:45PM by Daniel.Pocock

November 17, 2015

hackergotchi for Norbert Preining

Norbert Preining

Debian/TeX Live 2015.20151116-1

One month has passed since the big multiarch update, and not one bug report concerning it did come in, that are good news. So here is a completely boring update with nothing more than the usual checkout from the TeX Live tlnet distribution as of yesterday.

Debian - TeX Live 2015

I cannot recall anything particular to mention here, so this time let me go with the list of updated and new packages only:

Updated packages

alegreya, algorithm2e, animate, archaeologie, articleingud, attachfile, bankstatement, beebe, biber, biblatex, biblatex-manuscripts-philology, biblatex-opcit-booktitle, bidi, br-lex, bytefield, catcodes, chemfig, chemformula, chemgreek, comprehensive, computational-complexity, ctable, datetime2, dowith, dvipdfmx, dvipdfmx-def, dynamicnumber, e-french, epspdf, etoc, fetamont, findhyph, fix2col, gitinfo2, gradstudentresume, indextools, kotex-oblivoir, kotex-utils, kpathsea, l3build, l3experimental, l3kernel, l3packages, latex, latex2e-help-texinfo, lisp-on-tex, ltxfileinfo, luatexja, makedtx, mathastext, mathtools, mcf2graph, media9, medstarbeamer, morehype, nameauth, nicetext, ocgx2, perltex, preview, prftree, pst-eucl, pstricks, reledmac, resphilosophica, selnolig, substances, tcolorbox, tetex, teubner, tex4ht, texdoc, texinfo, texlive-scripts, toptesi, translations, turabian-formatting, uptex, xepersian, xetex, xetex-def, xint.

New packages

asciilist, babel-macedonian, bestpapers, bibtexperllibs, fixcmex, iffont, nucleardata, srcredact, texvc, xassoccnt.


17 November, 2015 11:08PM by Norbert Preining

Sven Hoexter

The 2015 version of Alanis Morissette Ironic

Something that made my day this week was a 2015 version of Alanis Morissette Ironic. It's even a bit more ironic when you're partially cought in a hands clean situation.

17 November, 2015 09:42PM

Failing with F5: assign a http profile and an irule at the same time

Beside of an upgrade to TMOS 11.4.1HF9 I wanted to use a maintenance today to assign some specific irule to a VS. Within the irule I use some HTTP functions so when I tried to add the irule to the already existing VS the tmsh correctly told me that I also need a http profile on this VS. Thanks tmsh you're right, oversight by myself.

So what I did was:

tmsh modify ltm virtual mySpecialVS rules { mySpecialiRule } profiles add { company-http-profile }

tmsh accepted but all my tests ended at the VS. I could connect but got no reply at all. That was strange because I tested this irule extensively. So I reverted back to the known good state with just plain tcp forwarding.

My next try was to assign only the http profile without the irule.

tmsh modify ltm virtual mySpecialVS profiles add { company-http-profile }

Tested that and it worked. So what on earth was wrong with my irule? I added some debug statements and readded the irule like this:

tmsh modify ltm virtual mySpecialVS rules { mySpecialiRule }

And now it worked as intended. So I went on and removed my debug statements, tested again and it still works. Let's see if I can reproduce that case some time later this week to fill a proper bugreport with F5.

Update: Turns out it was all my fault. Due to a misunderstanding about RULE_INIT and the static namespace, I managed to overwrite important variables globally. Lesson learned: Be very careful if you use "static::" or better avoid it. Also think twice if you start to set things on the RULE_INIT event. Since it's only called on saving an irule or restarts of the device, your errors might show only later when you do not expect that.

17 November, 2015 08:38PM

hackergotchi for Clint Adams

Clint Adams

Petter Reinholdtsen

PGP key transition statement for key EE4E02F9

I've needed a new OpenPGP key for a while, but have not had time to set it up properly. I wanted to generate it offline and have it available on a OpenPGP smart card for daily use, and learning how to do it and finding time to sit down with an offline machine almost took forever. But finally I've been able to complete the process, and have now moved from my old GPG key to a new GPG key. See the full transition statement, signed with both my old and new key for the details. This is my new key:

pub   3936R/EE4E02F9 2015-11-03 [expires: 2019-11-14]
      Key fingerprint = 3AC7 B2E3 ACA5 DF87 78F1  D827 111D 6B29 EE4E 02F9
uid                  Petter Reinholdtsen <>
uid                  Petter Reinholdtsen <>
sub   4096R/87BAFB0E 2015-11-03 [expires: 2019-11-02]
sub   4096R/F91E6DE9 2015-11-03 [expires: 2019-11-02]
sub   4096R/A0439BAB 2015-11-03 [expires: 2019-11-02]

The key can be downloaded from the OpenPGP key servers, signed by my old key.

If you signed my old key (DB4CCC4B2A30D729), I'd very much appreciate a signature on my new key, details and instructions in the transition statement. I m happy to reciprocate if you have a similarly signed transition statement to present.

17 November, 2015 09:50AM

hackergotchi for Dirk Eddelbuettel

Dirk Eddelbuettel

RcppAnnoy 0.0.7

A new version of RcppAnnoy, our Rcpp-based R integration of the nifty Annoy library by Erik, is now on CRAN. Annoy is a small, fast, and lightweight C++ template header library for approximate nearest neighbours.

This release mostly just catches up with the Annoy release 1.6.2 of last Friday. No new features were added on our side.

Courtesy of CRANberries, there is also a diffstat report for this release.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.

17 November, 2015 02:54AM

November 16, 2015

hackergotchi for Daniel Pocock

Daniel Pocock

Quick start using Blender for video editing

Updated 2015-11-16 for WebM

Although it is mostly known for animation, Blender includes a non-linear video editing system that is available in all the current stable versions of Debian, Ubuntu and Fedora.

Here are some screenshots showing how to start editing a video of a talk from a conference.

In this case, there are two input files:

  • A video file from a DSLR camera, including an audio stream from a microphone on the camera
  • A separate audio file with sound captured by a lapel microphone attached to the speaker's smartphone. This is a much better quality sound and we would like this to replace the sound included in the video file.

Open Blender and choose the video editing mode

Launch Blender and choose the video sequence editor from the pull down menu at the top of the window:

Now you should see all the video sequence editor controls:

Setup the properties for your project

Click the context menu under the strip editor panel and change the panel to a Properties panel:

The video file we are playing with is 720p, so it seems reasonable to use 720p for the output too. Change that here:

The input file is 25fps so we need to use exactly the same frame rate for the output, otherwise you will either observe the video going at the wrong speed or there will be a conversion that is CPU intensive and degrades the quality. Also check that the resolution_percentage setting under the picture dimensions is 100%:

Now specify output to PNG files. Later we will combine them into a WebM file with a script. Specify the directory where the files will be placed and use the # placeholder to specify the number of digits to use to embed the frame number in the filename:

Now your basic rendering properties are set. When you want to generate the output file, come back to this panel and use the Animation button at the top.

Editing the video

Use the context menu to change the properties panel back to the strip view panel:

Add the video file:

and then right click the video strip (the lower strip) to highlight it and then add a transform strip:

Audio waveform

Right click the audio strip to highlight it and then go to the properties on the right hand side and click to show the waveform:

Rendering length

By default, Blender assumes you want to render 250 frames of output. Looking in the properties to the right of the audio or video strip you can see the actual number of frames. Put that value in the box at the bottom of the window where it says 250:

Enable AV-sync

Also at the bottom of the window is a control to enable AV-sync. If your audio and video are not in sync when you preview, you need to set this AV-sync option and also make sure you set the frame rate correctly in the properties:

Add the other sound strip

Now add the other sound file that was recorded using the lapel microphone:

Enable the waveform display for that sound strip too, this will allow you to align the sound strips precisely:

You will need to listen to the strips to make an estimate of the time difference. Use this estimate to set the "start frame" in the properties for your audio strip, it will be a negative value if the audio strip starts before the video. You can then zoom the strip panel to show about 3 to 5 seconds of sound and try to align the peaks. An easy way to do this is to look for applause at the end of the audio strips, the applause generates a large peak that is easily visible.

Once you have synced the audio, you can play the track and you should not be able to hear any echo. You can then silence the audio track from the camera by right clicking it, look in the properties to the right and change volume to 0.

Make any transforms you require

For example, to zoom in on the speaker, right click the transform strip (3rd from the bottom) and then in the panel on the right, click to enable "Uniform Scale" and then set the scale factor as required:

Render the video output to PNG

Click the context menu under the Curves panel and choose Properties again.

Click the Animation button to generate a sequence of PNG files for each frame.

Render the audio output

On the Properties panel, click the Audio button near the top. Choose a filename for the generated audio file.

Look on the bottom left-hand side of the window for the audio file settings, change it to the ogg container and Vorbis codec:

Ensure the filename has a .ogg extension

Now look at the top right-hand corner of the window for the Mixdown button. Click it and wait for Blender to generate the audio file.

Combine the PNG files and audio file into a WebM video file

You will need to have a few command line tools installed for manipulating the files from scripts. Install them using the package manager, for example, on a Debian or Ubuntu system:

# apt-get install mjpegtools vpx-tools mkvtoolnix

Now create a script like the following:

#!/bin/bash -e

# Set this to match the project properties

# Set this to the rate you desire:


NUM_FRAMES=`find ${PNG_DIR} -type f | wc -l`

png2yuv -I p -f $FRAME_RATE -b 1 -n $NUM_FRAMES \
    -j ${PNG_DIR}/%08d.png > ${YUV_FILE}

vpxenc --good --cpu-used=0 --auto-alt-ref=1 \
   --lag-in-frames=16 --end-usage=vbr --passes=2 \
   --threads=2 --target-bitrate=${TARGET_BITRATE} \
   -o ${WEBM_FILE}-noaudio ${YUV_FILE}

rm ${YUV_FILE}

mkvmerge -o ${WEBM_FILE} -w ${WEBM_FILE}-noaudio ${AUDIO_FILE}

rm ${WEBM_FILE}-noaudio

Next steps

There are plenty of more comprehensive tutorials, including some videos on Youtube, explaining how to do more advanced things like fading in and out or zooming and panning dynamically at different points in the video.

If the lighting is not good (faces too dark, for example), you can right click the video strip, go to the properties panel on the right hand side and click Modifiers, Add Strip Modifier and then select "Color Balance". Use the Lift, Gamma and Gain sliders to adjust the shadows, midtones and highlights respectively.

16 November, 2015 06:53PM by Daniel.Pocock

hackergotchi for Steve Kemp

Steve Kemp

lumail2 nears another release

I'm pleased with the way that Lumail2 development is proceeding, and it is reaching a point where there will be a second source-release.

I've made a lot of changes to the repository recently, and most of them boil down to moving code from the C++ side of the application, over to the Lua side.

This morning, for example, I updated the handing of index.limit to be entirely Lua based.

When you open a Maildir folder you see the list of messages it contains, as you would expect.

The notion of the index.limit is that you can limit the messages displayed, for example:

  • See all messages: Config:set( "index.limit", "all")
  • See only new/unread messages: Config:set( "index.limit", "new")
  • See only messages which arrived today: Config:set( "index.limit", "today")
  • See only messages which contain "Steve" in their formatted version: Config:set( "index.limit", "steve")

These are just examples that are present as defaults, but they give an idea of how things can work. I guess it isn't so different to Mutt's "limit" facilities - but thanks to the dynamic Lua nature of the application you can add your own with relative ease.

One of the biggest changes, recently, was the ability to display coloured text! That was always possible before, but a single line could only be one colour. Now colours can be mixed within a line, so this works as you might imagine:

Panel:append( "$[RED]This is red, $[GREEN]green, $[WHITE]white, and $[CYAN]cyan!" )

Other changes include a persistant cache of "stuff", which is Lua-based, the inclusion of at least one luarocks library to parse Date: headers, and a simple API for all our objects.

All good stuff. Perhaps time for a break in the next few weeks, but right now I think I'm making useful updates every other evening or so.

16 November, 2015 06:35PM

hackergotchi for Julien Danjou

Julien Danjou

Profiling Python using cProfile: a concrete case

Writing programs is fun, but making them fast can be a pain. Python programs are no exception to that, but the basic profiling toolchain is actually not that complicated to use. Here, I would like to show you how you can quickly profile and analyze your Python code to find what part of the code you should optimize.

What's profiling?

Profiling a Python program is doing a dynamic analysis that measures the execution time of the program and everything that compose it. That means measuring the time spent in each of its functions. This will give you data about where your program is spending time, and what area might be worth optimizing.

It's a very interesting exercise. Many people focus on local optimizations, such as determining e.g. which of the Python functions range or xrange is going to be faster. It turns out that knowing which one is faster may never be an issue in your program, and that the time gained by one of the functions above might not be worth the time you spend researching that, or arguing about it with your colleague.

Trying to blindly optimize a program without measuring where it is actually spending its time is a useless exercise. Following your guts alone is not always sufficient.

There are many types of profiling, as there are many things you can measure. In this exercise, we'll focus on CPU utilization profiling, meaning the time spent by each function executing instructions. Obviously, we could do many more kind of profiling and optimizations, such as memory profiling which would measure the memory used by each piece of code – something I talk about in The Hacker's Guide to Python.


Since Python 2.5, Python provides a C module called cProfile which has a reasonable overhead and offers a good enough feature set. The basic usage goes down to:

>>> import cProfile
>>>'2 + 2')
2 function calls in 0.000 seconds
Ordered by: standard name
ncalls tottime percall cumtime percall filename:lineno(function)
1 0.000 0.000 0.000 0.000 <string>:1(<module>)
1 0.000 0.000 0.000 0.000 {method 'disable' of '_lsprof.Profiler' objects}

Though you can also run a script with it, which turns out to be handy:

$ python -m cProfile -s cumtime
72270 function calls (70640 primitive calls) in 4.481 seconds
Ordered by: cumulative time
ncalls tottime percall cumtime percall filename:lineno(function)
1 0.004 0.004 4.481 4.481<module>)
1 0.001 0.001 4.296 4.296
3 0.000 0.000 4.286 1.429
3 0.000 0.000 4.268 1.423
4/3 0.000 0.000 3.816 1.272
4 0.000 0.000 2.965 0.741
4 0.000 0.000 2.962 0.740
4 0.000 0.000 2.961 0.740
2 0.000 0.000 2.675 1.338
30 0.000 0.000 1.621 0.054
30 0.000 0.000 1.621 0.054
30 1.621 0.054 1.621 0.054 {method 'read' of '_ssl._SSLSocket' objects}
1 0.000 0.000 1.611 1.611
4 0.000 0.000 1.572 0.393
4 0.000 0.000 1.572 0.393
60 0.000 0.000 1.571 0.026
4 0.000 0.000 1.571 0.393
1 0.000 0.000 1.462 1.462
1 0.000 0.000 1.462 1.462
1 0.000 0.000 1.462 1.462
1 0.000 0.000 1.459 1.459

This prints out all the function called, with the time spend in each and the number of times they have been called.

Advanced visualization with KCacheGrind

While being useful, the output format is very basic and does not make easy to grab knowledge for complete programs. For more advanced visualization, I leverage KCacheGrind. If you did any C programming and profiling these last years, you may have used it as it is primarily designed as front-end for Valgrind generated call-graphs.

In order to use, you need to generate a cProfile result file, then convert it to KCacheGrind format. To do that, I use pyprof2calltree.

$ python -m cProfile -o myscript.cprof
$ pyprof2calltree -k -i myscript.cprof

And the KCacheGrind window magically appears!

Concrete case: Carbonara optimization

I was curious about the performances of Carbonara, the small timeserie library I wrote for Gnocchi. I decided to do some basic profiling to see if there was any obvious optimization to do.

In order to profile a program, you need to run it. But running the whole program in profiling mode can generate a lot of data that you don't care about, and adds noise to what you're trying to understand. Since Gnocchi has thousands of unit tests and a few for Carbonara itself, I decided to profile the code used by these unit tests, as it's a good reflection of basic features of the library.

Note that this is a good strategy for a curious and naive first-pass profiling. There's no way that you can make sure that the hotspots you will see in the unit tests are the actual hotspots you will encounter in production. Therefore, a profiling in conditions and with a scenario that mimics what's seen in production is often a necessity if you need to push your program optimization further and want to achieve perceivable and valuable gain.

I activated cProfile using the method described above, creating a cProfile.Profile object around my tests (I actually started to implement that in testtools). I then run KCacheGrind as described above. Using KCacheGrind, I generated the following figures.

The test I profiled here is called test_fetch and is pretty easy to understand: it puts data in a timeserie object, and then fetch the aggregated result. The above list shows that 88 % of the ticks are spent in set_values (44 ticks over 50). This function is used to insert values into the timeserie, not to fetch the values. That means that it's really slow to insert data, and pretty fast to actually retrieve them.

Reading the rest of the list indicates that several functions share the rest of the ticks, update, _first_block_timestamp, _truncate, _resample, etc. Some of the functions in the list are not part of Carbonara, so there's no point in looking to optimize them. The only thing that can be optimized is, sometimes, the number of times they're called.

The call graph gives me a bit more insight about what's going on here. Using my knowledge about how Carbonara works, I don't think that the whole stack on the left for _first_block_timestamp makes much sense. This function is supposed to find the first timestamp for an aggregate, e.g. with a timestamp of 13:34:45 and a period of 5 minutes, the function should return 13:30:00. The way it works currently is by calling the resample function from Pandas on a timeserie with only one element, but that seems to be very slow. Indeed, currently this function represents 25 % of the time spent by set_values (11 ticks on 44).

Fortunately, I recently added a small function called _round_timestamp that does exactly what _first_block_timestamp needs that without calling any Pandas function, so no resample. So I ended up rewriting that function this way:

def _first_block_timestamp(self):
- ts = self.ts[-1:].resample(self.block_size)
- return (ts.index[-1] - (self.block_size * self.back_window))
+ rounded = self._round_timestamp(self.ts.index[-1], self.block_size)
+ return rounded - (self.block_size * self.back_window)

And then I re-run the exact same test to compare the output of cProfile.

The list of function seems quite different this time. The number of time spend used by set_values dropped from 88 % to 71 %.

The call stack for set_values shows that pretty well: we can't even see the _first_block_timestamp function as it is so fast that it totally disappeared from the display. It's now being considered insignificant by the profiler.

So we just speed up the whole insertion process of values into Carbonara by a nice 25 % in a few minutes. Not that bad for a first naive pass, right?

16 November, 2015 03:00PM by Julien Danjou

hackergotchi for Wouter Verhelst

Wouter Verhelst


noun | ter·ror·ism | \ˈter-ər-ˌi-zəm\ | no plural

The mistaken belief that it is possible to change the world through acts of cowardice.

First use in English 1795 in reference to the Jacobin rule in Paris, France.

ex.: They killed a lot of people, but their terrorism only intensified the people's resolve.

16 November, 2015 08:07AM

hackergotchi for Dirk Eddelbuettel

Dirk Eddelbuettel

Rcpp 0.12.2: More refinements

The second update in the 0.12.* series of Rcpp is now on the CRAN network for GNU R. As usual, I will also push a Debian package. This follows the 0.12.0 release from late July which started to add some serious new features, and builds upon the 0.12.1 release in September. It also marks the sixth release this year where we managed to keep a steady bi-montly release frequency.

Rcpp has become the most popular way of enhancing GNU R with C or C++ code. As of today, 512 packages on CRAN depend on Rcpp for making analytical code go faster and further. That is up by more than fifty package from the last release in September (and we recently blogged about crossing 500 dependents).

This release once again features pull requests from two new contributors with Nathan Russell and Tianqi Chen joining in. As shown below, other recent contributors (such as such as Dan) are keeping at it too. Keep'em coming! Luke Tierney also email about a code smell he spotted and which we took care of. A big Thank You! to everybody helping with code, bug reports or documentation. See below for a detailed list of changes extracted from the NEWS file.

Changes in Rcpp version 0.12.2 (2015-11-14)

  • Changes in Rcpp API:

    • Correct return type in product of matrix dimensions (PR #374 by Florian)

    • Before creating a single String object from a SEXP, ensure that it is from a vector of length one (PR #376 by Dirk, fixing #375).

    • No longer use STRING_ELT as a left-hand side, thanks to a heads-up by Luke Tierney (PR #378 by Dirk, fixing #377).

    • Rcpp Module objects are now checked more carefully (PR #381 by Tianqi, fixing #380)

    • An overflow in Matrix column indexing was corrected (PR #390 by Qiang, fixing a bug reported by Allessandro on the list)

    • Nullable types can now be assigned R_NilValue in function signatures. (PR #395 by Dan, fixing issue #394)

    • operator<<() now always shows decimal points (PR #396 by Dan)

    • Matrix classes now have a transpose() function (PR #397 by Dirk fixing #383)

    • operator<<() for complex types was added (PRs #398 by Qiang and #399 by Dirk, fixing #187)

  • Changes in Rcpp Attributes:

    • Enable export of C++ interface for functions that return void.

  • Changes in Rcpp Sugar:

    • Added new Sugar function cummin(), cummax(), cumprod() (PR #389 by Nathan Russell fixing #388)

    • Enabled sugar math operations for subsets; e.g. x[y] + x[z]. (PR #393 by Kevin and Qiang, implementing #392)

  • Changes in Rcpp Documentation:

    • The NEWS file now links to GitHub issue tickets and pull requests.

    • The Rcpp.bib file with bibliographic references was updated.

Thanks to CRANberries, you can also look at a diff to the previous release As always, even fuller details are on the Rcpp Changelog page and the Rcpp page which also leads to the downloads page, the browseable doxygen docs and zip files of doxygen output for the standard formats. A local directory has source and documentation too. Questions, comments etc should go to the rcpp-devel mailing list off the R-Forge page.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.

16 November, 2015 03:43AM

hackergotchi for Norbert Preining

Norbert Preining

Movies: Monuments Men and Interstellar

Over the rainy weekend we watched two movies: Monuments Men (in Japanese it is called Michelangelo Project!) and Interstellar. Both blockbuster movies from the usual American companies, they are light-years away when it comes to quality. The Monuments Men are boring, without a story, without depth, historically inaccurate, a complete failure. Interstellar, although a long movie, keeps you frozen in the seat while being as scientific as possible and starts your brain working heavily.


My personal verdict: 3 rotten eggs (because Rotten Tomatoes are not stinky enough) for the Monuments Men, and 4 stars for Interstellar.


First for the plot of the two movies: The Monuments Men is loosely based on a true story about rescuing pieces of art at the end of the second world war, before the Nazis destroy them or the Russian take them away. A group of art experts is sent into Europe and manages to find several hiding places of art taken by the Nazis.

Interstellar is set in near future where the conditions on the earth are deteriorating to a degree that human life seems to be soon impossible. Some years before the movie plays a group of astronauts were sent through a wormhole into a different galaxy to search for new inhabitable planets. Now it is time to check out these planets, and try to establish colonies there. Cooper, a retired NASA officer and pilot, now working as farmer, and his daughter are guided by some mysterious way to a secret NASA facility. Cooper is drafted for being a pilot on the reconnaissance mission, and leaves earth and our galaxy through the same wormhole. (Not telling more!)

Monuments Men

Looking at the cast of Monuments Men (George Clooney, Matt Damon, Bill Murray, John Goodman, Jean Dujardin, Bob Balaban, Hugh Bonneville, and Cate Blanchett) one would expect a great movie – but from the very first to the very last scene, it is a slowly meandering shallow flow of sticked together scenes without much coherence. Tension is generated only through unrelated events (stepping onto a landmine, patting a horse), but never developed properly. Dialogs are shallow and boring – with one exception: When Frank Stokes (George Clooney) meets the one German and inquires general about the art, predicting his future being hanged.

Historically, the movie is as inaccurate as it can be – despite Clooney stating that “80 percent of the story is still completely true and accurate, and almost all of the scenes happened”. That contrasts starkly with the verdict of Nigel Pollard (Swansea University): “There’s a kernel of history there, but The Monuments Men plays fast and loose with it in ways that are probably necessary to make the story work as a film, but the viewer ends up with a fairly confused notion of what the organisation was, and what it achieved.”

The movie leaves a bitter aftertaste, hailing of American heroism paired with the usual stereotypes (French amour, German retarded, Russian ignorance, etc). Together with the half baked dialogues it feels like a permanent coitus interruptus.


Interstellar cannot serve with a similar cast, but still a few known people (Matthew McConaughey, Anne Hathaway, and Michael Caine!). But I believe this is actually a sign of quality. Well balancing scientific accuracy and the requirements for blockbusters, the movie successfully spans the bridge between complicated science, in particular general gravity, and entertainment. While not going so far to call the move edutainment (like both the old and new Cosmos), it is surprising how much of hard science is packed into this movie. This is mostly thanks to the theoretical physicist Kip Thorne acting as scientific consultant for the movie, but also due to the director Christopher Nolan being serious about it and studying relativity at Caltech.

Of course, scientific accuracy has limits – nobody knows what happens if one crosses the event horizon of a black hole, and even the existence of wormholes is purely theoretical by now. Still, throughout the movie it follows the two requirements laid out by Kip Thorne: “First, that nothing would violate established physical laws. Second, that all the wild speculations… would spring from science and not from the fertile mind of a screenwriter.”

I think the biggest compliment was that, despite the length, despite a long day out (see next blog), despite the rather unfamiliar topic, my wife, who is normally not interested in space movies and that kind, didn’t fall asleep throughout the movie, and I had to stop several times to explain details of the theory of gravity and astronomy. So in some sense it was perfect edutainment!

16 November, 2015 12:01AM by Norbert Preining

November 15, 2015

Manuel A. Fernandez Montecelo

Work on aptitude

Midsummer for me is also known as “Noite do Lume Novo” (literally “New Fire Night”), one of the big calendar events of the year, marking the end of the school year and the beginning of summer.

On this day, there are celebrations not very unlike the bonfires in the Guy Fawkes Night in England or Britain [1]. It is a bit different in that it is not a single event for the masses, more of a friends and neighbours thing, and that it lasts for a big chunk of the night (sometimes until morning). Perhaps for some people, or outside bigger towns or cities, Guy Fawkes Night is also celebrated in that way ─ and that's why during the first days of November there are fireworks rocketing and cracking in the neighbourhoods all around.

Like many other celebrations around the world involving bonfires, many of them also happening around the summer solstice, it is supposed to be a time of renewal of cycles, purification and keeping the evil spirits away; with rituals to that effect like jumping over the fire ─ when the flames are not high and it is safe enough.

So it was fitting that, in the middle of June (almost Midsummer in the northern hemisphere), I learnt that I was about to leave my now-previous job, which is a pretty big signal and precursor for renewal (and it might have something to do with purifying and keeping the evil away as well ;-) ).

Whatever... But what does all of this have to do with aptitude or Debian, anyway?

For one, it was a question of timing.

While looking for a new job (and I am still at it), I had more spare time than usual. DebConf 15 @ Heidelberg was within sight, and for the first time circumstances allowed me to attend this event.

It also coincided with the time when I re-gained access to commit to aptitude on the 19th of June. Which means Renewal.

End of June was also the time of the announcement of the colossal GCC-5/C++11 ABI transition in Debian, that was scheduled to start on the 1st of August, just before the DebConf. Between 2 and 3 thousand source packages in Debian were affected by this transition, which a few months later is not yet finished (although the most important parts were completed by mid-end September).

aptitude itself is written in C++, and depends on several libraries written in C++, like Boost, Xapian and SigC++. All of them had to be compiled with the new C++11 ABI of GCC-5, in unison and in a particular order, for aptitude to continue to work (and for minimal breakage). aptitude and some dependencies did not even compile straight away, so this transition meant that aptitude needed attention just to keep working.

Having recently being awarded again with the Aptitude Hat, attending DebConf for the first time and sailing towards the Transition Maelstrom, it was a clear sign that Something Had to Be Done (to avoid the sideways looks and consequent shame at DebConf, if nothing else).

Happily (or a bit unhappily for me, but let's pretend...), with the unexpected free time in my hands, I changed the plans that I had before re-gaining the Aptitude Hat (some of them involving Debian, but in other ways ─ maybe I will post about that soon).

In July I worked to fix the problems before the transition started, so aptitude would be (mostly) ready, or in the worst case broken only for a few days, while the chain of dependencies was rebuilt. But apart from the changes needed for the new GCC-5, it was decided at the last minute that Boost 1.55 would not be rebuilt with the new ABI, and that the only version with the new ABI would be 1.58 (which caused further breakage in aptitude, was added to experimental only a few days before, and was moved to unstable after the transition had started). Later, in the first days of the transition, aptitude was affected for a few days by breakage in the dependencies, due to not being compiled in sequence according to the transition levels (so with a mix of old and new ABI).

With the critical intervention of Axel Beckert (abe / XTaran), things were not so bad as they could have been. He was busy testing and uploading in the critical days when I was enjoying a small holiday on my way to DebConf, with minimal internet access and communicating almost exclusively with him; and he promptly tended the complaints arriving in the Bug Tracking System and asked for rebuilds of the dependencies with the new ABI. He also brought the packaging up to shape, which had decayed a bit in the last few years.

Gruesome Challenges

But not all was solved yet, more storms were brewing and started to appear in the horizon, in the form of clouds of fire coming from nearby realms.

The APT Deities, which had long ago spilled out their secret, inner challenge (just the initial paragraphs), were relentless. Moreover, they were present at Heidelberg in full force, in ─or close to─ their home grounds, and they were Marching Decidedly towards Victory:

apt BTS Graph, 2015-11-15

In the talk @ DebConf “This APT has Super Cow Powers” (video available), by David Kalnischkies, they told us about the niceties of apt 1.1 (still in experimental but hopefully coming to unstable soon), and they boasted about getting the lead in our arms race (should I say bugs race?) by a few open bug reports.

This act of provocation further escalated the tensions. The fierce competition which had been going on for some time gained new heights. So much so that APT Deities and our team had to sit together in the outdoor areas of the venue and have many a weissbier together, while discussing and fixing bugs.

But beneath the calm on the surface, and while pretending to keep good diplomatic relations, I knew that Something Had to Be Done, again. So I could only do one thing ─ jump over the bonfire and Keep the Evil away, be that Keep Evil bugs Away or Keep Evil APT Deities Away from winning the challenge, or both.

After returning from DebConf I continued to dedicate time to the project, more than a full time job in some weeks, and this is what happened in the last few months, summarised in another graph, showing the evolution of the BTS for aptitude:

aptitude BTS Graph, 2015-11-15

The numbers for apt right now (15th November 2015) are:

  • 629 open (731 if counting all merged bugs independently)
  • 0 Release Critical
  • 275 (318 unmerged) with severity Important or Normal
  • 354 (413 unmerged) with severity Minor or Wishlist
  • 0 marked as Forwarded or Pending

The numbers for aptitude right now are:

  • 488 (573 if counting all merged bugs independently)
  • 1 Release Critical (but it is an artificial bug to keep it from migrating to testing)
  • 197 (239 unmerged) with severity Important or Normal
  • 271 (313 unmerged) with severity Minor or Wishlist
  • 19 (20 unmerged) marked as Forwarded or Pending

The Aftermath

As we can see, for the time being I could keep the Evil at bay, both in terms of bugs themselves and re-gaining the lead in the bugs race ─ the Evil APT Deities were thwarted again in their efforts.

... More seriously, as most of you suspected, the graph above is not the whole truth, so I don't want to boast too much. A big part of the reduction in the number of bugs is because of merging duplicates, closing obsolete bugs, applying translations coming from multiple contributors, or simple fixes like typos and useful suggestions needing minor changes. Many of remaining problems are comparatively more difficult or time consuming that the ones addressed so far (except perhaps avoiding the immediate breakage of the transition, that took weeks to solve), and there are many important problems still there, chief among those is aptitude offering very poor solutions to resolve conflicts.

Still, even the simplest of the changes takes effort, and triaging hundreds of bugs is not fun at all and mostly a thankless effort ─ althought there is the occasionally kind soul that thanks you for handling a decade-old bug.

If being subjected to the rigours of the BTS and reading and solving hundreds of bug reports is not Purification, I don't know what it is.

Apart from the triaging, there were 118 bugs closed (or pending) due to changes made in the upstream part or the packaging in the last few months, and there are many changes that are not reflected in bugs closed (like most of the changes needed due to the C++11 ABI transition, bugs and problems fixed that had no report, and general rejuvenation or improvement of some parts of the code).

How long this will last, I cannot know. I hope to find a job at some point, which obviously will reduce the time available to work on this.

But in the meantime, for all aptitude users: Enjoy the fixes and new features!


[1] ^ Some visitors of the recent mini-DebConf @ Cambridge perhaps thought that the fireworks and throngs gathered were in honour of our mighty Universal Operating System, but sadly they were not. They might be, some day. In any case, the reports say that the visitors enjoyed the fireworks.

15 November, 2015 11:44PM by Manuel A. Fernandez Montecelo

Carl Chenet

Retweet 0.5 : only retweet some tweets

Retweet 0.5 is now available ! The main new feature is: Retweet now lets you define a list of hashtags that, if they appear in the text of the tweet, this tweet is not retweeted.

You only need this line in your retweet.ini configuration file:


Have a look at the official documentation to read how it extensively works.

Retweet 0.5 is available on the PyPI repository and is already in the official Debian unstable repository.

Proposed Debian Logo

Retweet is in production already for Le Journal Du hacker , a French FOSS community website to share and relay news and , a job board for the French-speaking FOSS community.



What about you? does Retweet allow you to develop your Twitter account? Let your comments in this article.

15 November, 2015 11:00PM by Carl Chenet

hackergotchi for Lunar


Reproducible builds: week 29 in Stretch cycle

What happened in the reproducible builds effort this week:

Toolchain fixes

Emmanuel Bourg uploaded eigenbase-resgen/ which uses of the scm-safe comment style by default to make them deterministic.

Mattia Rizzolo started a new thread on debian-devel to ask a wider audience for issues about the -Wdate-time compile time flag. When enabled, GCC and clang print warnings when __DATE__, __TIME__, or __TIMESTAMP__ are used. Having the flag set by default would prompt maintainers to remove these source of unreproducibility from the sources.

Packages fixed

The following packages have become reproducible due to changes in their build dependencies: bmake, cyrus-imapd-2.4, drobo-utils, eigenbase-farrago, fhist, fstrcmp, git-dpm, intercal, libexplain, libtemplates-parser, mcl, openimageio, pcal, powstatd, ruby-aggregate, ruby-archive-tar-minitar, ruby-bert, ruby-dbd-odbc, ruby-dbd-pg, ruby-extendmatrix, ruby-rack-mobile-detect, ruby-remcached, ruby-stomp, ruby-test-declarative, ruby-wirble, vtprint.

The following packages became reproducible after getting fixed:

Some uploads fixed some reproducibility issues, but not all of them:

Patches submitted which have not made their way to the archive yet:

  • #804729 on pbuilder by Reiner Herrmann: tell dblatex to build in a deterministic path.

The fifth and sixth armhf build nodes have been set up, resulting in five more builder jobs for armhf. More than 10,000 packages have now been identified as reproducible with the “reproducible” toolchain on armhf. (Vagrant Cascadian, h01ger)

Helmut Grohne and Mattia Rizzolo now have root access on all 12 build nodes used by and (h01ger) is now linked from all package pages and the dashboard. (h01ger)

profitbricks-build5-amd64 and profitbricks-build6-amd64, responsible for running amd64 tests now run 398.26 days in the future. This means that one of the two builds that are being compared will be run on a different minute, hour, day, month, and year. This is not yet the case for armhf. FreeBSD tests are also done with 398.26 days difference. (h01ger)

The design of the Arch Linux test page has been greatly improved. (Levente Polyak)

diffoscope development

Three releases of diffoscope happened this week numbered 39 to 41. It includes support for EPUB files (Reiner Herrmann) and Free Pascal unit files, usually having .ppu as extension (Paul Gevers).

The rest of the changes were mostly targetting at making it easier to run diffoscope on other systems. The tlsh, rpm, and debian modules are now all optional. The test suite will properly skip tests that need optional tools or modules when they are not available. As a result, diffosope is now available on PyPI and thanks to the work of Levente Polyak in Arch Linux.

Getting these versions in Debian was a bit cumbersome. Version 39 was uploaded with an expired key (according to the keyring on which will hopefully be updated soon) which is currently handled by keeping the files in the queue without REJECTing them. This prevented any other Debian Developpers to upload the same version. Version 40 was uploaded as a source-only upload… but failed to build from source which had the undesirable side effect of removing the previous version from unstable. The package faild to build from source because it was built passing -I to debbuild. This excluded the ELF object files and static archives used by the test suite from the archive, preventing the test suite to work correctly. Hopefully, in a nearby future it will be possible to implement a sanity check to prevent such mistakes in the future.

It has also been identified that ppudump outputs time in the system timezone without considering the TZ environment variable. Zachary Vance and Paul Gevers raised the issue on the appropriate channels.

strip-nondeterminism development

Chris Lamb released strip-nondeterminism version 0.014-1 which disables stripping Mono binaries as it is too aggressive and the source of the problem is being worked on by Mono upstream.

Package reviews

133 reviews have been removed, 115 added and 103 updated this week.

Chris West and Chris Lamb reported 57 new FTBFS bugs.


The video of h01ger and Chris Lamb's talk at MiniDebConf Cambridge is now available.

h01ger gave a talk at CCC Hamburg on November 13th, which was well received and sparked some interest among Gentoo folks. Slides and video should be available shortly.

Frederick Kautz has started to revive Dhiru Kholia's work on testing Fedora packages.

Your editor wish to once again thank #debian-reproducible regulars for reviewing these reports weeks after weeks.

15 November, 2015 05:51PM

hackergotchi for Simon McVittie

Simon McVittie

Discworld Noir in a Windows 98 virtual machine on Linux

Discworld Noir was a superb adventure game, but is also notoriously unreliable, even in Windows on real hardware; using Wine is just not going to work. After many attempts at bringing it back into working order, I've settled on an approach that seems to work: now that qemu and libvirt have made virtualization and emulation easy, I can run it in a version of Windows that was current at the time of its release. Unfortunately, Windows 98 doesn't virtualize particularly well either, so this still became a relatively extensive yak-shaving exercise.


These instructions assume that /srv/virt is a suitable place to put disk images, but you can use anywhere you want.

The emulated PC

After some trial and error, it seems to work if I configure qemu to emulate the following:

  • Fully emulated hardware instead of virtualization (qemu-system-i386 -no-kvm)
  • Intel Pentium III
  • Intel i440fx-based motherboard with ACPI
  • Real-time clock in local time
  • No HPET
  • 256 MiB RAM
  • IDE primary master: IDE hard disk (I used 30 GiB, which is massively overkill for this game; qemu can use sparse files so it actually ends up less than 2 GiB on the host system)
  • IDE primary slave, secondary master, secondary slave: three CD-ROM drives
  • PS/2 keyboard and mouse
  • Realtek AC97 sound card
  • Cirrus video card with 16 MiB video RAM

A modern laptop CPU is an order of magnitude faster than what Discworld Noir needs, so full emulation isn't a problem, despite being inefficient.

There is deliberately no networking, because Discworld Noir doesn't need it, and a 17 year old operating system with no privilege separation is very much not safe to use on the modern Internet!

Software needed

  • Windows 98 installation CD-ROM as a .iso file (cp /dev/cdrom windows98.iso) - in theory you could also use a real optical drive, but my laptop doesn't usually have one of those. I used the OEM disc, version 4.10.1998 (that's the original Windows 98, not the Second Edition), which came with a long-dead PC, and didn't bother to apply any patches.
  • A Windows 98 license key. Again, I used an OEM key from a past PC.
  • A complete set of Discworld Noir (English) CD-ROMs as .iso files. I used the UK "Sold Out Software" budget release, on 3 CDs.
  • A multi-platform Realtek AC97 audio driver.

Windows 98 installation

It seems to be easiest to do this bit by running qemu-system-i386 manually:

qemu-img create -f qcow2 /srv/virt/discworldnoir.qcow2 30G
qemu-system-i386 -hda /srv/virt/discworldnoir.qcow2 \
    -drive media=cdrom,format=raw,file=/srv/virt/windows98.iso \
    -no-kvm -vga cirrus -m 256 -cpu pentium3 -localtime

Don't start the installation immediately. Instead, boot the installation CD to a DOS prompt with CD-ROM support. From here, run


and create a single partition filling the emulated hard disk. When finished, hard-reboot the virtual machine (press Ctrl+C on the qemu-system-i386 process and run it again).

The DOS FORMAT.COM utility is on the Windows CD-ROM but not in the root directory or the default %PATH%, so you'll have to run:

d:\win98\format c:

to create the FAT filesystem. You might have to reboot again at this point.

The reason for doing this the hard way is that the Windows 98 installer doesn't detect qemu as supporting ACPI. You want ACPI support, so that Windows will issue IDLE instructions from its idle loop, instead of occupying a CPU core with a busy-loop. To get that, boot to a DOS prompt again, and use:

setup /p j /iv

/p j forces ACPI support (Thanks to "Richard S" on the Virtualbox forums for this tip.) /iv is unimportant, but it disables the annoying "billboards" during installation, which advertised once-exciting features like support for dial-up modems and JPEG wallpaper.

I used a "Typical" installation; there didn't seem to be much point in tweaking the installed package set when everything is so small by modern standards.

Windows 98 has built-in support for the Cirrus VGA card that we're emulating, so after a few reboots, it should be able to run in a semi-modern resolution and colour depth. Discworld Noir apparently prefers a 640 × 480 × 16-bit video mode, so right-click on the desktop background, choose Properties and set that up.

Audio drivers

This is the part that took me the longest to get working. Of the sound cards that qemu can emulate, Windows 98 only supports the SoundBlaster 16 out of the box. Unfortunately, the Soundblaster 16 emulation in qemu is incomplete, and in particular version 2.1 (as shipped in Debian 8) has a tendency to make Windows lock up during boot.

I've seen advice in various places to emulate an Eqsonic ES1370 (SoundBlaster AWE 64), but that didn't work for me: one of the drivers I tried caused Windows to lock up at a black screen during boot, and the other didn't detect the emulated hardware.

The next-oldest sound card that qemu can emulate is a Realtek AC97, which was often found integrated into motherboards in the late 1990s. This one seems to work, with the "A400" driver bundle linked above. For Windows 98 first edition, you need a driver bundle that includes the old "VXD" drivers, not just the "WDM" drivers supported by Second Edition and newer.

The easiest way to get that into qemu seems to be to turn it into a CD image:

genisoimage -o /srv/virt/discworldnoir-drivers.iso WDM_A400.exe
qemu-system-i386 -hda /srv/virt/discworldnoir.qcow2 \
    -drive media=cdrom,format=raw,file=/srv/virt/windows98.iso \
    -drive media=cdrom,format=raw,file=/srv/virt/discworldnoir-drivers.iso \
    -no-kvm -vga cirrus -m 256 -cpu pentium3 -localtime -soundhw ac97

Run the installer from E:, then reboot with the Windows 98 CD inserted, and Windows should install the driver.

Installing Discworld Noir

Boot up the virtual machine with CD 1 in the emulated drive:

qemu-system-i386 -hda /srv/virt/discworldnoir.qcow2 \
    -drive media=cdrom,format=raw,file=/srv/virt/DWN_ENG_1.iso \
    -no-kvm -vga cirrus -m 256 -cpu pentium3 -localtime -soundhw ac97

You might be thinking "... why not insert all three CDs into D:, E: and F:?" but the installer expects subsequent disks to appear in the same drive where CD 1 was initially, so that won't work. Instead, when prompted for a new CD, switch to the qemu monitor with Ctrl+Alt+2 (note that this is 2, not F2). At the (qemu) prompt, use info block to see a list of emulated drives, then issue a command like

change ide0-cd1 /srv/virt/DWN_ENG_2.iso

to swap the CD. Then switch back to Windows' console with Ctrl+Alt+1 and continue installation. I used a Full installation of Discworld Noir.

Transferring the virtual machine to GNOME Boxes

Having finished the "control freak" phase of installation, I wanted a slightly more user-friendly way to run this game, so I transferred the virtual machine to be used by libvirtd, which is the backend for both GNOME Boxes and virt-manager:

virsh create discworldnoir.xml

Here is the configuration I used. It's a mixture of automatic configuration from virt-manager, and hand-edited configuration to make it match the qemu-system-i386 command-line.

Running the game

If all goes well, you should now see a discworldnoir virtual machine in GNOME Boxes, in which you can boot Windows 98 and play the game. Have fun!

15 November, 2015 05:19PM

November 14, 2015

Juliana Louback

PaperTrail - Powered by IBM Watson

On the final semester of my MSc program at Columbia SEAS, I was lucky enough to be able to attend a seminar course taught by Alfio Gliozzo entitled Q&A with IBM Watson. A significant part of the course is dedicated to learning how to leverage the services and resources available on the Watson Developer Cloud. This post describes the course project my team developed, the PaperTrail application.

Project Proposal

Create an application to assist in the development of future academic papers. Based on a paper’s initial proposal, Paper Trail predicts publications to be used as references or acknowledgement of prior art and provides a trend analysis of major topics and methods.

The objective is to speed the discovery of relevant papers early in the research process, and allow for early assessment of the depth of prior research concerning the initial proposal.

Meet the Team

Wesley Bruning, Software Engineer, MSc. in Computer Science

Xavier Gonzalez, Industrial Engineer, MSc. in Data Science

Juliana Louback, Software Engineer, MSc. in Computer Science

Aaron Zakem, Patent Attorney, MSc. in Computer Science

Prior Art

A significant amount of attention has been given to this topic over the past few decades. The table below shows the work the team deemed most relevant due to recency, accuracy and similarity of functionality.


The variation in accuracy displayed is a result of experimentation with different dataset sizes and algorithm variations. More information and details can be found in the prior art report.

The main differential of PaperTrail is providing a form of access to the citation prediciton and trend analysis algorithm. With the exception of the project by McNee et al., these algorithmns aren’t currently available for general use. The application on is open to use but its objective is to rank publications and authors for given topics.


Citation Prediction: PaperTrail builds on the work done by Wolski’s team in Fall 2014. This algorithmn builds a reference graph used to define research communities, with an associated vector of topic scores generated by an LDA model. The papers in each research community are then ranked by importance within the community with a custom ranking algorithm. When a target document is given to algorithm as input, the LDA model is used to generate a vector of topics that are present in the document. The communities with the most similar topic vectors are selected and the publications within these communities with highest rank and greatest similarity to the input document are recommended as references. A more detailed description can be found here.

Trend Analysis: Initially, the idea was to use the AlchemyData News API to obtain statistics pertaining to the amount of publications on a given topic over time. However, with the exception of buzz-words (i.e. ‘big data’), many more specialized topics appeared very infrequently in news articles, if at all. This isn’t entirely surprising given the target audience of PaperTrail. As a work around, we use the Alchemy Language API to extract keywords from the abstracts in the dataset, in addition to relevance scores. The PaperTrail database could then be queried for entry counts for a given year and keyword to provide an indication of publication trends in academia. Note that the Alchemy Language API extracts multiple-word ‘keywords’ as well as single words.


To maintain consistency with Wolski’s project, we are using the DBLP data as made available on The DBLP-Citation-network V5 dataset contains 1,572,277 entries; we are limited to the use of entries that contain both abstracts and citations, bringing the dataset size down to 265,865 entries.


A high-level visualization of the project architecture is displayed below. Before launching PaperTrail, it’s necessary to train Wolski’s algorithm offline. Currently any documentation with regard to the performance of said algorithm is unavailable; the PaperTrail project will include an evaluation phase and report the findings made.

The PaperTrail app and database will be hosted on the Bluemix Platform.


Status Report

Phases completed:

  • Project design

  • Prior art research

  • Data cleansing

  • Development and deployment of an alpha version of the PaperTrail app

Phases under development:

  • Algorithm training and evaluation

  • Keyword extraction

  • MapReduce of publication frequency by year and topic

  • Data visualization component

14 November, 2015 10:06AM

Craig Small

Mixing pysnmp and stdin

Depending on the application, sometimes you want to have some socket operations going (such as loading a website) and have stdin being read. There are plenty of examples for this in python which usually boil down to making stdin behave like a socket and mixing it into the list of sockets select() cares about.

A while ago I asked an email list could I have pysnmp use a different socket map so I could add my own sockets in (UDP, TCP and a zmq to name a few) and the Ilya the author of pysnmp explained how pysnmp can use a foreign socket map.

This sample code below is merely an mixture of Ilya’s example code and the way stdin gets mixed into the fold.  I have also updated to the high-level pysnmp API which explains the slight differences in the calls.

  1. from time import time
  2. import sys
  3. import asyncore
  4. from pysnmp.hlapi import asyncore as snmpAC
  5. from pysnmp.carrier.asynsock.dispatch import AsynsockDispatcher
  8. class CmdlineClient(asyncore.file_dispatcher):
  9.     def handle_read(self):
  10.     buf = self.recv(1024)
  11.     print "you said {}".format(buf)
  14. def myCallback(snmpEngine, sendRequestHandle, errorIndication,
  15.                errorStatus, errorIndex, varBinds, cbCtx):
  16.     print "myCallback!!"
  17.     if errorIndication:
  18.         print(errorIndication)
  19.         return
  20.     if errorStatus:
  21.         print('%s at %s' % (errorStatus.prettyPrint(),
  22.               errorIndex and varBinds[int(errorIndex)-1] or '?')
  23.              )
  24.         return
  26.     for oid, val in varBinds:
  27.     if val is None:
  28.         print(oid.prettyPrint())
  29.     else:
  30.         print('%s = %s' % (oid.prettyPrint(), val.prettyPrint()))
  32. sharedSocketMap = {}
  33. transportDispatcher = AsynsockDispatcher()
  34. transportDispatcher.setSocketMap(sharedSocketMap)
  35. snmpEngine = snmpAC.SnmpEngine()
  36. snmpEngine.registerTransportDispatcher(transportDispatcher)
  37. sharedSocketMap[sys.stdin] = CmdlineClient(sys.stdin)
  39. snmpAC.getCmd(
  40.     snmpEngine,
  41.     snmpAC.CommunityData('public'),
  42.     snmpAC.UdpTransportTarget(('', 161)),
  43.     snmpAC.ContextData(),
  44.     snmpAC.ObjectType(
  45.         snmpAC.ObjectIdentity('SNMPv2-MIB', 'sysDescr', 0)),
  46.     cbFun=myCallback)
  48. while True:
  49.     asyncore.poll(timeout=0.5, map=sharedSocketMap)
  50.     if transportDispatcher.jobsArePending() or transportDispatcher.transportsAreWorking():
  51.         transportDispatcher.handleTimerTick(time())

Some interesting lines from the above code:

  • Lines 8-11 are the stdin class that is called (or rather its handle_read method is) when there is text available on stdin.
  • Line 34 is where pysnmp is told to use our socket map and not its inbuilt one
  • Line 37 is where we have used the socket map to say if we get input from stdin, what is the handler.
  • Lines 39-46 are sending a SNMP query using the high-level API
  • Lines 48-51 are my simple socket poller

With all this I can handle keyboard presses and network traffic, such as a simple SNMP poll.

14 November, 2015 07:04AM by Craig

November 13, 2015

hackergotchi for Francois Marier

Francois Marier

How Tracking Protection works in Firefox

Firefox 42, which was released last week, introduced a new feature in its Private Browsing mode: tracking protection.

If you are interested in how this list is put together and then used in Firefox, this post is for you.

Safe Browsing lists

There are many possible ways to download URL lists to the browser and check against that list before loading anything. One of those is already implemented as part of our malware and phishing protection. It uses the Safe Browsing v2.2 protocol.

In a nutshell, the way that this works is that each URL on the block list is hashed (using SHA-256) and then that list of hashes is downloaded by Firefox and stored into a data structure on disk:

  • ~/.cache/mozilla/firefox/XXXX/safebrowsing/mozstd-track* on Linux
  • ~/Library/Caches/Firefox/Profiles/XXXX/safebrowsing/mozstd-track* on Mac
  • C:\Users\XXXX\AppData\Local\mozilla\firefox\profiles\XXXX\safebrowsing\mozstd-track* on Windows

This sbdbdump script can be used to extract the hashes contained in these files and will output something like this:

$ ~/sbdbdump/ -v .
- Reading sbstore: mozstd-track-digest256
[mozstd-track-digest256] magic 1231AF3B Version 3 NumAddChunk: 1 NumSubChunk: 0 NumAddPrefix: 0 NumSubPrefix: 0 NumAddComplete: 1696 NumSubComplete: 0
[mozstd-track-digest256] AddChunks: 1445465225
[mozstd-track-digest256] SubChunks:
[mozstd-track-digest256] addComplete[chunk:1445465225] e48768b0ce59561e5bc141a52061dd45524e75b66cad7d59dd92e4307625bdc5
[mozstd-track-digest256] MD5: 81a8becb0903de19351427b24921a772

The name of the blocklist being dumped here (mozstd-track-digest256) is set in the urlclassifier.trackingTable preference which you can find in about:config. The most important part of the output shown above is the addComplete line which contains a hash that we will see again in a later section.

List lookups

Once it's time to load a resource, Firefox hashes the URL, as well as a few variations of it, and then looks for it in the local lists.

If there's no match, then the load proceeds. If there's a match, then we do an additional check against a pairwise allowlist.

The pairwise allowlist (hardcoded in the urlclassifier.trackingWhitelistTable pref) is designed to encode what we call "entity relationships". The list groups related domains together for the purpose of checking whether a load is first or third party (e.g. and both belong to the same entity).

Entries on this list (named mozstd-trackwhite-digest256) look like this:

which translates to "if you're on the site, then don't block resources from

If there's a match on the second list, we don't block the load. It's only when we get a match on the first list and not the second one that we go ahead and cancel the network load.

If you visit our test page, you will see tracking protection in action with a shield icon in the URL bar. Opening the developer tool console will expose the URL of the resource that was blocked:

The resource at "" was blocked because tracking protection is enabled.

Creating the lists

The blocklist is created by Disconnect according to their definition of tracking.

The Disconnect list is on their Github page, but the copy we use in Firefox is the copy we have in our own repository. Similarly the Disconnect entity list is from here but our copy is in our repository. Should you wish to be notified of any changes to the lists, you can simply subscribe to this Atom feed.

To convert this JSON-formatted list into the binary format needed by the Safe Browsing code, we run a custom list generation script whenever the list changes on GitHub.

If you run that script locally using the same configuration as our server stack, you can see the conversion from the original list to the binary hashes.

Here's a sample entry from the mozstd-track-digest256.log file:

[m] >>
[hash] e48768b0ce59561e5bc141a52061dd45524e75b66cad7d59dd92e4307625bdc5

and one from mozstd-trackwhite-digest256.log:

[entity] Twitter >> (canonicalized), hash a8e9e3456f46dbe49551c7da3860f64393d8f9d96f42b5ae86927722467577df

This in combination with the sbdbdump script mentioned earlier, will allow you to audit the contents of the local lists.

Serving the lists

The way that the binary lists are served to Firefox is through a custom server component written by Mozilla: shavar.

Every hour, Firefox requests updates from If new data is available, then the whole list is downloaded again. Otherwise, all it receives in return is an empty 204 response.

Should you want to play with it and run your own server, follow the installation instructions and then go into about:config to change these preferences to point to your own instance:


Note that on Firefox 43 and later, these prefs have been renamed to:


Learn more

If you want to learn more about how tracking protection works in Firefox, you can find all of the technical details on the Mozilla wiki or you can ask questions on our mailing list.

Thanks to Tanvi Vyas for reviewing a draft of this post.

13 November, 2015 08:40PM