December 01, 2020

hackergotchi for Jonathan Carter

Jonathan Carter

Free Software Activities for 2020-11

This month just went past way too fast, didn’t get to all the stuff I wanted to, but managed to cover many essentials (not even listed here) that I’ll cover in follow-up posts. In particular, highlights that I’m thankful for are that we’ve selected the final artwork for Bullseye. We’ve also successfully hosted another two MiniDebConfs. One that was gaming themed, and a Brazilian event all in Portuguese! Videos are up on Debian’s PeerTube instance (Gaming Edition | Brazil) and on the DebConf video archive for direct download.

Remember to take care of yourself out there! Physical safety is high on everyone’s mind in these times, but remember to pay attention to your mental health too. It’s ok if you won’t hit all your usual targets and goals in these times, don’t be too hard on yourself and burn out!

2020-11-01: Upload package gtetrinet (0.7.11+git20200916.46e7ade-2~bpo10+1) to Debian buster-backports.

2020-11-01: Upload package gnome-shell-extension-disconnect-wifi (26-1) to Debian unstable.

2020-11-02: Merge MR!2, MR!4 and MR!5 for zram-tools, follow 3-way merge closing MR!1 and MR!3.

2020-11-02: Upload package zram-swap (0.3.3-1) to Debian unstable (Closes: #917643, #928439, #928443).

2020-11-02: Close live-installer bugs #646704 (fix released a few years ago already), #700642 (nothing left to fix), #835391 (unproducible on latest images), #847446 (graphical d-i installer no longer provided). #714710 (problem not present on latest installation media)

2020-11-02: File ROM for calcoo (#973638) – no longer maintained upstream, GTK-2 only.

2020-11-03: Upload package bundlewrap (4.2.2-1) to Debian unstable.

2020-11-03: Upload package feed2toot (0.14-1) to Debian unstable.

2020-11-03: Upload package feed2toot (0.14-2) to Debian unstable.

2020-11-03: Upload package flask-autoindex (0.6.6-2) to Debian unstable.

2020-11-03: Upload package flask-caching (1.9.0-1) to Debian unstable.

2020-11-03: Upload package flask-restful (0.3.8-5) to Debian unstable.

2020-11-08: Upload package s-tui (1.0.2-2) to Debian unstable (Closes: #961534).

2020-11-09: Merge MR!1 for bluefish (remove old icon).

2020-11-10: Upload package bluefish (2.2.12-1) to Debian unstable.

2020-11-10: Upload package calamares (3.2.33-1) to Debian unstable.

2020-11-11: Upload package calamares-settings-debian (11.0.4-1) to Debian unstable.

2020-11-17: Upload package gnome-shell-extension-multiple-workspaces (22-1) to Debian-unstable.

2020-11-24: Sponsor package xmodem (0.4.6+dfsg-2) for Debian unstable (Python Team request).

2020-11-24: Sponsor package python-opentracing (2.4.0-1) for Debian unstable (Python Team request).

2020-11-24: Sponsor package python-css-parser (1.0.6-1) for Debian unstable (Python Team request).

2020-11-24: Review package buildbot (2.8.4-1) (Needs some more work) (Python Team request).

2020-11-24: Review package gbsplay (0.0.94-1) (Needs some more work) (Games Team request).

2020-11-24: Sponsor package goverlay (0.4.2-1) for Debian unstable (Games Team request).

2020-11-24: Sponsor package lutris (0.5.8-1) for Debian unstable (Games Team request).

2020-11-24: Review package mangohud (0.5.1-1) for Debian unstable (Needs some more work) (Games Team request).

2020-11-24: Sponsor package vkbasalt (0.3.2.3-1) for Debian unstable (Games Team request).

2020-11-25: Sponsor package starfighter (2.3.3-1) for Debian unstable (Games Team request).

2020-11-25: Sponsor package pentobi (18.3-1) for Debian unstable (Games Team request).

2020-11-30: Sponsor package lutris (0.5.8-1) for Debian unstable (Games Team request) (New upload).

01 December, 2020 07:44AM by jonathan

Paul Wise

FLOSS Activities November 2020

Focus

This month I didn't have any particular focus. I just worked on issues in my info bubble.

Changes

Issues

Review

Administration

  • Debian wiki: disable attachments due to security issue, approve accounts

Communication

  • Respond to queries from Debian users and contributors on the mailing lists and IRC

Sponsors

The visdom, apt-listchanges work and lintian-brush bug report were sponsored by my employer. All other work was done on a volunteer basis.

01 December, 2020 01:12AM

hackergotchi for Dirk Eddelbuettel

Dirk Eddelbuettel

inline 0.3.17: Refactored and New Tests

A new release of the inline package arrived on CRAN this evening and has already been shipped to Debian as well. inline facilitates writing code in-line in simple string expressions or short files. The package was used quite extensively by Rcpp in the days before Rcpp Attributes arrived on the scene proving an even better alternative for its use cases. inline is still use by rstan and a number of other packages.

One of those other packages is mkin, and its author Johannes Ranke overhauled the saving and re-loading of C functions part with a really well-done set of contributions. In the process we also added unit testing via the lovely tinytest, and changed to continuous integration setup to r-ci.

See below for a detailed list of changes extracted from the NEWS file.

Changes in inline version 0.3.17 (2020-11-30)

  • Unit testing is now supported via tinytest (Johannes in #15 addressing #14).

  • CI was updated to use focal and run.sh from r-ci on Travis and GitHub Actions (Dirk)

  • The writing and reading of compiled code was refactored and extended (Johannes in #16 fixing #13).

  • Some minor problems related to CRAN checks and tests were corrected (Johannes and Dirk in #17, Johannes in #18, #19, #20).

  • Small stylistic updates have been applied to some R and Rd files (Dirk).

Courtesy of my CRANberries, there is a comparison to the previous release.

If you like this or other open-source work I do, you can now sponsor me at GitHub. For the first year, GitHub will match your contributions.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.

01 December, 2020 01:09AM

November 30, 2020

hackergotchi for Chris Lamb

Chris Lamb

Free software activities in November 2020

Here is my monthly update covering what I have been doing in the free software world during November 2020 (previous month):

  • Merged a pull request from Jens Nistler for django-slack (my library which provides a convenient wrapper between projects using the Django and the Slack chat platform) to make it compatible with Celery version 5. [...]

§


Reproducible Builds

One of the original promises of open source software is that distributed peer review and transparency of process results in enhanced end-user security. However, whilst anyone may inspect the source code of free and open source software for malicious flaws, almost all software today is distributed as pre-compiled binaries. This allows nefarious third-parties to compromise systems by injecting malicious code into ostensibly secure software during the various compilation and distribution processes.

The motivation behind the Reproducible Builds effort is to ensure no flaws have been introduced during this compilation process by promising identical results are always generated from a given source, thus allowing multiple third-parties to come to a consensus on whether a build was compromised.

The project is proud to be a member project of the Software Freedom Conservancy. Conservancy acts as a corporate umbrella allowing projects to operate as non-profit initiatives without managing their own corporate structure. If you like the work of the Conservancy or the Reproducible Builds project, please consider becoming an official supporter.


This month, I:


I also made the following changes to diffoscope:

  • Improvements:

    • Move the slightly-confusing behaviour if a single file is passed to diffoscope on the command-line to a new --load-existing-diff command. [...]
    • Ensure the new diffoscope-minimal package that was introduced by Mattia Rizzolo has a different short description from the primary diffoscope one. [...]
    • Refresh the long and short descriptions of all of the Debian packages. [...]
  • Bug fixes:

    • Don't depend on radare2 in the Debian 'autopkgtests' as it will not be in bullseye due to security considerations. (#975313)
    • Avoid some incorrectly-formatted error messages. This was caused by diffoscope raising an artificial CalledProcessError exception in a generic handler. [...]
  • Codebase improvements:

    • Add a comment regarding Java tests to aid diffoscope contributors who are not using Debian [...] and don't use the old-style super(...) call [...].

§


Debian

I performed the following uploads to the Debian Linux distribution this month:

  • python-django (2.2.17-1 & 3.1.3-1) — New upstream releases.

  • memcached (1.6.9+dfsg-1) — New upstream release.

  • lintian (2.101.0, 2.102.0, 2.103.0 & 2.104.0) — New upstream releases.

  • xtrlock (2.14) — Mark an autopkgtest as 'superficial'. (#974491)

  • bfs (2.1-1) — New upstream release.

  • splint (3.1.2+dfsg-3) — Re-upload a previous QA upload of mine (3.1.2+dfsg-2) to ensure the package's transition to the testing distribution. (#974872)

I also filed a release-critical bug against the minidlna package which could not be successfully purged from the system without reporting a cannot remove '/var/log/minidlna' error. (#975372)


§


Debian LTS

This month I have worked 18 hours on Debian Long Term Support (LTS) and 12 hours on its sister Extended LTS project, including:

You can find out more about the Debian LTS project via the following video:

30 November, 2020 06:58PM

John Goerzen

Thanksgiving in 2020

With COVID-19, Thanksgiving is a little different this year.

The kids enjoyed doing a little sightseeing by air – in our own plane (all socially-distanced of course!). We built a Prusa 3D printer from a kit (the boys and I, though Martha checked in periodically too). It arrived earlier than expected so that kept us busy for several days. And, of course, there was the Christmas decorating and Zoom church (where only our family is in the building, hosting the service for everyone).

What, so Thanksgiving doesn’t normally involve assembling printers, sightseeing from the sky, and printing tiny cups and dishes for miniature houses on a 3D printer?

I’ll be glad when COVID is over. Meantime, we have some memories to treasure too.

30 November, 2020 04:20AM by John Goerzen

November 28, 2020

Mark Brown

Book club: Rust after the honeymoon

Earlier this month Daniel, Lars and myself got together to discuss Bryan Cantrill’s article Rust after the honeymoon. This is an overview of what keeps him enjoying working with Rust after having used it for an extended period of time for low level systems work at Oxide, we were particularly interested to read a perspective from someone who was both very experienced in general and had been working with the language for a while. While I have no experience with Rust both Lars and Daniel have been using it for a while and greatly enjoy it.

One of the first areas we discussed was data bearing enums – these have been very important to Bryan. In keeping with a pattern we all noted these take a construct that’s relatively commonly implemented by hand in C (or skipped as too much effort, as Lars found) and provides direct support in the language for it. For both Daniel and Lars this has been key to their enjoyment of Rust, it makes things that are good practice or common idioms in C and C++ into first class language features which makes them more robust and allows them to fade into the background in a way they can’t when done by hand.

Daniel was also surprised by some omissions, some small such as the ? operator but others much more substantial – the standout one being editions. These aim to address the problems seen with version transitions in other languages like Python, allowing individual parts of a Rust program to adopt potentially incompatible language features while remaining interoperability with older editions of the language rather than requiring the entire program to be upgraded en masse. This helps Rust move forwards with less need to maintain strict source level compatibility, allowing much more rapid evolution and helping deal with any issues that are found. Lars expressed the results of this very clearly, saying that while lots of languages offer a 20%/80% solution which does very well in specific problem domains but has issues for some applications Rust is much more able to move towards a much more general applicability by addressing problems and omissions as they are understood.

This distracted us a bit from the actual content of the article and we had an interesting discussion of the issues with handling OS differences in filenames portably. Rather than mapping filenames onto a standard type within the language and then have to map back out into whatever representation the system actually uses Rust has an explicit type for filenames which must be explicitly converted on those occasions when it’s required, meaning that a lot of file handling never needs to worry about anything except the OS native format and doesn’t run into surprises. This is in keeping with Rust’s general approach to interfacing with things that can’t be represented in its abstractions, rather than hide things it keeps track of where things that might break the assumptions it makes are and requires the programmer to acknowledge and handle them explicitly. Both Lars and Daniel said that this made them feel a lot more confident in the code that they were writing and that they had a good handle on where complexity might lie, Lars noted that Rust is the first languages he’s felt comfortable writing multi threaded code in.

We all agreed that the effect here was more about having idioms which tend to be robust and both encourage writing things well and gives readers tools to help know where particular attention is required – no tooling can avoid problems entirely. This was definitely an interesting discussion for me with my limited familiarity with Rust, hopefully Daniel and Lars also got a lot out of it!

28 November, 2020 03:58PM by broonie

Russ Allbery

Review: Nine Goblins

Review: Nine Goblins, by T. Kingfisher

Publisher: Red Wombat Tea Company
Copyright: 2013
ASIN: B00G9GSEXO
Format: Kindle
Pages: 140

The goblins are at war, a messy multi-sided war also involving humans, elves, and orcs. The war was not exactly their idea, although the humans would claim otherwise. Goblins kept moving farther and farther into the wilderness to avoid human settlements, and then they ran out of wilderness, and it wasn't clear what else to do. For the Nineteenth Infantry, the war is a confusing business, full of boredom and screaming and being miserable and following inexplicable orders. And then they run into a wizard.

Wizards in this world are not right in the head, and by not right I mean completely psychotic. That's the only way that you get magical powers. Wizards are therefore incredibly dangerous and scarily unpredictable, so when the Whinin' Nineteenth run into a human wizard who shoots blue out of his mouth, making him stop shooting blue out of his mouth becomes a high priority. Goblins have only one effective way of stopping things: charge at them and hit them with something until they stop. Wizards have things like emergency escape portals. And that's how the entire troop of nine goblins ended up far, far behind enemy lines.

Sings-to-Trees's problems, in contrast, are rather more domestic. At the start of the book, they involve, well:

Sings-to-Trees had hair the color of sunlight and ashes, delicately pointed ears, and eyes the translucent green of new leaves. His shirt was off, he had the sort of tanned muscle acquired from years of healthy outdoor living, and you could have sharpened a sword on his cheekbones.

He was saved from being a young maiden's fantasy — unless she was a very peculiar young maiden — by the fact that he was buried up to the shoulder in the unpleasant end of a heavily pregnant unicorn.

Sings-to-Trees is the sort of elf who lives by himself, has a healthy appreciation for what nursing wild animals involves, and does it anyway because he truly loves animals. Despite that, he was not entirely prepared to deal with a skeleton deer with a broken limb, or at least with the implications of injured skeleton deer who are attracted by magical disturbances showing up in his yard.

As one might expect, Sings-to-Trees and the goblins run into each other while having to sort out some problems that are even more dangerous than the war the goblins were unexpectedly removed from. But the point of this novella is not a deep or complex plot. It pushes together a bunch of delightfully weird and occasionally grumpy characters, throws a challenge at them, and gives them space to act like fundamentally decent people working within their constraints and preconceptions. It is, in other words, an excellent vehicle for Ursula Vernon (writing as T. Kingfisher) to describe exasperated good-heartedness and stubbornly determined decency.

Sings-to-Trees gazed off in the middle distance with a vague, pleasant expression, the way that most people do when present at other people's minor domestic disputes, and after a moment, the stag had stopped rattling, and the doe had turned back and rested her chin trustingly on Sings-to-Trees' shoulder.

This would have been a touching gesture, if her chin hadn't been made of painfully pointy blades of bone. It was like being snuggled by an affectionate plow.

It's not a book you read for the twists and revelations (the resolution is a bit of an anti-climax). It's strength is in the side moments of characterization, in the author's light-hearted style, and in descriptions like the above. Sings-to-Trees is among my favorite characters in all of Vernon's books, surpassed only by gnoles and a few characters in Digger.

The Kingfisher books I've read recently have involved humans and magic and romance and more standard fantasy plots. This book is from seven years ago and reminds me more of Digger. There is less expected plot machinery, more random asides, more narrator presence, inhuman characters, no romance, and a lot more focus on characters deciding moment to moment how to tackle the problem directly in front of them. I wouldn't call it a children's book (all of the characters are adults), but it has a bit of that simplicity and descriptive focus.

If you like Kingfisher in descriptive mode, or enjoy Vernon's descriptions of D&D campaigns on Twitter, you are probably going to like this. If you don't, you may not. I thought it was slight but perfect for my mood at the time.

Rating: 7 out of 10

28 November, 2020 07:05AM

November 27, 2020

hackergotchi for Shirish Agarwal

Shirish Agarwal

Farmer Protests and RCEP

Farmer Protests

While I was hoping to write about RCEP exclusively, just today farmer protests have happened against three farm laws which had been passed by our Govt. about a month ago without consulting anybody. The bills benefit only big business houses at the cost of farmers. This has been amply shared by an open letter to one of the biggest business house which will benefit the most.

Now while that is a national experience and what it tells, let me share, some experience from the State I come from, Maharashtra. About 4-5 years back Maharashtra delisted fruit and vegetables from the APMC market. But till date, the APMC market is working, why, the reasons are many. However, what it did was it forced the change to sugarcane, a water guzzling crop much more than previously. This has resulted in lowering the water table in Maharashtra and put them more into debt trap and later they had to commit suicide.

Now let us see why the Punjab farmers have been so agitated that they are walking all the way to Delhi. They are right now, somewhere between Haryana-Delhi border. The reason is that because even their experiments with contract farming have not been good. This is why they are struggling to go to Delhi to make their collective voices heard and get the farm bills rolled back. Even the farmers from Gujarat were sued, but because of elections were put back, the intentions though are clear. At the end of the day, the laws made by the Govt. leaves our farmer at the mercy of big corporations. It is preposterous to believe that the farmer, with their small land holdings will be able to stand up to the Corporation. Add to that, they cannot go to Court. It is the SDM (Sub-Divisonal Magistrate) who will decide on the matters and has the last word. If this is allowed, in a couple of years there will be only few farmers or corporations who would have large hand-holdings, and they would be easily co-opted by the Government in power.

Just in – A gentleman who turned off water cannon being shot at farmers has been charged for murder 😦

Currently, the Government procures rice in vast quantities and the farmers are assured at least some basic income, in the states of Punjab and Haryana –

Procurement of Rice by Various States

Recently there was also an article in Indian Express which shares the farmer’s apprehensions and does share that it’s a complex problem with no easy solutions. The solution can only be dialogue between the two parties. This was also shared by Vivek Kaul, who is far more knowledgable than me on the subject and made a long read on the subject.

The Canada Way

Recently, while sparring on the Internet, came to know of the Canada way. Here, the Government makes the farmer a corporation and the Government helps them. But the Canada way seems to largely work as the Canadian Government owns the majority of the lands in question. And yes, Indians have benefited from it but that is also due to a. the currency differential between Canadian dollar and Indian Rupee and the 99-year land lease. There may be other advantages that the Canadian Government bestows and that is the reason possibly that most Punjabi farmers go to Canada and UK to farm. While looking at it,

I also came across the situation in the United States and it seems the situation there seems to be becoming even more grim.

RCEP

RCEP stands for Regional Comprehensive Economic Partnership. We were supposed to be part of this partnership. Now why didn’t we join, for two reasons, our judicial infrastructure is the worst. It took 8 years to decide on a tax retrospective case (Vodafone) and that too finally outside India. And that decision, by no means an end. The other thing is all those who have joined RCEP have lesser duties, tariffs then India. What this means is that they are much more competitive than India. While there is fear that perhaps that China may take over its assets as it has done with few countries around the world, the opportunity for those countries was too good to pass up even with the dangers. But, then even India has taken loans from the Asian Infrastructure Investment (AIIB) Bank where China is the biggest shareholder. So it doesn’t make sense to be insecure on that front. And again, it is up to India or any other sovereign country to decide to take loans from some country, some multilateral organization or any other way and on what terms.

What China has done and doing is similar to what IMF (being used primarily by the United States) had done in its past. The only difference is that time it was the United States, now it is China. America co-opted Governments, and got assets, China doing the same, no difference in tactics, more or less the same.

There has also been a somewhat interesting paper which discusses how the RCEP may unfold in different circumstances. In short, it tells that the partners will benefit, some more than others. It also does compare the RCEP to CPTPP (The Comprehensive and Progressive Agreement for Trans-Pacific Partnership). While the study is a bit academic in nature as the United States has walked out and the new president-elect Joe Biden hasn’t made any moves and is unlikely to make any moves as there is deep divide and resentment about multilateral trade partnerships domestically within the United States. This news and understanding was quite shocking to me as it shows that unlike the United States of the past, which was supposed to be a beacon of capitalism and seemed to enjoy capitalism, it seems to be an opportunist only. There is also this truth that under Biden, there is only so many things on which he would need and can spend his political capital on.

Statistica Chart of differences between Republicans and Democrats

As can be seen, economy at least for the democrats, this time around is pretty far round the corner. He has a host of battles and would have to choose which to fight and which to ignore.

In the end, we are left to our own devices. At the moment, India does not know when it’s economy will recover –

PTI News, Nov 27, 2020

There has been another worrying bit of news, now all newspapers will need to get some sort of permission, certification from Govt. of India about any news of the world. This is harking back on the 1970’s, 1980’s era

27 November, 2020 07:22PM by shirishag75

Arturo Borrero González

Netfilter virtual workshop 2020 summary

Netfilter logo

Once a year folks interested in Netfilter technologies gather together to discuss past, ongoing and future works. The Netfilter Workshop is an opportunity to share and discuss new ideas, the state of the project, bring people together to work & hack and to put faces to people who otherwise are just email names. This is an event that has been happening since at least 2001, so we are talking about a genuine community thing here.

It was decided there would be an online format, split in 3 short meetings, once per week on Fridays. I was unable to attend the first session on 2020-11-06 due to scheduling conflict, but I made it to the sessions on 2020-11-13 and 2020-11-20. I would say the sessions were joined by about 8 to 10 people, depending on the day. This post is a summary with some notes on what happened in this edition, with no special order.

Pablo did the classical review of all the changes and updates that happened in all the Netfilter project software components since last workshop. I was unable to watch this presentation, so I have nothing special to comment. However, I’ve been following the development of the project very closely, and there are several interesting things going on, some of them commented below.

Florian Westphal brought to the table status on some open/pending work for mptcp option matching, systemd integration and finally interfacing from nft with cgroupv2. I was unable to participate in the talk for the first two items, so I cannot comment a lot more. On the cgroupv2 side, several options were evaluated to how to match them, identification methods, the hierarchical tree that cgroups present, etc. We will have to wait a bit more to see how the final implementation looks like.

Also, Florian presented his concerns on conntrack hash collisions. There are no real-world known issues at the moment, but there is an old paper that suggests we should keep and eye on this and introduce improvements to prevent future DoS attack vectors. Florian mentioned these attacks are not practical at the moment, but who knows in a few years. He wants to explore introducing RB trees for conntrack. It will probably be a rbtree structure of hash tables in order to keep supporting parallel insertions. He was encouraged by others to go ahead and play/explore with this.

Phil Sutter shared his past and future iptables development efforts. He highlighted fixed bugs and his short/midterm TODO list. I know Phil has been busy lately fixing iptables-legacy/iptables-nft incompatibilities. Basically addressing annoying bugs discovered by all ruleset managers out there (kubernetes, docker, openstack neutron, etc). Lots of work has been done to improve the situation; moreover I myself reported, or forwarded from the Debian bug tracker, several bugs. Anyway I was unable to attend this talk, only learnt a few bits in the following sessions, so I don’t have a lot to comment here.

But when I was fully present, I was asked by Phil about the status of netfilter components in Debian, and future plans. I shared my information. The idea for the next Debian stable release is to don’t include iptables in the installer, and include nftables instead. Since Debian Buster, nftables is the default firewalling tool anyway. He shared the plans for the RedHat-related ecosystem, and we were able to confirm that we are basically in sync.

Pablo commented on the latest Netfilter flowtable enhancements happening. Using the flowtable infrastructure, one can create kernel network bypasses to speed up packet throughput. The latest changes are aimed for bridge and VLAN enabled setups. The flowtable component will now know how to bypass in these 2 network architectures as well as the previously supported ingress hook. This is basically aimed for virtual machines and containers scenarios. There was some debate on use cases and supported setups. I commented that a bunch of virtual machines connected to a classic linux bridge and then doing NAT is basically what Openstack Neutron does, specifically in DVR setups. Same can be found in some container-based environments. Early/simple benchmarks done by Pablo suggest there could be huge performance improvements for those use cases. There was some inevitable comparison of this approach to what others, like DPDK/XDP can do. A point was raised about this being a more generic and operating system-integrated solution, which should make it more extensible and easier to use.

flowtable for bridges

Stefano Bravio commented on several open topics for nftables that he is interested on working on. One of them, issues related to concatenations + vmap issues. He also addressed concerns with people’s expectations when migrating from ipset to nftables. There are several corner features in ipset that aren’t currently supported in nftables, and we should document them. Stefano is also wondering about some tools to help in the migration. A translation layer like there is in place for iptables. Eric Gaver commented there are a couple of semantics that will not be suitable for translation, such as global sets, or sets of sets. But ipset is way simpler than iptables, so a translation mechanism should probably be created. In any case, there was agreement that anything that helps people migrate is more than welcome, even if it doesn’t support 100% of the use cases.

Stefano is planning to write documentation in the nftables wiki on how the pipapo algorithm works and the supported use cases. Other plans by Stefano include to work on some optimisations for faster matches. He mentioned using architecture specific instruction to speed up sets operations, like lookups.

Finally, he commented that some folks working with eBPF have showed interest in reusing some parts of the nftables sets infrastructure (pipapo) because they have detected performance issues in their own data structures in some cases. It is not clear how to best achieve it, how to better bridge the two things together. Probably the ideal is to generalize the pipapo data structures and integrate it into the generic bitmap library or something which can be used by anyone. Anyway, he hopes to get some more time to focus on Netfilter stuff begining with the next year, in a couple of months.

Moving a bit away from the pure software development topics, Pablo commented on the netfilter.org infrastructure. Right now the servers are running on gandi.net, on virtual machines that are being basically donated to us. He pointed that the plan is to simplify the infrastructure. For that reason, for example, FTP services has been shut down. Rsync services have been shut down as well, so basically we no longer have a mirrors infrastructure. The bugzilla and wikis we have need some attention, given they are old software pieces, and we need to migrate them to be more modern. Finally, the new logo that was created was presented.

Later on, we spent a good chunk of the meeting discussing options on how to address the inevitable iptables deprecation situation. There are some open questions, and we discussed several approaches. From doing nothing at all, which means keeping the current status-quo, to setting a deadline date for the deprecation like the python community did with python2. I personally like this deadline idea, but it is perceived like a negative push by other. We all agree that the current ‘do nothing’ approach is not sustainable either. Probably the way to go is basically to be more informative. We need to clearly communicate that choosing iptables for anything in 2020 is a bad idea. There are additional initiatives to help on this topic, without being too aggressive. A FAQ will probably be introduced. Eric Garver suggested we should bring nftables front and center. Given the website still mentions iptables everywhere, we will probably refresh the web content, introduce additional informative banners and similar things.

There was an interesting talk on the topic of nft table ownership. The idea is to attach a table, and all the child objects, to a process. Then, we prevent any modifications to the table or the child objects by external entities. Basically, allocating and locking a table for a certain netlink socket. This is a nice way for ruleset managers, like firewalld, to ensure they have full control of what’s happening to their ruleset, reducing the chances for ending with an inconsistent configuration. There is a proof-of-concept patch by Pablo to support this, and Eric mentioned he is pretty much interested in any improvements to support this use case.

The final time block in the final session day was dedicated to talk about the next workshop. We are all very happy we could meet. Meeting virtually is way easier (and cheaper) than in person. Perhaps we can make it online every 3 or 6 months instead of, or in addition to, one big annual physical event. We will see what happens next year.

That’s all on my side!

27 November, 2020 01:30PM

Reproducible Builds (diffoscope)

diffoscope 162 released

The diffoscope maintainers are pleased to announce the release of diffoscope version 162. This version includes the following changes:

[ Chris Lamb ]
* Don't depends on radare2 in the Debian autopkgtests as it will not be in
  bullseye due to security considerations (#950372). (Closes: #975313)
* Avoid "Command `s p a c e d o u t` failed" messages when creating an
  artificial CalledProcessError instance in our generic from_operation
  feeder creator.
* Overhaul long and short descriptions.
* Use the operation's full name so that "command failed" messages include
  its arguments.
* Add a missing comma in a comment.

[ Jelmer Vernooij ]
* Add missing space to the error message when only one argument is passed to
  diffoscope.

[ Holger Levsen ]
* Update Standards-Version to 4.5.1.

[ Mattia Rizzolo ]
* Split the diffoscope package into a diffoscope-minimal package that
  excludes the larger packages from Recommends. (Closes: #975261)
* Drop support for Python 3.6.

You find out more by visiting the project homepage.

27 November, 2020 12:00AM

November 26, 2020

hackergotchi for Jonathan Dowland

Jonathan Dowland

Touched by the Hand of God

picture of a vinyl record

In honour of Diego Maradona (RIP), this morning's cobweb-shifter is New Order's "Touched by the Hand of God"

26 November, 2020 10:18AM

November 25, 2020

hackergotchi for Shirish Agarwal

Shirish Agarwal

Women state in India and proposal for corporates in Indian banking

Gradle and Kotlin in Debian

Few months back, I was looking at where Gradle and Kotlin were in Debian. They still seem to be a work in progress. I found the Android-tools salsa repo which tells me the state of things. While there has been movement on both, a bit more on Kotlin, it still seems it would take a while. For kotlin, the wiki page is most helpful as well as the android-tool salsa kotlin board page . Ironically, some of the more important info. is shared in a blog post which ideally should also have been reflected in kotlin board page . I did see some of the bugs so know it’s pretty much dependency hell. I can only congratulate and encourage Samyak Jain and Raman Sarda. I also played a bit with the google-android-emulator-installer which is basically a hook which downloads the binary from google. I do not know what the plans are, but perhaps in the future they also might be build locally, who knows. Just sharing/stating here so it’s part of my notes whenever I wanna see what’s happening 🙂

Women in India

I am sure some of you might remember my blog post from last year. It is almost close to a year 2020 now and the question to be asked is, has much changed ? After a lot or hue and cry the Government of India shared the NCRB data of crimes against women and caste crimes. The report shared that crimes against women had risen by 7.3% in a year, similarly crimes against lower castes also went by similar percentage . With the 2020 pandemic, I am sure the number has gone up more. And there is a possibility that just like last year, next year the Government would cite the pandemic and say no data. This year they have done it for migrant deaths during lockdown , for job losses due to the pandemic and so on and so forth. So, it will be no surprise if the Govt. says about NCRB data next year as well. Although media has been showing some in spite of the regular threats to the journalists as shared in the last blog post. There is also data that shows that women participation in labor force has fallen sharply especially in the last few years and the Government seems to have no neither idea nor do they seem to care for the same. There aren’t any concrete plans to bring back the balance even a little bit.

Few Court judgements

But all hope is not lost. There have been a couple of good judgements, one from the CIC (Chief Information Commissioner) wherein in specific cases a wife can know salary details of her husband, especially if there is some kind of maintenance due from the husband. There was so much of hue and cry against this order that it was taken down from the livelaw RTI corner. Luckily, I had downloaded it, hence could upload and share it.

Another one was a case/suit about a legally matured women who had decided to marry without parental consent. In this case, the Delhi High Court had taken women’s side and stated she can marry whom she wants. Interestingly, about a week back Uttar Pradesh (most notorious about crime against women) had made laws called ‘Love Jihad‘ and 2 -3 states have followed them. The idea being to create an atmosphere of hate against Muslims and women have no autonomy about what they want. This is when in a separate suit/case against Sudharshan TV (a far-right leaning channel promoting hate against Muslims) , the Government of India itself put an affidavit stating that Tablighis (a sect of Muslims who came from Malaysia to India for religious discourse and assembly) were not responsible for dissemination of the virus and some media has correctly portrayed the same. Of course, those who are on the side of the Govt. on this topic think a ‘traitor’ has written. They also thought that the Govt. had taken a wrong approach but couldn’t tell of a better approach to the matter.

There are too many matters in the Supreme Court of women asking for justice to tell all here but two instances share how the SC has been buckling under the stress of late, one is a webinar which was chaired by Justice Subramaniam where he shared how the executive is using judicial appointments to do what it wants. The gulf between the executive and the SC has been since Indira Gandhi days, especially the judicial orders which declared that the Emergency is valid by large, it has fallen much more recently and the executive has been muscling in which have resulted in more regressive decisions than progressive.

This observation is also in tune with another study which came to the same result although using data. The raw data from the study could give so much more than what has been shared. For e.g. as an idea for the study, of the ones cited, how many have been in civil law, personal law, criminal or constitutional law. This would give a better understanding of things. Also what is shocking is none of our court orders have been cited in the west in the recent past, when there used to be a time when the west used to take guidance from Indian jurisprudence sometimes and cite the orders to reach similar conclusion or if not conclusion at least be used as a precedent. I guess those days are over.

Government giving Corporate ownership to Private Sector Banks

There was an Internal Working Group report to review extant ownership guidelines and Corporate Structure for Indian Private Sector Banks. – This is the actual title of the report.

Now there were and are concerns about the move which were put forth by Dr. Raghuram Rajan and Viral Acharya. While Dr. Rajan had been the 23rd Governor of RBI from 4th September 2013 to 4th September 2016.

His most commendable work which largely is unknown to most people was the report A hundred small steps which you buy from sage publications. Viral Acharya was the deputy governor from 23rd January 2017 – 23rd July 2019. Mr. Acharya just recently published his book Quest for Restoring Financial Stability in India which can be bought from the same publication house as well.

They also wrote a three page article stating that does India need corporates in banking? More interestingly, he shares two points from history both world war 1 and world war 2. In both cases, the allies had to cut down the businesses who had owned banks. In Germany, it was the same and in Japan, the zaibatsu’s dissolution, both of which were needed to make the world safe again. Now, if we don’t learn lessons from history it is our fault, not history’s.

What was also shared that this idea was taken up in 2013 but was put into cold-storage. He also commented on the pressure on RBI as all co-operative banks have come under its ambit in the last few months. RBI has had a patchy record, especially in the last couple of years, with big scams like ILFS, Yes Bank, PMC Bank, Laxmi Vilas Bank among others. The LVB Bank being the most recent one.

If new banking licenses have to be given they can be given to good NBFC’s who have been in the market for a long time and have shown maturity while dealing with public money. What is the hurry for giving it to Corporate/business houses ? There are many other good points in the report with which both Mr. Rajan and Mr. Acharya are in agreement and do hope the other points/suggestions/proposals are implemented.

Interestingly, while looking through the people who were part of the committee was a somewhat familiar name Murmu . This is perhaps the first time you see people from a sort of political background being in what should be a cut and dry review which have people normally from careers in finance or Accounts. It also turns out that only one person was in favor of banks going to corporates, all the rest were against.

It seems that the specific person hadn’t heard the terms ‘self-lending’, ‘connected-lending’ and conflict of interest. One of the more interesting comments in the report is if a corporate has a bank, then why would he go to Switzerland, he would just wash the money in his own bank. And if banks were to become to big to fail like it happened in the United States, it would be again private gains, public losses. There was also a Washington Post article which shares some of the reasons that Indian banks fail. I think we need to remind ourselves once again, how things can become –


https://www.youtube.com/watch?v=2gK3s5j7PgA

Positive News at end

At the end I do not want to end on a sour notes, hence sharing a YouTube channel of Films Division India where you can see of the very classic works and interviews of some of the greats in Indian art cinema.

https://www.youtube.com/user/FilmsDivision/videos

Also sharing a bit of funny story I came to know about youtube-dl, apparently it was taken off from github but thanks to efforts from EFF, Hackernews and others, it is now back in action.

25 November, 2020 11:31PM by shirishag75

hackergotchi for Junichi Uekawa

Junichi Uekawa

Grabbing screenshot.

Grabbing screenshot. I wanted to know the size of screenshot generated by canvas.toDataURL so I wrote a web app that just measures the size at 60fps because I could. From output I can see webp: 19087 png: 115818 jpg: 115818, so I figured webp is really good at this, or maybe chrome is really good at using webp. and png and jpg look like they are the same size... hmm.. why? UPDATE: image/jpg generated png and image/jpeg generated jpeg.

25 November, 2020 12:56AM by Junichi Uekawa

November 23, 2020

hackergotchi for Shirish Agarwal

Shirish Agarwal

White Hat Senior and Education

I had been thinking of doing a blog post on RCEP which China signed with 14 countries a week and a day back but this new story has broken and is being viraled a bit on the interwebs, especially twitter and is pretty much in our domain so thought would be better to do a blog post about it. Also there is quite a lot packed so quite a bit of unpacking to do.

Whitehat, Greyhat and Blackhat

For those of you who may not, there are actually three terms especially in computer science that one comes across. Those are white hats, grey hats and black hats. Now clinically white hats are like the fiery angels or the good guys who basically take permissions to try and find out weakness in an application, program, website, organization and so on and so forth. A somewhat dated reference to hacker could be Sandra Bullock (The Net 1995) , Sneakers (1992), Live Free or Die Hard (2007) . Of the three one could argue that Sandra was actually into viruses which are part of computer security but still she showed some bad-ass skills, but then that is what actors are paid to do 🙂 Sneakers was much more interesting for me because in that you got the best key which can unlock any lock, something like quantum computing is supposed to do. One could equate both the first movies in either white hat or grey hat . A grey hat is more flexible in his/her moral values and they are plenty of such people. For e.g. Julius Assange could be described as a grey hat, but as you can see and understand those are moral issues.

A black hat on the other hand is one who does things for profit even if it harms the others. The easiest fictitious examples are all Die Hard series, all of them except the 4th one, all had bad guys or black hats. The 4th one is the odd one out as it had Matthew Farell (Justin Long) as a grey hat hacker. In real life Kevin Mitnick, Kevin Poulsen, Robert Tappan Morris, George Hotz, Gary McKinnon are some examples of hackers, most of whom were black hats, most of them reformed into white hats and security specialists. There are many other groups and names but that perhaps is best for another day altogether.

Now why am I sharing this. Because in all of the above, the people who are using and working with the systems have better than average understanding of systems and they arguably would be better than most people at securing their networks, systems etc. but as we shall see in this case there has been lots of issues in the company.

WhiteHat Jr. and 300 Million Dollars

Before I start this, I would like to share that for me this suit in many ways seems to be similar to the suit filed against Krishnaraj Rao . Although the difference is that Krishnaraj Rao’s case/suit is that it was in real estate while this one is in ‘education’ although many things are similar to those cases but also differ in some obvious ways. For e.g. in the suit against Krishnaraj Rao, the plaintiff’s first approached the High Court and then the Supreme Court. Of course Krishanraj Rao won in the High Court and then in the SC plaintiff’s agreed to Krishnaraj Rao’s demands as they knew they could not win in SC. In that case, a compromise was reached by the plaintiff just before judgement was to be delivered.

In this case, the plaintiff have directly come to the SC, short-circuiting the high court process. This seems to be a new trend with the current Government in power where the rich get to be in SC without having to go the Honorable HC . It says much about SC as well, as they entertained the application and didn’t ask the plaintiff to go to the lower court first as should have been the case but that is and was honorable SC’s right to decide . The charges against Pradeep Poonia (the defendant in this case) are very much similar to those which were made in Krishanraj Rao’s suit hence won’t be going into those details. They have claimed defamation and filed a 20 crore suit. The idea is basically to silence any whistle-blowers.

Fictional Character Wolf Gupta

The first issue in this case or perhaps one of the most famous or infamous character is an unknown. While he has been reportedly hired by Google India, BJYU, Chandigarh. This has been reported by Yahoo News. I did a cursory search on LinkedIn to see if there indeed is a wolf gupta but wasn’t able to find any person with such a name. I am not even talking the amount of money/salary the fictitious gentleman is supposed to have got and the various variations on the salary figures at different times and the different ads.

If I wanted to, I could have asked few of the kind souls whom I know are working in Google to see if they can find such a person using their own credentials but it probably would have been a waste of time. When you show a LinkedIn profile in your social media, it should come up in the results, in this case it doesn’t. I also tried to find out if somehow BJYU was a partner to Google and came up empty there as well. There is another story done by Kan India but as I’m not a subscriber, I don’t know what they have written but the beginning of the story itself does not bode well.

While I can understand marketing, there is a line between marketing something and being misleading. At least to me, all of the references shared seems misleading at least to me.

Taking down dissent

One of the big no-nos at least from what I perceive, you cannot and should not take down dissent or critique. Indians, like most people elsewhere around the world, critique and criticize day and night. Social media like twitter, mastodon and many others would not exist in the place if criticisms are not there. In fact, one could argue that Twitter and most social media is used to drive engagements to a person, brand etc. It is even an official policy in Twitter. Now you can’t drive engagements without also being open to critique and this is true of all the web, including WordPress and me 🙂 . What has been happening is that whitehatjr with help of bjyu have been taking out content of people citing copyright violation which seems laughable.

When citizens critique anything, we are obviously going to take the name of the product otherwise people would have to start using new names similar to how Tom Riddle was known as ‘Dark Lord’ , ‘Voldemort’ and ‘He who shall not be named’ . There have been quite a few takedowns, I just provide one for reference, the rest of the takedowns would probably come in the ongoing suit/case.

Whitehat Jr. ad showing investors fighting


Now a brief synopsis of what the ad. is about. The ad is about a kid named ‘Chintu’ who makes an app. The app. is so good that investors come to his house and right in the lawn and start fighting each other. The parents are enjoying looking at the fight and to add to the whole thing there is also a nosy neighbor who has his own observations. Simply speaking, it is a juvenile ad but it works as most parents in India, as elsewhere are insecure.

Jihan critiquing the whitehatjr ad

Before starting, let me assure that I asked Jihan’s parents if it’s ok to share his ad on my blog and they agreed. What he has done is broken down the ad and showed how juvenile the ad is and using logic and humor as a template for the same. He does make sure to state that he does not know how the product is as he hasn’t used it. His critique was about the ad and not the product as he hasn’t used that.

The Website

If you look at the website, sadly, most of the site only talks about itself rather than giving examples that people can look in detail. For e.g. they say they have few apps. on Google play-store but no link to confirm the same. The same is true of quite a few other things. In another ad a Paralympic star says don’t get into sports and get into coding. Which athlete in their right mind would say that. And it isn’t that we (India) are brimming with athletes at the international level. In the last outing which was had in 2016, India sent a stunning 117 athletes but that was an exception as we had the women’s hockey squad which was of 16 women, and even then they were overshadowed in numbers by the bureaucratic and support staff. There was criticism about the staff bit but that is probably a story for another date.

Most of the site doesn’t really give much value and the point seems to be driving sales to their courses. This is pressurizing small kids as well as teenagers and better who are in the second and third year science-engineering whose parents don’t get that it is advertising and it is fake and think that their kids are incompetent. So this pressurizes both small kids as well as those who are learning, doing in whatever college or educational institution . The teenagers more often than not are unable to tell/share with them that this is advertising and fake. Also most of us have been on a a good diet of ads. Fair and lovely still sells even though we know it doesn’t work.

This does remind me of a similar fake academy which used very much similar symptoms and now nobody remembers them today. There used to be an academy called Wings Academy or some similar name. They used to advertise that you come to us and we will make you into a pilot or an airhostess and it was only much later that it was found out that most kids were doing laundry work in hotels and other such work. Many had taken loans, went bankrupt and even committed suicide because they were unable to pay off the loans due to the dreams given by the company and the harsh realities that awaited them. They were sued in court but dunno what happened but soon they were off the radar so we never came to know what happened to those million of kids whose life dreams were shattered.

Security

Now comes the security part. They have alleged that Pradeep Poonia broke into their systems. While this may be true, what I find funny is that with the name whitehat, how can they justify it. If you are saying you are white hat you are supposed to be much better than this. And while I have not tried to penetrate their systems, I did find it laughable that the site is using an expired https:// certificate. I could have tried further to figure out the systems but I chose not to . How they could not have an automated script to do the same is beyond me. But that is their concern, not mine.

Comparison

A similar offering would be unacademy but as can be seen they neither try to push you in anyway and nor do they make any ridiculous claims. In fact how genuine unacademy is can be gauged from the fact that many of its learning resources are available to people to see on YT and if they have tools they can also download it. Now, does this mean that every educational website should have their content for free, of course not. But when a channel has 80% – 90% of it YT content as ads and testimonials then they surely should give a reason to pause both for parents and students alike. But if parents had done that much research, then things would not be where they are now.

Allegations

Just to complete, there are allegations by Pradeep Poornia with some screenshots which show the company has been doing lot of bad things. For e.g. they were harassing an employee at night 2 a.m. who was frustrated and working in the company at the time. Many of the company staff routinely made sexist and offensive, sexual abusive remarks privately between themselves for prospective women who came to interview via webcam (due to the pandemic). There also seems to be a bit of porn on the web/mobile server of the company as well. There also have been allegations that while the company says refund is done next day, many parents who have demanded those refunds have not got it. Now while Pradeep has shared some of the quotations of the staff while hiding the identities of both the victims and the perpetrators, the language being used in itself tells a lot. I am in two minds whether to share those photos or not hence atm choosing not to. Poornia has also contended that all teachers do not know programming and they are given scripts to share. There have been some people who did share that experience with him –

Suruchi Sethi

From the company’s side they are alleging he has hacked the company servers and would probably be using the Fruit of the poisonous tree argument which we have seen have been used in many arguments.

Conclusion

Now that lies in the eyes of the Court whether the single bench choses the literal meaning or use the spirit of the law or the genuine concerns of the people concerned. While in today’s hearing while the company asked for a complete sweeping injunction they were unable to get it. Whatever may happen, we may hope to see some fireworks in the second hearing which is slated to be on 6.01.2021 where all of this plays out. Till later.

23 November, 2020 07:22PM by shirishag75

Vincent Fourmond

QSoas tips and tricks: using meta-data, first level

By essence, QSoas works with \(y = f(x)\) datasets. However, in practice, when working with experimental data (or data generated from simulations), one has often more than one experimental parameter (\(x\)). For instance, one could record series of spectra (\(A = f(\lambda)\)) for different pH values, so that the absorbance is in fact a function of both the pH and \(\lambda\). QSoas has different ways to deal with such situations, and we'll describe one today, using meta-data.

Setting meta-data

Meta-data are simply series of name/values attached to a dataset. It can be numbers, dates or just text. Some of these are automatically detected from certain type of data files (but that is the topic for another day). The simplest way to set meta-data is to use the set-meta command:
QSoas> set-meta pH 7.5
This command sets the meta-data pH to the value 7.5. Keep in mind that QSoas does not know anything about the meaning of the meta-data[1]. It can keep track of the meta-data you give, and manipulate them, but it will not interpret them for you. You can set several meta-data by repeating calls to set-meta, and you can display the meta-data attached to a dataset using the command show. Here is an example:
QSoas> generate-buffer 0 10
QSoas> set-meta pH 7.5
QSoas> set-meta sample "My sample"
QSoas> show 0
Dataset generated.dat: 2 cols, 1000 rows, 1 segments, #0
Flags: 
Meta-data:	pH =	 7.5	sample =	 My sample
Note here the use of quotes around My sample since there is a space inside the value.

Using meta-data

There are many ways to use meta-data in QSoas. In this post, we will discuss just one: using meta-data in the output file. The output file can collect data from several commands, like peak data, statistics and so on. For instance, each time the command 1 is run, a line with the information about the largest peak of the current dataset is written to the output file. It is possible to automatically add meta-data to those lines by using the /meta= option of the output command. Just listing the names of the meta-data will add them to each line of the output file.

As a full example, we'll see how one can take advantage of meta-data to determine the position of the peak of the function \(x^2 \exp (-a\,x)\) depends on \(a\). For that, we first create a script that generates the function for a certain value of \(a\), sets the meta-data a to the corresponding value, and find the peak. Let's call this file do-one.cmds (all the script files can be found in the GitHub repository):
generate-buffer 0 20 x**2*exp(-x*${1})
set-meta a ${1}
1 
This script takes a single argument, the value of \(a\), generates the appropriate dataset, sets the meta-data a and writes the data about the largest (and only in this case) peak to the output file. Let's now run this script with 1 as an argument:
QSoas> @ do-one.cmds 1
This command generates a file out.dat containing the following data:
## buffer       what    x       y       index   width   left_width      right_width     area
generated.dat   max     2.002002002     0.541340590883  100     3.4034034034    1.24124124124   2.162162162161.99999908761
This gives various information about the peak found: the name of the dataset it was found in, whether it's a maximum or minimum, the x and y positions of the peak, the index in the file, the widths of the peak and its area. We are interested here mainly in the x position.

Then, we just run this script for several values of \(a\) using run-for-each, and in particular the option /range-type=lin that makes it interpret values like 0.5..5:80 as 80 values evenly spread between 0.5 and 5. The script is called run-all.cmds:
output peaks.dat /overwrite=true /meta=a
run-for-each do-one.cmds /range-type=lin 0.5..5:80
V all /style=red-to-blue
The first line sets up the output to the output file peaks.dat. The option /meta=a makes sure the meta a is added to each line of the output file, and /overwrite=true make sure the file is overwritten just before the first data is written to it, in order to avoid accumulating the results of different runs of the script. The last line just displays all the curves with a color gradient. It looks like this:
Running this script (with @ run-all.cmds) creates a new file peaks.dat, whose first line looks like this:
## buffer       what    x       y       index   width   left_width      right_width     area    a
The column x (the 3rd) contains the position of the peaks, and the column a (the 10th) contains the meta a (this column wasn't present in the output we described above, because we had not used yet the output /meta=a command). Therefore, to load the peak position as a function of a, one has just to run:
QSoas> load peaks.dat /columns=10,3
This looks like this:
Et voilà !

To train further, you can:
  • improve the resolution in x;
  • improve the resolution in y;
  • plot the magnitude of the peak;
  • extend the range;
  • derive the analytical formula for the position of the peak and verify it !

[1] this is not exactly true. For instance, some commands like unwrap interpret the sr meta-data as a voltammetric scan rate if it is present. But this is the exception.

About QSoas

QSoas is a powerful open source data analysis program that focuses on flexibility and powerful fitting capacities. It is released under the GNU General Public License. It is described in Fourmond, Anal. Chem., 2016, 88 (10), pp 5050–5052. Current version is 2.2. You can download its source code there (or clone from the GitHub repository) and compile it yourself, or buy precompiled versions for MacOS and Windows there.

23 November, 2020 06:55PM by Vincent Fourmond (noreply@blogger.com)

November 22, 2020

François Marier

Removing a corrupted data pack in a Restic backup

I recently ran into a corrupted data pack in a Restic backup on my GnuBee. It led to consistent failures during the prune operation:

incomplete pack file (will be removed): b45afb51749c0778de6a54942d62d361acf87b513c02c27fd2d32b730e174f2e
incomplete pack file (will be removed): c71452fa91413b49ea67e228c1afdc8d9343164d3c989ab48f3dd868641db113
incomplete pack file (will be removed): 10bf128be565a5dc4a46fc2fc5c18b12ed2e77899e7043b28ce6604e575d1463
incomplete pack file (will be removed): df282c9e64b225c2664dc6d89d1859af94f35936e87e5941cee99b8fbefd7620
incomplete pack file (will be removed): 1de20e74aac7ac239489e6767ec29822ffe52e1f2d7f61c3ec86e64e31984919
hash does not match id: want 8fac6efe99f2a103b0c9c57293a245f25aeac4146d0e07c2ab540d91f23d3bb5, got 2818331716e8a5dd64a610d1a4f85c970fd8ae92f891d64625beaaa6072e1b84
github.com/restic/restic/internal/repository.Repack
        github.com/restic/restic/internal/repository/repack.go:37
main.pruneRepository
        github.com/restic/restic/cmd/restic/cmd_prune.go:242
main.runPrune
        github.com/restic/restic/cmd/restic/cmd_prune.go:62
main.glob..func19
        github.com/restic/restic/cmd/restic/cmd_prune.go:27
github.com/spf13/cobra.(*Command).execute
        github.com/spf13/cobra/command.go:838
github.com/spf13/cobra.(*Command).ExecuteC
        github.com/spf13/cobra/command.go:943
github.com/spf13/cobra.(*Command).Execute
        github.com/spf13/cobra/command.go:883
main.main
        github.com/restic/restic/cmd/restic/main.go:86
runtime.main
        runtime/proc.go:204
runtime.goexit
        runtime/asm_amd64.s:1374

Thanks to the excellent support forum, I was able to resolve this issue by dropping a single snapshot.

First, I identified the snapshot which contained the offending pack:

$ restic -r sftp:hostname.local: find --pack 8fac6efe99f2a103b0c9c57293a245f25aeac4146d0e07c2ab540d91f23d3bb5
repository b0b0516c opened successfully, password is correct
Found blob 2beffa460d4e8ca4ee6bf56df279d1a858824f5cf6edc41a394499510aa5af9e
 ... in file /home/francois/.local/share/akregator/Archive/http___udd.debian.org_dmd_feed_
     (tree 602b373abedca01f0b007fea17aa5ad2c8f4d11f1786dd06574068bf41e32020)
 ... in snapshot 5535dc9d (2020-06-30 08:34:41)

Then, I could simply drop that snapshot:

$ restic -r sftp:hostname.local: forget 5535dc9d
repository b0b0516c opened successfully, password is correct
[0:00] 100.00%  1 / 1 files deleted

and run the prune command to remove the snapshot, as well as the incomplete packs that were also mentioned in the above output but could never be removed due to the other error:

$ restic -r sftp:hostname.local: prune
repository b0b0516c opened successfully, password is correct
counting files in repo
building new index for repo
[20:11] 100.00%  77439 / 77439 packs
incomplete pack file (will be removed): b45afb51749c0778de6a54942d62d361acf87b513c02c27fd2d32b730e174f2e
incomplete pack file (will be removed): c71452fa91413b49ea67e228c1afdc8d9343164d3c989ab48f3dd868641db113
incomplete pack file (will be removed): 10bf128be565a5dc4a46fc2fc5c18b12ed2e77899e7043b28ce6604e575d1463
incomplete pack file (will be removed): df282c9e64b225c2664dc6d89d1859af94f35936e87e5941cee99b8fbefd7620
incomplete pack file (will be removed): 1de20e74aac7ac239489e6767ec29822ffe52e1f2d7f61c3ec86e64e31984919
repository contains 77434 packs (2384522 blobs) with 367.648 GiB
processed 2384522 blobs: 1165510 duplicate blobs, 47.331 GiB duplicate
load all snapshots
find data that is still in use for 15 snapshots
[1:11] 100.00%  15 / 15 snapshots
found 1006062 of 2384522 data blobs still in use, removing 1378460 blobs
will remove 5 invalid files
will delete 13728 packs and rewrite 15140 packs, this frees 142.285 GiB
[4:58:20] 100.00%  15140 / 15140 packs rewritten
counting files in repo
[18:58] 100.00%  50164 / 50164 packs
finding old index files
saved new indexes as [340cb68f 91ff77ef ee21a086 3e5fa853 084b5d4b 3b8d5b7a d5c385b4 5eff0be3 2cebb212 5e0d9244 29a36849 8251dcee 85db6fa2 29ed23f6 fb306aba 6ee289eb 0a74829d]
remove 190 old index files
[0:00] 100.00%  190 / 190 files deleted
remove 28868 old packs
[1:23] 100.00%  28868 / 28868 files deleted
done

22 November, 2020 07:30PM

Molly de Blanc

Why should you work on free software (or other technology issues)?

Twice this week I was asked how it can be okay to work on free software when there are issues like climate change and racial injustice. I have a few answers for that.

You can work on injustice while working on free software.

A world in which all technology is just cannot exist under capitalism. It cannot exist under racism or sexism or ableism. It cannot exist in a world that does not exist if we are ravaged by the effects of climate change. At the same time, free software is part of the story of each of these. The modern technology state fuels capitalism, and capitalism fuels it. It cannot exist without transparency at all levels of the creation process. Proprietary software and algorithms reinforce racial and gender injustice. Technology is very guilty of its contributions to the climate crisis. By working on making technology more just, by making it more free, we are working to address these issues. Software makes the world work, and oppressive software creates an oppressive world.

You can work on free software while working on injustice.

Let’s say you do want to devote your time to working on climate justice full time. Activism doesn’t have to only happen in the streets or in legislative buildings. Being a body in a protest is activism, and so is running servers for your community’s federated social network, providing wiki support, developing custom software, and otherwise bringing your free software skills into new environments. As long as your work is being accomplished under an ethos of free software, with free software, and under free software licenses, you’re working on free software issues while saving the world in other ways too!

Not everyone needs to work on everything all the time.

When your house in on fire, you need to put out the fire. However, maybe you can’t help put out the first. Maybe You don’t have the skills or knowledge or physical ability. Maybe your house is on fire, but there’s also an earthquake and a meteor and a airborn toxic event all coming at once. When that happens, we have to split up our efforts and that’s okay.

22 November, 2020 05:41PM by mollydb

Arturo Borrero González

How to use nftables from python

Netfilter logo

One of the most interesting (and possibly unknown) features of the nftables framework is the native python interface, which allows python programs to access all nft features programmatically, from the source code.

There is a high-level library, libnftables, which is responsible for translating the human-readable syntax from the nft binary into low-level expressions that the nf_tables kernel subsystem can run. The nft command line utility basically wraps this library, where all actual nftables logic lives. You can only imagine how powerful this library is. Originally written in C, ctypes is used to allow native wrapping of the shared lib object using pure python.

To use nftables in your python script or program, first you have to install the libnftables library and the python bindings. In Debian systems, installing the python3-nftables package should be enough to have everything ready to go.

To interact with libnftables you have 2 options, either use the standard nft syntax or the JSON format. The standard format allows you to send commands exactly like you would do using the nft binary. That format is intended for humans and it doesn’t make a lot of sense in a programmatic interaction. Whereas JSON is pretty convenient, specially in a python environment, where there are direct data structure equivalents.

The following code snippet gives you an example of how easy this is to use:

#!/usr/bin/env python3

import nftables
import json

nft = nftables.Nftables()
nft.set_json_output(True)
rc, output, error = nft.cmd("list ruleset")
print(json.loads(output))

This is functionally equivalent to running nft -j list ruleset. Basically, all you have to do in your python code is:

  • import the nftables & json modules
  • init the libnftables instance
  • configure library behavior
  • run commands and parse the output (ideally using JSON)

The key here is to use the JSON format. It allows adding ruleset modification in batches, i.e. to create tables, chains, rules, sets, stateful counters, etc in a single atomic transaction, which is the proper way to update firewalling and NAT policies in the kernel and to avoid inconsistent intermediate states.

The JSON schema is pretty well documented in the libnftables-json(5) manpage. The following example is copy/pasted from there, and illustrates the basic idea behind the JSON format. The structure accepts an arbitrary amount of commands which are interpreted in order of appearance. For instance, the following standard syntax input:

flush ruleset
add table inet mytable
add chain inet mytable mychain
add rule inet mytable mychain tcp dport 22 accept

Translates into JSON as such:

{ "nftables": [
    { "flush": { "ruleset": null }},
    { "add": { "table": {
        "family": "inet",
        "name": "mytable"
    }}},
    { "add": { "chain": {
        "family": "inet",
        "table": "mytable",
        "chain": "mychain"
    }}}
    { "add": { "rule": {
        "family": "inet",
        "table": "mytable",
        "chain": "mychain",
        "expr": [
            { "match": {
                "left": { "payload": {
                    "protocol": "tcp",
                    "field": "dport"
                }},
                "right": 22
            }},
            { "accept": null }
        ]
    }}}
]}

I encourage you to take a look at the manpage if you want to know about how powerful this interface is. I’ve created a git repository to host several source code examples using different features of the library: https://github.com/aborrero/python-nftables-tutorial. I plan to introduce more code examples as I learn and create them.

There are several relevant projects out there using this nftables python integration already. One of the most important pieces of software is firewalld. They started using the JSON format back in 2019.

In the past, people interacting with iptables programmatically would either call the iptables binary directly or, in the case of some C programs, hack libiptc/libxtables libraries into their source code. The native python approach to use libnftables is a huge step forward, which should come handy for developers, network engineers, integrators and other folks using the nftables framework in a pythonic environment.

If you are interested to know how this python binding works, I invite you to take a look at the upstream source code, nftables.py, which contains all the magic behind the scenes.

22 November, 2020 05:08PM

hackergotchi for Markus Koschany

Markus Koschany

My Free Software Activities in October 2020

Welcome to gambaru.de. Here is my monthly report (+ the first week in November) that covers what I have been doing for Debian. If you’re interested in Java, Games and LTS topics, this might be interesting for you.

Debian Games

  • I released a new version of debian-games, a collection of metapackages for games. As expected the Python 2 removal takes its toll on games in Debian that depend on pygame or other Python 2 libraries. Currently we have lost more games in 2020 than could be newly introduced to the archive. All in all it could be better but also a lot worse.
  • New upstream releases were packaged for freeorion and xaos.
  • Most of the time was spent on upgrading the bullet physics library to version 3.06, testing all reverse-dependencies and requesting a transition for it. (#972395) Similar to bullet I also updated box2d, the 2D counterpart. The only reverse-dependency, caveexpress fails to build from source with box2d 2.4.1, so unless I can fix it, it doesn’t make much sense to upload the package to unstable.
  • Some package polishing: I could fix two bugs in stormbaancoureur, patch by Helmut Grohne, and ardentryst that required a dependency on python3-future to start.
  • I sponsored mgba and pekka-kana-2 for Ryan Tandy and Carlos Donizete Froes
  • and started to work on porting childsplay to Python 3.
  • Finally I did a NMU for bygfoot to work around a GCC 10 FTBFS.

Debian Java

pdfsam
  • I uploaded pdfsam and its related sejda libraries to unstable and applied an upstream patch to fix an error with Debian’s jackson-jr version. Everything should be usable and up-to-date now.
  • I updated mina2 and investigated a related build failure in apache-directory-server, packaged a new upstream release of commons-io and undertow and fixed a security vulnerability in junit4 by upgrading to version 4.13.1.
  • The upgrade of jflex to version 1.8.2 took a while. The package is available in experimental now but regression tests with ratt showed, that several reverse-dependencies FTBFS with 1.8.2. Since all of these projects work fine with 1.7.0, I intend to postpone the upload to unstable. No need to break something.

Misc

  • This month also saw new upstream versions of wabt and binaryen.
  • I intend to update ublock-origin in Buster but I haven’t heard back from the release team yet. (#973695)

Debian LTS

This was my 56. month as a paid contributor and I have been paid to work 20,75 hours on Debian LTS, a project started by Raphaël Hertzog. In that time I did the following:

  • DLA-2440-1. Issued a security update for poppler fixing 9 CVE.
  • DLA-2445-1. Issued a security update for libmaxminddb fixing 1 CVE.
  • DLA-2447-1. Issued a security update for pacemaker fixing 1 CVE. The update had to be reverted because of an unexpected permission problem. I am in contact with one of the users who reported the regression and my intention is to update pacemaker to the latest supported release in the 1.x branch. If further tests show no regressions anymore, a new update will follow shortly.
  • Investigated CVE-2020-24614 in fossil and marked the issue as no-dsa because the impact for Debian users was low.
  • Investigated the open security vulnerabilities in ansible (11) and prepared some preliminary patches. The work is ongoing.
  • Fixed the remaining zsh vulnerabilities in Stretch in line with Debian 8 „Jessie“, so that all versions in Debian are equally protected.

ELTS

Extended Long Term Support (ELTS) is a project led by Freexian to further extend the lifetime of Debian releases. It is not an official Debian project but all Debian users benefit from it without cost. The current ELTS release is Debian 8 „Jessie“. This was my 29. month and I have been paid to work 15 hours on ELTS.

  • ELA-302-1. Issued a security update for poppler fixing 2 CVE. Investigated Debian bug #942391, identified the root cause and reverted the patch for CVE-2018-13988.
  • ELA-303-1. Issued a security update for junit4 fixing 1 CVE.
  • ELA-316-1. Issued a security update for zsh fixing 7 CVE.

Thanks for reading and see you next time.

22 November, 2020 03:45PM by apo

November 21, 2020

Giovanni Mascellani

Having fun with signal handlers

As every C and C++ programmer knows far too well, if you dereference a pointer that points outside of the space mapped on your process' memory, you get a segmentation fault and your programs crashes. As far as the language itself is concerned, you don't have a second chance and you cannot know in advance whether that dereferencing operation is going to set a bomb off or not. In technical terms, you are invoking undefined behaviour, and you should never do that: you are responsible for knowing in advance if your pointers are valid, and if they are not you keep the pieces.

However, turns out that most actual operating system give you a second chance, although with a lot of fine print attached. So I tried to implement a function that tries to dereference a pointer: if it can, it gives you the value; if it can't, it tells you it couldn't. Again, I stress this should never happen in a real program, except possibly for debugging (or for having fun).

The prototype is

word_t peek(word_t *addr, int *success);

The function is basically equivalent to return *addr, except that if addr is not mapped it doesn't crash, and if success is not NULL it is set to 0 or 1 to indicate that addr was not mapped or mapped. If addr was not mapped the return value is meaningless.

I won't explain it in detail to leave you some fun. Basically the idea is to install a handler for SIGSEGV: if the address is invalid, the handler is called, which basically fixes everything by advancing a little bit the instruction pointer, in order to skip the faulting instruction. The dereferencing instruction is written as hardcoded Assembly bytes, so that I know exactly how many bytes I need to skip.

Of course this is very architecture-dependent: I wrote the i386 and amd64 variants (no x32). And I don't guarantee there are no bugs or subtelties!

Another solution would have been to just parse /proc/self/maps before dereferencing and check whether the pointer is in a mapped area, but it would have suffered of a TOCTTOU problem: another thread might have changed the mappings between the time when /proc/self/maps was parsed and when the pointer was dereferenced (also, parsing that file can take a relatively long amount of time). Another less architecture-dependent but still not pure-C approach would have been to establish a setjmp before attempting the dereference and longjmp-ing back from the signal handler (but again you would need to use different setjmp contexts in different threads to exclude race conditions).

Have fun! (and again, don't try this in real programs)

EDIT I realized I should specify the language for source code highlighting to work decently. Now it's better!

EDIT 2 I also realized that my version of peek has problems when there are other threads, because signal actions are per-process, not per-thread (as I initially thought). See the comments for a better version (though not perfect).

#define _GNU_SOURCE
#include <stdint.h>
#include <signal.h>
#include <assert.h>
#include <stdlib.h>
#include <stdio.h>
#include <ucontext.h>

#ifdef __i386__
typedef uint32_t word_t;
#define IP_REG REG_EIP
#define IP_REG_SKIP 3
#define READ_CODE __asm__ __volatile__(".byte 0x8b, 0x03\n"  /* mov (%ebx), %eax */ \
                                       ".byte 0x41\n"        /* inc %ecx */ \
                                       : "=a"(ret), "=c"(tmp) : "b"(addr), "c"(tmp));
#endif

#ifdef __x86_64__
typedef uint64_t word_t;
#define IP_REG REG_RIP
#define IP_REG_SKIP 6
#define READ_CODE __asm__ __volatile__(".byte 0x48, 0x8b, 0x03\n"  /* mov (%rbx), %rax */ \
                                       ".byte 0x48, 0xff, 0xc1\n"  /* inc %rcx */ \
                                       : "=a"(ret), "=c"(tmp) : "b"(addr), "c"(tmp));
#endif

static void segv_action(int sig, siginfo_t *info, void *ucontext) {
    (void) sig;
    (void) info;
    ucontext_t *uctx = (ucontext_t*) ucontext;
    uctx->uc_mcontext.gregs[IP_REG] += IP_REG_SKIP;
}

struct sigaction peek_sigaction = {
    .sa_sigaction = segv_action,
    .sa_flags = SA_SIGINFO,
    .sa_mask = 0,
};

word_t peek(word_t *addr, int *success) {
    word_t ret;
    int tmp, res;
    struct sigaction prev_act;

    res = sigaction(SIGSEGV, &peek_sigaction, &prev_act);
    assert(res == 0);

    tmp = 0;
    READ_CODE

    res = sigaction(SIGSEGV, &prev_act, NULL);
    assert(res == 0);

    if (success) {
        *success = tmp;
    }

    return ret;
}

int main() {
    int success;
    word_t number = 22;
    word_t value;

    number = 22;
    value = peek(&number, &success);
    printf("%d %d\n", success, value);

    value = peek(NULL, &success);
    printf("%d %d\n", success, value);

    value = peek((word_t*)0x1234, &success);
    printf("%d %d\n", success, value);

    return 0;
}

21 November, 2020 08:00PM by Giovanni Mascellani

Michael Stapelberg

Debian Code Search: positional index, TurboPFor-compressed

See the Conclusion for a summary if you’re impatient :-)

Motivation

Over the last few months, I have been developing a new index format for Debian Code Search. This required a lot of careful refactoring, re-implementation, debug tool creation and debugging.

Multiple factors motivated my work on a new index format:

  1. The existing index format has a 2G size limit, into which we have bumped a few times, requiring manual intervention to keep the system running.

  2. Debugging the existing system required creating ad-hoc debugging tools, which made debugging sessions unnecessarily lengthy and painful.

  3. I wanted to check whether switching to a different integer compression format would improve performance (it does not).

  4. I wanted to check whether storing positions with the posting lists would improve performance of identifier queries (= queries which are not using any regular expression features), which make up 78.2% of all Debian Code Search queries (it does).

I figured building a new index from scratch was the easiest approach, compared to refactoring the existing index to increase the size limit (point ①).

I also figured it would be a good idea to develop the debugging tool in lock step with the index format so that I can be sure the tool works and is useful (point ②).

Integer compression: TurboPFor

As a quick refresher, search engines typically store document IDs (representing source code files, in our case) in an ordered list (“posting list”). It usually makes sense to apply at least a rudimentary level of compression: our existing system used variable integer encoding.

TurboPFor, the self-proclaimed “Fastest Integer Compression” library, combines an advanced on-disk format with a carefully tuned SIMD implementation to reach better speeds (in micro benchmarks) at less disk usage than Russ Cox’s varint implementation in github.com/google/codesearch.

If you are curious about its inner workings, check out my “TurboPFor: an analysis”.

Applied on the Debian Code Search index, TurboPFor indeed compresses integers better:

Disk space

 
8.9G codesearch varint index

 
5.5G TurboPFor index

Switching to TurboPFor (via cgo) for storing and reading the index results in a slight speed-up of a dcs replay benchmark, which is more pronounced the more i/o is required.

Query speed (regexp, cold page cache)

 
18s codesearch varint index

 
14s TurboPFor index (cgo)

Query speed (regexp, warm page cache)

 
15s codesearch varint index

 
14s TurboPFor index (cgo)

Overall, TurboPFor is an all-around improvement in efficiency, albeit with a high cost in implementation complexity.

Positional index: trade more disk for faster queries

This section builds on the previous section: all figures come from the TurboPFor index, which can optionally support positions.

Conceptually, we’re going from:

type docid uint32
type index map[trigram][]docid

…to:

type occurrence struct {
    doc docid
    pos uint32 // byte offset in doc
}
type index map[trigram][]occurrence

The resulting index consumes more disk space, but can be queried faster:

  1. We can do fewer queries: instead of reading all the posting lists for all the trigrams, we can read the posting lists for the query’s first and last trigram only.
    This is one of the tricks described in the paper “AS-Index: A Structure For String Search Using n-grams and Algebraic Signatures” (PDF), and goes a long way without incurring the complexity, computational cost and additional disk usage of calculating algebraic signatures.

  2. Verifying the delta between the last and first position matches the length of the query term significantly reduces the number of files to read (lower false positive rate).

  3. The matching phase is quicker: instead of locating the query term in the file, we only need to compare a few bytes at a known offset for equality.

  4. More data is read sequentially (from the index), which is faster.

Disk space

A positional index consumes significantly more disk space, but not so much as to pose a challenge: a Hetzner EX61-NVME dedicated server (≈ 64 €/month) provides 1 TB worth of fast NVMe flash storage.

 
 6.5G non-positional

 
123G positional

 
  93G positional (posrel)

The idea behind the positional index (posrel) is to not store a (doc,pos) tuple on disk, but to store positions, accompanied by a stream of doc/pos relationship bits: 1 means this position belongs to the next document, 0 means this position belongs to the current document.

This is an easy way of saving some space without modifying the TurboPFor on-disk format: the posrel technique reduces the index size to about ¾.

With the increase in size, the Linux page cache hit ratio will be lower for the positional index, i.e. more data will need to be fetched from disk for querying the index.

As long as the disk can deliver data as fast as you can decompress posting lists, this only translates into one disk seek’s worth of additional latency. This is the case with modern NVMe disks that deliver thousands of MB/s, e.g. the Samsung 960 Pro (used in Hetzner’s aforementioned EX61-NVME server).

The values were measured by running dcs du -h /srv/dcs/shard*/full without and with the -pos argument.

Bytes read

A positional index requires fewer queries: reading only the first and last trigram’s posting lists and positions is sufficient to achieve a lower (!) false positive rate than evaluating all trigram’s posting lists in a non-positional index.

As a consequence, fewer files need to be read, resulting in fewer bytes required to read from disk overall.

As an additional bonus, in a positional index, more data is read sequentially (index), which is faster than random i/o, regardless of the underlying disk.

1.2G
19.8G
21.0G regexp queries

4.2G (index)
10.8G (files)
15.0G identifier queries

The values were measured by running iostat -d 25 just before running bench.zsh on an otherwise idle system.

Query speed

Even though the positional index is larger and requires more data to be read at query time (see above), thanks to the C TurboPFor library, the 2 queries on a positional index are roughly as fast as the n queries on a non-positional index (≈4s instead of ≈3s).

This is more than made up for by the combined i/o matching stage, which shrinks from ≈18.5s (7.1s i/o + 11.4s matching) to ≈1.3s.

3.3s (index)
7.1s (i/o)
11.4s (matching)
21.8s regexp queries

3.92s (index)
≈1.3s
5.22s identifier queries

Note that identifier query i/o was sped up not just by needing to read fewer bytes, but also by only having to verify bytes at a known offset instead of needing to locate the identifier within the file.

Conclusion

The new index format is overall slightly more efficient. This disk space efficiency allows us to introduce a positional index section for the first time.

Most Debian Code Search queries are positional queries (78.2%) and will be answered much quicker by leveraging the positions.

Bottomline, it is beneficial to use a positional index on disk over a non-positional index in RAM.

21 November, 2020 09:04AM

Linux package managers are slow

I measured how long the most popular Linux distribution’s package manager take to install small and large packages (the ack(1p) source code search Perl script and qemu, respectively).

Where required, my measurements include metadata updates such as transferring an up-to-date package list. For me, requiring a metadata update is the more common case, particularly on live systems or within Docker containers.

All measurements were taken on an Intel(R) Core(TM) i9-9900K CPU @ 3.60GHz running Docker 1.13.1 on Linux 4.19, backed by a Samsung 970 Pro NVMe drive boasting many hundreds of MB/s write performance. The machine is located in Zürich and connected to the Internet with a 1 Gigabit fiber connection, so the expected top download speed is ≈115 MB/s.

See Appendix C for details on the measurement method and command outputs.

Measurements

Keep in mind that these are one-time measurements. They should be indicative of actual performance, but your experience may vary.

ack (small Perl program)

distribution package manager data wall-clock time rate
Fedora dnf 114 MB 33s 3.4 MB/s
Debian apt 16 MB 10s 1.6 MB/s
NixOS Nix 15 MB 5s 3.0 MB/s
Arch Linux pacman 6.5 MB 3s 2.1 MB/s
Alpine apk 10 MB 1s 10.0 MB/s

qemu (large C program)

distribution package manager data wall-clock time rate
Fedora dnf 226 MB 4m37s 1.2 MB/s
Debian apt 224 MB 1m35s 2.3 MB/s
Arch Linux pacman 142 MB 44s 3.2 MB/s
NixOS Nix 180 MB 34s 5.2 MB/s
Alpine apk 26 MB 2.4s 10.8 MB/s


(Looking for older measurements? See Appendix B (2019).

The difference between the slowest and fastest package managers is 30x!

How can Alpine’s apk and Arch Linux’s pacman be an order of magnitude faster than the rest? They are doing a lot less than the others, and more efficiently, too.

Pain point: too much metadata

For example, Fedora transfers a lot more data than others because its main package list is 60 MB (compressed!) alone. Compare that with Alpine’s 734 KB APKINDEX.tar.gz.

Of course the extra metadata which Fedora provides helps some use case, otherwise they hopefully would have removed it altogether. The amount of metadata seems excessive for the use case of installing a single package, which I consider the main use-case of an interactive package manager.

I expect any modern Linux distribution to only transfer absolutely required data to complete my task.

Pain point: no concurrency

Because they need to sequence executing arbitrary package maintainer-provided code (hooks and triggers), all tested package managers need to install packages sequentially (one after the other) instead of concurrently (all at the same time).

In my blog post “Can we do without hooks and triggers?”, I outline that hooks and triggers are not strictly necessary to build a working Linux distribution.

Thought experiment: further speed-ups

Strictly speaking, the only required feature of a package manager is to make available the package contents so that the package can be used: a program can be started, a kernel module can be loaded, etc.

By only implementing what’s needed for this feature, and nothing more, a package manager could likely beat apk’s performance. It could, for example:

  • skip archive extraction by mounting file system images (like AppImage or snappy)
  • use compression which is light on CPU, as networks are fast (like apk)
  • skip fsync when it is safe to do so, i.e.:
    • package installations don’t modify system state
    • atomic package installation (e.g. an append-only package store)
    • automatically clean up the package store after crashes

Current landscape

Here’s a table outlining how the various package managers listed on Wikipedia’s list of software package management systems fare:

name scope package file format hooks/triggers
AppImage apps image: ISO9660, SquashFS no
snappy apps image: SquashFS yes: hooks
FlatPak apps archive: OSTree no
0install apps archive: tar.bz2 no
nix, guix distro archive: nar.{bz2,xz} activation script
dpkg distro archive: tar.{gz,xz,bz2} in ar(1) yes
rpm distro archive: cpio.{bz2,lz,xz} scriptlets
pacman distro archive: tar.xz install
slackware distro archive: tar.{gz,xz} yes: doinst.sh
apk distro archive: tar.gz yes: .post-install
Entropy distro archive: tar.bz2 yes
ipkg, opkg distro archive: tar{,.gz} yes

Conclusion

As per the current landscape, there is no distribution-scoped package manager which uses images and leaves out hooks and triggers, not even in smaller Linux distributions.

I think that space is really interesting, as it uses a minimal design to achieve significant real-world speed-ups.

I have explored this idea in much more detail, and am happy to talk more about it in my post “Introducing the distri research linux distribution".

There are a couple of recent developments going into the same direction:

Appendix C: measurement details (2020)

ack

You can expand each of these:

Fedora’s dnf takes almost 33 seconds to fetch and unpack 114 MB.

% docker run -t -i fedora /bin/bash
[root@62d3cae2e2f9 /]# time dnf install -y ack
Fedora 32 openh264 (From Cisco) - x86_64     1.9 kB/s | 2.5 kB     00:01
Fedora Modular 32 - x86_64                   6.8 MB/s | 4.9 MB     00:00
Fedora Modular 32 - x86_64 - Updates         5.6 MB/s | 3.7 MB     00:00
Fedora 32 - x86_64 - Updates                 9.9 MB/s |  23 MB     00:02
Fedora 32 - x86_64                            39 MB/s |  70 MB     00:01
[…]
real	0m32.898s
user	0m25.121s
sys	0m1.408s

NixOS’s Nix takes a little over 5s to fetch and unpack 15 MB.

% docker run -t -i nixos/nix
39e9186422ba:/# time sh -c 'nix-channel --update && nix-env -iA nixpkgs.ack'
unpacking channels...
created 1 symlinks in user environment
installing 'perl5.32.0-ack-3.3.1'
these paths will be fetched (15.55 MiB download, 85.51 MiB unpacked):
  /nix/store/34l8jdg76kmwl1nbbq84r2gka0kw6rc8-perl5.32.0-ack-3.3.1-man
  /nix/store/9df65igwjmf2wbw0gbrrgair6piqjgmi-glibc-2.31
  /nix/store/9fd4pjaxpjyyxvvmxy43y392l7yvcwy1-perl5.32.0-File-Next-1.18
  /nix/store/czc3c1apx55s37qx4vadqhn3fhikchxi-libunistring-0.9.10
  /nix/store/dj6n505iqrk7srn96a27jfp3i0zgwa1l-acl-2.2.53
  /nix/store/ifayp0kvijq0n4x0bv51iqrb0yzyz77g-perl-5.32.0
  /nix/store/w9wc0d31p4z93cbgxijws03j5s2c4gyf-coreutils-8.31
  /nix/store/xim9l8hym4iga6d4azam4m0k0p1nw2rm-libidn2-2.3.0
  /nix/store/y7i47qjmf10i1ngpnsavv88zjagypycd-attr-2.4.48
  /nix/store/z45mp61h51ksxz28gds5110rf3wmqpdc-perl5.32.0-ack-3.3.1
copying path '/nix/store/34l8jdg76kmwl1nbbq84r2gka0kw6rc8-perl5.32.0-ack-3.3.1-man' from 'https://cache.nixos.org'...
copying path '/nix/store/czc3c1apx55s37qx4vadqhn3fhikchxi-libunistring-0.9.10' from 'https://cache.nixos.org'...
copying path '/nix/store/9fd4pjaxpjyyxvvmxy43y392l7yvcwy1-perl5.32.0-File-Next-1.18' from 'https://cache.nixos.org'...
copying path '/nix/store/xim9l8hym4iga6d4azam4m0k0p1nw2rm-libidn2-2.3.0' from 'https://cache.nixos.org'...
copying path '/nix/store/9df65igwjmf2wbw0gbrrgair6piqjgmi-glibc-2.31' from 'https://cache.nixos.org'...
copying path '/nix/store/y7i47qjmf10i1ngpnsavv88zjagypycd-attr-2.4.48' from 'https://cache.nixos.org'...
copying path '/nix/store/dj6n505iqrk7srn96a27jfp3i0zgwa1l-acl-2.2.53' from 'https://cache.nixos.org'...
copying path '/nix/store/w9wc0d31p4z93cbgxijws03j5s2c4gyf-coreutils-8.31' from 'https://cache.nixos.org'...
copying path '/nix/store/ifayp0kvijq0n4x0bv51iqrb0yzyz77g-perl-5.32.0' from 'https://cache.nixos.org'...
copying path '/nix/store/z45mp61h51ksxz28gds5110rf3wmqpdc-perl5.32.0-ack-3.3.1' from 'https://cache.nixos.org'...
building '/nix/store/m0rl62grplq7w7k3zqhlcz2hs99y332l-user-environment.drv'...
created 49 symlinks in user environment
real	0m 5.60s
user	0m 3.21s
sys	0m 1.66s

Debian’s apt takes almost 10 seconds to fetch and unpack 16 MB.

% docker run -t -i debian:sid
root@1996bb94a2d1:/# time (apt update && apt install -y ack-grep)
Get:1 http://deb.debian.org/debian sid InRelease [146 kB]
Get:2 http://deb.debian.org/debian sid/main amd64 Packages [8400 kB]
Fetched 8546 kB in 1s (8088 kB/s)
[…]
The following NEW packages will be installed:
  ack libfile-next-perl libgdbm-compat4 libgdbm6 libperl5.30 netbase perl perl-modules-5.30
0 upgraded, 8 newly installed, 0 to remove and 23 not upgraded.
Need to get 7341 kB of archives.
After this operation, 46.7 MB of additional disk space will be used.
[…]
real	0m9.544s
user	0m2.839s
sys	0m0.775s

Arch Linux’s pacman takes a little under 3s to fetch and unpack 6.5 MB.

% docker run -t -i archlinux/base
[root@9f6672688a64 /]# time (pacman -Sy && pacman -S --noconfirm ack)
:: Synchronizing package databases...
 core            130.8 KiB  1090 KiB/s 00:00
 extra          1655.8 KiB  3.48 MiB/s 00:00
 community         5.2 MiB  6.11 MiB/s 00:01
resolving dependencies...
looking for conflicting packages...

Packages (2) perl-file-next-1.18-2  ack-3.4.0-1

Total Download Size:   0.07 MiB
Total Installed Size:  0.19 MiB
[…]
real	0m2.936s
user	0m0.375s
sys	0m0.160s

Alpine’s apk takes a little over 1 second to fetch and unpack 10 MB.

% docker run -t -i alpine
fetch http://dl-cdn.alpinelinux.org/alpine/v3.12/main/x86_64/APKINDEX.tar.gz
fetch http://dl-cdn.alpinelinux.org/alpine/v3.12/community/x86_64/APKINDEX.tar.gz
(1/4) Installing libbz2 (1.0.8-r1)
(2/4) Installing perl (5.30.3-r0)
(3/4) Installing perl-file-next (1.18-r0)
(4/4) Installing ack (3.3.1-r0)
Executing busybox-1.31.1-r16.trigger
OK: 43 MiB in 18 packages
real	0m 1.24s
user	0m 0.40s
sys	0m 0.15s

qemu

You can expand each of these:

Fedora’s dnf takes over 4 minutes to fetch and unpack 226 MB.

% docker run -t -i fedora /bin/bash
[root@6a52ecfc3afa /]# time dnf install -y qemu
Fedora 32 openh264 (From Cisco) - x86_64     3.1 kB/s | 2.5 kB     00:00
Fedora Modular 32 - x86_64                   6.3 MB/s | 4.9 MB     00:00
Fedora Modular 32 - x86_64 - Updates         6.0 MB/s | 3.7 MB     00:00
Fedora 32 - x86_64 - Updates                 334 kB/s |  23 MB     01:10
Fedora 32 - x86_64                            33 MB/s |  70 MB     00:02
[…]

Total download size: 181 M
Downloading Packages:
[…]

real	4m37.652s
user	0m38.239s
sys	0m6.321s

NixOS’s Nix takes almost 34s to fetch and unpack 180 MB.

% docker run -t -i nixos/nix
83971cf79f7e:/# time sh -c 'nix-channel --update && nix-env -iA nixpkgs.qemu'
unpacking channels...
created 1 symlinks in user environment
installing 'qemu-5.1.0'
these paths will be fetched (180.70 MiB download, 1146.92 MiB unpacked):
[…]
real	0m 33.64s
user	0m 16.96s
sys	0m 3.05s

Debian’s apt takes over 95 seconds to fetch and unpack 224 MB.

% docker run -t -i debian:sid
root@b7cc25a927ab:/# time (apt update && apt install -y qemu-system-x86)
Get:1 http://deb.debian.org/debian sid InRelease [146 kB]
Get:2 http://deb.debian.org/debian sid/main amd64 Packages [8400 kB]
Fetched 8546 kB in 1s (5998 kB/s)
[…]
Fetched 216 MB in 43s (5006 kB/s)
[…]
real	1m25.375s
user	0m29.163s
sys	0m12.835s

Arch Linux’s pacman takes almost 44s to fetch and unpack 142 MB.

% docker run -t -i archlinux/base
[root@58c78bda08e8 /]# time (pacman -Sy && pacman -S --noconfirm qemu)
:: Synchronizing package databases...
 core          130.8 KiB  1055 KiB/s 00:00
 extra        1655.8 KiB  3.70 MiB/s 00:00
 community       5.2 MiB  7.89 MiB/s 00:01
[…]
Total Download Size:   135.46 MiB
Total Installed Size:  661.05 MiB
[…]
real	0m43.901s
user	0m4.980s
sys	0m2.615s

Alpine’s apk takes only about 2.4 seconds to fetch and unpack 26 MB.

% docker run -t -i alpine
/ # time apk add qemu-system-x86_64
fetch http://dl-cdn.alpinelinux.org/alpine/v3.10/main/x86_64/APKINDEX.tar.gz
fetch http://dl-cdn.alpinelinux.org/alpine/v3.10/community/x86_64/APKINDEX.tar.gz
[…]
OK: 78 MiB in 95 packages
real	0m 2.43s
user	0m 0.46s
sys	0m 0.09s

Appendix B: measurement details (2019)

ack

You can expand each of these:

Fedora’s dnf takes almost 30 seconds to fetch and unpack 107 MB.

% docker run -t -i fedora /bin/bash
[root@722e6df10258 /]# time dnf install -y ack
Fedora Modular 30 - x86_64            4.4 MB/s | 2.7 MB     00:00
Fedora Modular 30 - x86_64 - Updates  3.7 MB/s | 2.4 MB     00:00
Fedora 30 - x86_64 - Updates           17 MB/s |  19 MB     00:01
Fedora 30 - x86_64                     31 MB/s |  70 MB     00:02
[…]
Install  44 Packages

Total download size: 13 M
Installed size: 42 M
[…]
real	0m29.498s
user	0m22.954s
sys	0m1.085s

NixOS’s Nix takes 14s to fetch and unpack 15 MB.

% docker run -t -i nixos/nix
39e9186422ba:/# time sh -c 'nix-channel --update && nix-env -i perl5.28.2-ack-2.28'
unpacking channels...
created 2 symlinks in user environment
installing 'perl5.28.2-ack-2.28'
these paths will be fetched (14.91 MiB download, 80.83 MiB unpacked):
  /nix/store/57iv2vch31v8plcjrk97lcw1zbwb2n9r-perl-5.28.2
  /nix/store/89gi8cbp8l5sf0m8pgynp2mh1c6pk1gk-attr-2.4.48
  /nix/store/gkrpl3k6s43fkg71n0269yq3p1f0al88-perl5.28.2-ack-2.28-man
  /nix/store/iykxb0bmfjmi7s53kfg6pjbfpd8jmza6-glibc-2.27
  /nix/store/k8lhqzpaaymshchz8ky3z4653h4kln9d-coreutils-8.31
  /nix/store/svgkibi7105pm151prywndsgvmc4qvzs-acl-2.2.53
  /nix/store/x4knf14z1p0ci72gl314i7vza93iy7yc-perl5.28.2-File-Next-1.16
  /nix/store/zfj7ria2kwqzqj9dh91kj9kwsynxdfk0-perl5.28.2-ack-2.28
copying path '/nix/store/gkrpl3k6s43fkg71n0269yq3p1f0al88-perl5.28.2-ack-2.28-man' from 'https://cache.nixos.org'...
copying path '/nix/store/iykxb0bmfjmi7s53kfg6pjbfpd8jmza6-glibc-2.27' from 'https://cache.nixos.org'...
copying path '/nix/store/x4knf14z1p0ci72gl314i7vza93iy7yc-perl5.28.2-File-Next-1.16' from 'https://cache.nixos.org'...
copying path '/nix/store/89gi8cbp8l5sf0m8pgynp2mh1c6pk1gk-attr-2.4.48' from 'https://cache.nixos.org'...
copying path '/nix/store/svgkibi7105pm151prywndsgvmc4qvzs-acl-2.2.53' from 'https://cache.nixos.org'...
copying path '/nix/store/k8lhqzpaaymshchz8ky3z4653h4kln9d-coreutils-8.31' from 'https://cache.nixos.org'...
copying path '/nix/store/57iv2vch31v8plcjrk97lcw1zbwb2n9r-perl-5.28.2' from 'https://cache.nixos.org'...
copying path '/nix/store/zfj7ria2kwqzqj9dh91kj9kwsynxdfk0-perl5.28.2-ack-2.28' from 'https://cache.nixos.org'...
building '/nix/store/q3243sjg91x1m8ipl0sj5gjzpnbgxrqw-user-environment.drv'...
created 56 symlinks in user environment
real	0m 14.02s
user	0m 8.83s
sys	0m 2.69s

Debian’s apt takes almost 10 seconds to fetch and unpack 16 MB.

% docker run -t -i debian:sid
root@b7cc25a927ab:/# time (apt update && apt install -y ack-grep)
Get:1 http://cdn-fastly.deb.debian.org/debian sid InRelease [233 kB]
Get:2 http://cdn-fastly.deb.debian.org/debian sid/main amd64 Packages [8270 kB]
Fetched 8502 kB in 2s (4764 kB/s)
[…]
The following NEW packages will be installed:
  ack ack-grep libfile-next-perl libgdbm-compat4 libgdbm5 libperl5.26 netbase perl perl-modules-5.26
The following packages will be upgraded:
  perl-base
1 upgraded, 9 newly installed, 0 to remove and 60 not upgraded.
Need to get 8238 kB of archives.
After this operation, 42.3 MB of additional disk space will be used.
[…]
real	0m9.096s
user	0m2.616s
sys	0m0.441s

Arch Linux’s pacman takes a little over 3s to fetch and unpack 6.5 MB.

% docker run -t -i archlinux/base
[root@9604e4ae2367 /]# time (pacman -Sy && pacman -S --noconfirm ack)
:: Synchronizing package databases...
 core            132.2 KiB  1033K/s 00:00
 extra          1629.6 KiB  2.95M/s 00:01
 community         4.9 MiB  5.75M/s 00:01
[…]
Total Download Size:   0.07 MiB
Total Installed Size:  0.19 MiB
[…]
real	0m3.354s
user	0m0.224s
sys	0m0.049s

Alpine’s apk takes only about 1 second to fetch and unpack 10 MB.

% docker run -t -i alpine
/ # time apk add ack
fetch http://dl-cdn.alpinelinux.org/alpine/v3.10/main/x86_64/APKINDEX.tar.gz
fetch http://dl-cdn.alpinelinux.org/alpine/v3.10/community/x86_64/APKINDEX.tar.gz
(1/4) Installing perl-file-next (1.16-r0)
(2/4) Installing libbz2 (1.0.6-r7)
(3/4) Installing perl (5.28.2-r1)
(4/4) Installing ack (3.0.0-r0)
Executing busybox-1.30.1-r2.trigger
OK: 44 MiB in 18 packages
real	0m 0.96s
user	0m 0.25s
sys	0m 0.07s

qemu

You can expand each of these:

Fedora’s dnf takes over a minute to fetch and unpack 266 MB.

% docker run -t -i fedora /bin/bash
[root@722e6df10258 /]# time dnf install -y qemu
Fedora Modular 30 - x86_64            3.1 MB/s | 2.7 MB     00:00
Fedora Modular 30 - x86_64 - Updates  2.7 MB/s | 2.4 MB     00:00
Fedora 30 - x86_64 - Updates           20 MB/s |  19 MB     00:00
Fedora 30 - x86_64                     31 MB/s |  70 MB     00:02
[…]
Install  262 Packages
Upgrade    4 Packages

Total download size: 172 M
[…]
real	1m7.877s
user	0m44.237s
sys	0m3.258s

NixOS’s Nix takes 38s to fetch and unpack 262 MB.

% docker run -t -i nixos/nix
39e9186422ba:/# time sh -c 'nix-channel --update && nix-env -i qemu-4.0.0'
unpacking channels...
created 2 symlinks in user environment
installing 'qemu-4.0.0'
these paths will be fetched (262.18 MiB download, 1364.54 MiB unpacked):
[…]
real	0m 38.49s
user	0m 26.52s
sys	0m 4.43s

Debian’s apt takes 51 seconds to fetch and unpack 159 MB.

% docker run -t -i debian:sid
root@b7cc25a927ab:/# time (apt update && apt install -y qemu-system-x86)
Get:1 http://cdn-fastly.deb.debian.org/debian sid InRelease [149 kB]
Get:2 http://cdn-fastly.deb.debian.org/debian sid/main amd64 Packages [8426 kB]
Fetched 8574 kB in 1s (6716 kB/s)
[…]
Fetched 151 MB in 2s (64.6 MB/s)
[…]
real	0m51.583s
user	0m15.671s
sys	0m3.732s

Arch Linux’s pacman takes 1m2s to fetch and unpack 124 MB.

% docker run -t -i archlinux/base
[root@9604e4ae2367 /]# time (pacman -Sy && pacman -S --noconfirm qemu)
:: Synchronizing package databases...
 core       132.2 KiB   751K/s 00:00
 extra     1629.6 KiB  3.04M/s 00:01
 community    4.9 MiB  6.16M/s 00:01
[…]
Total Download Size:   123.20 MiB
Total Installed Size:  587.84 MiB
[…]
real	1m2.475s
user	0m9.272s
sys	0m2.458s

Alpine’s apk takes only about 2.4 seconds to fetch and unpack 26 MB.

% docker run -t -i alpine
/ # time apk add qemu-system-x86_64
fetch http://dl-cdn.alpinelinux.org/alpine/v3.10/main/x86_64/APKINDEX.tar.gz
fetch http://dl-cdn.alpinelinux.org/alpine/v3.10/community/x86_64/APKINDEX.tar.gz
[…]
OK: 78 MiB in 95 packages
real	0m 2.43s
user	0m 0.46s
sys	0m 0.09s

21 November, 2020 09:04AM

Winding down my Debian involvement

This post is hard to write, both in the emotional sense but also in the “I would have written a shorter letter, but I didn’t have the time” sense. Hence, please assume the best of intentions when reading it—it is not my intention to make anyone feel bad about their contributions, but rather to provide some insight into why my frustration level ultimately exceeded the threshold.

Debian has been in my life for well over 10 years at this point.

A few weeks ago, I have visited some old friends at the Zürich Debian meetup after a multi-year period of absence. On my bike ride home, it occurred to me that the topics of our discussions had remarkable overlap with my last visit. We had a discussion about the merits of systemd, which took a detour to respect in open source communities, returned to processes in Debian and eventually culminated in democracies and their theoretical/practical failings. Admittedly, that last one might be a Swiss thing.

I say this not to knock on the Debian meetup, but because it prompted me to reflect on what feelings Debian is invoking lately and whether it’s still a good fit for me.

So I’m finally making a decision that I should have made a long time ago: I am winding down my involvement in Debian to a minimum.

What does this mean?

Over the coming weeks, I will:

  • transition packages to be team-maintained where it makes sense
  • remove myself from the Uploaders field on packages with other maintainers
  • orphan packages where I am the sole maintainer

I will try to keep up best-effort maintenance of the manpages.debian.org service and the codesearch.debian.net service, but any help would be much appreciated.

For all intents and purposes, please treat me as permanently on vacation. I will try to be around for administrative issues (e.g. permission transfers) and questions addressed directly to me, permitted they are easy enough to answer.

Why?

When I joined Debian, I was still studying, i.e. I had luxurious amounts of spare time. Now, over 5 years of full time work later, my day job taught me a lot, both about what works in large software engineering projects and how I personally like my computer systems. I am very conscious of how I spend the little spare time that I have these days.

The following sections each deal with what I consider a major pain point, in no particular order. Some of them influence each other—for example, if changes worked better, we could have a chance at transitioning packages to be more easily machine readable.

Change process in Debian

The last few years, my current team at work conducted various smaller and larger refactorings across the entire code base (touching thousands of projects), so we have learnt a lot of valuable lessons about how to effectively do these changes. It irks me that Debian works almost the opposite way in every regard. I appreciate that every organization is different, but I think a lot of my points do actually apply to Debian.

In Debian, packages are nudged in the right direction by a document called the Debian Policy, or its programmatic embodiment, lintian.

While it is great to have a lint tool (for quick, local/offline feedback), it is even better to not require a lint tool at all. The team conducting the change (e.g. the C++ team introduces a new hardening flag for all packages) should be able to do their work transparent to me.

Instead, currently, all packages become lint-unclean, all maintainers need to read up on what the new thing is, how it might break, whether/how it affects them, manually run some tests, and finally decide to opt in. This causes a lot of overhead and manually executed mechanical changes across packages.

Notably, the cost of each change is distributed onto the package maintainers in the Debian model. At work, we have found that the opposite works better: if the team behind the change is put in power to do the change for as many users as possible, they can be significantly more efficient at it, which reduces the total cost and time a lot. Of course, exceptions (e.g. a large project abusing a language feature) should still be taken care of by the respective owners, but the important bit is that the default should be the other way around.

Debian is lacking tooling for large changes: it is hard to programmatically deal with packages and repositories (see the section below). The closest to “sending out a change for review” is to open a bug report with an attached patch. I thought the workflow for accepting a change from a bug report was too complicated and started mergebot, but only Guido ever signaled interest in the project.

Culturally, reviews and reactions are slow. There are no deadlines. I literally sometimes get emails notifying me that a patch I sent out a few years ago (!!) is now merged. This turns projects from a small number of weeks into many years, which is a huge demotivator for me.

Interestingly enough, you can see artifacts of the slow online activity manifest itself in the offline culture as well: I don’t want to be discussing systemd’s merits 10 years after I first heard about it.

Lastly, changes can easily be slowed down significantly by holdouts who refuse to collaborate. My canonical example for this is rsync, whose maintainer refused my patches to make the package use debhelper purely out of personal preference.

Granting so much personal freedom to individual maintainers prevents us as a project from raising the abstraction level for building Debian packages, which in turn makes tooling harder.

How would things look like in a better world?

  1. As a project, we should strive towards more unification. Uniformity still does not rule out experimentation, it just changes the trade-off from easier experimentation and harder automation to harder experimentation and easier automation.
  2. Our culture needs to shift from “this package is my domain, how dare you touch it” to a shared sense of ownership, where anyone in the project can easily contribute (reviewed) changes without necessarily even involving individual maintainers.

To learn more about how successful large changes can look like, I recommend my colleague Hyrum Wright’s talk “Large-Scale Changes at Google: Lessons Learned From 5 Yrs of Mass Migrations”.

Fragmented workflow and infrastructure

Debian generally seems to prefer decentralized approaches over centralized ones. For example, individual packages are maintained in separate repositories (as opposed to in one repository), each repository can use any SCM (git and svn are common ones) or no SCM at all, and each repository can be hosted on a different site. Of course, what you do in such a repository also varies subtly from team to team, and even within teams.

In practice, non-standard hosting options are used rarely enough to not justify their cost, but frequently enough to be a huge pain when trying to automate changes to packages. Instead of using GitLab’s API to create a merge request, you have to design an entirely different, more complex system, which deals with intermittently (or permanently!) unreachable repositories and abstracts away differences in patch delivery (bug reports, merge requests, pull requests, email, …).

Wildly diverging workflows is not just a temporary problem either. I participated in long discussions about different git workflows during DebConf 13, and gather that there were similar discussions in the meantime.

Personally, I cannot keep enough details of the different workflows in my head. Every time I touch a package that works differently than mine, it frustrates me immensely to re-learn aspects of my day-to-day.

After noticing workflow fragmentation in the Go packaging team (which I started), I tried fixing this with the workflow changes proposal, but did not succeed in implementing it. The lack of effective automation and slow pace of changes in the surrounding tooling despite my willingness to contribute time and energy killed any motivation I had.

Old infrastructure: package uploads

When you want to make a package available in Debian, you upload GPG-signed files via anonymous FTP. There are several batch jobs (the queue daemon, unchecked, dinstall, possibly others) which run on fixed schedules (e.g. dinstall runs at 01:52 UTC, 07:52 UTC, 13:52 UTC and 19:52 UTC).

Depending on timing, I estimated that you might wait for over 7 hours (!!) before your package is actually installable.

What’s worse for me is that feedback to your upload is asynchronous. I like to do one thing, be done with it, move to the next thing. The current setup requires a many-minute wait and costly task switch for no good technical reason. You might think a few minutes aren’t a big deal, but when all the time I can spend on Debian per day is measured in minutes, this makes a huge difference in perceived productivity and fun.

The last communication I can find about speeding up this process is ganneff’s post from 2008.

How would things look like in a better world?

  1. Anonymous FTP would be replaced by a web service which ingests my package and returns an authoritative accept or reject decision in its response.
  2. For accepted packages, there would be a status page displaying the build status and when the package will be available via the mirror network.
  3. Packages should be available within a few minutes after the build completed.

Old infrastructure: bug tracker

I dread interacting with the Debian bug tracker. debbugs is a piece of software (from 1994) which is only used by Debian and the GNU project these days.

Debbugs processes emails, which is to say it is asynchronous and cumbersome to deal with. Despite running on the fastest machines we have available in Debian (or so I was told when the subject last came up), its web interface loads very slowly.

Notably, the web interface at bugs.debian.org is read-only. Setting up a working email setup for reportbug(1) or manually dealing with attachments is a rather big hurdle.

For reasons I don’t understand, every interaction with debbugs results in many different email threads.

Aside from the technical implementation, I also can never remember the different ways that Debian uses pseudo-packages for bugs and processes. I need them rarely enough to establish a mental model of how they are set up, or working memory of how they are used, but frequently enough to be annoyed by this.

How would things look like in a better world?

  1. Debian would switch from a custom bug tracker to a (any) well-established one.
  2. Debian would offer automation around processes. It is great to have a paper-trail and artifacts of the process in the form of a bug report, but the primary interface should be more convenient (e.g. a web form).

Old infrastructure: mailing list archives

It baffles me that in 2019, we still don’t have a conveniently browsable threaded archive of mailing list discussions. Email and threading is more widely used in Debian than anywhere else, so this is somewhat ironic. Gmane used to paper over this issue, but Gmane’s availability over the last few years has been spotty, to say the least (it is down as I write this).

I tried to contribute a threaded list archive, but our listmasters didn’t seem to care or want to support the project.

Debian is hard to machine-read

While it is obviously possible to deal with Debian packages programmatically, the experience is far from pleasant. Everything seems slow and cumbersome. I have picked just 3 quick examples to illustrate my point.

debiman needs help from piuparts in analyzing the alternatives mechanism of each package to display the manpages of e.g. psql(1). This is because maintainer scripts modify the alternatives database by calling shell scripts. Without actually installing a package, you cannot know which changes it does to the alternatives database.

pk4 needs to maintain its own cache to look up package metadata based on the package name. Other tools parse the apt database from scratch on every invocation. A proper database format, or at least a binary interchange format, would go a long way.

Debian Code Search wants to ingest new packages as quickly as possible. There used to be a fedmsg instance for Debian, but it no longer seems to exist. It is unclear where to get notifications from for new packages, and where best to fetch those packages.

Complicated build stack

See my “Debian package build tools” post. It really bugs me that the sprawl of tools is not seen as a problem by others.

Developer experience pretty painful

Most of the points discussed so far deal with the experience in developing Debian, but as I recently described in my post “Debugging experience in Debian”, the experience when developing using Debian leaves a lot to be desired, too.

I have more ideas

At this point, the article is getting pretty long, and hopefully you got a rough idea of my motivation.

While I described a number of specific shortcomings above, the final nail in the coffin is actually the lack of a positive outlook. I have more ideas that seem really compelling to me, but, based on how my previous projects have been going, I don’t think I can make any of these ideas happen within the Debian project.

I intend to publish a few more posts about specific ideas for improving operating systems here. Stay tuned.

Lastly, I hope this post inspires someone, ideally a group of people, to improve the developer experience within Debian.

21 November, 2020 09:04AM

hackergotchi for Kentaro Hayashi

Kentaro Hayashi

Introduction about recent debexpo (mentors.debian.net)

I've make a presentation about "How to hack debexpo (mentors.debian.net)" at Tokyo Debian (local Debian meeting) 21, November 2020.

Here is the agenda about presentation.

  • What is mentors.debian.net
  • How to setup debexpo development environment
  • One example to hack debexpo (Showing "In Debian" flag)

The presentation slide is published at Rabbit Slide Show (Written in Japanese)

I hope that more people will be involved to hack debexpo!

21 November, 2020 07:37AM

November 20, 2020

hackergotchi for Shirish Agarwal

Shirish Agarwal

Rights, Press freedom and India

In some ways it is sad and interesting to see how personal liberty is viewed in India. And how it differs from those having the highest fame and power can get a different kind of justice then the rest cannot.

Arnab Goswami

This particular gentleman is a class apart. He is the editor as well as Republic TV, a right-leaning channel which demonizes the minority, women whatever is antithesis to the Central Govt. of India. As a result there have been a spate of cases against him in the past few months. But surprisingly, in each of them he got hearing the day after the suit was filed. This is unique in Indian legal history so much so that a popular legal site which publishes on-going cases put up a post sharing how he was getting prompt hearings. That post itself needs to be updated as there have been 3 more hearings which have been done back to back for him. This is unusual as there have been so many cases pending for the SC attention, some arguably more important than this gentleman . So many precedents have been set which will send a wrong message. The biggest one, that even though a trial is taking place in the sessions court (below High Court) the SC can interject on matters. What this will do to the morale of both lawyers as well as judges of the various Sessions Court is a matter of speculation and yet as shared unprecedented. The saddest part was when Justice Chandrachud said –

Justice Chandrachud – If you don’t like a channel then don’t watch it. – 11th November 2020 .

This is basically giving a free rope to hate speech. How can a SC say like that ? And this is the Same Supreme Court which could not take two tweets from Shri Prashant Bhushan when he made remarks against the judiciary .

J&K pleas in Supreme Court pending since August 2019 (Abrogation 370)

After abrogation of 370, citizens of Jammu and Kashmir, the population of which is 13.6 million people including 4 million Hindus have been stuck with reduced rights and their land being taken away due to new laws. Many of the Hindus which regionally are a minority now rue the fact that they supported the abrogation of 370A . Imagine, a whole state whose answers and prayers have not been heard by the Supreme Court and the people need to move a prayer stating the same.

100 Journalists, activists languishing in Jail without even a hearing

55 Journalists alone have been threatened, booked and in jail for reporting of pandemic . Their fault, they were bring the irregularities, corruption made during the pandemic early months. Activists such as Sudha Bharadwaj, who giving up her American citizenship and settling to fight for tribals is in jail for 2 years without any charges. There are many like her, There are several more petitions lying in the Supreme Court, for e.g. Varavara Rao, not a single hearing from last couple of years, even though he has taken part in so many national movements including the emergency as well as part-responsible for creation of Telengana state out of Andhra Pradesh .

Then there is Devangana kalita who works for gender rights. Similar to Sudha Bharadwaj, she had an opportunity to go to UK and settle here. She did her master’s and came back. And now she is in jail for the things that she studied. While she took part in Anti-CAA sittings, none of her speeches were incendiary but she still is locked up under UAPA (Unlawful Practises Act) . I could go on and on but at the moment these should suffice.

Petitions for Hate Speech which resulted in riots in Delhi are pending, Citizen’s Amendment Act (controversial) no hearings till date. All of the best has been explained in a newspaper article which articulates perhaps all that I wanted to articulate and more. It is and was amazing to see how in certain cases Article 32 is valid and in many it is not. Also a fair reading of Justice Bobde’s article tells you a lot how the SC is functioning. I would like to point out that barandbench along with livelawindia makes it easier for never non-lawyers and public to know how arguments are done in court, what evidences are taken as well as give some clue about judicial orders and judgements. Both of these resources are providing an invaluable service and more often than not, free of charge.

Student Suicide and High Cost of Education

For quite sometime now, the cost of education has been shooting up. While I have visited this topic earlier as well, recently a young girl committed suicide because she was unable to pay the fees as well as additional costs due to pandemic. Further investigations show that this is the case with many of the students who are unable to buy laptops. Now while one could think it is limited to one college then it would be wrong. It is almost across all India and this will continue for months and years. People do know that the pandemic is going to last a significant time and it would be a long time before R value becomes zero . Even the promising vaccine from Pfizer need constant refrigeration which is sort of next to impossible in India. It is going to make things very costly.

Last Nail on Indian Media

Just today the last nail on India has been put. Thankfully Freedom Gazette India did a much better job so just pasting that –

Information and Broadcasting Ministry bringing OTT services as well as news within its ambit.

With this, projects like Scam 1992, The Harshad Mehta Story or Bad Boy Billionaires:India, Test Case, Delhi Crime, Laakhon Mein Ek etc. etc. such kind of series, investigative journalism would be still-births. Many of these web-series also shared tales of woman empowerment while at the same time showed some of the hard choices that women had to contend to live with.

Even western media may be censored where it finds the political discourse not to its liking. There had been so many accounts of Mr. Ravish Kumar, the winner of Ramon Magsaysay, how in his shows the electricity was cut in many places. I too have been the victim when the BJP governed in Maharashtra as almost all Puneities experienced it. Light would go for just half or 45 minutes at the exact time.

There is another aspect to it. The U.S. elections showed how independent media was able to counter Mr. Trump’s various falsehoods and give rise to alternative ideas which lead the team of Bernie Sanders, Joe Biden and Kamala Harris, Biden now being the President-elect while Kamala Harris being the vice-president elect. Although the journey to the white house seems as tough as before. Let’s see what happens.

Hopefully 2021 will bring in some good news.

Update – On 27th November 2020 Martin who runs the planet got an e-mail/notice by a Mr. Nikhil Sethi who runs the wikibio.com property. Mr. Sethi asked to remove the link pointing Devangana Kalita from my blog post to his site as he has used the no follow link. On inquiring further, the gentleman stated that it is an ‘Updated mandate’ (his exact quote) from Google algorithm. To further understand the issue, I went to SERP as they are one of the more known ones on the subject. I also looked it up on Google as well. Found that the gentleman was BSing the whole time. The page basically talks about weightage of a page/site and authoritativeness which is known and yet highly contested ideas. In any case, the point for me was for whatever reason (could be fear, could be something else entirely), Mr. Sethi did not want me to link the content. Hence, I have complied above. I could have dragged it out but I do not wish Mr. Sethi any ill-being or/and further harm unduly and unintentionally caused by me. Hence, have taken down the link.

20 November, 2020 07:39PM by shirishag75

November 19, 2020

Molly de Blanc

Transparency

Technology must be transparent in order to be knowable. Technology must be knowable in order for us to be able to consent to it in good faith. Good faith informed consent is necessary to preserving our (digital) autonomy.

Let’s now look at this in reverse, considering first why informed consent is necessary to our digital autonomy.

Let’s take the concept of our digital autonomy as being one of the highest goods. It is necessary to preserve and respect the value of each individual, and the collectives we choose to form. It is a right to which we are entitled by our very nature, and a prerequisite for building the lives we want, that fulfill us. This is something that we have generally agreed on as important or even sacred. Our autonomy, in whatever form it takes, in whatever part of our life it governs, is necessary and must be protected.

One of the things we must do in order to accomplish this is to build a practice and culture of consent. Giving consent — saying yes — is not enough. This consent must come from a place of understand to that which one is consenting. “Informed consent is consenting to the unknowable.”(1)

Looking at sexual consent as a parallel, even when we have a partner who discloses their sexual history and activities, we cannot know whether they are being truthful and complete. Let’s even say they are and that we can trust this, there is a limit to how much even they know about their body, health, and experience. They might not know the extent of their other partners’ experience. They might be carrying HPV without symptoms; we rarely test for herpes.

Arguably, we have more potential to definitely know what is occurring when it comes to technological consent. Technology can be broken apart. We can share and examine code, schematics, and design documentation. Certainly, lots of information is being hidden from us — a lot of code is proprietary, technical documentation unavailable, and the skills to process these things is treated as special, arcane, and even magical. Tracing the resource pipelines for the minerals and metals essential to building circuit boards is not possible for the average person. Knowing the labor practices of each step of this process, and understanding what those imply for individuals, societies, and the environments they exist in seems improbable at best.

Even though true informed consent might not be possible, it is an ideal towards which we must strive. We must work with what we have, and we must be provided as much as possible.

A periodic conversation that arises in the consideration of technology rights is whether companies should build backdoors into technology for the purpose of government exploitation. A backdoor is a hidden vulnerability in a piece of technology that, when used, would afford someone else access to your device or work or cloud storage or whatever. As long as the source code that powers computing technology is proprietary and opaque, we cannot truly know whether backdoors exist and how secure we are in our digital spaces and even our own computers, phones, and other mobile devices.

We must commit wholly to transparency and openness in order to create the possibility of as-informed-as-possible consent in order to protect our digital autonomy. We cannot exist in a vacuum and practical autonomy relies on networks of truth in order to provide the opportunity for the ideal of informed consent. These networks of truth are created through the open availability and sharing of information, relating to how and why technology works the way it does.

(1) Heintzman, Kit. 2020.

19 November, 2020 03:24PM by mollydb

hackergotchi for Steinar H. Gunderson

Steinar H. Gunderson

COVID-19 vaccine confidence intervals

I keep hearing about new vaccines being “at least 90% effective”, “94.5% effective”, “92% effective” etc... and that's obviously really good news. But is that a point estimate, or a confidence interval? Does 92% mean “anything from 70% to 99%”, given that n=20?

I dusted off the memories of how bootstrapping works (I didn't want to try to figure out whether one could really approximate using the Cauchy distribution or not) and wrote some R code. Obviously, don't use this for medical or policy decisions since I don't have a background in neither medicine nor medical statistics. But it's uplifting results nevertheless; here from the Pfizer/BioNTech data that I could find:

> N <- 43538 / 2
> infected_vaccine <- c(rep(1, times = 8), rep(0, times=N-8))
> infected_placebo <- c(rep(1, times = 162), rep(0, times=N-162))
>
> infected <- c(infected_vaccine, infected_placebo)
> vaccine <- c(rep(1, times=N), rep(0, times=N))
> mydata <- data.frame(infected, vaccine)
>
> library(boot)
> rsq <- function(data, indices) {
+   d <- data[indices,]
+   num_infected_vaccine <- sum(d[which(d$vaccine == 1), ]$infected)
+   num_infected_placebo <- sum(d[which(d$vaccine == 0), ]$infected)
+   return(1.0 - num_infected_vaccine / num_infected_placebo)
+ }
>
> results <- boot(data=mydata, statistic=rsq, R=1000)
> results

ORDINARY NONPARAMETRIC BOOTSTRAP


Call:
boot(data = mydata, statistic = rsq, R = 1000)


Bootstrap Statistics :
     original       bias    std. error
t1* 0.9506173 -0.001428342  0.01832874
> boot.ci(results, type="perc")
BOOTSTRAP CONFIDENCE INTERVAL CALCULATIONS
Based on 1000 bootstrap replicates

CALL :
boot.ci(boot.out = results, type = "perc")

Intervals :
Level     Percentile
95%   ( 0.9063,  0.9815 )
Calculations and Intervals on Original Scale

So that would be a 95% CI of between 90.6% and 98.1% effective, roughly. The confidence intervals might be slightly too wide, since I didn't have enough RAM (!) to run the bootstrap calibrated ones (BCa).

Again, take it with a grain of salt. Corrections welcome. :-)

19 November, 2020 09:39AM

hackergotchi for Daniel Silverstone

Daniel Silverstone

Withdrawing Gitano from support

Unfortunately, in Debian in particular, libgit2 is undergoing a transition which is blocked by gall. Despite having had over a month to deal with this, I've not managed to summon the tuits to update Gall to the new libgit2 which means, nominally, I ought to withdraw it from testing and possibly even from unstable given that I'm not really prepared to look after Gitano and friends in Debian any longer.

However, I'd love for Gitano to remain in Debian if it's useful to people. Gall isn't exactly a large piece of C code, and so probably won't be a huge job to do the port, I simply don't have the desire/energy to do it myself.

If someone wanted to do the work and provide a patch / "pull request" to me, then I'd happily take on the change and upload a new package, or if someone wanted to NMU the gall package in Debian I'll take the change they make and import it into upstream. I just don't have the energy to reload all that context and do the change myself.

If you want to do this, email me and let me know, so I can support you and take on the change when it's done. Otherwise I probably will go down the lines of requesting Gitano's removal from Debian in the next week or so.

19 November, 2020 08:49AM by Daniel Silverstone

November 17, 2020

hackergotchi for Rapha&#235;l Hertzog

Raphaël Hertzog

Freexian’s report about Debian Long Term Support, October 2020

A Debian LTS logo Like each month, here comes a report about the work of paid contributors to Debian LTS.

Individual reports

In October, 221.50 work hours have been dispatched among 13 paid contributors. Their reports are available:
  • Abhijith PA did 16.0h (out of 14h assigned and 2h from September).
  • Adrian Bunk did 7h (out of 20.75h assigned and 5.75h from September), thus carrying over 19.5h to November.
  • Ben Hutchings did 11.5h (out of 6.25h assigned and 9.75h from September), thus carrying over 4.5h to November.
  • Brian May did 10h (out of 10h assigned).
  • Chris Lamb did 18h (out of 18h assigned).
  • Emilio Pozuelo Monfort did 20.75h (out of 20.75h assigned).
  • Holger Levsen did 7.0h coordinating/managing the LTS team.
  • Markus Koschany did 20.75h (out of 20.75h assigned).
  • Mike Gabriel gave back the 8h he was assigned. See below 🙂
  • Ola Lundqvist did 10.5h (out of 8h assigned and 2.5h from September).
  • Roberto C. Sánchez did 13.5h (out of 20.75h assigned) and gave back 7.25h to the pool.
  • Sylvain Beucler did 20.75h (out of 20.75h assigned).
  • Thorsten Alteholz did 20.75h (out of 20.75h assigned).
  • Utkarsh Gupta did 20.75h (out of 20.75h assigned).

Evolution of the situation

October was a regular LTS month with a LTS team meeting done via video chat thus there’s no log to be shared. After more than five years of contributing to LTS (and ELTS), Mike Gabriel announced that he founded a new company called Frei(e) Software GmbH and thus would leave us to concentrate on this new endeavor. Best of luck with that, Mike! So, once again, this is a good moment to remind that we are constantly looking for new contributors. Please contact Holger if you are interested!

The security tracker currently lists 42 packages with a known CVE and the dla-needed.txt file has 39 packages needing an update.

Thanks to our sponsors

Sponsors that joined recently are in bold.

No comment | Liked this article? Click here. | My blog is Flattr-enabled.

17 November, 2020 09:06AM by Raphaël Hertzog

hackergotchi for Jaldhar Vyas

Jaldhar Vyas

Sal Mubarak 2077!

[Celebrating Diwali wearing a mask]

Best wishes to the entire Debian world for a happy, prosperous and safe Gujarati new year, Vikram Samvat 2077 named Paridhawi.

17 November, 2020 08:05AM

hackergotchi for Louis-Philippe Véronneau

Louis-Philippe Véronneau

A better git diff

A few days ago I wrote a quick patch and missed a dumb mistake that made the program crash. When reviewing the merge request on Salsa, the problem became immediately apparent; Gitlab's diff is much better than what git diff shows by default in a terminal.

Well, it turns out since version 2.9, git bundles a better pager, diff-highlight. À la Gitlab, it will highlight what changed in the line.

The output of git diff using diff-highlight

Sadly, even though diff-highlight comes with the git package in Debian, it is not built by default (925288). You will need to:

$ sudo make --directory /usr/share/doc/git/contrib/diff-highlight

You can then add this line to your .gitconfig file:

[core]
  pager = /usr/share/doc/git/contrib/diff-highlight/diff-highlight | less --tabs=4 -RFX

If you use tig, you'll also need to add this line in your tigrc:

set diff-highlight = /usr/share/doc/git/contrib/diff-highlight/diff-highlight

17 November, 2020 05:00AM by Louis-Philippe Véronneau

hackergotchi for Dirk Eddelbuettel

Dirk Eddelbuettel

RcppArmadillo 0.10.1.2.0

armadillo image

Armadillo is a powerful and expressive C++ template library for linear algebra aiming towards a good balance between speed and ease of use with a syntax deliberately close to a Matlab. RcppArmadillo integrates this library with the R environment and language–and is widely used by (currently) 779 other packages on CRAN.

This release ties up a few loose ends from the recent 0.10.1.0.0.

Changes in RcppArmadillo version 0.10.1.2.0 (2020-11-15)

  • Upgraded to Armadillo release 10.1.2 (Orchid Ambush)

  • Remove three unused int constants (#313)

  • Include main armadillo header using quotes instead of brackets

  • Rewrite version number use in old-school mode because gcc 4.8.5

  • Skipping parts of sparse conversion on Windows as win-builder fails

Courtesy of my CRANberries, there is a diffstat report relative to previous release. More detailed information is on the RcppArmadillo page. Questions, comments etc should go to the rcpp-devel mailing list off the R-Forge page.

If you like this or other open-source work I do, you can now sponsor me at GitHub. For the first year, GitHub will match your contributions.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.

17 November, 2020 02:03AM

RcppAnnoy 0.0.17

annoy image

A new release 0.0.17 of RcppAnnoy is now on CRAN. RcppAnnoy is the Rcpp-based R integration of the nifty Annoy library by Erik Bernhardsson. Annoy is a small and lightweight C++ template header library for very fast approximate nearest neighbours—originally developed to drive the famous Spotify music discovery algorithm.

This release brings a new upstream version 1.17, released a few weeks ago, which adds multithreaded index building. This changes the API by adding a new ‘threading policy’ parameter requiring code using the main Annoy header to update. For this reason we waited a little for the dust to settle on the BioConductor 3.12 release before bringing the changes to BiocNeighbors via this commit and to uwot via this simple PR. Aaron and James updated their packages accordingly so by the time I uploaded RcppAnnoy it made for very smooth sailing as we all had done our homework with proper conditional builds, and the package had no other issue preventing automated processing at CRAN. Yay. I also added a (somewhat overdue one may argue) header file RcppAnnoy.h regrouping defines and includes which should help going forward.

Detailed changes follow below.

Changes in version 0.0.17 (2020-11-15)

  • Upgrade to Annoy 1.17, but default to serial use.

  • Add new header file to regroup includes and defines.

  • Upgrade CI script to use R with bspm on focal.

Courtesy of my CRANberries, there is also a diffstat report for this release.

If you like this or other open-source work I do, you can now sponsor me at GitHub. For the first year, GitHub will match your contributions.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.

17 November, 2020 01:48AM

November 16, 2020

hackergotchi for Bits from Debian

Bits from Debian

New Debian Developers and Maintainers (September and October 2020)

The following contributors got their Debian Developer accounts in the last two months:

  • Benda XU (orv)
  • Joseph Nahmias (jello)
  • Marcos Fouces (marcos)
  • Hayashi Kentaro (kenhys)
  • James Valleroy (jvalleroy)
  • Helge Deller (deller)

The following contributors were added as Debian Maintainers in the last two months:

  • Ricardo Ribalda Delgado
  • Pierre Gruet
  • Henry-Nicolas Tourneur
  • Aloïs Micard
  • Jérôme Lebleu
  • Nis Martensen
  • Stephan Lachnit
  • Felix Salfelder
  • Aleksey Kravchenko
  • Étienne Mollier

Congratulations!

16 November, 2020 07:00PM by Jean-Pierre Giraud

hackergotchi for Adnan Hodzic

Adnan Hodzic

Degiro trading tracker – Simplified tracking of your investments

TL;DRVisit degiro-trading-tracker on Github I was always interested in stocks and investing. While I wanted to get into trading for long time, I could never...

The post Degiro trading tracker – Simplified tracking of your investments appeared first on FoolControl: Phear the penguin.

16 November, 2020 07:21AM by Adnan Hodzic

November 15, 2020

Jamie McClelland

Being your own Certificate Authority

There are many blogs and tutorials with nice shortcuts providing the necessary openssl commands to create and sign x509 certficates.

However, there is precious few instructions for how to easily create your own certificate authority.

You probably never want to do this in a production environment, but in a development environment it will make your life signficantly easier.

Create the certificate authority

Create the key and certificate

Pick a directory to store things in. Then, make your certificate authority key and certificate:

openssl genrsa -out cakey.pem 2048
openssl req -x509 -new -nodes -key cakey.pem -sha256 -days 1024 -out cacert.pem

Some tips:

  • You will be prompted to enter some information about your certificate authoirty. Provide the minimum information - i.e., only overwrite the defaults. So, provide a value for Country, State or Province, and Organization Name and leave the rest blank.
  • You probably want to leave the password blank if this is a development/testing environment.

Want to review what you created?

openssl x509 -text -noout -in cacert.pem 

Prepare your directory

You can create your own /etc/ssl/openssl.cnf file and really customize things. But I find it safer to use your distribution's default file so you can benefit from changes to it every time you upgrade.

If you do take the default file, you may have the dir option coded to demoCA (in Debian at least, maybe it's the upstream default too).

So, to avoid changing any configuration files, let's just use this value. Which means... you'll need to create that directory. The setting is relative - so you can create this directory in the same directory you have your keys.

mkdir demoCA

Lastly, you have to have a file that keeps track of your certificates. If it doesn't exist, you get an error:

touch demoCA/index.txt

That's it! Your certificate authority is ready to go.

Create a key and ceritificate signing request

First, pick your domain names (aka "common" names). For example, example.org and www.example.org.

Set those values in an environment variable. If you just have one:

export subjectAltName=DNS:example.org

If you have more then one:

export subjectAltName=DNS:example.org,DNS:www.example.org

Next, create a key and a certificate signing request:

openssl req -new -nodes -out new.csr -keyout new.key

Again, you will be prompted for some values (country, state, etc) - be sure to choose the same values you used with your certficiate authority! I honestly don't understand why this is necessary (when I set different values I get an error on the signing request step below). Maybe someone can add a comment to this post explaining why these values have to match?

Also, you must provide a common name for your certificate - you can choose the same name as the altSubjectNames value you set above (but just one domain).

Want to review what you created?

openssl req -in new.csr -text -noout 

Sign it!

At last the momenet we have been waiting for.

openssl ca -keyfile cakey.pem -cert cacert.pem -out new.crt -outdir . -rand_serial -infiles new.csr

Now yu have a new.crt and new.csr that you can install via your web browser, mail server, etc specification.

Sanity check it

This command will confirm that the certificate is trusted by your certificate authority.

openssl verify -no-CApath -CAfile cacert.pem new.crt 

But wait, there's still a question of trust

You probably want to tell your computer or browser that you want to trust your certificate signing authority.

Command line tools

Most tools in linux by default will trust all the certificates in /etc/ssl/certs/ca-certificates.crt. (If that file doesn't exist, try installing the ca-certificates package). If you want to add your certificate to that file:

cp cacert.pem /usr/local/share/ca-certificates/cacert.crt
sudo dpkg-reconfigure ca-certificates

Want to know what's funny? Ok, not really funny. If the certificate name ends with .pem the command above won't work. Seriously.

Once your certificate is installed with your web server you can now test to make sure it's all working with:

gnutls-cli --print-cert $domainName

Want a second opinion?

curl https://$domainName
wget https://$domainName -O-

Both will report errors if the certificate can't be verified by a system certificate.

If you really want to narrow down the cause of error (maybe reconfiguring ca-certificates didn't work)?

curl --cacert /path/to/your/cacert.pem --capath /tmp

Those arguments tell curl to use your certificate authority file and not to load any other certificate authority files (well, unless you have some installed in the temp directory).

Web browsers

Firefox and Chrome have their own store of trusted certificates - you'll have to import your cacert.pem file into each browser that you want to trust your key.

15 November, 2020 03:11PM

hackergotchi for Steinar H. Gunderson

Steinar H. Gunderson

Using Buypass card readers in Linux

If you want to know the result of your corona test in Norway (or really, any other health information), you'll need to either get an electronic ID with a confidential spec where your bank holds the secret key, can use it towards other banks with no oversight, and allows whoever has that key to take up huge loans and other binding agreements in your name in a matter of minutes (also known as “BankID”)… or you can get a smart card from Buypass, where you hold the private key yourself.

Since most browsers won't talk directly to a smart card, you'll need a small proxy that exposes a REST interface on 127.0.0.1. (It used to be solved with a Java applet, but, yeah. That was 40 Chrome releases ago.) Buypass publishes those only for Windows and Mac, but the protocol was simple enough, so I made my own reimplementation called Linux Dallas Multipass. It's rough, only really seems to work in Firefox (and only if you spoof your UA to be Windows), you'll need to generate and install certificates to install it yourself… but yes. You can log in to find out that you're negative.

15 November, 2020 11:15AM

Vincent Bernat

Zero-Touch Provisioning for Juniper

Juniper’s official documentation on ZTP explains how to configure the ISC DHCP Server to automatically upgrade and configure on first boot a Juniper device. However, the proposed configuration could be a bit more elegant. This note explains how.

TL;DR

Do not redefine option 43. Instead, specify the vendor option space to use to encode parameters with vendor-option-space.


When booting for the first time, a Juniper device requests its IP address through a DHCP discover message, then request additional parameters for autoconfiguration through a DHCP request message:

Dynamic Host Configuration Protocol (Request)
    Message type: Boot Request (1)
    Hardware type: Ethernet (0x01)
    Hardware address length: 6
    Hops: 0
    Transaction ID: 0x44e3a7c9
    Seconds elapsed: 0
    Bootp flags: 0x8000, Broadcast flag (Broadcast)
    Client IP address: 0.0.0.0
    Your (client) IP address: 0.0.0.0
    Next server IP address: 0.0.0.0
    Relay agent IP address: 0.0.0.0
    Client MAC address: 02:00:00:00:00:01 (02:00:00:00:00:01)
    Client hardware address padding: 00000000000000000000
    Server host name not given
    Boot file name not given
    Magic cookie: DHCP
    Option: (54) DHCP Server Identifier (10.0.2.2)
    Option: (55) Parameter Request List
        Length: 14
        Parameter Request List Item: (3) Router
        Parameter Request List Item: (51) IP Address Lease Time
        Parameter Request List Item: (1) Subnet Mask
        Parameter Request List Item: (15) Domain Name
        Parameter Request List Item: (6) Domain Name Server
        Parameter Request List Item: (66) TFTP Server Name
        Parameter Request List Item: (67) Bootfile name
        Parameter Request List Item: (120) SIP Servers
        Parameter Request List Item: (44) NetBIOS over TCP/IP Name Server
        Parameter Request List Item: (43) Vendor-Specific Information
        Parameter Request List Item: (150) TFTP Server Address
        Parameter Request List Item: (12) Host Name
        Parameter Request List Item: (7) Log Server
        Parameter Request List Item: (42) Network Time Protocol Servers
    Option: (50) Requested IP Address (10.0.2.15)
    Option: (53) DHCP Message Type (Request)
    Option: (60) Vendor class identifier
        Length: 15
        Vendor class identifier: Juniper-mx10003
    Option: (51) IP Address Lease Time
    Option: (12) Host Name
    Option: (255) End
    Padding: 00

It requests several options, including the TFTP server address option 150, and the Vendor-Specific Information Option 43—or VSIO. The DHCP server can use option 60 to identify the vendor-specific information to send. For Juniper devices, option 43 encodes the image name and the configuration file name. They are fetched from the IP address provided in option 150.

The official documentation on ZTP provides a valid configuration to answer such a request. However, it does not leverage the ability of the ISC DHCP Server to support several vendors and redefines option 43 to be Juniper-specific:

option NEW_OP-encapsulation code 43 = encapsulate NEW_OP;

Instead, it is possible to define an option space for Juniper, using a self-descriptive name, without overriding option 43:

# Juniper vendor option space
option space juniper;
option juniper.image-file-name     code 0 = text;
option juniper.config-file-name    code 1 = text;
option juniper.image-file-type     code 2 = text;
option juniper.transfer-mode       code 3 = text;
option juniper.alt-image-file-name code 4 = text;
option juniper.http-port           code 5 = text;

Then, when you need to set these suboptions, specify the vendor option space:

class "juniper-mx10003" {
  match if (option vendor-class-identifier = "Juniper-mx10003") {
  vendor-option-space juniper;
  option juniper.transfer-mode    "http";
  option juniper.image-file-name  "/images/junos-vmhost-install-mx-x86-64-19.3R2-S4.5.tgz";
  option juniper.config-file-name "/cfg/juniper-mx10003.txt";
}

This configuration returns the following answer:1

Dynamic Host Configuration Protocol (ACK)
    Message type: Boot Reply (2)
    Hardware type: Ethernet (0x01)
    Hardware address length: 6
    Hops: 0
    Transaction ID: 0x44e3a7c9
    Seconds elapsed: 0
    Bootp flags: 0x8000, Broadcast flag (Broadcast)
    Client IP address: 0.0.0.0
    Your (client) IP address: 10.0.2.15
    Next server IP address: 0.0.0.0
    Relay agent IP address: 0.0.0.0
    Client MAC address: 02:00:00:00:00:01 (02:00:00:00:00:01)
    Client hardware address padding: 00000000000000000000
    Server host name not given
    Boot file name not given
    Magic cookie: DHCP
    Option: (53) DHCP Message Type (ACK)
    Option: (54) DHCP Server Identifier (10.0.2.2)
    Option: (51) IP Address Lease Time
    Option: (1) Subnet Mask (255.255.255.0)
    Option: (3) Router
    Option: (6) Domain Name Server
    Option: (43) Vendor-Specific Information
        Length: 89
        Value: 00362f696d616765732f6a756e6f732d766d686f73742d69…
    Option: (150) TFTP Server Address
    Option: (255) End

Using vendor-option-space directive allows you to make different ZTP implementations coexist. For example, you can add the option space for PXE:

option space PXE;
option PXE.mtftp-ip    code 1 = ip-address;
option PXE.mtftp-cport code 2 = unsigned integer 16;
option PXE.mtftp-sport code 3 = unsigned integer 16;
option PXE.mtftp-tmout code 4 = unsigned integer 8;
option PXE.mtftp-delay code 5 = unsigned integer 8;
option PXE.discovery-control    code 6  = unsigned integer 8;
option PXE.discovery-mcast-addr code 7  = ip-address;
option PXE.boot-server code 8  = { unsigned integer 16, unsigned integer 8, ip-address };
option PXE.boot-menu   code 9  = { unsigned integer 16, unsigned integer 8, text };
option PXE.menu-prompt code 10 = { unsigned integer 8, text };
option PXE.boot-item   code 71 = unsigned integer 32;

class "pxeclients" {
  match if substring (option vendor-class-identifier, 0, 9) = "PXEClient";
  vendor-option-space PXE;
  option PXE.mtftp-ip 10.0.2.2;
  # […]
}

On the same topic, do not override option 125 “VIVSO.” See “Zero-Touch Provisioning for Cisco IOS.”


  1. Wireshark knows how to decode option 43 for some vendors, thanks to option 60, but not for Juniper. ↩︎

15 November, 2020 10:20AM by Vincent Bernat

Russ Allbery

PGP::Sign 1.04

The refactor of PGP::Sign in the 1.00 release to use IPC::Run instead of hand-rolled process management code broke signing large files, which I discovered when trying to use the new module to sign checkgroups for the Big Eight Usenet hierarchies.

There were two problems: IPC::Run sets sockets to talk to the child process to non-blocking, and when you pass a scalar in as the data to pass to a child socket, IPC::Run expects to use it as a queue and thus doesn't send EOF to the child process when the input is exhausted.

This release works around both problems by handling non-blocking writes to the child using select and using a socket to write the passphrase to the child process instead of a scalar variable. It also adds a test to ensure that signing long input keeps working.

You can get the latest release from CPAN or from the PGP::Sign distribution page.

15 November, 2020 12:05AM

November 14, 2020

hackergotchi for Junichi Uekawa

Junichi Uekawa

Rewrote my build system in C++.

Rewrote my build system in C++. I used to write build rules in Nodejs, but I figured if my projects are mostly C++ I should probably write them in C++. I wanted to make it a bit more like BUILD files but couldn't really and ended up looking more C++ than I wanted to. Seems like key-value struct initialization isn't available until C++20.

14 November, 2020 08:43AM by Junichi Uekawa

November 13, 2020

hackergotchi for Martin Michlmayr

Martin Michlmayr

beancount2ledger 1.3 released

I released version 1.3 of beancount2ledger, the beancount to ledger converter that was moved from bean-report ledger into a standalone tool.

You can get beancount2ledger from GitHub or via pip install.

Here are the changes in 1.3:

  • Add rounding postings only when required (issue #9)
  • Avoid printing too much precision for a currency (issue #21)
  • Avoid creating two or more postings with null amount (issue #23)
  • Add price to cost when needed by ledger (issue #22)
  • Preserve posting order (issue #18)
  • Add config option indent
  • Show metadata with hledger output
  • Support setting auxiliary dates and posting dates from metadata (issue #14)
  • Support setting the code of transactions from metadata
  • Support mapping of account and currency names (issue #24)
  • Improve documentation:
    • Add user guide
    • Document limitations (issue #12)

13 November, 2020 12:12PM by Martin Michlmayr

November 12, 2020

hackergotchi for Dirk Eddelbuettel

Dirk Eddelbuettel

tidyCpp 0.0.2: More documentation and features

A first update of the still fairly new package tidyCpp is now on CRAN. The packages offers a C++ layer on top of the C API for R which aims to make its use a little easier and more consistent.

The vignette has been extended with a new examples, a new section and some general editing. A few new defines have been added mostly from the Rinternals.h header. We also replaced the Shield class with a simpler yet updated version class Protect. The name better represent the core functionality of offering a simpler alternative to the PROTECT and UNPROTECT macro pairing. We also added a short discussion to the vignette of a gotcha one has to be mindful of, and that we fell for ourselves in version 0.0.1. We also added a typedef so that code using Shield can still be used.

The NEWS entry follows.

Changes in tidyCpp version 0.0.2 (2020-11-12)

  • Expanded definitions in internals.h to support new example.

  • The vignette has been extended with an example based on package uchardet.

  • Class Shield has been replaced by an new class Protect; a compatibility typdef has been added.

  • The examples and vignette have been clarified with respect to proper ownership of protected objects; a new vignette section was added.

Thanks to my CRANberries, there is also a diffstat report for this release.

For questions, suggestions, or issues please use the issue tracker at the GitHub repo.

If you like this or other open-source work I do, you can now sponsor me at GitHub. For the first year, GitHub will match your contributions.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.

12 November, 2020 03:03PM

hackergotchi for Bits from Debian

Bits from Debian

"Homeworld" will be the default theme for Debian 11

The theme "Homeworld" by Juliette Taka has been selected as default theme for Debian 11 'bullseye'. Juliette says that this theme has been inspired by the Bauhaus movement, an art style born in Germany in the 20th century.

Homeworld wallpaper. Click to see the whole theme proposal

Homeworld debian-installer theme. Click to see the whole theme proposal

After the Debian Desktop Team made the call for proposing themes, a total of eighteen choices have been submitted. The desktop artwork poll was open to the public, and we received 5,613 responses ranking the different choices, of which Homeworld has been ranked as the winner among them.

This is the third time that a submission by Juliette has won. Juliette is also the author of the lines theme that was used in Debian 8 and the softWaves theme that was used in Debian 9.

We'd like to thank all the designers that have participated and have submitted their excellent work in the form of wallpapers and artwork for Debian 11.

Congratulations, Juliette, and thank you very much for your contribution to Debian!

12 November, 2020 12:30PM by Jonathan Carter

Mike Hommey

Announcing git-cinnabar 0.5.6

Please partake in the git-cinnabar survey.

Git-cinnabar is a git remote helper to interact with mercurial repositories. It allows to clone, pull and push from/to mercurial remote repositories, using git.

Get it on github.

These release notes are also available on the git-cinnabar wiki.

What’s new since 0.5.5?

  • Updated git to 2.29.2 for the helper.
  • git cinnabar git2hg and git cinnabar hg2git now have a --batch flag.
  • Fixed a few issues with experimental support for python 3.
  • Fixed compatibility issues with mercurial >= 5.5.
  • Avoid downloading unsupported clonebundles.
  • Provide more resilience to network problems during bundle download.
  • Prebuilt helper for Apple Silicon macos now available via git cinnabar download.

12 November, 2020 02:40AM by glandium

November 11, 2020

Vincent Fourmond

Solution for QSoas quiz #1: averaging spectra

This post describes the solution to the Quiz #1, based on the files found there. The point is to produce both the average and the standard deviation of a series of spectra. Below is how the final averaged spectra shoud look like:
I will present here two different solutions.

Solution 1: using the definition of standard deviation

There is a simple solution using the definition of the standard deviation: $$\sigma_y = \sqrt{<y^2> - {<y>}^2}$$ in which \(<y^2>\) is the average of \(y^2\) (and so on). So the simplest solution is to construct datasets with an additional column that would contain \(y^2\), average these columns, and replace the average with the above formula. For that, we need first a companion script that loads a single data file and adds a column with \(y^2\). Let's call this script load-one.cmds:
load ${1}
apply-formula y2=y**2 /extra-columns=1
flag /flags=processed
When this script is run with the name of a spectrum file as argument, it loads it (replaces ${1} by the first argument, the file name), adds a column y2 containing the square of the y column, and flag it with the processed flag. This is not absolutely necessary, but it makes it much easier to refer to all the spectra when they are processed. Then to process all the spectra, one just has to run the following commands:
run-for-each load-one.cmds Spectrum-1.dat Spectrum-2.dat Spectrum-3.dat
average flagged:processed
apply-formula y2=(y2-y**2)**0.5
dataset-options /yerrors=y2
The run-for-each command runs the load-one.cmds script for all the spectra (one could also have used Spectra-*.dat to not have to give all the file names). Then, the average averages the values of the columns over all the datasets. To be clear, it finds all the values that have the same X (or very close X values) and average them, column by column. The result of this command is therefore a dataset with the average of the original \(y\) data as y column and the average of the original \(y^2\) data as y2 column. So now, the only thing left to do is to use the above equation, which is done by the apply-formula code. The last command, dataset-options, is not absolutely necessary but it signals to QSoas that the standard error of the y column should be found in the y2 column. This is now available as script method-one.cmds in the git repository.

Solution 2: use QSoas's knowledge of standard deviation

The other method is a little more involved but it demonstrates a good approach to problem solving with QSoas. The starting point is that, in apply-formula, the value $stats.y_stddev corresponds to the standard deviation of the whole y column... Loading the spectra yields just a series of x,y datasets. We can contract them into a single dataset with one x column and several y columns:
load Spectrum-*.dat /flags=spectra
contract flagged:spectra
After these commands, the current dataset contains data in the form of:
lambda1	a1_1	a1_2	a1_3
lambda2	a2_1	a2_2	a2_3
...
in which the ai_1 come from the first file, ai_2 the second and so on. We need to use transpose to transform that dataset into:
0	a1_1	a2_1	...
1	a1_2	a2_2	...
2	a1_3	a2_3	...
In this dataset, values of the absorbance for the same wavelength for each dataset is now stored in columns. The next step is just to use expand to obtain a series of datasets with the same x column and a single y column (each corresponding to a different wavelength in the original data). The game is now to replace these datasets with something that looks like:
0	a_average
1	a_stddev
For that, one takes advantage of the $stats.y_average and $stats.y_stddev values in apply-formula, together with the i special variable that represents the index of the point:
apply-formula "if i == 0; then y=$stats.y_average; end; if i == 1; then y=$stats.y_stddev; end"
strip-if i>1
Then, all that is left is to apply this to all the datasets created by expand, which can be just made using run-for-datasets, and then, we reverse the splitting by using contract and transpose ! In summary, this looks like this. We need two files. The first, process-one.cmds contains the following code:
apply-formula "if i == 0; then y=$stats.y_average; end; if i == 1; then y=$stats.y_stddev; end"
strip-if i>1
flag /flags=processed
The main file, method-two.cmds looks like this:
load Spectrum-*.dat /flags=spectra
contract flagged:spectra
transpose
expand /flags=tmp
run-for-datasets process-one.cmds flagged:tmp
contract flagged:processed
transpose
dataset-options /yerrors=y2
Note some of the code above can be greatly simplified using new features present in the upcoming 3.0 version, but that is the topic for another post.

About QSoas

QSoas is a powerful open source data analysis program that focuses on flexibility and powerful fitting capacities. It is released under the GNU General Public License. It is described in Fourmond, Anal. Chem., 2016, 88 (10), pp 5050–5052. Current version is 2.2. You can download its source code and compile it yourself or buy precompiled versions for MacOS and Windows there.

11 November, 2020 07:53PM by Vincent Fourmond (noreply@blogger.com)

Reproducible Builds

Reproducible Builds in October 2020

Welcome to the October 2020 report from the Reproducible Builds project.

In our monthly reports, we outline the major things that we have been up to over the past month. As a brief reminder, the motivation behind the Reproducible Builds effort is to ensure flaws have not been introduced in the binaries we install on our systems. If you are interested in contributing to the project, please visit our main website.

General

On Saturday 10th October, Morten Linderud gave a talk at Arch Conf Online 2020 on The State of Reproducible Builds in Arch. The video should be available later this month, but as a teaser:

The previous year has seen great progress in Arch Linux to get reproducible builds in the hands of the users and developers. In this talk we will explore the current tooling that allows users to reproduce packages, the rebuilder software that has been written to check packages and the current issues in this space.

During the Reproducible Builds summit in Marrakesh in 2019, developers from the GNU Guix, NixOS and Debian distributions were able to produce a bit-for-bit identical GNU Mes binary despite using three different versions of GCC. Since this summit, additional work resulted in a bit-for-bit identical Mes binary using tcc, and last month a fuller update was posted to this effect by the individuals involved. This month, however, David Wheeler updated his extensive page on Fully Countering Trusting Trust through Diverse Double-Compiling, remarking that:

GNU Mes rebuild is definitely an application of [Diverse Double-Compiling]. [..] This is an awesome application of DDC, and I believe it’s the first publicly acknowledged use of DDC on a binary

There was a small, followup discussion on our mailing list.

In openSUSE, Bernhard M. Wiedemann published his monthly Reproducible Builds status update.

This month, the Reproducible Builds project restarted our IRC meetings, managing to convene twice: the first time on October 12th (summary & logs), and later on the 26th (logs). As mentioned in previous reports, due to the unprecedented events throughout 2020, there will be no in-person summit event this year.

On our mailing list this month Elías Alejandro posted a request for help with a local configuration

In August, Lucas Nussbaum performed an archive-wide rebuild of packages to test enabling the reproducible=+fixfilepath Debian build flag by default. Enabling this fixfilepath feature will likely fix reproducibility issues in an estimated 500-700 packages. However, this month Vagrant Cascadian posted to the debian-devel mailing list:

It would be great to see the reproducible=+fixfilepath feature enabled by default in dpkg-buildflags, and we would like to proceed forward with this soon unless we hear any major concerns or other outstanding issues. […] We would like to move forward with this change soon, so please raise any concerns or issues not covered already.

Debian Developer Stuart Prescott has been improving python-debian, a Python library that is used to parse Debian-specific files such as changelogs, .dscs, etc. In particular, Stuart is working on adding support for .buildinfo files used for recording reproducibility-related build metadata:

This can mostly be a very thin layer around the existing Deb822 types, using the existing Changes code for the file listings, the existing PkgRelations code for the package listing and gpg_* functions for signature handling.

A total of 159 Debian packages were categorised, 69 had their categorisation updated, and 33 had their classification removed this month, adding to our knowledge about identified issues. As part of this, Chris Lamb identified and classified two new issues: build_path_captured_in_emacs_el_file and rollup_embeds_build_path.

Software development

This month, we tried to fix a large number of currently-unreproducible packages, including:

Bernhard M. Wiedemann also reported three issues against bison, ibus and postgresql12.

Tools

diffoscope is our in-depth and content-aware diff utility. Not only could you locate and diagnose reproducibility issues, it provides human-readable diffs of all kinds too. This month, Chris Lamb uploaded version 161 to Debian (later backported by Mattia Rizzolo), as well as made the following changes:

  • Move test_ocaml to the assert_diff helper. []
  • Update tests to support OCaml version 4.11.1. Thanks to Sebastian Ramacher for the report. (#972518)
  • Bump minimum version of the Black source code formatter to 20.8b1. (#972518)

In addition, Jean-Romain Garnier temporarily updated the dependency on radare2 to ensure our test pipelines continue to work [], and for the GNU Guix distribution Vagrant Cascadian diffoscope to version 161 [].

In related development, trydiffoscope is the web-based version of diffoscope. This month, Chris Lamb made the following changes:

  • Mark a --help-only test as being a ‘superficial’ test. (#971506)
  • Add a real, albeit flaky, test that interacts with the try.diffoscope.org service. []
  • Bump debhelper compatibility level to 13 [] and bump Standards-Version to 4.5.0 [].

Lastly, disorderfs version 0.5.10-2 was uploaded to Debian unstable by Holger Levsen, which enabled security hardening via DEB_BUILD_MAINT_OPTIONS [] and dropped debian/disorderfs.lintian-overrides [].

Website and documentation

This month, a number of updates to the main Reproducible Builds website and related documentation were made by Chris Lamb:

  • Add a citation link to the academic article regarding dettrace [], and added yet another supply-chain security attack publication [].
  • Reformatted the Jekyll’s Liquid templating language and CSS formatting to be consistent [] as well as expand a number of tab characters [].
  • Used relative_url to fix missing translation icon on various pages. []
  • Published two announcement blog posts regarding the restarting of our IRC meetings. [][]
  • Added an explicit note regarding the lack of an in-person summit in 2020 to our events page. []

Testing framework

The Reproducible Builds project operates a Jenkins-based testing framework that powers tests.reproducible-builds.org. This month, Holger Levsen made the following changes:

  • Debian-related changes:

    • Refactor and improve the Debian dashboard. [][][]
    • Track bugs which are usertagged as ‘filesystem’, ‘fixfilepath’, etc.. [][][]
    • Make a number of changes to package index pages. [][][]
  • System health checks:

    • Relax disk space warning levels. []
    • Specifically detect build failures reported by dpkg-buildpackage. []
    • Fix a regular expression to detect outdated package sets. []
    • Detect Lintian issues in diffoscope. []

  • Misc:

    • Make a number of updates to reflect that our sponsor Profitbricks has renamed itself to IONOS. [][][][]
    • Run a F-Droid maintenance routine twice a month to utilise its cleanup features. []
    • Fix the target name in OpenWrt builds to ath79 from ath97. []
    • Add a missing Postfix configuration for a node. []
    • Temporarily disable Arch Linux builds until a core node is back. []
    • Make a number of changes to our “thanks” page. [][][]

Build node maintenance was performed by both Holger Levsen [][] and Vagrant Cascadian [][][], Vagrant Cascadian also updated the page listing the variations made when testing to reflect changes for in build paths [] and Hans-Christoph Steiner made a number of changes for F-Droid, the free software app repository for Android devices, including:

  • Do not fail reproducibility jobs when their cleanup tasks fail. []
  • Skip libvirt-related sudo command if we are not actually running libvirt. []
  • Use direct URLs in order to eliminate a useless HTTP redirect. []

If you are interested in contributing to the Reproducible Builds project, please visit the Contribute page on our website. However, you can also get in touch with us via:

11 November, 2020 02:35PM

November 10, 2020

Thorsten Alteholz

My Debian Activities in October 2020

FTP master

This month I accepted 208 packages and rejected 29. The overall number of packages that got accepted was 563, so yeah, I was not alone this month :-).

Anyway, this month marked another milestone in my NEW package handling. My overall number of ACCEPTed package exceeded the magic number of 20000 packages. This is almost 30% of all packages accepted in Debian. I am a bit proud of this achievement.

Debian LTS

This was my seventy-sixth month that I did some work for the Debian LTS initiative, started by Raphael Hertzog at Freexian.

This month my all in all workload has been 20.75h. During that time I did LTS uploads of:

  • [DLA 2415-1] freetype security update for one CVE
  • [DLA 2419-1] dompurify.js security update for two CVEs
  • [DLA 2418-1] libsndfile security update for eight CVEs
  • [DLA 2421-1] cimg security update for eight CVEs

I also started to work on golang-1.7 and golang-1.8

Last but not least I did some days of frontdesk duties.

Debian ELTS

This month was the twenty eighth ELTS month.

During my allocated time I uploaded:

  • ELA-289-2 for python3.4
  • ELA-304-1 for freetype
  • ELA-305-1 for libsndfile

The first upload of python3.4, last month, did not build on armel, so I had to reupload an improved package this month. For amd64 and i386 the ELTS packages are built in native mode, whereas the packages on armel are cross-built. There is some magic in debian/rules of python to detect in which mode the package is built. This is important as some tests of the testsuite are not really working in cross-build-mode. Unfortunately I had to learn this the hard way …

The upload of libsndfile now aligns the number of fixed CVEs in all releases.

Last but not least I did some days of frontdesk duties.

Other stuff

Despite my NEW-handling and LTS/ELTS stuff I hadn’t much fun with Debian packages this month. Given the approaching freeze, I hope this will change again in November.

10 November, 2020 02:48PM by alteholz

hackergotchi for Jonathan Dowland

Jonathan Dowland

Borg, confidence in backups, GtkPod and software preservation

Over the summer I decided to migrate my backups from rdiff-backup to borg, which offers some significant advantages, in particular de-duplication, but comes at a cost of complexity, and a corresponding sense of unease about how sound my backup strategy might be. I've now hit the Point Of No Return: my second external backup drive is overdue being synced with my NAS, which will delete the last copy of the older rdiff-backup backups.

Whilst I hesitate over this last action to commit to borg, something else happened. My wife wanted to put a copy of her iTunes music library on her new phone, and I couldn't find it: not only could I not find it on any of our computers, I also couldn't find a copy on the NAS, or in backups, or even in old DVD-Rs. This has further knocked my confidence in our family data management, and makes me even more nervous to commit to borg. I'm now wondering about stashing the contents of the second external backup disk on some cloud service as a fail-safe.

There was one known-good copy of Sarah's music: on her ancient iPod Nano. Apple have gone to varying lengths to prevent you from copying music from an iPod. When Music is copied to an iPod, the files are stripped of all their metadata (artist, title, album, etc.) and renamed to something non-identifying (e.g. F01/MNRL.m4a), and the metadata (and correlation to the obscure file name) is saved in separate database files. The partition of the flash drive containing all this is also marked as "hidden" to prevent it appearing on macOS and Windows systems. We are lucky that the iPod is so old, because Apple went even further in more recent models, adding a layer of encryption.

To get the music off the iPod, one has to undo all of these steps.

Luckily, other fine folks have worked out reversing all these steps and implemented it in software such as libgpod and its frontend, GtkPod, which is still currently available as a Debian package. It mostly worked, and I got back 95% of the tracks. (It would have been nice if GtkPod had reported the tracks it hadn't recovered, it was aware they existed based on the errors it did print. But you can't have everything.)

GtkPod is a quirky, erratic piece of software, that is only useful for old Apple equipment that is long out of production, prior to the introduction of the encryption. The upstream homepage is dead, and I suspect it is unmaintained. The Debian package is orphaned. It's been removed from testing, because it won't build with GCC 10. On the other hand, my experience shows that it worked, and was useful for a real problem that someone had today.

I'm in two minds about GtkPod's fate. On the one hand, I think Debian has far too many packages, with a corresponding burden of maintenance responsibility (for the whole project, not just the individual package maintainers), and there's a quality problem: once upon a time, if software had been packaged in a distribution like Debian, that was a mark of quality, a vote of confidence, and you could have some hope that the software would work and integrate well with the rest of the system. That is no longer true, and hasn't been in my experience for many years. If we were more discerning about what software we included in the distribution, and what we kept, perhaps we could be a leaner distribution, faster to adapt to the changing needs in the world, and of a higher quality.

On the other hand, this story about GtkPod is just one of many similar stories. Real problems have been solved in open source software, and computing historians, vintage computer enthusiasts, researchers etc. can still benefit from that software long into the future. Throwing out all this stuff in the name of "progress", could be misguided. I'm especially sad when I see the glee which people have expressed when ditching libraries like Qt4 from the archive. Some software will not be ported on to Qt5 (or Gtk3, Qt6, Gtk4, Qt7, etc., in perpetuity). Such software might be all of: unmaintained, "finished", and useful for some purpose (however niche), all at the same time.

10 November, 2020 11:01AM

Red Hat at the Turing Institute

In Summer 2019 Red Hat were invited to the Turing Institute to provide a workshop on issues around building and sustaining an Open Source community. I was part of a group of about 6 people to visit the Turing and deliver the workshop. It seemed to have been well received by the audience.

The Turing Institute is based within the British Library. For many years I have enjoyed visiting the British Library if I was visiting or passing through London for some reason or other: it's such a lovely serene space in a busy, hectic part of London. On one occasion they had Jack Kerouac's manuscript for "On The Road" on display in one of the public gallery spaces: it's a continuous 120-foot long piece of paper that Kerouac assembled to prevent the interruption of changing sheets of paper in his typewriter from disturbing his flow whilst writing.

The Institute itself is a really pleasant-looking working environment. I got a quick tour of it back in February 2019 when visiting a friend who worked there, but last year's visit was my first prolonged experience of working there. (I also snuck in this February, when passing through London, to visit my supervisor who is a Turing Fellow)

I presented a section of a presentation entitled "How to build a successful Open Source community". My section attempted to focus on the "how". We've put out all the presentations under a Creative Commons license, and we've published them on the Red Hat Research website: https://research.redhat.com/blog/2020/08/12/open-source-at-the-turing/

The workshop participants were drawn from PhD students, research associates, research software engineers and Turing Institute fellows. We had some really great feedback from them which we've fed back into revisions of the workshop material including the presentations.

I'm hoping to stay involve in further collaborations between the Turing and Red Hat. I'm pleased to say that we participated in a recent Tools, practices and systems seminar (although I was not involved).

10 November, 2020 09:54AM

November 09, 2020

hackergotchi for Joachim Breitner

Joachim Breitner

Distributing Haskell programs in a multi-platform zip file

My maybe most impactful piece of code is tttool and the surrounding project, which allows you to create your own content for the Ravensburger Tiptoi™ platform. The program itself is a command line tool, and in this blog post I want to show how I go about building that program for Linux (both normal and static builds), Windows (cross-compiled from Linux), OSX (only on CI), all combined into and released as a single zip file.

Maybe some of it is useful or inspiring to my readers, or can even serve as a template. This being a blob post, though, note that it may become obsolete or outdated.

Ingredients

I am building on the these components:

Without the nix build system and package manger I probably woudn’t even attempt to pull of complex tasks that may, say, require a patched ghc. For many years I resisted learning about nix, but when I eventually had to, I didn’t want to go back.

This project provides an alternative Haskell build infrastructure for nix. While this is not crucial for tttool, it helps that they tend to have some cross-compilation-related patches more than the official nixpkgs. I also like that it more closely follows the cabal build work-flow, where cabal calculates a build plan based on your projects dependencies. It even has decent documentation (which is a new thing compared to two years ago).

Niv is a neat little tool to keep track of your dependencies. You can quickly update them with, say niv update nixpkgs. But what’s really great is to temporarily replace one of your dependencies with a local checkout, e.g. via NIV_OVERRIDE_haskellNix=$HOME/build/haskell/haskell.nix nix-instantiate -A osx-exe-bundle There is a Github action that will keep your niv-managed dependencies up-to-date.

This service (proprietary, but free to public stuff up to 10GB) gives your project its own nix cache. This means that build artifacts can be cached between CI builds or even build steps, and your contributors. A cache like this is a must if you want to use nix in more interesting ways where you may end up using, say, a changed GHC compiler. Comes with GitHub actions integration.

  • CI via Github actions

Until recently, I was using Travis, but Github actions are just a tad easier to set up and, maybe more important here, the job times are high enough that you can rebuild GHC if you have to, and even if your build gets canceled or times out, cleanup CI steps still happen, so that any new nix build products will still reach your nix cache.

The repository setup

All files discussed in the following are reflected at https://github.com/entropia/tip-toi-reveng/tree/7020cde7da103a5c33f1918f3bf59835cbc25b0c.

We are starting with a fairly normal Haskell project, with a single .cabal file (but multi-package projects should work just fine). To make things more interesting, I also have a cabal.project which configures one dependency to be fetched via git from a specific fork.

To start building the nix infrastructure, we can initialize niv and configure it to use the haskell.nix repo:

niv init
niv add input-output-hk/haskell.nix -n haskellNix

This creates nix/sources.json (which you can also edit by hand) and nix/sources.nix (which you can treat like a black box).

Now we can start writing the all-important default.nix file, which defines almost everything of interest here. I will just go through it line by line, and explain what I am doing here.

{ checkMaterialization ? false }:

This defines a flag that we can later set when using nix-build, by passing --arg checkMaterialization true, and which is off by default. I’ll get to that flag later.

let
  sources = import nix/sources.nix;
  haskellNix = import sources.haskellNix {};

This imports the sources as defined niv/sources.json, and loads the pinned revision of the haskell.nix repository.

  # windows crossbuilding with ghc-8.10 needs at least 20.09.
  # A peek at https://github.com/input-output-hk/haskell.nix/blob/master/ci.nix can help
  nixpkgsSrc = haskellNix.sources.nixpkgs-2009;
  nixpkgsArgs = haskellNix.nixpkgsArgs;

  pkgs = import nixpkgsSrc nixpkgsArgs;

Now we can define pkgs, which is “our” version of the nixpkgs package set, extended with the haskell.nix machinery. We rely on haskell.nix to pin of a suitable revision of the nixpkgs set (see how we are using their niv setup).

Here we could our own configuration, overlays, etc to nixpkgsArgs. In fact, we do in

  pkgs-osx = import nixpkgsSrc (nixpkgsArgs // { system = "x86_64-darwin"; });

to get the nixpkgs package set of an OSX machine.

  # a nicer filterSource
  sourceByRegex =
    src: regexes: builtins.filterSource (path: type:
      let relPath = pkgs.lib.removePrefix (toString src + "/") (toString path); in
      let match = builtins.match (pkgs.lib.strings.concatStringsSep "|" regexes); in
      ( type == "directory"  && match (relPath + "/") != null
      || match relPath != null)) src;

Next I define a little helper that I have been copying between projects, and which allows me to define the input to a nix derivation (i.e. a nix build job) with a set of regexes. I’ll use that soon.

  tttool-exe = pkgs: sha256:
    (pkgs.haskell-nix.cabalProject {

The cabalProject function takes a cabal project and turns it into a nix project, running cabal v2-configure under the hood to let cabal figure out a suitable build plan. Since we want to have multiple variants of the tttool, this is so far just a function of two arguments pkgs and sha256, which will be explained in a bit.

      src = sourceByRegex ./. [
          "cabal.project"
          "src/"
          "src/.*/"
          "src/.*.hs"
          ".*.cabal"
          "LICENSE"
        ];

The cabalProject function wants to know the source of the Haskell projects. There are different ways of specifying this; in this case I went for a simple whitelist approach. Note that cabal.project.freze (which exists in the directory) is not included.

      # Pinning the input to the constraint solver
      compiler-nix-name = "ghc8102";

The cabal solver doesn’t find out which version of ghc to use, that is still my choice. I am using GHC-8.10.2 here. It may require a bit of experimentation to see which version works for your project, especially when cross-compiling to odd targets.

      index-state = "2020-11-08T00:00:00Z";

I want the build to be deterministic, and not let cabal suddenly pick different package versions just because something got uploaded. Therefore I specify which snapshot of the Hackage package index it should consider.

      plan-sha256 = sha256;
      inherit checkMaterialization;

Here we use the second parameter, but I’ll defer the explanation for a bit.

      modules = [{
        # smaller files
        packages.tttool.dontStrip = false;
      }] ++

These “modules” are essentially configuration data that is merged in a structural way. Here we say that we want the tttool binary to be stripped (saves a few megabyte).

      pkgs.lib.optional pkgs.hostPlatform.isMusl {
        packages.tttool.configureFlags = [ "--ghc-option=-static" ];

Also, when we are building on the musl platform, that’s when we want to produce a static build, so let’s pass -static to GHC. This seems to be enough in terms of flags to produce static binaries. It helps that my project is using mostly pure Haskell libraries; if you link against C libraries you might have to jump through additional hoops to get static linking going. The haskell.nix documentation has a section on static building with some flags to cargo-cult.

        # terminfo is disabled on musl by haskell.nix, but still the flag
        # is set in the package plan, so override this
        packages.haskeline.flags.terminfo = false;
      };

This (again only used when the platform is musl) seems to be necessary to workaround what might be a big in haskell.nix.

    }).tttool.components.exes.tttool;

The cabalProject function returns a data structure with all Haskell packages of the project, and for each package the different components (libraries, tests, benchmarks and of course executables). We only care about the tttool executable, so let’s project that out.

  osx-bundler = pkgs: tttool:
   pkgs.stdenv.mkDerivation {
      name = "tttool-bundle";

      buildInputs = [ pkgs.macdylibbundler ];

      builder = pkgs.writeScript "tttool-osx-bundler.sh" ''
        source ${pkgs.stdenv}/setup

        mkdir -p $out/bin/osx
        cp ${tttool}/bin/tttool $out/bin/osx
        chmod u+w $out/bin/osx/tttool
        dylibbundler \
          -b \
          -x $out/bin/osx/tttool \
          -d $out/bin/osx \
          -p '@executable_path' \
          -i /usr/lib/system \
          -i ${pkgs.darwin.Libsystem}/lib
      '';
    };

This function, only to be used on OSX, takes a fully build tttool, finds all the system libraries it is linking against, and copies them next to the executable, using the nice macdylibbundler. This way we can get a self-contained executable.

A nix expert will notice that this probably should be written with pkgs.runCommandNoCC, but then dylibbundler fails because it lacks otool. This should work eventually, though.

in rec {
  linux-exe      = tttool-exe pkgs
     "0rnn4q0gx670nzb5zp7xpj7kmgqjmxcj2zjl9jqqz8czzlbgzmkh";
  windows-exe    = tttool-exe pkgs.pkgsCross.mingwW64
     "01js5rp6y29m7aif6bsb0qplkh2az0l15nkrrb6m3rz7jrrbcckh";
  static-exe     = tttool-exe pkgs.pkgsCross.musl64
     "0gbkyg8max4mhzzsm9yihsp8n73zw86m3pwvlw8170c75p3vbadv";
  osx-exe        = tttool-exe pkgs-osx
     "0rnn4q0gx670nzb5zp7xpj7kmgqjmxcj2zjl9jqqz8czzlbgzmkh";

Time to create the four versions of tttool. In each case we use the tttool-exe function from above, passing the package set (pkgs,…) and a SHA256 hash.

The package set is either the normal one, or it is one of those configured for cross compilation, building either for Windows or for Linux using musl, or it is the OSX package set that we instantiated earlier.

The SHA256 hash describes the result of the cabal plan calculation that happens as part of cabalProject. By noting down the expected result, nix can skip that calculation, or fetch it from the nix cache etc.

How do we know what number to put there, and when to change it? That’s when the --arg checkMaterialization true flag comes into play: When that is set, cabalProject will not blindly trust these hashes, but rather re-calculate them, and tell you when they need to be updated. We’ll make sure that CI checks them.

  osx-exe-bundle = osx-bundler pkgs-osx osx-exe;

For OSX, I then run the output through osx-bundler defined above, to make it independent of any library paths in /nix/store.

This is already good enough to build the tool for the various systems! The rest of the the file is related to packaging up the binaries, to tests, and various other things, but nothing too essentially. So if you got bored, you can more or less stop now.

  static-files = sourceByRegex ./. [
    "README.md"
    "Changelog.md"
    "oid-decoder.html"
    "example/.*"
    "Debug.yaml"
    "templates/"
    "templates/.*\.md"
    "templates/.*\.yaml"
    "Audio/"
    "Audio/digits/"
    "Audio/digits/.*\.ogg"
  ];

  contrib = ./contrib;

The final zip file that I want to serve to my users contains a bunch of files from throughout my repository; I collect them here.

  book = …;

The project comes with documentation in the form of a Sphinx project, which we build here. I’ll omit the details, because they are not relevant for this post (but of course you can peek if you are curious).

  os-switch = pkgs.writeScript "tttool-os-switch.sh" ''
    #!/usr/bin/env bash
    case "$OSTYPE" in
      linux*)   exec "$(dirname "''${BASH_SOURCE[0]}")/linux/tttool" "$@" ;;
      darwin*)  exec "$(dirname "''${BASH_SOURCE[0]}")/osx/tttool" "$@" ;;
      msys*)    exec "$(dirname "''${BASH_SOURCE[0]}")/tttool.exe" "$@" ;;
      cygwin*)  exec "$(dirname "''${BASH_SOURCE[0]}")/tttool.exe" "$@" ;;
      *)        echo "unsupported operating system $OSTYPE" ;;
    esac
  '';

The zipfile should provide a tttool command that works on all systems. To that end, I implement a simple platform switch using bash. I use pks.writeScript so that I can include that file directly in default.nix, but it would have been equally reasonable to just save it into nix/tttool-os-switch.sh and include it from there.

  release = pkgs.runCommandNoCC "tttool-release" {
    buildInputs = [ pkgs.perl ];
  } ''
    # check version
    version=$(${static-exe}/bin/tttool --help|perl -ne 'print $1 if /tttool-(.*) -- The swiss army knife/')
    doc_version=$(perl -ne "print \$1 if /VERSION: '(.*)'/" ${book}/book.html/_static/documentation_options.js)

    if [ "$version" != "$doc_version" ]
    then
      echo "Mismatch between tttool version \"$version\" and book version \"$doc_version\""
      exit 1
    fi

Now the derivation that builds the content of the release zip file. First I double check that the version number in the code and in the documentation matches. Note how ${static-exe} refers to a path with the built static Linux build, and ${book} the output of the book building process.

    mkdir -p $out/
    cp -vsr ${static-files}/* $out
    mkdir $out/linux
    cp -vs ${static-exe}/bin/tttool $out/linux
    cp -vs ${windows-exe}/bin/* $out/
    mkdir $out/osx
    cp -vsr ${osx-exe-bundle}/bin/osx/* $out/osx
    cp -vs ${os-switch} $out/tttool
    mkdir $out/contrib
    cp -vsr ${contrib}/* $out/contrib/
    cp -vsr ${book}/* $out
  '';

The rest of the release script just copies files from various build outputs that we have defined so far.

Note that this is using both static-exe (built on Linux) and osx-exe-bundle (built on Mac)! This means you can only build the release if you either have setup a remote osx builder (a pretty nifty feature of nix, which I unfortunately can’t use, since I don't have access to a Mac), or the build product must be available in a nix cache (which it is in my case, as I will explain later).

The output of this derivation is a directory with all the files I want to put in the release.

  release-zip = pkgs.runCommandNoCC "tttool-release.zip" {
    buildInputs = with pkgs; [ perl zip ];
  } ''
    version=$(bash ${release}/tttool --help|perl -ne 'print $1 if /tttool-(.*) -- The swiss army knife/')
    base="tttool-$version"
    echo "Zipping tttool version $version"
    mkdir -p $out/$base
    cd $out
    cp -r ${release}/* $base/
    chmod u+w -R $base
    zip -r $base.zip $base
    rm -rf $base
  '';

And now these files are zipped up. Note that this automatically determines the right directory name and basename for the zipfile.

This concludes the step necessary for a release.

  gme-downloads = …;
  tests = …;

These two definitions in default.nix are related to some simple testing, and again not relevant for this post.

  cabal-freeze = pkgs.stdenv.mkDerivation {
    name = "cabal-freeze";
    src = linux-exe.src;
    buildInputs = [ pkgs.cabal-install linux-exe.env ];
    buildPhase = ''
      mkdir .cabal
      touch .cabal/config
      rm cabal.project # so that cabal new-freeze does not try to use HPDF via git
      HOME=$PWD cabal new-freeze --offline --enable-tests || true
    '';
    installPhase = ''
      mkdir -p $out
      echo "-- Run nix-shell -A check-cabal-freeze to update this file" > $out/cabal.project.freeze
      cat cabal.project.freeze >> $out/cabal.project.freeze
    '';
  };

Above I mentioned that I still would like to be able to just run cabal, and ideally it should take the same library versions that the nix-based build does. But pinning the version of ghc in cabal.project is not sufficient, I also need to pin the precise versions of the dependencies. This is best done with a cabal.project.freeze file.

The above derivation runs cabal new-freeze in the environment set up by haskell.nix and grabs the resulting cabal.project.freeze. With this I can run nix-build -A cabal-freeze and fetch the file from result/cabal.project.freeze and add it to the repository.

  check-cabal-freeze = pkgs.runCommandNoCC "check-cabal-freeze" {
      nativeBuildInputs = [ pkgs.diffutils ];
      expected = cabal-freeze + /cabal.project.freeze;
      actual = ./cabal.project.freeze;
      cmd = "nix-shell -A check-cabal-freeze";
      shellHook = ''
        dest=${toString ./cabal.project.freeze}
        rm -f $dest
        cp -v $expected $dest
        chmod u-w $dest
        exit 0
      '';
    } ''
      diff -r -U 3 $actual $expected ||
        { echo "To update, please run"; echo "nix-shell -A check-cabal-freeze"; exit 1; }
      touch $out
    '';

But generated files in repositories are bad, so if that cannot be avoided, at least I want a CI job that checks if they are up to date. This job does that. What’s more, it is set up so that if I run nix-shell -A check-cabal-freeze it will update the file in the repository automatically, which is much more convenient than manually copying.

Lately, I have been using this pattern regularly when adding generated files to a repository: * Create one nix derivation that creates the files * Create a second derivation that compares the output of that derivation against the file in the repo * Create a derivation that, when run in nix-shell, updates that file. Sometimes that derivation is its own file (so that I can just run nix-shell nix/generate.nix), or it is merged into one of the other two.

This concludes the tour of default.nix.

The CI setup

The next interesting bit is the file .github/workflows/build.yml, which tells Github Actions what to do:

name: "Build and package"
on:
  pull_request:
  push:

Standard prelude: Run the jobs in this file upon all pushes to the repository, and also on all pull requests. Annoying downside: If you open a PR within your repository, everything gets built twice. Oh well.

jobs:
  build:
    strategy:
      fail-fast: false
      matrix:
        include:
        - target: linux-exe
          os: ubuntu-latest
        - target: windows-exe
          os: ubuntu-latest
        - target: static-exe
          os: ubuntu-latest
        - target: osx-exe-bundle
          os: macos-latest
    runs-on: ${{ matrix.os }}

The “build” job is a matrix job, i.e. there are four variants, one for each of the different tttool builds, together with an indication of what kind of machine to run this on.

    - uses: actions/checkout@v2
    - uses: cachix/install-nix-action@v12

We begin by checking out the code and installing nix via the install-nix-action.

    - name: "Cachix: tttool"
      uses: cachix/cachix-action@v7
      with:
        name: tttool
        signingKey: '${{ secrets.CACHIX_SIGNING_KEY }}'

Then we configure our Cachix cache. This means that this job will use build products from the cache if possible, and it will also push new builds to the cache. This requires a secret key, which you get when setting up your Cachix cache. See the nix and Cachix tutorial for good instructions.

    - run: nix-build --arg checkMaterialization true -A ${{ matrix.target }}

Now we can actually run the build. We set checkMaterialization to true so that CI will tell us if we need to update these hashes.

    # work around https://github.com/actions/upload-artifact/issues/92
    - run: cp -RvL result upload
    - uses: actions/upload-artifact@v2
      with:
        name: tttool (${{ matrix.target }})
        path: upload/

For convenient access to build products, e.g. from pull requests, we store them as Github artifacts. They can then be downloaded from Github’s CI status page.

  test:
    runs-on: ubuntu-latest
    needs: build
    steps:
    - uses: actions/checkout@v2
    - uses: cachix/install-nix-action@v12
    - name: "Cachix: tttool"
      uses: cachix/cachix-action@v7
      with:
        name: tttool
        signingKey: '${{ secrets.CACHIX_SIGNING_KEY }}'
    - run: nix-build -A tests

The next job repeats the setup, but now runs the tests. Because of needs: build it will not start before the builds job has completed. This also means that it should get the actual tttool executable to test from our nix cache.

  check-cabal-freeze:
    runs-on: ubuntu-latest
    steps:
    - uses: actions/checkout@v2
    - uses: cachix/install-nix-action@v12
    - name: "Cachix: tttool"
      uses: cachix/cachix-action@v7
      with:
        name: tttool
        signingKey: '${{ secrets.CACHIX_SIGNING_KEY }}'
    - run: nix-build -A check-cabal-freeze

The same, but now running the check-cabal-freeze test mentioned above. Quite annoying to repeat the setup instructions for each job…

  package:
    runs-on: ubuntu-latest
    needs: build
    steps:
    - uses: actions/checkout@v2
    - uses: cachix/install-nix-action@v12
    - name: "Cachix: tttool"
      uses: cachix/cachix-action@v7
      with:
        name: tttool
        signingKey: '${{ secrets.CACHIX_SIGNING_KEY }}'

    - run: nix-build -A release-zip

    - run: unzip -d upload ./result/*.zip
    - uses: actions/upload-artifact@v2
      with:
        name: Release zip file
        path: upload

Finally, with the same setup, but slightly different artifact upload, we build the release zip file. Again, we wait for build to finish so that the built programs are in the nix cache. This is especially important since this runs on linux, so it cannot build the OSX binary and has to rely on the cache.

Note that we don’t need to checkMaterialization again.

Annoyingly, the upload-artifact action insists on zipping the files you hand to it. A zip file that contains just a zipfile is kinda annoying, so I unpack the zipfile here before uploading the contents.

Conclusion

With this setup, when I do a release of tttool, I just bump the version numbers, wait for CI to finish building, run nix-build -A release-zip and upload result/tttool-n.m.zip. A single file that works on all target platforms. I have not yet automated making the actual release, but with one release per year this is fine.

Also, when trying out a new feature, I can easily create a branch or PR for that and grab the build products from Github’s CI, or ask people to try them out (e.g. to see if they fixed their bugs). Note, though, that you have to sign into Github before being able to download these artifacts.

One might think that this is a fairly hairy setup – finding the right combinations of various repertories so that cross-compilation works as intended. But thanks to nix’s value propositions, this does work! The setup presented here was a remake of a setup I did two years ago, with a much less mature haskell.nix. Back then, I committed a fair number of generated files to git, and juggled more complex files … but once it worked, it kept working for two years. I was indeed insulated from upstream changes. I expect that this setup will also continue to work reliably, until I choose to upgrade it again. Hopefully, then things are even more simple, and require less work-around or manual intervention.

09 November, 2020 07:42PM by Joachim Breitner (mail@joachim-breitner.de)

November 08, 2020

hackergotchi for Sean Whitton

Sean Whitton

Combining repeat and repeat-complex-command

In Emacs, you can use C-x z to repeat the last command you input, and subsequently you can keep tapping the ‘z’ key to execute that command again and again. If the command took minibuffer input, however, you’ll be asked for that input again. For example, suppose you type M-z : to delete through the next colon character. If you want to keep going and delete through the next few colons, you would need to use C-x z : z : z : etc. which is pretty inconvenient. So there’s also C-x ESC ESC RET or C-x M-: RET, which will repeat the last command which took minibuffer input, as if you’d given it the same minibuffer input. So you could use M-z : C-x M-: RET C-x M-: RET etc., but then you might as well just keep typing M-z : over and over. It’s also quite inconvenient to have to remember whether you need to use C-x z or C-x M-: RET.

I wanted to come up with a single command which would choose the correct repetition method. It turns out it’s a bit involved, but here’s what I came up with. You can use this under the GPL-3 or any later version published by the FSF. Assumes lexical binding is turned on for the file you have this in.

;; Adapted from `repeat-complex-command&apos as of November 2020
(autoload &aposrepeat-message "repeat")
(defun spw/repeat-complex-command-immediately (arg)
  "Like `repeat-complex-command&apos followed immediately by RET."
  (interactive "p")
  (if-let ((newcmd (nth (1- arg) command-history)))
      (progn
        (add-to-history &aposcommand-history newcmd)
        (repeat-message "Repeating %S" newcmd)
        (apply #&aposfuncall-interactively
               (car newcmd)
               (mapcar (lambda (e) (eval e t)) (cdr newcmd))))
    (if command-history
        (error "Argument %d is beyond length of command history" arg)
      (error "There are no previous complex commands to repeat"))))

(let (real-last-repeatable-command)
  (defun spw/repeat-or-repeat-complex-command-immediately ()
    "Call `repeat&apos or `spw/repeat-complex-command-immediately&apos as appropriate.

Note that no prefix argument is accepted because this has
different meanings for `repeat&apos and for
`spw/repeat-complex-command-immediately&apos, so that might cause surprises."
    (interactive)
    (if (eq last-repeatable-command this-command)
        (setq last-repeatable-command real-last-repeatable-command)
      (setq real-last-repeatable-command last-repeatable-command))
    (if (eq last-repeatable-command (caar command-history))
        (spw/repeat-complex-command-immediately 1)
      (repeat nil))))

;; `suspend-frame&apos is bound to both C-x C-z and C-z
(global-set-key "\C-z" #&aposspw/repeat-or-repeat-complex-command-immediately)

08 November, 2020 09:16PM

Ian Jackson

Gazebo out of scaffolding

Today we completed our gazebo, which we designed and built out of scaffolding:

Picture of gazebo

Scaffolding is fairly expensive but building things out of it is enormous fun! You can see a complete sequence of the build process, including pictures of the "engineering maquette", at https://www.chiark.greenend.org.uk/~ijackson/2020/scaffold/

Post-lockdown maybe I will build a climbing wall or something out of it...

edited 2020-11-08 20:44Z to fix img url following hosting reorg



comment count unavailable comments

08 November, 2020 08:44PM

hackergotchi for Kentaro Hayashi

Kentaro Hayashi

debexpo: adding "Already in Debian" field for packages list

I've sent a merge request to show "Already in Debian" column in packages list on mentors.debian.net.

salsa.debian.org

At first, I've used Emoji, but for consistency, it has been modified to use "Yes or No".

This feature is not fully merged yet, but it may be useful to distinguish "This package is already in Debian or not" for sponsor.

f:id:kenhys:20201108151153p:plainAready in Debian

08 November, 2020 06:16AM

Russell Coker

November 06, 2020

hackergotchi for Steinar H. Gunderson

Steinar H. Gunderson

plocate in backports

plocate 1.0.7 hit Debian backports.org today, which means that it's now available for use on Debian stable (and machines with zillions of files are not unlikely to run stable). (The package page still says 1.0.5, but 1.0.7 really is up.)

The other big change from 1.0.5 is that plocate now escapes potentially dangerous filenames, like modern versions of GNU ls does. I'm not overly happy about this, but it's really hard not to do it; any user can create a publicly-viewable file on the system, and allowing them to sneak in arbitrary escape sequences just is too much. Most terminals should be good by now, but there have been many in the past with potentially really dangerous behavior (like executing arbitrary commands!), and just being able to mess up another user's terminal isn't a good idea. Most users should never really see filenames being escaped, but if you have a filename with \n or other nonprintable characters in, they will be escaped using bash' $'foo\nbar' syntax.

So, anyone up for uploading to Fedora? :-)

06 November, 2020 04:15PM

hackergotchi for Louis-Philippe Véronneau

Louis-Philippe Véronneau

Book Review: Working in Public by Nadia Eghbal

I have a lot of respect for Nadia Eghbal, partly because I can't help to be jealous of her work on the economics of Free Software1. If you are not already familiar with Eghbal, she is the author of Roads and Bridges: The Unseen Labor Behind Our Digital Infrastructure, a great technical report published for the Ford Foundation in 2016. You may also have caught her excellent keynote at LCA 2017, entitled Consider the Maintainer.

Her latest book, Working in Public: The Making and Maintenance of Open Source Software, published by Stripe Press a few months ago, is a great read and if this topic interests you, I highly recommend it.

The book itself is simply gorgeous; bright orange, textured hardcover binding, thick paper, wonderful typesetting — it has everything to please. Well, nearly everything. Sadly, it is only available on Amazon, exclusively in the United States. A real let down for a book on Free and Open Source Software.

The book is divided in five chapters, namely:

  1. Github as a Platform
  2. The Structure of an Open Source Project
  3. Roles, Incentives and Relationships
  4. The Work Required by Software
  5. Managing the Costs of Production

A picture of the book cover

Contrary to what I was expecting, the book feels more like an extension of the LCA keynote I previously mentioned than Roads and Bridges. Indeed, as made apparent by the following quote, Eghbal doesn't believe funding to be the primary problem of FOSS anymore:

We still don't have a common understanding about who's doing the work, why they do it, and what work needs to be done. Only when we understand the underlying behavioral dynamics of open source today, and how it differs from its early origins, can we figure out where money fits in. Otherwise, we're just flinging wet paper towels at a brick wall, hoping that something sticks. — p.184

That is to say, the behavior of maintainers and the challenges they face — not the eternal money problem — is the real topic of this book. And it feels refreshing. When was the last time you read something on the economics of Free Software without it being mostly about what licences projects should pick and how business models can be tacked on them? I certainly can't.

To be clear, I'm not sure I agree with Eghbal on this. Her having worked at Github for a few years and having interviewed mostly people in the Ruby on Rails and Javascript communities certainly shows in the form of a strong selection bias. As she herself admits, this is a book on how software on Github is produced. As much as this choice irks me (the Free Software community certainly cannot be reduced to Github), this exercise had the merit of forcing me to look at my own selection biases.

As such, reading Working in Public did to me something I wasn't expecting it to do: it broke my Free Software echo chamber. Although I consider myself very familiar with the world of Free and Open Source Software, I now understand my somewhat ill-advised contempt for certain programming languages — mostly JS — skewed my understanding of what FOSS in 2020 really is.

My Free Software world very much revolves around Debian, a project with a strong and opinionated view of Free Software, rooted in a historical and political understanding of the term. This, Eghbal argues, is not the case for a large swat of developers anymore. They are The Github Generation, people attached to Github as a platform first and foremost, and who feel "Open Source" is just a convenient way to make things.

Although I could intellectualise this, before reading the book, I didn't really grok how communities akin to npm have been reshaping the modern FOSS ecosystem and how different they are from Debian itself. To be honest, I am not sure I like this tangent and it is certainly part of the reason why I had a tendency to dismiss it as a fringe movement I could safely ignore.

Thanks to Nadia Eghbal, I come out of this reading more humble and certainly reminded that FOSS' heterogeneity is real and should not be idly dismissed. This book is rich in content and although I could go on (my personal notes clock-in at around 2000 words and I certainly disagree with a number of things), I'll stop here for now. Go and grab a copy already!


  1. She insists on using the term open source, but I won't :) 

06 November, 2020 05:00AM by Louis-Philippe Véronneau

Sandro Tosi

QNAP firmware 4.5.1.1465: disable ssh management menu

as a good boy i just upgraded my QNAP NAS to the latest available firmware, 4.5.1.1465, but after the reboot there's an ugly surprise awaiting for me

once i ssh'd into the box to do my stuff, instead of a familiar bash prompt i'm greeted by a management menu that allows me to perform some basic management tasks or quit it and go back to the shell. i dont really need this menu (in particular because i have automations that regularly ssh into the box and they are not meant to be interactive).

to disable it: edit /etc/profile and comment the line "[[ "admin" = "$USER" ]] && /sbin/qts-console-mgmt -f" (you can judge me later for sshing as root)

06 November, 2020 02:42AM by Sandro Tosi (noreply@blogger.com)

November 05, 2020

hackergotchi for Mike Gabriel

Mike Gabriel

Welcome, Fre(i)e Software GmbH

Last week I received the official notice: There is now a German company named "Fre(i)e Software GmbH" registered with the German Trade Register.

Founding a New Company

Over the past months I have put my energy into founding a new company. As a freelancing IT consultant I started facing the limitation of other companies having strict policies that forbid the cooperation with one person businesses (Personengesellschaften).

Thus, the requirement for setting up a GmbH business came onto my agenda. I will move some of my business activities into this new company, starting next year.

Policy Ideas

The "Fre(i)e Software GmbH" will be a platform to facilitate the growth and spreading of Free Software on this planet.

Here are some first ideas for company policies:

  • The idea is to bring together teams of developers and consultants that provide the highest expertise in FLOSS.

  • Everything this company will do, will finally (or already during the development cycles) be published under some sorf of a free software / content license (for software, ideally a copyleft license).

  • Staff members will work and live across Europe, freelancers may possibly live in any country where German businesses may do business with.

  • Ideally, staff members and freelancers work on projects that they can identify themselves with, projects that they love.

  • Software development and software design is an art. In the company we will honour this. We will be artists.

  • In software development, we will enroll our customers in non-CLA FLOSS copyright holdership policies: developers can become copyright holders of the worked-on code projects as persons. This will strengthen the liberty nature of the FLOSS licensed code brought forth in the company.

  • The Fre(i)e Software GmbH will be a business focussing on sustainability and sufficiency. We will be gentle to our planet. We won't work on projects that create artificial needs.

  • We all will be experts in communication. We all will continually work on improving our communication skills.

  • Integrity shall be a virtue to strive for in the company.

  • We will be honest to ourselves and the customers about the mistakes we do, the misassumptions we have.

  • We will honour and support diversity.

This is all pretty fresh. I'll be happy about hearing your feedback, ideas and worries. If you are interested in joining the company, please let me know. If you are interested in supporting a company with such values, please also let me know.

light+love
Mike Gabriel (aka sunweaver)

05 November, 2020 11:38AM by sunweaver

hackergotchi for Jonathan Dowland

Jonathan Dowland

PhD Year 3 progression

I'm now into my 4th calendar year of my part-time PhD, corresponding to half-way through Stage 2, or Year 2 for a full-time student. Here's a report I wrote that describes what I did last year and what I'm working on going forward.

year3 progression report.pdf (223K PDF)

05 November, 2020 09:33AM

hackergotchi for Junichi Uekawa

Junichi Uekawa

Sent a pull request to document sshfs slave mode.

Sent a pull request to document sshfs slave mode. Every time I try to do it I forget, so at least I have a document about how to do it. Also changed the name from slave to passive, but I don't think that will help me remember... Not feeling particularly creative about the name.

05 November, 2020 01:34AM by Junichi Uekawa