May 27, 2022

Reproducible Builds (diffoscope)

diffoscope 214 released

The diffoscope maintainers are pleased to announce the release of diffoscope version 214. This version includes the following changes:

[ Chris Lamb ]
* Support both python-argcomplete 1.x and 2.x.

[ Vagrant Cascadian ]
* Add external tool on GNU Guix for xb-tool.

You find out more by visiting the project homepage.

27 May, 2022 12:00AM

May 26, 2022

hackergotchi for Sergio Talens-Oliag

Sergio Talens-Oliag

New Blog Config

As promised, on this post I’m going to explain how I’ve configured this blog using hugo, asciidoctor and the papermod theme, how I publish it using nginx, how I’ve integrated the remark42 comment system and how I’ve automated its publication using gitea and json2file-go.

It is a long post, but I hope that at least parts of it can be interesting for some, feel free to ignore it if that is not your case …​ 😉

Hugo Configuration

Theme settings

The site is using the PaperMod theme and as I’m using asciidoctor to publish my content I’ve adjusted the settings to improve how things are shown with it.

The current config.yml file is the one shown below (probably some of the settings are not required nor being used right now, but I’m including the current file, so this post will have always the latest version of it):

config.yml
baseURL: https://blogops.mixinet.net/
title: Mixinet BlogOps
paginate: 5
theme: PaperMod
destination: public/
enableInlineShortcodes: true
enableRobotsTXT: true
buildDrafts: false
buildFuture: false
buildExpired: false
enableEmoji: true
pygmentsUseClasses: true
minify:
  disableXML: true
  minifyOutput: true
languages:
  en:
    languageName: "English"
    description: "Mixinet BlogOps - https://blogops.mixinet.net/"
    author: "Sergio Talens-Oliag"
    weight: 1
    title: Mixinet BlogOps
    homeInfoParams:
      Title: "Sergio Talens-Oliag Technical Blog"
      Content: >
        ![Mixinet BlogOps](/images/mixinet-blogops.png)
    taxonomies:
      category: categories
      tag: tags
      series: series
    menu:
      main:
        - name: Archive
          url: archives
          weight: 5
        - name: Categories
          url: categories/
          weight: 10
        - name: Tags
          url: tags/
          weight: 10
        - name: Search
          url: search/
          weight: 15
outputs:
  home:
    - HTML
    - RSS
    - JSON
params:
  env: production
  defaultTheme: light
  disableThemeToggle: false
  ShowShareButtons: true
  ShowReadingTime: true
  disableSpecial1stPost: true
  disableHLJS: true
  displayFullLangName: true
  ShowPostNavLinks: true
  ShowBreadCrumbs: true
  ShowCodeCopyButtons: true
  ShowRssButtonInSectionTermList: true
  ShowFullTextinRSS: true
  ShowToc: true
  TocOpen: false
  comments: true
  remark42SiteID: "blogops"
  remark42Url: "/remark42"
  profileMode:
    enabled: false
    title: Sergio Talens-Oliag Technical Blog
    imageUrl: "/images/mixinet-blogops.png"
    imageTitle: Mixinet BlogOps
    buttons:
      - name: Archives
        url: archives
      - name: Categories
        url: categories
      - name: Tags
        url: tags
  socialIcons:
    - name: CV
      url: "https://www.uv.es/~sto/cv/"
    - name: Debian
      url: "https://people.debian.org/~sto/"
    - name: GitHub
      url: "https://github.com/sto/"
    - name: GitLab
      url: "https://gitlab.com/stalens/"
    - name: Linkedin
      url: "https://www.linkedin.com/in/sergio-talens-oliag/"
    - name: RSS
      url: "index.xml"
  assets:
    disableHLJS: true
    favicon: "/favicon.ico"
    favicon16x16:  "/favicon-16x16.png"
    favicon32x32:  "/favicon-32x32.png"
    apple_touch_icon:  "/apple-touch-icon.png"
    safari_pinned_tab:  "/safari-pinned-tab.svg"
  fuseOpts:
    isCaseSensitive: false
    shouldSort: true
    location: 0
    distance: 1000
    threshold: 0.4
    minMatchCharLength: 0
    keys: ["title", "permalink", "summary", "content"]
markup:
  asciidocExt:
    attributes: {}
    backend: html5s
    extensions: ['asciidoctor-html5s','asciidoctor-diagram']
    failureLevel: fatal
    noHeaderOrFooter: true
    preserveTOC: false
    safeMode: unsafe
    sectionNumbers: false
    trace: false
    verbose: false
    workingFolderCurrent: true
privacy:
  vimeo:
    disabled: false
    simple: true
  twitter:
    disabled: false
    enableDNT: true
    simple: true
  instagram:
    disabled: false
    simple: true
  youtube:
    disabled: false
    privacyEnhanced: true
services:
  instagram:
    disableInlineCSS: true
  twitter:
    disableInlineCSS: true
security:
  exec:
    allow:
      - '^asciidoctor$'
      - '^dart-sass-embedded$'
      - '^go$'
      - '^npx$'
      - '^postcss$'

Some notes about the settings:

  • disableHLJS and assets.disableHLJS are set to true; we plan to use rouge on adoc and the inclusion of the hljs assets adds styles that collide with the ones used by rouge.
  • ShowToc is set to true and the TocOpen setting is set to false to make the ToC appear collapsed initially. My plan was to use the asciidoctor ToC, but after trying I believe that the theme one looks nice and I don’t need to adjust styles, although it has some issues with the html5s processor (the admonition titles use <h6> and they are shown on the ToC, which is weird), to fix it I’ve copied the layouts/partial/toc.html to my site repository and replaced the range of headings to end at 5 instead of 6 (in fact 5 still seems a lot, but as I don’t think I’ll use that heading level on the posts it doesn’t really matter).
  • params.profileMode values are adjusted, but for now I’ve left it disabled setting params.profileMode.enabled to false and I’ve set the homeInfoParams to show more or less the same content with the latest posts under it (I’ve added some styles to my custom.css style sheet to center the text and image of the first post to match the look and feel of the profile).
  • On the asciidocExt section I’ve adjusted the backend to use html5s, I’ve added the asciidoctor-html5s and asciidoctor-diagram extensions to asciidoctor and adjusted the workingFolderCurrent to true to make asciidoctor-diagram work right (haven’t tested it yet).

Theme customisations

To write in asciidoctor using the html5s processor I’ve added some files to the assets/css/extended directory:

  1. As said before, I’ve added the file assets/css/extended/custom.css to make the homeInfoParams look like the profile page and I’ve also changed a little bit some theme styles to make things look better with the html5s output:

    custom.css
    /* Fix first entry alignment to make it look like the profile */
    .first-entry { text-align: center; }
    .first-entry img { display: inline; }
    /**
     * Remove margin for .post-content code and reduce padding to make it look
     * better with the asciidoctor html5s output.
     **/
    .post-content code { margin: auto 0; padding: 4px; }
  2. I’ve also added the file assets/css/extended/adoc.css with some styles taken from the asciidoctor-default.css, see this blog post about the original file; mine is the same after formatting it with css-beautify and editing it to use variables for the colors to support light and dark themes:

    adoc.css
    /* AsciiDoctor*/
    table {
        border-collapse: collapse;
        border-spacing: 0
    }
    
    .admonitionblock>table {
        border-collapse: separate;
        border: 0;
        background: none;
        width: 100%
    }
    
    .admonitionblock>table td.icon {
        text-align: center;
        width: 80px
    }
    
    .admonitionblock>table td.icon img {
        max-width: none
    }
    
    .admonitionblock>table td.icon .title {
        font-weight: bold;
        font-family: "Open Sans", "DejaVu Sans", sans-serif;
        text-transform: uppercase
    }
    
    .admonitionblock>table td.content {
        padding-left: 1.125em;
        padding-right: 1.25em;
        border-left: 1px solid #ddddd8;
        color: var(--primary)
    }
    
    .admonitionblock>table td.content>:last-child>:last-child {
        margin-bottom: 0
    }
    
    .admonitionblock td.icon [class^="fa icon-"] {
        font-size: 2.5em;
        text-shadow: 1px 1px 2px var(--secondary);
        cursor: default
    }
    
    .admonitionblock td.icon .icon-note::before {
        content: "\f05a";
        color: var(--icon-note-color)
    }
    
    .admonitionblock td.icon .icon-tip::before {
        content: "\f0eb";
        color: var(--icon-tip-color)
    }
    
    .admonitionblock td.icon .icon-warning::before {
        content: "\f071";
        color: var(--icon-warning-color)
    }
    
    .admonitionblock td.icon .icon-caution::before {
        content: "\f06d";
        color: var(--icon-caution-color)
    }
    
    .admonitionblock td.icon .icon-important::before {
        content: "\f06a";
        color: var(--icon-important-color)
    }
    
    .conum[data-value] {
        display: inline-block;
        color: #fff !important;
        background-color: rgba(100, 100, 0, .8);
        -webkit-border-radius: 100px;
        border-radius: 100px;
        text-align: center;
        font-size: .75em;
        width: 1.67em;
        height: 1.67em;
        line-height: 1.67em;
        font-family: "Open Sans", "DejaVu Sans", sans-serif;
        font-style: normal;
        font-weight: bold
    }
    
    .conum[data-value] * {
        color: #fff !important
    }
    
    .conum[data-value]+b {
        display: none
    }
    
    .conum[data-value]::after {
        content: attr(data-value)
    }
    
    pre .conum[data-value] {
        position: relative;
        top: -.125em
    }
    
    b.conum * {
        color: inherit !important
    }
    
    .conum:not([data-value]):empty {
        display: none
    }
  3. The previous file uses variables from a partial copy of the theme-vars.css file that changes the highlighted code background color and adds the color definitions used by the admonitions:

    theme-vars.css
    :root {
        /* Solarized base2 */
        /* --hljs-bg: rgb(238, 232, 213); */
        /* Solarized base3 */
        /* --hljs-bg: rgb(253, 246, 227); */
        /* Solarized base02 */
        --hljs-bg: rgb(7, 54, 66);
        /* Solarized base03 */
        /* --hljs-bg: rgb(0, 43, 54); */
        /* Default asciidoctor theme colors */
        --icon-note-color: #19407c;
        --icon-tip-color: var(--primary);
        --icon-warning-color: #bf6900;
        --icon-caution-color: #bf3400;
        --icon-important-color: #bf0000
    }
    
    .dark {
        --hljs-bg: rgb(7, 54, 66);
        /* Asciidoctor theme colors with tint for dark background */
        --icon-note-color: #3e7bd7;
        --icon-tip-color: var(--primary);
        --icon-warning-color: #ff8d03;
        --icon-caution-color: #ff7847;
        --icon-important-color: #ff3030
    }
  4. The previous styles use font-awesome, so I’ve downloaded its resources for version 4.7.0 (the one used by asciidoctor) storing the font-awesome.css into on the assets/css/extended dir (that way it is merged with the rest of .css files) and copying the fonts to the static/assets/fonts/ dir (will be served directly):

    FA_BASE_URL="https://cdnjs.cloudflare.com/ajax/libs/font-awesome/4.7.0"
    curl "$FA_BASE_URL/css/font-awesome.css" \
      > assets/css/extended/font-awesome.css
    for f in FontAwesome.otf fontawesome-webfont.eot \
      fontawesome-webfont.svg fontawesome-webfont.ttf \
      fontawesome-webfont.woff fontawesome-webfont.woff2; do
        curl "$FA_BASE_URL/fonts/$f" > "static/assets/fonts/$f"
    done
  5. As already said the default highlighter is disabled (it provided a css compatible with rouge) so we need a css to do the highlight styling; as rouge provides a way to export them, I’ve created the assets/css/extended/rouge.css file with the thankful_eyes theme:

    rougify style thankful_eyes > assets/css/extended/rouge.css
  6. To support the use of the html5s backend with admonitions I’ve added a variation of the example found on this blog post to assets/js/adoc-admonitions.js:

    adoc-admonitions.js
    // replace the default admonitions block with a table that uses a format
    // similar to the standard asciidoctor ... as we are using fa-icons here there
    // is no need to add the icons: font entry on the document.
    window.addEventListener('load', function () {
      const admonitions = document.getElementsByClassName('admonition-block')
      for (let i = admonitions.length - 1; i >= 0; i--) {
        const elm = admonitions[i]
        const type = elm.classList[1]
        const title = elm.getElementsByClassName('block-title')[0];
    	const label = title.getElementsByClassName('title-label')[0]
    		.innerHTML.slice(0, -1);
        elm.removeChild(elm.getElementsByClassName('block-title')[0]);
        const text = elm.innerHTML
        const parent = elm.parentNode
        const tempDiv = document.createElement('div')
        tempDiv.innerHTML = `<div class="admonitionblock ${type}">
        <table>
          <tbody>
            <tr>
              <td class="icon">
                <i class="fa icon-${type}" title="${label}"></i>
              </td>
              <td class="content">
                ${text}
              </td>
            </tr>
          </tbody>
        </table>
      </div>`
        const input = tempDiv.childNodes[0]
        parent.replaceChild(input, elm)
      }
    })

    and enabled its minified use on the layouts/partials/extend_footer.html file adding the following lines to it:

    {{- $admonitions := slice (resources.Get "js/adoc-admonitions.js")
      | resources.Concat "assets/js/adoc-admonitions.js" | minify | fingerprint }}
    <script defer crossorigin="anonymous" src="{{ $admonitions.RelPermalink }}"
      integrity="{{ $admonitions.Data.Integrity }}"></script>

Remark42 configuration

To integrate Remark42 with the PaperMod theme I’ve created the file layouts/partials/comments.html with the following content based on the remark42 documentation, including extra code to sync the dark/light setting with the one set on the site:

comments.html
<div id="remark42"></div>
<script>
  var remark_config = {
    host: {{ .Site.Params.remark42Url }},
    site_id: {{ .Site.Params.remark42SiteID }},
    url: {{ .Permalink }},
    locale: {{ .Site.Language.Lang }}
  };
  (function(c) {
    /* Adjust the theme using the local-storage pref-theme if set */
    if (localStorage.getItem("pref-theme") === "dark") {
      remark_config.theme = "dark";
    } else if (localStorage.getItem("pref-theme") === "light") {
      remark_config.theme = "light";
    }
    /* Add remark42 widget */
    for(var i = 0; i < c.length; i++){
      var d = document, s = d.createElement('script');
      s.src = remark_config.host + '/web/' + c[i] +'.js';
      s.defer = true;
      (d.head || d.body).appendChild(s);
    }
  })(remark_config.components || ['embed']);
</script>

In development I use it with anonymous comments enabled, but to avoid SPAM the production site uses social logins (for now I’ve only enabled Github & Google, if someone requests additional services I’ll check them, but those were the easy ones for me initially).

To support theme switching with remark42 I’ve also added the following inside the layouts/partials/extend_footer.html file:

{{- if (not site.Params.disableThemeToggle) }}
<script>
/* Function to change theme when the toggle button is pressed */
document.getElementById("theme-toggle").addEventListener("click", () => {
  if (typeof window.REMARK42 != "undefined") {
    if (document.body.className.includes('dark')) {
      window.REMARK42.changeTheme('light');
    } else {
      window.REMARK42.changeTheme('dark');
    }
  }
});
</script>
{{- end }}

With this code if the theme-toggle button is pressed we change the remark42 theme before the PaperMod one (that’s needed here only, on page loads the remark42 theme is synced with the main one using the code from the layouts/partials/comments.html shown earlier).

Development setup

To preview the site on my laptop I’m using docker-compose with the following configuration:

docker-compose.yaml
version: "2"
services:
  hugo:
    build:
      context: ./docker/hugo-adoc
      dockerfile: ./Dockerfile
    image: sto/hugo-adoc
    container_name: hugo-adoc-blogops
    restart: always
    volumes:
      - .:/documents
    command: server --bind 0.0.0.0 -D -F
    user: ${APP_UID}:${APP_GID}
  nginx:
    image: nginx:latest
    container_name: nginx-blogops
    restart: always
    volumes:
      - ./nginx/default.conf:/etc/nginx/conf.d/default.conf
    ports:
      -  1313:1313
  remark42:
    build:
      context: ./docker/remark42
      dockerfile: ./Dockerfile
    image: sto/remark42
    container_name: remark42-blogops
    restart: always
    env_file:
      - ./.env
      - ./remark42/env.dev
    volumes:
      - ./remark42/var.dev:/srv/var

To run it properly we have to create the .env file with the current user ID and GID on the variables APP_UID and APP_GID (if we don’t do it the files can end up being owned by a user that is not the same as the one running the services):

$ echo "APP_UID=$(id -u)\nAPP_GID=$(id -g)" > .env

The Dockerfile used to generate the sto/hugo-adoc is:

Dockerfile
FROM asciidoctor/docker-asciidoctor:latest
RUN gem install --no-document asciidoctor-html5s &&\
 apk update && apk add --no-cache curl libc6-compat &&\
 repo_path="gohugoio/hugo" &&\
 api_url="https://api.github.com/repos/$repo_path/releases/latest" &&\
 download_url="$(\
  curl -sL "$api_url" |\
  sed -n "s/^.*download_url\": \"\\(.*.extended.*Linux-64bit.tar.gz\)\"/\1/p"\
 )" &&\
 curl -sL "$download_url" -o /tmp/hugo.tgz &&\
 tar xf /tmp/hugo.tgz hugo &&\
 install hugo /usr/bin/ &&\
 rm -f hugo /tmp/hugo.tgz &&\
 /usr/bin/hugo version &&\
 apk del curl && rm -rf /var/cache/apk/*
# Expose port for live server
EXPOSE 1313
ENTRYPOINT ["/usr/bin/hugo"]
CMD [""]

If you review it you will see that I’m using the docker-asciidoctor image as the base; the idea is that this image has all I need to work with asciidoctor and to use hugo I only need to download the binary from their latest release at github (as we are using an image based on alpine we also need to install the libc6-compat package, but once that is done things are working fine for me so far).

The image does not launch the server by default because I don’t want it to; in fact I use the same docker-compose.yml file to publish the site in production simply calling the container without the arguments passed on the docker-compose.yml file (see later).

When running the containers with docker-compose up (or docker compose up if you have the docker-compose-plugin package installed) we also launch a nginx container and the remark42 service so we can test everything together.

The Dockerfile for the remark42 image is the original one with an updated version of the init.sh script:

Dockerfile
FROM umputun/remark42:latest
COPY init.sh /init.sh

The updated init.sh is similar to the original, but allows us to use an APP_GID variable and updates the /etc/group file of the container so the files get the right user and group (with the original script the group is always 1001):

init.sh
#!/sbin/dinit /bin/sh

uid="$(id -u)"

if [ "${uid}" -eq "0" ]; then
  echo "init container"

  # set container's time zone
  cp "/usr/share/zoneinfo/${TIME_ZONE}" /etc/localtime
  echo "${TIME_ZONE}" >/etc/timezone
  echo "set timezone ${TIME_ZONE} ($(date))"

  # set UID & GID for the app
  if [ "${APP_UID}" ] || [ "${APP_GID}" ]; then
    [ "${APP_UID}" ] || APP_UID="1001"
    [ "${APP_GID}" ] || APP_GID="${APP_UID}"
    echo "set custom APP_UID=${APP_UID} & APP_GID=${APP_GID}"
    sed -i "s/^app:x:1001:1001:/app:x:${APP_UID}:${APP_GID}:/" /etc/passwd
    sed -i "s/^app:x:1001:/app:x:${APP_GID}:/" /etc/group
  else
    echo "custom APP_UID and/or APP_GID not defined, using 1001:1001"
  fi
  chown -R app:app /srv /home/app
fi

echo "prepare environment"

# replace {% REMARK_URL %} by content of REMARK_URL variable
find /srv -regex '.*\.\(html\|js\|mjs\)$' -print \
  -exec sed -i "s|{% REMARK_URL %}|${REMARK_URL}|g" {} \;

if [ -n "${SITE_ID}" ]; then
  #replace "site_id: 'remark'" by SITE_ID
  sed -i "s|'remark'|'${SITE_ID}'|g" /srv/web/*.html
fi

echo "execute \"$*\""
if [ "${uid}" -eq "0" ]; then
  exec su-exec app "$@"
else
  exec "$@"
fi

The environment file used with remark42 for development is quite minimal:

env.dev
TIME_ZONE=Europe/Madrid
REMARK_URL=http://localhost:1313/remark42
SITE=blogops
SECRET=123456
ADMIN_SHARED_ID=sto
AUTH_ANON=true
EMOJI=true

And the nginx/default.conf file used to publish the service locally is simple too:

default.conf
server { 
 listen 1313;
 server_name localhost;
 location / {
    proxy_pass http://hugo:1313;
    proxy_set_header Host $http_host;
    proxy_set_header X-Real-IP $remote_addr;
    proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
    proxy_set_header X-Forwarded-Proto $scheme;
    proxy_set_header Upgrade $http_upgrade;
    proxy_set_header Connection "upgrade";
 }
 location /remark42/ {
    rewrite /remark42/(.*) /$1 break;
    proxy_pass http://remark42:8080/;
    proxy_set_header Host $http_host;
    proxy_set_header X-Real-IP $remote_addr;
    proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
    proxy_set_header X-Forwarded-Proto $scheme;
  }
}

Production setup

The VM where I’m publishing the blog runs Debian GNU/Linux and uses binaries from local packages and applications packaged inside containers.

To run the containers I’m using docker-ce (I could have used podman instead, but I already had it installed on the machine, so I stayed with it).

The binaries used on this project are included on the following packages from the main Debian repository:

  • git to clone & pull the repository,
  • jq to parse json files from shell scripts,
  • json2file-go to save the webhook messages to files,
  • inotify-tools to detect when new files are stored by json2file-go and launch scripts to process them,
  • nginx to publish the site using HTTPS and work as proxy for json2file-go and remark42 (I run it using a container),
  • task-spool to queue the scripts that update the deployment.

And I’m using docker and docker compose from the debian packages on the docker repository:

  • docker-ce to run the containers,
  • docker-compose-plugin to run docker compose (it is a plugin, so no - in the name).

Repository checkout

To manage the git repository I’ve created a deploy key, added it to gitea and cloned the project on the /srv/blogops PATH (that route is owned by a regular user that has permissions to run docker, as I said before).

Compiling the site with hugo

To compile the site we are using the docker-compose.yml file seen before, to be able to run it first we build the container images and once we have them we launch hugo using docker compose run:

$ cd /srv/blogops
$ git pull
$ docker compose build
$ if [ -d "./public" ]; then rm -rf ./public; fi
$ docker compose run hugo --

The compilation leaves the static HTML on /srv/blogops/public (we remove the directory first because hugo does not clean the destination folder as jekyll does).

The deploy script re-generates the site as described and moves the public directory to its final place for publishing.

Running remark42 with docker

On the /srv/blogops/remark42 folder I have the following docker-compose.yml:

docker-compose.yml
version: "2"
services:
  remark42:
    build:
      context: ../docker/remark42
      dockerfile: ./Dockerfile
    image: sto/remark42
    env_file:
      - ../.env
      - ./env.prod
    container_name: remark42
    restart: always
    volumes:
      - ./var.prod:/srv/var
    ports:
      - 127.0.0.1:8042:8080

The ../.env file is loaded to get the APP_UID and APP_GID variables that are used by my version of the init.sh script to adjust file permissions and the env.prod file contains the rest of the settings for remark42, including the social network tokens (see the remark42 documentation for the available parameters, I don’t include my configuration here because some of them are secrets).

Nginx configuration

The nginx configuration for the blogops.mixinet.net site is as simple as:

server {
  listen 443 ssl http2;
  server_name blogops.mixinet.net;
  ssl_certificate /etc/letsencrypt/live/blogops.mixinet.net/fullchain.pem;
  ssl_certificate_key /etc/letsencrypt/live/blogops.mixinet.net/privkey.pem;
  include /etc/letsencrypt/options-ssl-nginx.conf;
  ssl_dhparam /etc/letsencrypt/ssl-dhparams.pem;
  access_log /var/log/nginx/blogops.mixinet.net-443.access.log;
  error_log  /var/log/nginx/blogops.mixinet.net-443.error.log;
  root /srv/blogops/nginx/public_html;
  location / {
    try_files $uri $uri/ =404;
  }
  include /srv/blogops/nginx/remark42.conf;
}
server {
  listen 80 ;
  listen [::]:80 ;
  server_name blogops.mixinet.net;
  access_log /var/log/nginx/blogops.mixinet.net-80.access.log;
  error_log  /var/log/nginx/blogops.mixinet.net-80.error.log;
  if ($host = blogops.mixinet.net) {
    return 301 https://$host$request_uri;
  }
  return 404;
}

On this configuration the certificates are managed by certbot and the server root directory is on /srv/blogops/nginx/public_html and not on /srv/blogops/public; the reason for that is that I want to be able to compile without affecting the running site, the deployment script generates the site on /srv/blogops/public and if all works well we rename folders to do the switch, making the change feel almost atomic.

json2file-go configuration

As I have a working WireGuard VPN between the machine running gitea at my home and the VM where the blog is served, I’m going to configure the json2file-go to listen for connections on a high port using a self signed certificate and listening on IP addresses only reachable through the VPN.

To do it we create a systemd socket to run json2file-go and adjust its configuration to listen on a private IP (we use the FreeBind option on its definition to be able to launch the service even when the IP is not available, that is, when the VPN is down).

The following script can be used to set up the json2file-go configuration:

setup-json2file.sh
#!/bin/sh

set -e

# ---------
# VARIABLES
# ---------

BASE_DIR="/srv/blogops/webhook"
J2F_DIR="$BASE_DIR/json2file"
TLS_DIR="$BASE_DIR/tls"

J2F_SERVICE_NAME="json2file-go"
J2F_SERVICE_DIR="/etc/systemd/system/json2file-go.service.d"
J2F_SERVICE_OVERRIDE="$J2F_SERVICE_DIR/override.conf"
J2F_SOCKET_DIR="/etc/systemd/system/json2file-go.socket.d"
J2F_SOCKET_OVERRIDE="$J2F_SOCKET_DIR/override.conf"

J2F_BASEDIR_FILE="/etc/json2file-go/basedir"
J2F_DIRLIST_FILE="/etc/json2file-go/dirlist"
J2F_CRT_FILE="/etc/json2file-go/certfile"
J2F_KEY_FILE="/etc/json2file-go/keyfile"
J2F_CRT_PATH="$TLS_DIR/crt.pem"
J2F_KEY_PATH="$TLS_DIR/key.pem"

# ----
# MAIN
# ----

# Install packages used with json2file for the blogops site
sudo apt update
sudo apt install -y json2file-go uuid
if [ -z "$(type mkcert)" ]; then
  sudo apt install -y mkcert
fi
sudo apt clean

# Configuration file values
J2F_USER="$(id -u)"
J2F_GROUP="$(id -g)"
J2F_DIRLIST="blogops:$(uuid)"
J2F_LISTEN_STREAM="172.31.31.1:4443"

# Configure json2file
[ -d "$J2F_DIR" ] || mkdir "$J2F_DIR"
sudo sh -c "echo '$J2F_DIR' >'$J2F_BASEDIR_FILE'"
[ -d "$TLS_DIR" ] || mkdir "$TLS_DIR"
if [ ! -f "$J2F_CRT_PATH" ] || [ ! -f "$J2F_KEY_PATH" ]; then
  mkcert -cert-file "$J2F_CRT_PATH" -key-file "$J2F_KEY_PATH" "$(hostname -f)"
fi
sudo sh -c "echo '$J2F_CRT_PATH' >'$J2F_CRT_FILE'"
sudo sh -c "echo '$J2F_KEY_PATH' >'$J2F_KEY_FILE'"
sudo sh -c "cat >'$J2F_DIRLIST_FILE'" <<EOF
$(echo "$J2F_DIRLIST" | tr ';' '\n')
EOF

# Service override
[ -d "$J2F_SERVICE_DIR" ] || sudo mkdir "$J2F_SERVICE_DIR"
sudo sh -c "cat >'$J2F_SERVICE_OVERRIDE'" <<EOF
[Service]
User=$J2F_USER
Group=$J2F_GROUP
EOF

# Socket override
[ -d "$J2F_SOCKET_DIR" ] || sudo mkdir "$J2F_SOCKET_DIR"
sudo sh -c "cat >'$J2F_SOCKET_OVERRIDE'" <<EOF
[Socket]
# Set FreeBind to listen on missing addresses (the VPN can be down sometimes)
FreeBind=true
# Set ListenStream to nothing to clear its value and add the new value later
ListenStream=
ListenStream=$J2F_LISTEN_STREAM
EOF

# Restart and enable service
sudo systemctl daemon-reload
sudo systemctl stop "$J2F_SERVICE_NAME"
sudo systemctl start "$J2F_SERVICE_NAME"
sudo systemctl enable "$J2F_SERVICE_NAME"

# ----
# vim: ts=2:sw=2:et:ai:sts=2
Warning:

The script uses mkcert to create the temporary certificates, to install the package on bullseye the backports repository must be available.

Gitea configuration

To make gitea use our json2file-go server we go to the project and enter into the hooks/gitea/new page, once there we create a new webhook of type gitea and set the target URL to https://172.31.31.1:4443/blogops and on the secret field we put the token generated with uuid by the setup script:

sed -n -e 's/blogops://p' /etc/json2file-go/dirlist

The rest of the settings can be left as they are:

  • Trigger on: Push events
  • Branch filter: *
Warning:

We are using an internal IP and a self signed certificate, that means that we have to review that the webhook section of the app.ini of our gitea server allows us to call the IP and skips the TLS verification (you can see the available options on the gitea documentation).

The [webhook] section of my server looks like this:

[webhook]
ALLOWED_HOST_LIST=private
SKIP_TLS_VERIFY=true

Once we have the webhook configured we can try it and if it works our json2file server will store the file on the /srv/blogops/webhook/json2file/blogops/ folder.

The json2file spooler script

With the previous configuration our system is ready to receive webhook calls from gitea and store the messages on files, but we have to do something to process those files once they are saved in our machine.

An option could be to use a cronjob to look for new files, but we can do better on Linux using inotify …​ we will use the inotifywait command from inotify-tools to watch the json2file output directory and execute a script each time a new file is moved inside it or closed after writing (IN_CLOSE_WRITE and IN_MOVED_TO events).

To avoid concurrency problems we are going to use task-spooler to launch the scripts that process the webhooks using a queue of length 1, so they are executed one by one in a FIFO queue.

The spooler script is this:

blogops-spooler.sh
#!/bin/sh

set -e

# ---------
# VARIABLES
# ---------

BASE_DIR="/srv/blogops/webhook"
BIN_DIR="$BASE_DIR/bin"
TSP_DIR="$BASE_DIR/tsp"

WEBHOOK_COMMAND="$BIN_DIR/blogops-webhook.sh"

# ---------
# FUNCTIONS
# ---------

queue_job() {
  echo "Queuing job to process file '$1'"
  TMPDIR="$TSP_DIR" TS_SLOTS="1" TS_MAXFINISHED="10" \
    tsp -n "$WEBHOOK_COMMAND" "$1"
}

# ----
# MAIN
# ----

INPUT_DIR="$1"
if [ ! -d "$INPUT_DIR" ]; then
  echo "Input directory '$INPUT_DIR' does not exist, aborting!"
  exit 1
fi

[ -d "$TSP_DIR" ] || mkdir "$TSP_DIR"

echo "Processing existing files under '$INPUT_DIR'"
find "$INPUT_DIR" -type f | sort | while read -r _filename; do
  queue_job "$_filename"
done

# Use inotifywatch to process new files
echo "Watching for new files under '$INPUT_DIR'"
inotifywait -q -m -e close_write,moved_to --format "%w%f" -r "$INPUT_DIR" |
  while read -r _filename; do
    queue_job "$_filename"
  done

# ----
# vim: ts=2:sw=2:et:ai:sts=2

To run it as a daemon we install it as a systemd service using the following script:

setup-spooler.sh
#!/bin/sh

set -e

# ---------
# VARIABLES
# ---------

BASE_DIR="/srv/blogops/webhook"
BIN_DIR="$BASE_DIR/bin"
J2F_DIR="$BASE_DIR/json2file"

SPOOLER_COMMAND="$BIN_DIR/blogops-spooler.sh '$J2F_DIR'"
SPOOLER_SERVICE_NAME="blogops-j2f-spooler"
SPOOLER_SERVICE_FILE="/etc/systemd/system/$SPOOLER_SERVICE_NAME.service"

# Configuration file values
J2F_USER="$(id -u)"
J2F_GROUP="$(id -g)"

# ----
# MAIN
# ----

# Install packages used with the webhook processor
sudo apt update
sudo apt install -y inotify-tools jq task-spooler
sudo apt clean

# Configure process service
sudo sh -c "cat > $SPOOLER_SERVICE_FILE" <<EOF
[Install]
WantedBy=multi-user.target
[Unit]
Description=json2file processor for $J2F_USER
After=docker.service
[Service]
Type=simple
User=$J2F_USER
Group=$J2F_GROUP
ExecStart=$SPOOLER_COMMAND
EOF

# Restart and enable service
sudo systemctl daemon-reload
sudo systemctl stop "$SPOOLER_SERVICE_NAME" || true
sudo systemctl start "$SPOOLER_SERVICE_NAME"
sudo systemctl enable "$SPOOLER_SERVICE_NAME"

# ----
# vim: ts=2:sw=2:et:ai:sts=2

The gitea webhook processor

Finally, the script that processes the JSON files does the following:

  1. First, it checks if the repository and branch are right,
  2. Then, it fetches and checks out the commit referenced on the JSON file,
  3. Once the files are updated, compiles the site using hugo with docker compose,
  4. If the compilation succeeds the script renames directories to swap the old version of the site by the new one.

If there is a failure the script aborts but before doing it or if the swap succeeded the system sends an email to the configured address and/or the user that pushed updates to the repository with a log of what happened.

The current script is this one:

blogops-webhook.sh
#!/bin/sh

set -e

# ---------
# VARIABLES
# ---------

# Values
REPO_REF="refs/heads/main"
REPO_CLONE_URL="https://gitea.mixinet.net/mixinet/blogops.git"

MAIL_PREFIX="[BLOGOPS-WEBHOOK] "
# Address that gets all messages, leave it empty if not wanted
MAIL_TO_ADDR="blogops@mixinet.net"
# If the following variable is set to 'true' the pusher gets mail on failures
MAIL_ERRFILE="false"
# If the following variable is set to 'true' the pusher gets mail on success
MAIL_LOGFILE="false"
# gitea's conf/app.ini value of NO_REPLY_ADDRESS, it is used for email domains
# when the KeepEmailPrivate option is enabled for a user
NO_REPLY_ADDRESS="noreply.example.org"

# Directories
BASE_DIR="/srv/blogops"

PUBLIC_DIR="$BASE_DIR/public"
NGINX_BASE_DIR="$BASE_DIR/nginx"
PUBLIC_HTML_DIR="$NGINX_BASE_DIR/public_html"

WEBHOOK_BASE_DIR="$BASE_DIR/webhook"
WEBHOOK_SPOOL_DIR="$WEBHOOK_BASE_DIR/spool"
WEBHOOK_ACCEPTED="$WEBHOOK_SPOOL_DIR/accepted"
WEBHOOK_DEPLOYED="$WEBHOOK_SPOOL_DIR/deployed"
WEBHOOK_REJECTED="$WEBHOOK_SPOOL_DIR/rejected"
WEBHOOK_TROUBLED="$WEBHOOK_SPOOL_DIR/troubled"
WEBHOOK_LOG_DIR="$WEBHOOK_SPOOL_DIR/log"

# Files
TODAY="$(date +%Y%m%d)"
OUTPUT_BASENAME="$(date +%Y%m%d-%H%M%S.%N)"
WEBHOOK_LOGFILE_PATH="$WEBHOOK_LOG_DIR/$OUTPUT_BASENAME.log"
WEBHOOK_ACCEPTED_JSON="$WEBHOOK_ACCEPTED/$OUTPUT_BASENAME.json"
WEBHOOK_ACCEPTED_LOGF="$WEBHOOK_ACCEPTED/$OUTPUT_BASENAME.log"
WEBHOOK_REJECTED_TODAY="$WEBHOOK_REJECTED/$TODAY"
WEBHOOK_REJECTED_JSON="$WEBHOOK_REJECTED_TODAY/$OUTPUT_BASENAME.json"
WEBHOOK_REJECTED_LOGF="$WEBHOOK_REJECTED_TODAY/$OUTPUT_BASENAME.log"
WEBHOOK_DEPLOYED_TODAY="$WEBHOOK_DEPLOYED/$TODAY"
WEBHOOK_DEPLOYED_JSON="$WEBHOOK_DEPLOYED_TODAY/$OUTPUT_BASENAME.json"
WEBHOOK_DEPLOYED_LOGF="$WEBHOOK_DEPLOYED_TODAY/$OUTPUT_BASENAME.log"
WEBHOOK_TROUBLED_TODAY="$WEBHOOK_TROUBLED/$TODAY"
WEBHOOK_TROUBLED_JSON="$WEBHOOK_TROUBLED_TODAY/$OUTPUT_BASENAME.json"
WEBHOOK_TROUBLED_LOGF="$WEBHOOK_TROUBLED_TODAY/$OUTPUT_BASENAME.log"

# Query to get variables from a gitea webhook json
ENV_VARS_QUERY="$(
  printf "%s" \
    '(.           | @sh "gt_ref=\(.ref);"),' \
    '(.           | @sh "gt_after=\(.after);"),' \
    '(.repository | @sh "gt_repo_clone_url=\(.clone_url);"),' \
    '(.repository | @sh "gt_repo_name=\(.name);"),' \
    '(.pusher     | @sh "gt_pusher_full_name=\(.full_name);"),' \
    '(.pusher     | @sh "gt_pusher_email=\(.email);")'
)"

# ---------
# Functions
# ---------

webhook_log() {
  echo "$(date -R) $*" >>"$WEBHOOK_LOGFILE_PATH"
}

webhook_check_directories() {
  for _d in "$WEBHOOK_SPOOL_DIR" "$WEBHOOK_ACCEPTED" "$WEBHOOK_DEPLOYED" \
    "$WEBHOOK_REJECTED" "$WEBHOOK_TROUBLED" "$WEBHOOK_LOG_DIR"; do
    [ -d "$_d" ] || mkdir "$_d"
  done
}

webhook_clean_directories() {
  # Try to remove empty dirs
  for _d in "$WEBHOOK_ACCEPTED" "$WEBHOOK_DEPLOYED" "$WEBHOOK_REJECTED" \
    "$WEBHOOK_TROUBLED" "$WEBHOOK_LOG_DIR" "$WEBHOOK_SPOOL_DIR"; do
    if [ -d "$_d" ]; then
      rmdir "$_d" 2>/dev/null || true
    fi
  done
}

webhook_accept() {
  webhook_log "Accepted: $*"
  mv "$WEBHOOK_JSON_INPUT_FILE" "$WEBHOOK_ACCEPTED_JSON"
  mv "$WEBHOOK_LOGFILE_PATH" "$WEBHOOK_ACCEPTED_LOGF"
  WEBHOOK_LOGFILE_PATH="$WEBHOOK_ACCEPTED_LOGF"
}

webhook_reject() {
  [ -d "$WEBHOOK_REJECTED_TODAY" ] || mkdir "$WEBHOOK_REJECTED_TODAY"
  webhook_log "Rejected: $*"
  if [ -f "$WEBHOOK_JSON_INPUT_FILE" ]; then
    mv "$WEBHOOK_JSON_INPUT_FILE" "$WEBHOOK_REJECTED_JSON"
  fi
  mv "$WEBHOOK_LOGFILE_PATH" "$WEBHOOK_REJECTED_LOGF"
  exit 0
}

webhook_deployed() {
  [ -d "$WEBHOOK_DEPLOYED_TODAY" ] || mkdir "$WEBHOOK_DEPLOYED_TODAY"
  webhook_log "Deployed: $*"
  mv "$WEBHOOK_ACCEPTED_JSON" "$WEBHOOK_DEPLOYED_JSON"
  mv "$WEBHOOK_ACCEPTED_LOGF" "$WEBHOOK_DEPLOYED_LOGF"
  WEBHOOK_LOGFILE_PATH="$WEBHOOK_DEPLOYED_LOGF"
}

webhook_troubled() {
  [ -d "$WEBHOOK_TROUBLED_TODAY" ] || mkdir "$WEBHOOK_TROUBLED_TODAY"
  webhook_log "Troubled: $*"
  mv "$WEBHOOK_ACCEPTED_JSON" "$WEBHOOK_TROUBLED_JSON"
  mv "$WEBHOOK_ACCEPTED_LOGF" "$WEBHOOK_TROUBLED_LOGF"
  WEBHOOK_LOGFILE_PATH="$WEBHOOK_TROUBLED_LOGF"
}

print_mailto() {
  _addr="$1"
  _user_email=""
  # Add the pusher email address unless it is from the domain NO_REPLY_ADDRESS,
  # which should match the value of that variable on the gitea 'app.ini' (it
  # is the domain used for emails when the user hides it).
  # shellcheck disable=SC2154
  if [ -n "${gt_pusher_email##*@"${NO_REPLY_ADDRESS}"}" ] &&
    [ -z "${gt_pusher_email##*@*}" ]; then
    _user_email="\"$gt_pusher_full_name <$gt_pusher_email>\""
  fi
  if [ "$_addr" ] && [ "$_user_email" ]; then
    echo "$_addr,$_user_email"
  elif [ "$_user_email" ]; then
    echo "$_user_email"
  elif [ "$_addr" ]; then
    echo "$_addr"
  fi
}

mail_success() {
  to_addr="$MAIL_TO_ADDR"
  if [ "$MAIL_LOGFILE" = "true" ]; then
    to_addr="$(print_mailto "$to_addr")"
  fi
  if [ "$to_addr" ]; then
    # shellcheck disable=SC2154
    subject="OK - $gt_repo_name updated to commit '$gt_after'"
    mail -s "${MAIL_PREFIX}${subject}" "$to_addr" \
      <"$WEBHOOK_LOGFILE_PATH"
  fi
}

mail_failure() {
  to_addr="$MAIL_TO_ADDR"
  if [ "$MAIL_ERRFILE" = true ]; then
    to_addr="$(print_mailto "$to_addr")"
  fi
  if [ "$to_addr" ]; then
    # shellcheck disable=SC2154
    subject="KO - $gt_repo_name update FAILED for commit '$gt_after'"
    mail -s "${MAIL_PREFIX}${subject}" "$to_addr" \
      <"$WEBHOOK_LOGFILE_PATH"
  fi
}

# ----
# MAIN
# ----
# Check directories
webhook_check_directories

# Go to the base directory
cd "$BASE_DIR"

# Check if the file exists
WEBHOOK_JSON_INPUT_FILE="$1"
if [ ! -f "$WEBHOOK_JSON_INPUT_FILE" ]; then
  webhook_reject "Input arg '$1' is not a file, aborting"
fi

# Parse the file
webhook_log "Processing file '$WEBHOOK_JSON_INPUT_FILE'"
eval "$(jq -r "$ENV_VARS_QUERY" "$WEBHOOK_JSON_INPUT_FILE")"

# Check that the repository clone url is right
# shellcheck disable=SC2154
if [ "$gt_repo_clone_url" != "$REPO_CLONE_URL" ]; then
  webhook_reject "Wrong repository: '$gt_clone_url'"
fi

# Check that the branch is the right one
# shellcheck disable=SC2154
if [ "$gt_ref" != "$REPO_REF" ]; then
  webhook_reject "Wrong repository ref: '$gt_ref'"
fi

# Accept the file
# shellcheck disable=SC2154
webhook_accept "Processing '$gt_repo_name'"

# Update the checkout
ret="0"
git fetch >>"$WEBHOOK_LOGFILE_PATH" 2>&1 || ret="$?"
if [ "$ret" -ne "0" ]; then
  webhook_troubled "Repository fetch failed"
  mail_failure
fi
# shellcheck disable=SC2154
git checkout "$gt_after" >>"$WEBHOOK_LOGFILE_PATH" 2>&1 || ret="$?"
if [ "$ret" -ne "0" ]; then
  webhook_troubled "Repository checkout failed"
  mail_failure
fi

# Remove the build dir if present
if [ -d "$PUBLIC_DIR" ]; then
  rm -rf "$PUBLIC_DIR"
fi

# Build site
docker compose run hugo -- >>"$WEBHOOK_LOGFILE_PATH" 2>&1 || ret="$?"
# go back to the main branch
git switch main && git pull
# Fail if public dir was missing
if [ "$ret" -ne "0" ] || [ ! -d "$PUBLIC_DIR" ]; then
  webhook_troubled "Site build failed"
  mail_failure
fi

# Remove old public_html copies
webhook_log 'Removing old site versions, if present'
find $NGINX_BASE_DIR -mindepth 1 -maxdepth 1 -name 'public_html-*' -type d \
  -exec rm -rf {} \; >>"$WEBHOOK_LOGFILE_PATH" 2>&1 || ret="$?"
if [ "$ret" -ne "0" ]; then
  webhook_troubled "Removal of old site versions failed"
  mail_failure
fi
# Switch site directory
TS="$(date +%Y%m%d-%H%M%S)"
if [ -d "$PUBLIC_HTML_DIR" ]; then
  webhook_log "Moving '$PUBLIC_HTML_DIR' to '$PUBLIC_HTML_DIR-$TS'"
  mv "$PUBLIC_HTML_DIR" "$PUBLIC_HTML_DIR-$TS" >>"$WEBHOOK_LOGFILE_PATH" 2>&1 ||
    ret="$?"
fi
if [ "$ret" -eq "0" ]; then
  webhook_log "Moving '$PUBLIC_DIR' to '$PUBLIC_HTML_DIR'"
  mv "$PUBLIC_DIR" "$PUBLIC_HTML_DIR" >>"$WEBHOOK_LOGFILE_PATH" 2>&1 ||
    ret="$?"
fi
if [ "$ret" -ne "0" ]; then
  webhook_troubled "Site switch failed"
  mail_failure
else
  webhook_deployed "Site deployed successfully"
  mail_success
fi

# ----
# vim: ts=2:sw=2:et:ai:sts=2

26 May, 2022 10:00PM

May 25, 2022

hackergotchi for Dirk Eddelbuettel

Dirk Eddelbuettel

RcppAPT 0.0.9: Minor Update

A new version of the RcppAPT package with the R interface to the C++ library behind the awesome apt, apt-get, apt-cache, … commands and their cache powering Debian, Ubuntu and the like arrived on CRAN earlier today.

RcppAPT allows you to query the (Debian or Ubuntu) package dependency graph at will, with build-dependencies (if you have deb-src entries), reverse dependencies, and all other goodies. See the vignette and examples for illustrations.

This release updates the code to the Apt 2.5.0 release this makes. It makes a cleaner distinction between public and private components of the API. We adjusted one access point to a pattern we already used, and while at it, simplified some of the transition from the pre-Apt 2.0.0 interface. No new features. The NEWS entries follow.

Changes in version 0.0.9 (2022-05-25)

  • Simplified and standardized to only use public API

  • No longer tests and accomodates pre-Apt 2.0 API

Courtesy of my CRANberries, there is also a diffstat report for this release. A bit more information about the package is available here as well as at the GitHub repo.

If you like this or other open-source work I do, you can now sponsor me at GitHub.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.

25 May, 2022 09:50PM

hackergotchi for Emmanuel Kasper

Emmanuel Kasper

One of the strangest bug I have ever seen on Linux

Networking starts when you login as root, stops when you log off !

SeLinux messages can be ignored I guess, but we see clearly the devices being activated (it's a Linux bridge)

If you have any explanations I am curious.

25 May, 2022 08:58PM by Emmanuel Kasper (noreply@blogger.com)

May 24, 2022

hackergotchi for Bits from Debian

Bits from Debian

Debian welcomes the 2022 GSOC interns

GSoC logo

We are very excited to announce that Debian has selected three interns to work under mentorship on a variety of projects with us during the Google Summer of Code.

Here are the list of the projects, interns, and details of the tasks to be performed.


Project: Android SDK Tools in Debian

  • Interns: Nkwuda Sunday Cletus and Raman Sarda

The deliverables of this project will mostly be finished packages submitted to Debian sid, both for new packages and updated packages. Whenever possible, we should also try to get patches submitted and merged upstream in the Android sources.


Project: Project: Quality Assurance for Biological and Medical Applications inside Debian

  • Interns: Mohammed Bilal

Deliverables of the project: Continuous integration tests for all Debian Med applications (life sciences, medical imaging, others), Quality Assurance review and bug fixing.


Congratulations and welcome to all the interns!

The Google Summer of Code program is possible in Debian thanks to the efforts of Debian Developers and Debian Contributors that dedicate part of their free time to mentor interns and outreach tasks.

Join us and help extend Debian! You can follow the interns' weekly reports on the debian-outreach mailing-list, chat with us on our IRC channel or reach out to the individual projects' team mailing lists.

24 May, 2022 11:15AM by Abhijith Pa

May 23, 2022

Arturo Borrero González

Toolforge Jobs Framework

Toolforge jobs framework diagram

This post was originally published in the Wikimedia Tech blog, authored by Arturo Borrero Gonzalez.

This post continues the discussion of Toolforge updates as described in a previous post. Every non-trivial task performed in Toolforge (like executing a script or running a bot) should be dispatched to a job scheduling backend, which ensures that the job is run in a suitable place with sufficient resources.

Jobs can be scheduled synchronously or asynchronously, continuously, or simply executed once. The basic principle of running jobs is fairly straightforward:

  • You create a job from a submission server (usually login.toolforge.org).
  • The backend finds a suitable execution node to run the job on, and starts it once resources are available.
  • As it runs, the job will send output and errors to files until the job completes or is aborted.

So far, if a tool developer wanted to work with jobs, the Toolforge Grid Engine backend was the only suitable choice. This is despite the fact that Kubernetes supports this kind of workload natively. The truth is that we never prepared our Kubernetes environment to work with jobs. Luckily that has changed.

We no longer want to run Grid Engine

In a previous blog post we shared information about our desired future for Grid Engine in Toolforge. Our intention is to discontinue our usage of this technology.

Convenient way of running jobs on Toolforge Kubernetes

Some advanced Toolforge users really wanted to use Kubernetes. They were aware of the lack of abstractions or helpers, so they were forced to use the raw Kubernetes API. Eventually, they figured everything out and managed to succeed. The result of this move was in the form of [docs on Wikitech][raws] and a few dozen jobs running on Kubernetes for the first time.

We were aware of this, and this initiative was much in sync with our ultimate goal: to promote Kubernetes over Grid Engine. We rolled up our sleeves and started thinking of a way to abstract and make it easy to run jobs without having to deal with lots of YAML and the raw Kubernetes API.

There is a precedent: the webservice command does exactly that. It hides all the details behind a simple command line interface to start/stop a web app running on Kubernetes. However, we wanted to go even further, be more flexible and prepare ourselves for more situations in the future: we decided to create a complete new REST API to wrap the jobs functionality in Toolforge Kubernetes. The Toolforge Jobs Framework was born.

Toolforge Jobs Framework components

The new framework is a small collection of components. As of this writing, we have three:

  • The REST API — responsible for creating/deleting/listing jobs on the Kubernetes system.
  • A command line interface — to interact with the REST API above.
  • An emailer — to notify users about their jobs activity in the Kubernetes system.

Toolforge jobs framework diagram

There were a couple of challenges that weren’t trivial to solve. The authentication and authorization against the Kubernetes API was one of them. The other was deciding on the semantics of the new REST API itself. If you are curious, we invite you to take a look at the documentation we have in wikitech.

Open beta phase

Once we gained some confidence with the new framework, in July 2021 we decided to start a beta phase. We suggested some advanced Toolforge users try out the new framework. We tracked this phase in Phabricator, where our collaborators quickly started reporting some early bugs, helping each other, and creating new feature requests.

Moreover, when we launched the Grid Engine migration from Debian 9 Stretch to Debian 10 Buster we took a step forward and started promoting the new jobs framework as a viable replacement for the grid. Some official documentation pages were created on wikitech as well.

As of this writing the framework continues in beta phase. We have solved basically all of the most important bugs, and we already started thinking on how to address the few feature requests that are missing.

We haven’t yet established yet the criteria for leaving the beta phase, but it would be good to have:

  • Critical bugs fixed and most feature requests addressed (or at least somehow planned).
  • Proper automated test coverage. We can do better on testing the different software components to ensure they are as bug free as possible. This also would make sure that contributing changes is easy.
  • REST API swagger integration.
  • Deployment automation. Deploying the REST API and the emailer is tedious. This is tracked in Phabricator.
  • Documentation, documentation, documentation.

Limitations

One of the limitations we bear in mind since early on in the development process of this framework was the support for mixing different programming languages or runtime environments in the same job.

Solving this limitation is currently one of the WMCS team priorities, because this is one of the key workflows that was available on Grid Engine. The moment we address it, the framework adoption will grow, and it will pretty much enable the same workflows as in the grid, if not more advanced and featureful.

Stay tuned for more upcoming blog posts with additional information about Toolforge.

This post was originally published in the Wikimedia Tech blog, authored by Arturo Borrero Gonzalez.

23 May, 2022 07:19PM

May 22, 2022

hackergotchi for Ulrike Uhlig

Ulrike Uhlig

How do kids conceive the internet? - part 3

I received some feedback on the first part of interviews about the internet with children that I’d like to share publicly here. Thank you! Your thoughts and experiences are important to me!

In the first interview round there was this French girl.

Asked what she would change if she could, the 9 year old girl advocated for a global usage limit of the internet in order to protect the human brain. Also, she said, her parents spend way too much time on their phones and people should rather spend more time with their children.

To this bit, one person reacted saying that they first laughed when reading her proposal, but then felt extremely touched by it.

Another person reacted to the same bit of text:

That’s just brilliant. We spend so much time worrying about how the internet will affect children while overlooking how it has already affected us as parents. It actively harms our relationship with our children (keeping us distracted from their amazing life) and sets a bad example for them.

Too often, when we worry about children, we should look at our own behavior first. Until about that age (9-10+) at least, they are such a direct reflection of us that it’s frightening…

Yet another person reacted to the fact that many of the interviewees in the first round seemed to believe that the internet is immaterial, located somewhere in the air, while being at the same time omnipresent:

It reminds me of one time – about a dozen years ago, when i was still working closely with one of the city high schools – where i’d just had a terrible series of days, dealing with hardware failure, crappy service followthrough by the school’s ISP, and overheating in the server closet, and had basically stayed overnight at the school and just managed to get things back to mostly-functional before kids and teachers started showing up again.

That afternoon, i’d been asked by the teacher of a dystopian fiction class to join them for a discussion of Feed, which they’d just finished reading. i had read it the week before, and came to class prepared for their questions. (the book is about a near-future where kids have cybernetic implants and their society is basically on a runaway communications overload; not a bad Y[oung]A[dult] novel, really!)

The kids all knew me from around the school, but the teacher introduced my appearance in class as “one of the most Internet-connected people” and they wanted to ask me about whether i really thought the internet would “do this kind of thing” to our culture, which i think was the frame that the teacher had prepped them with. I asked them whether they thought the book was really about the Internet, or whether it was about mobile phones. Totally threw off the teacher’s lesson plans, i think, but we had a good discussion.

At one point, one of the kids asked me “if there was some kind of crazy disaster and all the humans died out, would the internet just keep running? what would happen on it if we were all gone?”

all of my labor – even that grueling week – was invisible to him! The internet was an immaterial thing, or if not immaterial, a force of nature, a thing that you accounted for the way you accounted for the weather, or traffic jams. It didn’t occur to him, even having just read a book that asked questions about what hyperconnectivity does to a culture (including grappling with issues of disparate access, effective discrimination based on who has the latest hardware, etc), it didn’t occur to him that this shit all works to the extent that it does because people make it go.

I felt lost trying to explain it to him, because where i wanted to get to with the class discussion was about how we might decide collectively to make it go somewhere else – that our contributions to it, and our labor to perpetuate it (or not) might actually help shape the future that the network helps us slide into. but he didn’t even see that human decisions or labor played a role it in at all, let alone a potentially directive role. We were really starting at square zero, which wasn’t his fault. Or the fault of his classmates that matter – but maybe a little bit of fault on the teacher, who i thought should have been emphasizing this more – but even the teacher clearly thought of the internet as a thing being done to us not as something we might actually drive one way or another. And she’s not even wrong – most people don’t have much control, just like most people can’t control the weather, even as our weather changes based on aggregate human activity.

I was quite impressed by seeing the internet perceived as a force of nature, so we continued this discussion a bit:

that whole story happened before we started talking about “the cloud”, but “the cloud” really reinforces this idea, i think. not that anyone actually thinks that “the cloud” is a literal cloud, but language shapes minds in subtle ways.

(Bold emphasis in the texts are mine.)

Thanks :) I’m happy and touched that these interviews prompted your wonderful reactions, and I hope that there’ll be more to come on this topic. I’m working on it!

22 May, 2022 10:00PM by Ulrike Uhlig

hackergotchi for Sergio Talens-Oliag

Sergio Talens-Oliag

New Blog

Welcome to my new Blog for Technical Stuff.

For a long time I was planning to start publishing technical articles again but to do it I wanted to replace my old blog based on ikiwiki by something more modern.

I’ve used Jekyll with GitLab Pages to build the Intranet of the ITI and to generate internal documentation sites on Agile Content, but, as happened with ikiwiki, I felt that things were kind of slow and not as easy to maintain as I would like.

So on Kyso (the Company I work for right now) I switched to Hugo as the Static Site Generator (I still use GitLab Pages to automate the deployment, though), but the contents are written using the Markdown format, while my personal preference is the Asciidoc format.

One thing I liked about Jekyll was that it was possible to use Asciidoctor to generate the HTML simply by using the Jekyll Asciidoc plugin (I even configured my site to generate PDF documents from .adoc files using the Asciidoctor PDF converter) and, luckily for me, that is also possible with Hugo, so that is what I plan to use on this blog, in fact this post is written in .adoc.

My plan is to start publishing articles about things I’m working on to keep them documented for myself and maybe be useful to someone else.

The general intention is to write about Container Orchestration (mainly Kubernetes), CI/CD tools (currently I’m using GitLab CE for that), System Administration (with Debian GNU/Linux as my preferred OS) and that sort of things.

My next post will be about how I build, publish and update the Blog, but probably I will not finish it until next week, once the site is fully operational and the publishing system is tested.

Spoiler Alert:

This is a personal site, so I’m using Gitea to host the code instead of GitLab.

To handle the deployment I’ve configured json2file-go to save the data sent by the hook calls and process it asynchronously using inotify-tools.

When a new file is detected a script parses the JSON file using jq and builds and updates the site if appropriate.

22 May, 2022 10:00PM

Russ Allbery

Review: On a Sunbeam

Review: On a Sunbeam, by Tillie Walden

Publisher: Tillie Walden
Copyright: 2016-2017
Format: Online graphic novel
Pages: 544

On a Sunbeam is a web comic that was published in installments between Fall 2016 and Spring 2017, and then later published in dead tree form. I read the on-line version, which is still available for free from its web site. It was nominated for an Eisner Award and won a ton of other awards, including the Los Angeles Times Book Prize.

Mia is a new high school graduate who has taken a job with a construction crew that repairs old buildings (that are floating in space, but I'll get to that in a moment). Alma, Elliot, and Charlotte have been together for a long time; Jules is closer to Mia's age and has been with them for a year. This is not the sort of job one commutes to: they live together on a spaceship that travels to the job sites, share meals together, and are more of an extended family than a group of coworkers. It's all a bit intimidating for Mia, but Jules provides a very enthusiastic welcome and some orientation.

The story of Mia's new job is interleaved with Mia's school experience from five years earlier. As a new frosh at a boarding school, Mia is obsessed with Lux, a school sport that involves building and piloting ships through a maze to capture orbs. Sent to the principal's office on the first day of school for sneaking into the Lux tower when she's supposed to be at assembly, she meets Grace, a shy girl with sparkly shoes and an unheard-of single room. Mia (a bit like Jules in the present timeline) overcomes Grace's reticence by being persistently outgoing and determinedly friendly, while trying to get on the Lux team and dealing with the typical school problems of bullies and in-groups.

On a Sunbeam is science fiction in the sense that it seems to take place in space and school kids build flying ships. It is not science fiction in the sense of caring about technological extrapolation or making any scientific sense whatsoever. The buildings that Mia and the crew repair appear to be hanging in empty space, but there's gravity. No one wears any protective clothing or air masks. The spaceships look (and move) like giant tropical fish. If you need realism in your science fiction graphical novels, it's probably best not to think of this as science fiction at all, or even science fantasy despite the later appearance of some apparently magical or divine elements.

That may sound surrealistic or dream-like, but On a Sunbeam isn't that either. It's a story about human relationships, found family, and diversity of personalities, all of which are realistically portrayed. The characters find their world coherent, consistent, and predictable, even if it sometimes makes no sense to the reader. On a Sunbeam is simply set in its own universe, with internal logic but without explanation or revealed rules.

I kind of liked this approach? It takes some getting used to, but it's an excuse for some dramatic and beautiful backgrounds, and it's oddly freeing to have unremarked train tracks in outer space. There's no way that an explanation would have worked; if one were offered, my brain would have tried to nitpick it to the detriment of the story. There's something delightful about a setting that follows imaginary physical laws this unapologetically and without showing the author's work.

I was, sadly, not as much of a fan of the art, although I am certain this will be a matter of taste. Walden mixes simple story-telling panels with sweeping vistas, free-floating domes, and strange, wild asteroids, but she uses a very limited color palette. Most panels are only a few steps away from monochrome, and the colors are chosen more for mood or orientation in the story (Mia's school days are all blue, the Staircase is orange) than for any consistent realism. There is often a lot of detail in the panels, but I found it hard to appreciate because the coloring confused my eye. I'm old enough to have been a comics reader during the revolution in digital coloring and improved printing, and I loved the subsequent dramatic improvement in vivid colors and shading. I know the coloring style here is an intentional artistic choice, but to me it felt like a throwback to the days of muddy printing on cheap paper.

I have a similar complaint about the lettering: On a Sunbeam is either hand-lettered or closely simulates hand lettering, and I often found the dialogue hard to read due to inconsistent intra- and interword spacing or ambiguous letters. Here too I'm sure this was an artistic choice, but as a reader I'd much prefer a readable comics font over hand lettering.

The detail in the penciling is more to my liking. I had occasional trouble telling some of the characters apart, but they're clearly drawn and emotionally expressive. The scenery is wildly imaginative and often gorgeous, which increased my frustration with the coloring. I would love to see what some of these panels would have looked like after realistic coloring with a full palette.

(It's worth noting again that I read the on-line version. It's possible that the art was touched up for the print version and would have been more to my liking.)

But enough about the art. The draw of On a Sunbeam for me is the story. It's not very dramatic or event-filled at first, starting as two stories of burgeoning friendships with a fairly young main character. (They are closely linked, but it's not obvious how until well into the story.) But it's the sort of story that I started reading, thought was mildly interesting, and then kept reading just one more chapter until I had somehow read the whole thing.

There are some interesting twists towards the end, but it's otherwise not a very dramatic or surprising story. What it is instead is open-hearted, quiet, charming, and deeper than it looks. The characters are wildly different and can be abrasive, but they invest time and effort into understanding each other and adjusting for each other's preferences. Personal loss drives a lot of the plot, but the characters are also allowed to mature and be happy without resolving every bad thing that happened to them. These characters felt like people I would like and would want to get to know (even if Jules would be overwhelming). I enjoyed watching their lives.

This reminded me a bit of a Becky Chambers novel, although it's less invested in being science fiction and sticks strictly to humans. There's a similar feeling that the relationships are the point of the story, and that nearly everyone is trying hard to be good, with differing backgrounds and differing conceptions of good. All of the characters are female or non-binary, which is left as entirely unexplained as the rest of the setting. It's that sort of book.

I wouldn't say this is one of the best things I've ever read, but I found it delightful and charming, and it certainly sucked me in and kept me reading until the end. One also cannot argue with the price, although if I hadn't already read it, I would be tempted to buy a paper copy to support the author. This will not be to everyone's taste, and stay far away if you are looking for realistic science fiction, but recommended if you are in the mood for an understated queer character story full of good-hearted people.

Rating: 7 out of 10

22 May, 2022 05:06AM

May 21, 2022

hackergotchi for Dirk Eddelbuettel

Dirk Eddelbuettel

#37: Introducing r2u with 2 x 19k CRAN binaries for Ubuntu 22.04 and 20.04

One month ago I started work on a new side project which is now up and running, and deserving on an introductory blog post: r2u. It was announced in two earlier tweets (first, second) which contained the two (wicked) demos below also found at the documentation site.

So what is this about? It brings full and complete CRAN installability to Ubuntu LTS, both the ‘focal’ release 20.04 and the recent ‘jammy’ release 22.04. It is unique in resolving all R and CRAN packages with the system package manager. So whenever you install something it is guaranteed to run as its dependencies are resolved and co-installed as needed. Equally important, no shared library will be updated or removed by the system as the possible dependency of the R package is known and declared. No other package management system for R does that as only apt on Debian or Ubuntu can — and this project integrates all CRAN packages (plus 200+ BioConductor packages). It will work with any Ubuntu installation on laptop, desktop, server, cloud, container, or in WSL2 (but is limited to Intel/AMD chips, sorry Raspberry Pi or M1 laptop). It covers all of CRAN (or nearly 19k packages), all the BioConductor packages depended-upon (currently over 200), and only excludes less than a handful of CRAN packages that cannot be built.

Usage

Setup instructions approaches described concisely in the repo README.md and documentation site. It consists of just five (or fewer) simple steps, and scripts are provided too for ‘focal’ (20.04) and ‘jammy’ (22.04).

Demos

Check out these two demos (also at the r2u site):

Installing the full tidyverse in one command and 18 seconds

Installing brms and its depends in one command and 13 seconds (and show gitpod.io)

Integration via bspm

The r2u setup can be used directly with apt (or dpkg or any other frontend to the package management system). Once installed apt update; apt upgrade will take care of new packages. For this to work, all CRAN packages (and all BioConductor packages depended upon) are mapped to names like r-cran-rcpp and r-bioc-s4vectors: an r prefix, the repo, and the package name, all lower-cased. That works—but thanks to the wonderful bspm package by Iñaki Úcar we can do much better. It connects R’s own install.packages() and update.packages() to apt. So we can just say (as the demos above show) install.packages("tidyverse") or install.packages("brms") and binaries are installed via apt which is fantastic and it connects R to the system package manager. The setup is really only two lines and described at the r2u site as part of the setup.

History and Motivation

Turning CRAN packages into .deb binaries is not a new idea. Albrecht Gebhardt was the first to realize this about twenty years ago (!!) and implemented it with a single Perl script. Next, Albrecht, Stefan Moeller, David Vernazobres and I built on top of this which is described in this useR! 2007 paper. A most excellent generalization and rewrite was provided by Charles Blundell in an superb Google Summer of Code contribution in 2008 which I mentored. Charles and I described it in this talk at useR! 2009. I ran that setup for a while afterwards, but it died via an internal database corruption in 2010 right when I tried to demo it at CRAN headquarters in Vienna. This peaked at, if memory serves, about 5k packages: all of CRAN at the time. Don Armstrong took it one step further in a full reimplemenation which, if I recall correctly, coverd all of CRAN and BioConductor for what may have been 8k or 9k packages. Don had a stronger system (with full RAID-5) but it also died in a crash and was never rebuilt even though he and I could have relied on Debian resources (as all these approaches focused on Debian). During that time, Michael Rutter created a variant that cleverly used an Ubuntu-only setup utilizing Launchpad. This repo is still going strong, used and relied-upon by many, and about 5k packages (per distribution) strong. At one point, a group consisting of Don, Michael, Gábor Csárdi and myself (as lead/PI) had financial support from the RConsortium ISC for a more general re-implementation , but that support was withdrawn when we did not have time to deliver.

We should also note other long-standing approaches. Detlef Steuer has been using the openSUSE Build Service to provide nearly all of CRAN for openSUSE for many years. Iñaki Úcar built a similar system for Fedora described in this blog post. Iñaki and I also have a arXiv paper describing all this.

Details

Please see the the r2u site for all details on using r2u.

Acknowledgements

The help of everybody who has worked on this is greatly appreciated. So a huge Thank you! to Albrecht, David, Stefan, Charles, Don, Michael, Detlef, Gábor, Iñaki—and whoever I may have omitted. Similarly, thanks to everybody working on R, CRAN, Debian, or Ubuntu—it all makes for a superb system. And another big Thank you! goes to my GitHub sponsors whose continued support is greatly appreciated.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.

21 May, 2022 03:09PM

May 20, 2022

hackergotchi for Wouter Verhelst

Wouter Verhelst

Faster tar

I have a new laptop. The new one is a Dell Latitude 5521, whereas the old one was a Dell Latitude 5590.

As both the old and the new laptops are owned by the people who pay my paycheck, I'm supposed to copy all my data off the old laptop and then return it to the IT department.

A simple way of doing this (and what I'd usually use) is to just rsync the home directory (and other relevant locations) to the new machine. However, for various reasons I didn't want to do that this time around; for one, my home directory on the old laptop is a bit of a mess, and a new laptop is an ideal moment in time to clean that up. If I were to just rsync over the new home directory, then, well.

So instead, I'm creating a tar ball. The first attempt was quite slow:

tar cvpzf wouter@new-laptop:old-laptop.tar.gz /home /var /etc

The problem here is that the default compression algorithm, gzip, is quite slow, especially if you use the default non-parallel implementation.

So we tried something else:

tar cvpf wouter@new-laptop:old-laptop.tar.gz -Ipigz /home /var /etc

Better, but not quite great yet. The old laptop now has bursts of maxing out CPU, but it doesn't even come close to maxing out the gigabit network cable between the two.

Tar can compress to the LZ4 algorithm. That algorithm doesn't compress very well, but it's the best algorithm if "speed" is the most important consideration. So I could do that:

tar cvpf wouter@new-laptop:old-laptop.tar.gz -Ilz4 /home /var /etc

The trouble with that, however, is that the tarball will then be quite big.

So why not use the CPU power of the new laptop?

tar cvpf - /home /var /etc | ssh new-laptop "pigz > old-laptop.tar.gz"

Yeah, that's much faster. Except, now the network speed becomes the limiting factor. We can do better.

tar cvpf - -Ilz4 /home /var /etc | ssh new-laptop "lz4 -d | pigz > old-laptop.tar.gz"

This uses about 70% of the link speed, just over one core on the old laptop, and 60% of CPU time on the new laptop.

After also adding a bit of --exclude="*cache*", to avoid files we don't care about, things go quite quickly now: somewhere between 200 and 250G (uncompressed) was transferred into a 74G file, in 20 minutes. My first attempt hadn't even done 10G after an hour!

20 May, 2022 12:53PM

hackergotchi for Louis-Philippe Véronneau

Louis-Philippe Véronneau

Introducing metalfinder

After going to an incredible Arch Enemy / Behemoth / Napalm Death / Unto Others concert a few weeks ago, I decided I wanted to go to more concerts.

I like music, and I really enjoy concerts. Sadly, I often miss great performances because no one told me about it, or my local newspaper didn't cover the event enough in advance for me to get tickets.

Some online services lets you sync your Spotify account to notify you when a new concert is announced, but I don't use Spotify. As a music geek, I have a local music collection and if I need to stream it, I have a supysonic server.

Introducing metalfinder, a cli tool to find concerts using your local music collection! At the moment, it scans your music collection, creates a list of artists and queries Bandsintown for concerts in your town. Multiple output formats are supported, but I mainly use the ATOM one, as I'm a heavy feed reader user.

Screenshot of the ATOM output in my feed reader

The current metalfinder version (1.1.1) is a MVP: it works well enough, but I still have a lot of work to do... If you want to give it a try, the easiest way is to download it from PyPi. metalfinder is also currently in NEW and I'm planning to have something feature complete in time for the Bookworm freeze.

20 May, 2022 04:00AM by Louis-Philippe Véronneau

Reproducible Builds (diffoscope)

diffoscope 213 released

The diffoscope maintainers are pleased to announce the release of diffoscope version 213. This version includes the following changes:

* Don't mask differences in .zip/.jar central directory extra fields.
* Don't show a binary comparison of .zip/.jar files if we have at least
  one observed nested difference.
* Use assert_diff in test_zip over get_data and separate assert.

You find out more by visiting the project homepage.

20 May, 2022 12:00AM

May 19, 2022

hackergotchi for Ulrike Uhlig

Ulrike Uhlig

How do kids conceive the internet? - part 2

I promised a follow up to my post about interviews about how children conceptualize the internet. Here it is. (Maybe not the last one!)

The internet, it’s that thing that acts up all the time, right?

As said in my first post, I abandoned the idea to interview children younger than 9 years because it seems they are not necessarily aware that they are using the internet. But it turns out that some do have heard about the internet. My friend Anna, who has 9 younger siblings, tried to win some of her brothers and sisters for an interview with me. At the dinner table, this turned into a discussion and she sent me an incredibly funny video where two of her brothers and sisters, aged 5 and 6, discuss with her about the internet. I won’t share the video for privacy reasons — besides, the kids speak in the wondrous dialect of Vorarlberg, a region in western Austria, close to the border with Liechtenstein.

Here’s a transcription of the dinner table discussion:

  • Anna: what is the internet?
  • both children: (shouting as if it was a game of who gets it first) photo! mobile! device! camera!
  • Anna: But one can have a camera without the internet…
  • M.: Internet is the mobile phone charger! Mobile phone full!
  • J.: Internet is… internet is…
  • M.: I know! Internet is where you can charge something, the mobile phone and…
  • Anna: You mean electricity?
  • M.: Yeah, that is the internet, electricity!
  • Anna: (laughs), Yes, the internet works a bit similarly, true.
  • J.: It’s the electricity of the house!
  • Anna: The electricity of the house…

(everyone is talking at the same time now.)

  • Anna: And what’s WiFi?
  • M.: WiFi it’s the TV!
  • Anna (laughs)
  • M.: WiFi is there so it doesn’t act up!
  • Anna (laughs harder)
  • J. (repeats what M. just said): WiFi is there so it doesn’t act up!
  • Anna: So that what doesn’t act up?
  • M.: (moves her finger wildly drawing a small circle in the air) So that it doesn’t spin!
  • Anna: Ah?
  • M.: When one wants to watch something on Youtube, well then… that the thing doesn’t spin like that!
  • Anna: Ahhh! so when you use Youtube, you need the internet, right?
  • J.: Yes, so that one can watch things.

I really like how the kids associate the internet with a thing that works all the time, except for when it doesn’t work. Then they notice: “The internet is acting up!” Probably, when that happens, parents or older siblings say: “the internet is acting up” or “let me check why the internet acts up again” and maybe they get up from the sofa, switch a home router on and off again, which creates this association with electricity.

(Just for the sake of clarity for fellow multilingualist readers, the kids used the German word “spinnen”, which I translated to “acting up”. In French that would be “déconner”.)

WiFi for everyone!

I interviewed another of Anna’s siblings, a 10 year old boy. He told me that he does not really use the internet by himself yet, and does not own any internet capable device. He watches when older family members look up stuff on Google, or put on a video on Youtube, Netflix, or Amazon — he knew all these brand names though. In the living room, there’s Alexa, he told me, and he uses the internet by asking Alexa to play music.

Then I say: Alexa, play this song!

Interestingly, he knew that, in order to listen to a CD, the internet was not needed.

When asked how a drawing would look like that explains the internet, he drew a scheme of the living room at home, with the TV, Alexa, and some kind of WiFi dongle, maybe a repeater. (Unfortunately I did not manage to get his drawing.)

If he could ask a wise and friendly dragon one thing about the internet that he always wanted to know, he would ask “How much internet can one have and what are all the things one can do with the internet?”

If he could change the internet for the better for everyone, he would build a gigantic building which would provide the entire world with WiFi. ☺

Cut out the stupid stuff from the internet

His slightly older sister does own a laptop and a smartphone. She uses the internet to watch movies, or series, to talk with her friends, or to listen to music.

When asked how she would explain the internet to an alien, she said that

one can do a lot of things on the internet, but on the internet there can be stupid things, but also good things, one can learn stuff on the internet, for example how to do crochet.

Most importantly, she noticed that

one needs the internet nowadays.

A child's drawing. On the left, a smartphone with WhatsApp, saying 'calls with WhatsApp'. In the middle a TV saying 'watching movies'. On the right, a laptop with lots of open windowns.

Her drawing shows how she uses the internet: calls using WhatsApp, watching movies online, and a laptop with open windows on the screen.

She would ask the dragon that can explain one thing she always wanted to know about the internet:

What is the internet? How does it work at all? How does it function?

What she would change has to do with her earlier remark about stupid things:

I would make it so that there are less stupid things. It would be good to use the internet for better things, but not for useless things, that one doesn’t actually need.

When I asked her what she meant by “stupid things”, she replied:

Useless videos where one talks about nonsense. And one can also google stupid things, for example “how long will i be alive?” and stuff like that.

Patterns

From the interviews I made until now, there seems to be a cut between then age where kids don’t own a device and use the internet to watch movies, series or listen to music and the age where they start owning a device and then they start talking to their friends, and create accounts on social media. This seems to happen roughly at ages 9-10.

I’m still surprised at the amount of ideas that kids have, when asked what they would change on the internet if they could. I’m sure there’s more if one goes looking for it.

Thanks

Thanks to my friends who made all these interviews possible either by allowing me to meet their children, or their younger siblings: Anna, Christa, Aline, Cindy, and Martina.

19 May, 2022 10:00PM by Ulrike Uhlig

Joerg Jaspert

Rust? Munin? munin-plugin…

My first Rust crate: munin-plugin

Sooo, some time ago I had to rewrite a munin plugin from Shell to Rust, due to the shell version going crazy after some runtime and using up a CPU all for its own. Sure, it only did that on Systems with Oracle Database installed, so that monster seems to be bad (who would have guessed?), but somehow I had to fixup this plugin and wasn’t allowed to drop that wannabe-database.

A while later I wrote a plugin to graph Fibre Channel Host data, and then Network interface statistics, all with a one-second resolution for the graphs, to allow one to zoom in and see every spike. Not have RRD round of the interesting parts.

As one can imagine, that turns out to be a lot of very similar code - after all, most of the difference is in the graph config statements and actual data gathering, but the rest of code is just the same.

As I already know there are more plugins (hello rsyslog statistics) I have to (sometimes re-)write in Rust, I took some time and wrote me a Rust library to make writing munin-plugins in Rust easier. Yay, my first crate on crates.io (and wrote lots of docs for it).

By now I made my 1 second resolution CPU load plugin and the 1 second resolution Network interface plugin use this lib already. To test less complicated plugins with the lib, I took the munin default plugin “load” (Linux variant) and made a Rust version from it, but mostly to see that something as simple as that is also easy to implement: Munin load

I got some idea on how to provide a useful default implementation of the fetch function, so one can write even less code, when using this library.

It is my first library in Rust, so if you see something bad or missing in there, feel free to open issues or pull requests.

Now, having done this, one thing missing: Someone to (re)write munin itself in something that is actually fast… Not munin-node, but munin. Or maybe the RRD usage, but with a few hundred nodes in it, with loads of graphs, we had to adjust munin code and change some timeout or it would commit suicide regularly. And some other code change for not always checking for a rename, or something like it. And only run parts of the default cronjob once an hour, not on every update run. And switch to fetching data over ssh (and munin-async on the nodes). And rrdcached with loads of caching for the trillions of files (currently amounts to ~800G of data).. And it still needs way more CPU than it should. Soo, lots of possible optimizations hidden in there. Though I bet a non-scripting language rewrite might gain the most. (Except, of course, someone needs to do it… :) )

19 May, 2022 08:33PM

May 18, 2022

hackergotchi for Gunnar Wolf

Gunnar Wolf

I do have a full face

I have been a bearded subject since I was 18, back in 1994. Yes, during 1999-2000, I shaved for my military service, and I briefly tried the goatee look in 2008… Few people nowadays can imagine my face without a forest of hair.

But sometimes, life happens. And, unlike my good friend Bdale, I didn’t get Linus to do the honors… But, all in all, here I am:

Turns out, I have been suffering from quite bad skin infections for a couple of years already. Last Friday, I checked in to the hospital, with an ugly, swollen face (I won’t put you through that), and the hospital staff decided it was in my best interests to trim my beard. And then some more. And then shave me. I sat in the hospital for four days, getting soaked (medical term) with antibiotics and otherstuff, got my recipes for the next few days, and… well, I really hope that’s the end of the infections. We shall see!

So, this is the result of the loving and caring work of three different nurses. Yes, not clean-shaven (I should not trim it further, as shaving blades are a risk of reinfection).

Anyway… I guess the bits of hair you see over the place will not take too long to become a beard again, even get somewhat respectable. But I thought some of you would like to see the real me™ 😉

PS- Thanks to all who have reached out with good wishes. All is fine!

18 May, 2022 02:52PM

Reproducible Builds

Supporter spotlight: Jan Nieuwenhuizen on Bootstrappable Builds, GNU Mes and GNU Guix

The Reproducible Builds project relies on several projects, supporters and sponsors for financial support, but they are also valued as ambassadors who spread the word about our project and the work that we do.

This is the fourth instalment in a series featuring the projects, companies and individuals who support the Reproducible Builds project.

We started this series by featuring the Civil Infrastructure Platform project and followed this up with a post about the Ford Foundation as well as a recent ones about ARDC and the Google Open Source Security Team (GOSST). Today, however, we will be talking with Jan Nieuwenhuizen about Bootstrappable Builds, GNU Mes and GNU Guix.


Chris Lamb: Hi Jan, thanks for taking the time to talk with us today. First, could you briefly tell me about yourself?

Jan: Thanks for the chat; it’s been a while! Well, I’ve always been trying to find something new and interesting that is just asking to be created but is mostly being overlooked. That’s how I came to work on GNU Guix and create GNU Mes to address the bootstrapping problem that we have in free software. It’s also why I have been working on releasing Dezyne, a programming language and set of tools to specify and formally verify concurrent software systems as free software.

Briefly summarised, compilers are often written in the language they are compiling. This creates a chicken-and-egg problem which leads users and distributors to rely on opaque, pre-built binaries of those compilers that they use to build newer versions of the compiler. To gain trust in our computing platforms, we need to be able to tell how each part was produced from source, and opaque binaries are a threat to user security and user freedom since they are not auditable. The goal of bootstrappability (and the bootstrappable.org project in particular) is to minimise the amount of these “bootstrap” binaries.

Anyway, after studying Physics at Eindhoven University of Technology (TU/e), I worked for digicash.com, a startup trying to create a digital and anonymous payment system – sadly, however, a traditional account-based system won. Separate to this, as there was no software (either free or proprietary) to automatically create beautiful music notation, together with Han-Wen Nienhuys, I created GNU LilyPond. Ten years ago, I took the initiative to co-found a democratic school in Eindhoven based on the principles of sociocracy. And last Christmas I finally went vegan, after being mostly vegetarian for about 20 years!


Chris: For those who have not heard of it before, what is GNU Guix? What are the key differences between Guix and other Linux distributions?

Jan: GNU Guix is both a package manager and a full-fledged GNU/Linux distribution. In both forms, it provides state-of-the-art package management features such as transactional upgrades and package roll-backs, hermetical-sealed build environments, unprivileged package management as well as per-user profiles. One obvious difference is that Guix forgoes the usual Filesystem Hierarchy Standard (ie. /usr, /lib, etc.), but there are other significant differences too, such as Guix being scriptable using Guile/Scheme, as well as Guix’s dedication and focus on free software.


Chris: How does GNU Guix relate to GNU Mes? Or, rather, what problem is Mes attempting to solve?

Jan: GNU Mes was created to address the security concerns that arise from bootstrapping an operating system such as Guix. Even if this process entirely involves free software (i.e. the source code is, at least, available), this commonly uses large and unauditable binary blobs.

Mes is a Scheme interpreter written in a simple subset of C and a C compiler written in Scheme, and it comes with a small, bootstrappable C library. Twice, the Mes bootstrap has halved the size of opaque binaries that were needed to bootstrap GNU Guix. These reductions were achieved by first replacing GNU Binutils, GNU GCC and the GNU C Library with Mes, and then replacing Unix utilities such as awk, bash, coreutils, grep sed, etc., by Gash and Gash-Utils. The final goal of Mes is to help create a full-source bootstrap for any interested UNIX-like operating system.


Chris: What is the current status of Mes?

Jan: Mes supports all that is needed from ‘R5RS’ and GNU Guile to run MesCC with Nyacc, the C parser written for Guile, for 32-bit x86 and ARM. The next step for Mes would be more compatible with Guile, e.g., have guile-module support and support running Gash and Gash Utils.

In working to create a full-source bootstrap, I have disregarded the kernel and Guix build system for now, but otherwise, all packages should be built from source, and obviously, no binary blobs should go in. We still need a Guile binary to execute some scripts, and it will take at least another one to two years to remove that binary. I’m using the 80/20 approach, cutting corners initially to get something working and useful early.

Another metric would be how many architectures we have. We are quite a way with ARM, tinycc now works, but there are still problems with GCC and Glibc. RISC-V is coming, too, which could be another metric. Someone has looked into picking up NixOS this summer. “How many distros do anything about reproducibility or bootstrappability?” The bootstrappability community is so small that we don’t ‘need’ metrics, sadly. The number of bytes of binary seed is a nice metric, but running the whole thing on a full-fledged Linux system is tough to put into a metric. Also, it is worth noting that I’m developing on a modern Intel machine (ie. a platform with a management engine), that’s another key component that doesn’t have metrics.


Chris: From your perspective as a Mes/Guix user and developer, what does ‘reproducibility’ mean to you? Are there any related projects?

Jan: From my perspective, I’m more into the problem of bootstrapping, and reproducibility is a prerequisite for bootstrappability. Reproducibility clearly takes a lot of effort to achieve, however. It’s relatively easy to install some Linux distribution and be happy, but if you look at communities that really care about security, they are investing in reproducibility and other ways of improving the security of their supply chain. Projects I believe are complementary to Guix and Mes include NixOS, Debian and — on the hardware side — the RISC-V platform shares many of our core principles and goals.


Chris: Well, what are these principles and goals?

Jan: Reproducibility and bootstrappability often feel like the “next step” in the frontier of free software. If you have all the sources and you can’t reproduce a binary, that just doesn’t “feel right” anymore. We should start to desire (and demand) transparent, elegant and auditable software stacks. To a certain extent, that’s always been a low-level intent since the beginning of free software, but something clearly got lost along the way.

On the other hand, if you look at the NPM or Rust ecosystems, we see a world where people directly install binaries. As they are not as supportive of copyleft as the rest of the free software community, you can see that movement and people in our area are doing more as a response to that so that what we have continues to remain free, and to prevent us from falling asleep and waking up in a couple of years and see, for example, Rust in the Linux kernel and (more importantly) we require big binary blobs to use our systems. It’s an excellent time to advance right now, so we should get a foothold in and ensure we don’t lose any more.


Chris: What would be your ultimate reproducibility goal? And what would the key steps or milestones be to reach that?

Jan: The “ultimate” goal would be to have a system built with open hardware, with all software on it fully bootstrapped from its source. This bootstrap path should be easy to replicate and straightforward to inspect and audit. All fully reproducible, of course! In essence, we would have solved the supply chain security problem.

Our biggest challenge is ignorance. There is much unawareness about the importance of what we are doing. As it is rather technical and doesn’t really affect everyday computer use, that is not surprising. This unawareness can be a great force driving us in the opposite direction. Think of Rust being allowed in the Linux kernel, or Python being required to build a recent GNU C library (glibc). Also, the fact that companies like Google/Apple still want to play “us” vs “them”, not willing to to support GPL software. Not ready yet to truly support user freedom.

Take the infamous log4j bug — everyone is using “open source” these days, but nobody wants to take responsibility and help develop or nurture the community. Not “ecosystem”, as that’s how it’s being approached right now: live and let live/die: see what happens without taking any responsibility. We are growing and we are strong and we can do a lot… but if we have to work against those powers, it can become problematic. So, let’s spread our great message and get more people involved!


Chris: What has been your biggest win?

Jan: From a technical point of view, the “full-source” bootstrap has have been our biggest win. A talk by Carl Dong at the 2019 Breaking Bitcoin conference stated that connecting Jeremiah Orian’s Stage0 project to Mes would be the “holy grail” of bootstrapping, and we recently managed to achieve just that: in other words, starting from hex0, 357-byte binary, we can now build the entire Guix system.

This past year we have not made significant visible progress, however, as our funding was unfortunately not there. The Stage0 project has advanced in RISC-V. A month ago, though, I secured NLnet funding for another year, and thanks to NLnet, Ekaitz Zarraga and Timothy Sample will work on GNU Mes and the Guix bootstrap as well. Separate to this, the bootstrappable community has grown a lot from two people it was six years ago: there are now currently over 100 people in the #bootstrappable IRC channel, for example. The enlarged community is possibly an even more important win going forward.


Chris: How realistic is a 100% bootstrappable toolchain? And from someone who has been working in this area for a while, is “solving Trusting Trust)” actually feasible in reality?

Jan: Two answers: Yes and no, it really depends on your definition. One great thing is that the whole Stage0 project can also run on the Knight virtual machine, a hardware platform that was designed, I think, in the 1970s. I believe we can and must do better than we are doing today, and that there’s a lot of value in it.

The core issue is not the trust; we can probably all trust each other. On the other hand, we don’t want to trust each other or even ourselves. I am not, personally, going to inspect my RISC-V laptop, and other people create the hardware and do probably not want to inspect the software. The answer comes back to being conscientious and doing what is right. Inserting GCC as a binary blob is not right. I think we can do better, and that’s what I’d like to do. The security angle is interesting, but I don’t like the paranoid part of that; I like the beauty of what we are creating together and stepwise improving on that.


Chris: Thanks for taking the time to talk to us today. If someone wanted to get in touch or learn more about GNU Guix or Mes, where might someone go?

Jan: Sure! First, check out:

I’m also on Twitter (@janneke_gnu) and on octodon.social (@janneke@octodon.social).


Chris: Thanks for taking the time to talk to us today.

Jan: No problem. :)




For more information about the Reproducible Builds project, please see our website at reproducible-builds.org. If you are interested in ensuring the ongoing security of the software that underpins our civilisation and wish to sponsor the Reproducible Builds project, please reach out to the project by emailing contact@reproducible-builds.org.

18 May, 2022 10:00AM

hackergotchi for Louis-Philippe Véronneau

Louis-Philippe Véronneau

Clojure Team 2022 Sprint Report

This is the report for the Debian Clojure Team remote sprint that took place on May 13-14th.

Looking at my previous blog entries, this was my first Debian sprint since July 2020! Crazy how fast time flies...

Many thanks to those who participated, namely:

  • Rob Browning (rlb)
  • Elana Hashman (ehashman)
  • Jérôme Charaoui (lavamind)
  • Leandro Doctors (allentiak)
  • Louis-Philippe Véronneau (pollo)

Sadly, Utkarsh Gupta — although having planned on participating — ended up not being able to and worked on DebConf Bursary paperwork instead.

rlb

Rob mostly worked on creating a dh-clojure tool to help make packaging Clojure libraries easier.

At the moment, most of the packaging is done manually, by invoking build tools by hand. Having a tool to automate many of the steps required to build Clojure packages would go a long way in making them more uniform.

His work (although still very much a WIP) can be found here: https://salsa.debian.org/rlb/dh-clojure/

ehashman

Elana:

  • Finished the Java Team VCS migration to the Clojure Team namespace.
  • Worked on updating Leiningen to 2.9.8.
  • Proposed an upstream dependency update in Leiningen to match Debian's most recent version.
  • Gave pollo Owner access on the Clojure Team namespace and added lavamind as a Developer.
  • Uploaded Clojure 1.10.3-1.
  • Updated sjacket-clojure to version 0.1.1.1 and uploaded it to experimental.
  • Added build tests to spec-alpha-clojure.
  • Filed bug #1010995 for missing test dependency for Clojure.
  • Closed bugs #976151, #992735 and #992736.

lavamind

It was Jérôme's first time working on Clojure packages, and things went great! During the sprint, he:

  • Joined the Clojure Team on salsa.
  • Identified missing dependencies to update puppetdb to the 7.x release.
  • Learned how to package Clojure libraries in Debian.
  • Packaged murphy-clojure, truss-clojure and encore-clojure and uploaded them to NEW.
  • Began to package nippy-clojure.

allentiak

Leandro joined us on Saturday, since he couldn't get off work on Friday. He mostly continued working on replacing our in-house scripts for /usr/bin/clojure by upstream's, a task he had already started during GSoC 2021.

Sadly, none of us were familiar with Debian's mechanism for alternatives. If you (yes you, dear reader) are familiar with it, I'm sure he would warmly welcome feedback on his development branch.

pollo

As for me, I:

  • Fixed a classpath bug in core-async-clojure that was breaking other libraries.
  • Added meaningful autopkgtests to core-async-clojure.
  • Uploaded new versions of tools-analyzer-clojure and trapperkeeper-clojure with autopkgtests.
  • Updated pomegranate-clojure and nrepl-clojure to the latest upstream version and revamped the way they were packaged.
  • Assisted lavamind with Clojure packaging.

Overall, it was quite a productive sprint!

Thanks to Debian for sponsoring our food during the sprint. It was nice to be able to concentrate on fixing things instead of making food :)

Here's a bonus picture of the nice sushi platter I ended up getting for dinner on Saturday night:

Picture of a sushi platter

18 May, 2022 04:00AM by Louis-Philippe Véronneau

May 16, 2022

hackergotchi for Matthew Garrett

Matthew Garrett

Can we fix bearer tokens?

Last month I wrote about how bearer tokens are just awful, and a week later Github announced that someone had managed to exfiltrate bearer tokens from Heroku that gave them access to, well, a lot of Github repositories. This has inevitably resulted in a whole bunch of discussion about a number of things, but people seem to be largely ignoring the fundamental issue that maybe we just shouldn't have magical blobs that grant you access to basically everything even if you've copied them from a legitimate holder to Honest John's Totally Legitimate API Consumer.

To make it clearer what the problem is here, let's use an analogy. You have a safety deposit box. To gain access to it, you simply need to be able to open it with a key you were given. Anyone who turns up with the key can open the box and do whatever they want with the contents. Unfortunately, the key is extremely easy to copy - anyone who is able to get hold of your keyring for a moment is in a position to duplicate it, and then they have access to the box. Wouldn't it be better if something could be done to ensure that whoever showed up with a working key was someone who was actually authorised to have that key?

To achieve that we need some way to verify the identity of the person holding the key. In the physical world we have a range of ways to achieve this, from simply checking whether someone has a piece of ID that associates them with the safety deposit box all the way up to invasive biometric measurements that supposedly verify that they're definitely the same person. But computers don't have passports or fingerprints, so we need another way to identify them.

When you open a browser and try to connect to your bank, the bank's website provides a TLS certificate that lets your browser know that you're talking to your bank instead of someone pretending to be your bank. The spec allows this to be a bi-directional transaction - you can also prove your identity to the remote website. This is referred to as "mutual TLS", or mTLS, and a successful mTLS transaction ends up with both ends knowing who they're talking to, as long as they have a reason to trust the certificate they were presented with.

That's actually a pretty big constraint! We have a reasonable model for the server - it's something that's issued by a trusted third party and it's tied to the DNS name for the server in question. Clients don't tend to have stable DNS identity, and that makes the entire thing sort of awkward. But, thankfully, maybe we don't need to? We don't need the client to be able to prove its identity to arbitrary third party sites here - we just need the client to be able to prove it's a legitimate holder of whichever bearer token it's presenting to that site. And that's a much easier problem.

Here's the simple solution - clients generate a TLS cert. This can be self-signed, because all we want to do here is be able to verify whether the machine talking to us is the same one that had a token issued to it. The client contacts a service that's going to give it a bearer token. The service requests mTLS auth without being picky about the certificate that's presented. The service embeds a hash of that certificate in the token before handing it back to the client. Whenever the client presents that token to any other service, the service ensures that the mTLS cert the client presented matches the hash in the bearer token. Copy the token without copying the mTLS certificate and the token gets rejected. Hurrah hurrah hats for everyone.

Well except for the obvious problem that if you're in a position to exfiltrate the bearer tokens you can probably just steal the client certificates and keys as well, and now you can pretend to be the original client and this is not adding much additional security. Fortunately pretty much everything we care about has the ability to store the private half of an asymmetric key in hardware (TPMs on Linux and Windows systems, the Secure Enclave on Macs and iPhones, either a piece of magical hardware or Trustzone on Android) in a way that avoids anyone being able to just steal the key.

How do we know that the key is actually in hardware? Here's the fun bit - it doesn't matter. If you're issuing a bearer token to a system then you're already asserting that the system is trusted. If the system is lying to you about whether or not the key it's presenting is hardware-backed then you've already lost. If it lied and the system is later compromised then sure all your apes get stolen, but maybe don't run systems that lie and avoid that situation as a result?

Anyway. This is covered in RFC 8705 so why aren't we all doing this already? From the client side, the largest generic issue is that TPMs are astonishingly slow in comparison to doing a TLS handshake on the CPU. RSA signing operations on TPMs can take around half a second, which doesn't sound too bad, except your browser is probably establishing multiple TLS connections to subdomains on the site it's connecting to and performance is going to tank. Fixing this involves doing whatever's necessary to convince the browser to pipe everything over a single TLS connection, and that's just not really where the web is right at the moment. Using EC keys instead helps a lot (~0.1 seconds per signature on modern TPMs), but it's still going to be a bottleneck.

The other problem, of course, is that ecosystem support for hardware-backed certificates is just awful. Windows lets you stick them into the standard platform certificate store, but the docs for this are hidden in a random PDF in a Github repo. Macs require you to do some weird bridging between the Secure Enclave API and the keychain API. Linux? Well, the standard answer is to do PKCS#11, and I have literally never met anybody who likes PKCS#11 and I have spent a bunch of time in standards meetings with the sort of people you might expect to like PKCS#11 and even they don't like it. It turns out that loading a bunch of random C bullshit that has strong feelings about function pointers into your security critical process is not necessarily something that is going to improve your quality of life, so instead you should use something like this and just have enough C to bridge to a language that isn't secretly plotting to kill your pets the moment you turn your back.

And, uh, obviously none of this matters at all unless people actually support it. Github has no support at all for validating the identity of whoever holds a bearer token. Most issuers of bearer tokens have no support for embedding holder identity into the token. This is not good! As of last week, all three of the big cloud providers support virtualised TPMs in their VMs - we should be running CI on systems that can do that, and tying any issued tokens to the VMs that are supposed to be making use of them.

So sure this isn't trivial. But it's also not impossible, and making this stuff work would improve the security of, well, everything. We literally have the technology to prevent attacks like Github suffered. What do we have to do to get people to actually start working on implementing that?

comment count unavailable comments

16 May, 2022 07:48AM

May 15, 2022

hackergotchi for Dirk Eddelbuettel

Dirk Eddelbuettel

RcppArmadillo 0.11.1.1.0 on CRAN: Updates

armadillo image

Armadillo is a powerful and expressive C++ template library for linear algebra and scientific computing. It aims towards a good balance between speed and ease of use, has syntax deliberately close to Matlab and is useful for algorithm development directly in C++, or quick conversion of research code into production environments. RcppArmadillo integrates this library with the R environment and language–and is widely used by (currently) 978 other packages on CRAN, downloaded over 24 million times (per the partial logs from the cloud mirrors of CRAN), and the CSDA paper (preprint / vignette) by Conrad and myself has been cited 469 times according to Google Scholar.

This release brings a first new upstream fix in the new release series 11.*. In particular, treatment of ill-conditioned matrices is further strengthened. We once again tested this very rigorously via three different RC releases each of which got a full reverse-dependencies run (for which results are always logged here). A minor issue with old g++ compilers was found once 11.1.0 was tagged to this upstream release is now 11.1.1. Also fixed is an OpenMP setup issue where Justin Silverman noticed that we did not propagate the -fopenmp setting correctly.

The full set of changes (since the last CRAN release 0.11.0.0.0) follows.

Changes in RcppArmadillo version 0.11.1.1.0 (2022-05-15)

  • Upgraded to Armadillo release 11.1.1 (Angry Kitchen Appliance)

    • added inv_opts::no_ugly option to inv() and inv_sympd() to disallow inverses of poorly conditioned matrices

    • more efficient handling of rank-deficient matrices via inv_opts::allow_approx option in inv() and inv_sympd()

    • better detection of rank deficient matrices by solve()

    • faster handling of symmetric and diagonal matrices by cond()

  • The configure script again propagates the'found' case again, thanks to Justin Silverman for the heads-up and suggested fix (Dirk and Justin in #376 and #377 fixing #375).

Changes in RcppArmadillo version 0.11.0.1.0 (2022-04-14)

  • Upgraded to Armadillo release 11.0.1 (Creme Brulee)

    • fix miscompilation of inv() and inv_sympd() functions when using inv_opts::allow_approx and inv_opts::tiny options

Courtesy of my CRANberries, there is a diffstat report relative to previous release. More detailed information is on the RcppArmadillo page. Questions, comments etc should go to the rcpp-devel mailing list off the R-Forge page.

If you like this or other open-source work I do, you can sponsor me at GitHub.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.

15 May, 2022 09:21PM

May 13, 2022

Antoine Beaupré

NVMe/SSD disk failure

Yesterday, my workstation (curie) was hung when I came in the office. After a "skinny elephant", the box rebooted, but it couldn't find the primary disk (in the BIOS). Instead, it booted on the secondary HDD drive, still running an old Fedora 27 install which somehow survived to this day, possibly because ?BTRFS is incomprehensible.

Somehow, I blindly accepted the Fedora prompt asking me to upgrade to Fedora 28, not realizing that:

  1. Fedora is now at release 36, not 28
  2. major upgrades take about an hour...
  3. ... and happen at boot time, blocking the entire machine (I'll remember this next time I laugh at Windows and Mac OS users stuck on updates on boot)
  4. you can't skip more than one major upgrade

Which means that upgrading to latest would take over 4 hours. Thankfully, it's mostly automated and seems to work pretty well (which is not exactly the case for Debian). It still seems like a lot of wasted time -- it would probably be better to just reinstall the machine at this point -- and not what I had planned to do that morning at all.

In any case, after waiting all that time, the machine booted (in Fedora) again, and now it could detect the SSD disk. The BIOS could find the disk too, so after I reinstalled grub (from Fedora) and fixed the boot order, it rebooted, but secureboot failed, so I turned that off (!?), and I was back in Debian.

I did an emergency backup with ddrescue, from the running system which probably doesn't really work as a backup (because the filesystem is likely to be corrupt) but it was fast enough (20 minutes) and gave me some peace of mind. My offsites backup have been down for a while and since I treat my workstations as "cattle" (not "pets"), I don't have a solid recovery scenario for those situations other than "just reinstall and run Puppet", which takes a while.

Now I'm wondering what the next step is: probably replace the disk anyways (the new one is bigger: 1TB instead of 500GB), or keep the new one as a hot backup somehow. Too bad I don't have a snapshotting filesystem on there... (Technically, I have LVM, but LVM snapshots are heavy and slow, and can't atomically cover the entire machine.)

It's kind of scary how this thing failed: totally dropped off the bus, just not in the BIOS at all. I prefer the way spinning rust fails: clickety sounds, tons of warnings beforehand, partial recovery possible. With this new flashy junk, you just lose everything all at once. Not fun.

13 May, 2022 08:19PM

BTRFS notes

I'm not a fan of BTRFS. This page serves as a reminder of why, but also a cheat sheet to figure out basic tasks in a BTRFS environment because those are not obvious to me, even after repeatedly having to deal with them.

Content warning: there might be mentions of ZFS.

Stability concerns

I'm worried about BTRFS stability, which has been historically ... changing. RAID-5 and RAID-6 are still marked unstable, for example. It's kind of a lucky guess whether your current kernel will behave properly with your planned workload. For example, in Linux 4.9, RAID-1 and RAID-10 were marked as "mostly OK" with a note that says:

Needs to be able to create two copies always. Can get stuck in irreversible read-only mode if only one copy can be made.

Even as of now, RAID-1 and RAID-10 has this note:

The simple redundancy RAID levels utilize different mirrors in a way that does not achieve the maximum performance. The logic can be improved so the reads will spread over the mirrors evenly or based on device congestion.

Granted, that's not a stability concern anymore, just performance. A reviewer of a draft of this article actually claimed that BTRFS only reads from one of the drives, which hopefully is inaccurate, but goes to show how confusing all this is.

There are other warnings in the Debian wiki that are quite scary. Even the legendary Arch wiki has a warning on top of their BTRFS page, still.

Even if those issues are now fixed, it can be hard to tell when they were fixed. There is a changelog by feature but it explicitly warns that it doesn't know "which kernel version it is considered mature enough for production use", so it's also useless for this.

It would have been much better if BTRFS was released into the world only when those bugs were being completely fixed. Or that, at least, features were announced when they were stable, not just "we merged to mainline, good luck". Even now, we get mixed messages even in the official BTRFS documentation which says "The Btrfs code base is stable" (main page) while at the same time clearly stating unstable parts in the status page (currently RAID56).

There are much harsher BTRFS critics than me out there so I will stop here, but let's just say that I feel a little uncomfortable trusting server data with full RAID arrays to BTRFS. But surely, for a workstation, things should just work smoothly... Right? Well, let's see the snags I hit.

My BTRFS test setup

Before I go any further, I should probably clarify how I am testing BTRFS in the first place.

The reason I tried BTRFS is that I was ... let's just say "strongly encouraged" by the LWN editors to install Fedora for the terminal emulators series. That, in turn, meant the setup was done with BTRFS, because that was somewhat the default in Fedora 27 (or did I want to experiment? I don't remember, it's been too long already).

So Fedora was setup on my 1TB HDD and, with encryption, the partition table looks like this:

NAME                   MAJ:MIN RM   SIZE RO TYPE  MOUNTPOINT
sda                      8:0    0 931,5G  0 disk  
├─sda1                   8:1    0   200M  0 part  /boot/efi
├─sda2                   8:2    0     1G  0 part  /boot
├─sda3                   8:3    0   7,8G  0 part  
│ └─fedora_swap        253:5    0   7.8G  0 crypt [SWAP]
└─sda4                   8:4    0 922,5G  0 part  
  └─fedora_crypt       253:4    0 922,5G  0 crypt /

(This might not entirely be accurate: I rebuilt this from the Debian side of things.)

This is pretty straightforward, except for the swap partition: normally, I just treat swap like any other logical volume and create it in a logical volume. This is now just speculation, but I bet it was setup this way because "swap" support was only added in BTRFS 5.0.

I fully expect BTRFS experts to yell at me now because this is an old setup and BTRFS is so much better now, but that's exactly the point here. That setup is not that old (2018? old? really?), and migrating to a new partition scheme isn't exactly practical right now. But let's move on to more practical considerations.

No builtin encryption

BTRFS aims at replacing the entire mdadm, LVM, and ext4 stack with a single entity, and adding new features like deduplication, checksums and so on.

Yet there is one feature it is critically missing: encryption. See, my typical stack is actually mdadm, LUKS, and then LVM and ext4. This is convenient because I have only a single volume to decrypt.

If I were to use BTRFS on servers, I'd need to have one LUKS volume per-disk. For a simple RAID-1 array, that's not too bad: one extra key. But for large RAID-10 arrays, this gets really unwieldy.

The obvious BTRFS alternative, ZFS, supports encryption out of the box and mixes it above the disks so you only have one passphrase to enter. The main downside of ZFS encryption is that it happens above the "pool" level so you can typically see filesystem names (and possibly snapshots, depending on how it is built), which is not the case with a more traditional stack.

Subvolumes, filesystems, and devices

I find BTRFS's architecture to be utterly confusing. In the traditional LVM stack (which is itself kind of confusing if you're new to that stuff), you have those layers:

  • disks: let's say /dev/nvme0n1 and nvme1n1
  • RAID arrays with mdadm: let's say the above disks are joined in a RAID-1 array in /dev/md1
  • volume groups or VG with LVM: the above RAID device (technically a "physical volume" or PV) is assigned into a VG, let's call it vg_tbbuild05 (multiple PVs can be added to a single VG which is why there is that abstraction)
  • LVM logical volumes: out of that volume group actually "virtual partitions" or "logical volumes" are created, that is where your filesystem lives
  • filesystem, typically with ext4: that's your normal filesystem, which treats the logical volume as just another block device

A typical server setup would look like this:

NAME                      MAJ:MIN RM   SIZE RO TYPE  MOUNTPOINT
nvme0n1                   259:0    0   1.7T  0 disk  
├─nvme0n1p1               259:1    0     8M  0 part  
├─nvme0n1p2               259:2    0   512M  0 part  
│ └─md0                     9:0    0   511M  0 raid1 /boot
├─nvme0n1p3               259:3    0   1.7T  0 part  
│ └─md1                     9:1    0   1.7T  0 raid1 
│   └─crypt_dev_md1       253:0    0   1.7T  0 crypt 
│     ├─vg_tbbuild05-root 253:1    0    30G  0 lvm   /
│     ├─vg_tbbuild05-swap 253:2    0 125.7G  0 lvm   [SWAP]
│     └─vg_tbbuild05-srv  253:3    0   1.5T  0 lvm   /srv
└─nvme0n1p4               259:4    0     1M  0 part

I stripped the other nvme1n1 disk because it's basically the same.

Now, if we look at my BTRFS-enabled workstation, which doesn't even have RAID, we have the following:

  • disk: /dev/sda with, again, /dev/sda4 being where BTRFS lives
  • filesystem: fedora_crypt, which is, confusingly, kind of like a volume group. it's where everything lives. i think.
  • subvolumes: home, root, /, etc. those are actually the things that get mounted. you'd think you'd mount a filesystem, but no, you mount a subvolume. that is backwards.

It looks something like this to lsblk:

NAME                   MAJ:MIN RM   SIZE RO TYPE  MOUNTPOINT
sda                      8:0    0 931,5G  0 disk  
├─sda1                   8:1    0   200M  0 part  /boot/efi
├─sda2                   8:2    0     1G  0 part  /boot
├─sda3                   8:3    0   7,8G  0 part  [SWAP]
└─sda4                   8:4    0 922,5G  0 part  
  └─fedora_crypt       253:4    0 922,5G  0 crypt /srv

Notice how we don't see all the BTRFS volumes here? Maybe it's because I'm mounting this from the Debian side, but lsblk definitely gets confused here. I frankly don't quite understand what's going on, even after repeatedly looking around the rather dismal documentation. But that's what I gather from the following commands:

root@curie:/home/anarcat# btrfs filesystem show
Label: 'fedora'  uuid: 5abb9def-c725-44ef-a45e-d72657803f37
    Total devices 1 FS bytes used 883.29GiB
    devid    1 size 922.47GiB used 916.47GiB path /dev/mapper/fedora_crypt

root@curie:/home/anarcat# btrfs subvolume list /srv
ID 257 gen 108092 top level 5 path home
ID 258 gen 108094 top level 5 path root
ID 263 gen 108020 top level 258 path root/var/lib/machines

I only got to that point through trial and error. Notice how I use an existing mountpoint to list the related subvolumes. If I try to use the filesystem path, the one that's listed in filesystem show, I fail:

root@curie:/home/anarcat# btrfs subvolume list /dev/mapper/fedora_crypt 
ERROR: not a btrfs filesystem: /dev/mapper/fedora_crypt
ERROR: can't access '/dev/mapper/fedora_crypt'

Maybe I just need to use the label? Nope:

root@curie:/home/anarcat# btrfs subvolume list fedora
ERROR: cannot access 'fedora': No such file or directory
ERROR: can't access 'fedora'

This is really confusing. I don't even know if I understand this right, and I've been staring at this all afternoon. Hopefully, the lazyweb will correct me eventually.

(As an aside, why are they called "subvolumes"? If something is a "sub" of "something else", that "something else" must exist right? But no, BTRFS doesn't have "volumes", it only has "subvolumes". Go figure. Presumably the filesystem still holds "files" though, at least empirically it doesn't seem like it lost anything so far.

In any case, at least I can refer to this section in the future, the next time I fumble around the btrfs commandline, as I surely will. I will possibly even update this section as I get better at it, or based on my reader's judicious feedback.

Mounting BTRFS subvolumes

So how did I even get to that point? I have this in my /etc/fstab, on the Debian side of things:

UUID=5abb9def-c725-44ef-a45e-d72657803f37   /srv    btrfs  defaults 0   2

This thankfully ignores all the subvolume nonsense because it relies on the UUID. mount tells me that's actually the "root" (? /?) subvolume:

root@curie:/home/anarcat# mount | grep /srv
/dev/mapper/fedora_crypt on /srv type btrfs (rw,relatime,space_cache,subvolid=5,subvol=/)

Let's see if I can mount the other volumes I have on there. Remember that subvolume list showed I had home, root, and var/lib/machines. Let's try root:

mount -o subvol=root /dev/mapper/fedora_crypt /mnt

Interestingly, root is not the same as /, it's a different subvolume! It seems to be the Fedora root (/, really) filesystem. No idea what is happening here. I also have a home subvolume, let's mount it too, for good measure:

mount -o subvol=home /dev/mapper/fedora_crypt /mnt/home

Note that lsblk doesn't notice those two new mountpoints, and that's normal: it only lists block devices and subvolumes (rather inconveniently, I'd say) do not show up as devices:

root@curie:/home/anarcat# lsblk 
NAME                   MAJ:MIN RM   SIZE RO TYPE  MOUNTPOINT
sda                      8:0    0 931,5G  0 disk  
├─sda1                   8:1    0   200M  0 part  
├─sda2                   8:2    0     1G  0 part  
├─sda3                   8:3    0   7,8G  0 part  
└─sda4                   8:4    0 922,5G  0 part  
  └─fedora_crypt       253:4    0 922,5G  0 crypt /srv

This is really, really confusing. Maybe I did something wrong in the setup. Maybe it's because I'm mounting it from outside Fedora. Either way, it just doesn't feel right.

No disk usage per volume

If you want to see what's taking up space in one of those subvolumes, tough luck:

root@curie:/home/anarcat# df -h  /srv /mnt /mnt/home
Filesystem                Size  Used Avail Use% Mounted on
/dev/mapper/fedora_crypt  923G  886G   31G  97% /srv
/dev/mapper/fedora_crypt  923G  886G   31G  97% /mnt
/dev/mapper/fedora_crypt  923G  886G   31G  97% /mnt/home

(Notice, in passing, that it looks like the same filesystem is mounted in different places. In that sense, you'd expect /srv and /mnt (and /mnt/home?!) to be exactly the same, but no: they are entirely different directory structures, which I will not call "filesystems" here because everyone's head will explode in sparks of confusion.)

Yes, disk space is shared (that's the Size and Avail columns, makes sense). But nope, no cookie for you: they all have the same Used columns, so you need to actually walk the entire filesystem to figure out what each disk takes.

(For future reference, that's basically:

root@curie:/home/anarcat# time du -schx /mnt/home /mnt /srv
124M    /mnt/home
7.5G    /mnt
875G    /srv
883G    total

real    2m49.080s
user    0m3.664s
sys 0m19.013s

And yes, that was painfully slow.)

ZFS actually has some oddities in that regard, but at least it tells me how much disk each volume (and snapshot) takes:

root@tubman:~# time df -t zfs -h
Filesystem         Size  Used Avail Use% Mounted on
rpool/ROOT/debian  3.5T  1.4G  3.5T   1% /
rpool/var/tmp      3.5T  384K  3.5T   1% /var/tmp
rpool/var/spool    3.5T  256K  3.5T   1% /var/spool
rpool/var/log      3.5T  2.0G  3.5T   1% /var/log
rpool/home/root    3.5T  2.2G  3.5T   1% /root
rpool/home         3.5T  256K  3.5T   1% /home
rpool/srv          3.5T   80G  3.5T   3% /srv
rpool/var/cache    3.5T  114M  3.5T   1% /var/cache
bpool/BOOT/debian  571M   90M  481M  16% /boot

real    0m0.003s
user    0m0.002s
sys 0m0.000s

That's 56360 times faster, by the way.

But yes, that's not fair: those in the know will know there's a different command to do what df does with BTRFS filesystems, the btrfs filesystem usage command:

root@curie:/home/anarcat# time btrfs filesystem usage /srv
Overall:
    Device size:         922.47GiB
    Device allocated:        916.47GiB
    Device unallocated:        6.00GiB
    Device missing:          0.00B
    Used:            884.97GiB
    Free (estimated):         30.84GiB  (min: 27.84GiB)
    Free (statfs, df):        30.84GiB
    Data ratio:               1.00
    Metadata ratio:           2.00
    Global reserve:      512.00MiB  (used: 0.00B)
    Multiple profiles:              no

Data,single: Size:906.45GiB, Used:881.61GiB (97.26%)
   /dev/mapper/fedora_crypt  906.45GiB

Metadata,DUP: Size:5.00GiB, Used:1.68GiB (33.58%)
   /dev/mapper/fedora_crypt   10.00GiB

System,DUP: Size:8.00MiB, Used:128.00KiB (1.56%)
   /dev/mapper/fedora_crypt   16.00MiB

Unallocated:
   /dev/mapper/fedora_crypt    6.00GiB

real    0m0,004s
user    0m0,000s
sys 0m0,004s

Almost as fast as ZFS's df! Good job. But wait. That doesn't actually tell me usage per subvolume. Notice it's filesystem usage, not subvolume usage, which unhelpfully refuses to exist. That command only shows that one "filesystem" internal statistics that are pretty opaque.. You can also appreciate that it's wasting 6GB of "unallocated" disk space there: I probably did something Very Wrong and should be punished by Hacker News. I also wonder why it has 1.68GB of "metadata" used...

At this point, I just really want to throw that thing out of the window and restart from scratch. I don't really feel like learning the BTRFS internals, as they seem oblique and completely bizarre to me. It feels a little like the state of PHP now: it's actually pretty solid, but built upon so many layers of cruft that I still feel it corrupts my brain every time I have to deal with it (needle or haystack first? anyone?)...

Conclusion

I find BTRFS utterly confusing and I'm worried about its reliability. I think a lot of work is needed on usability and coherence before I even consider running this anywhere else than a lab, and that's really too bad, because there are really nice features in BTRFS that would greatly help my workflow. (I want to use filesystem snapshots as high-performance, high frequency backups.)

So now I'm experimenting with OpenZFS. It's so much simpler, just works, and it's rock solid. After this 8 minute read, I had a good understanding of how ZFS worked. Here's the 30 seconds overview:

  • vdev: a RAID array
  • vpool: a volume group of vdevs
  • datasets: normal filesystems (or block device, if you want to use another filesystem on top of ZFS)

There's also other special volumes like caches and logs that you can (really easily, compared to LVM caching) use to tweak your setup. You might also want to look at recordsize or ashift to tweak the filesystem to fit better your workload (or deal with drives lying about their sector size, I'm looking at you Samsung), but that's it.

Running ZFS on Linux currently involves building kernel modules from scratch on every host, which I think is pretty bad. But I was able to setup a ZFS-only server using this excellent documentation without too much problem.

I'm hoping some day the copyright issues are resolved and we can at least ship binary packages, but the politics (e.g. convincing Debian that is the right thing to do) and the logistics (e.g. DKMS auto-builders? is that even a thing? how about signed DKMS packages? fun-fun-fun!) seem really impractical. Who knows, maybe hell will freeze over (again) and Oracle will fix the CDDL. I personally think that we should just completely ignore this problem (which wasn't even supposed to be a problem) and ship binary packages directly, but I'm a pragmatic and do not always fit well with the free software fundamentalists.

All of this to say that, short term, we don't have a reliable, advanced filesystem/logical disk manager in Linux. And that's really too bad.

13 May, 2022 08:04PM

hackergotchi for Bits from Debian

Bits from Debian

New Debian Developers and Maintainers (March and April 2022)

The following contributors got their Debian Developer accounts in the last two months:

  • Henry-Nicolas Tourneur (hntourne)
  • Nick Black (dank)

The following contributors were added as Debian Maintainers in the last two months:

  • Jan Mojžíš
  • Philip Wyett
  • Thomas Ward
  • Fabio Fantoni
  • Mohammed Bilal
  • Guilherme de Paula Xavier Segundo

Congratulations!

13 May, 2022 03:00PM by Jean-Pierre Giraud

Arturo Borrero González

Toolforge GridEngine Debian 10 Buster migration

Toolforge logo, a circle with an anvil in the middle

This post was originally published in the Wikimedia Tech blog, authored by Arturo Borrero Gonzalez.

In accordance with our operating system upgrade policy, we should migrate our servers to Debian Buster.

As discussed in the previous post, one of the most important and successful services provided by the Wikimedia Cloud Services team at the Wikimedia Foundation is Toolforge. Toolforge is a platform that allows users and developers to run and use a variety of applications with the ultimate goal of helping the Wikimedia mission from the technical side.

As you may know already, all Wikimedia Foundation servers are powered by Debian, and this includes Toolforge and Cloud VPS. The Debian Project mostly follows a two year cadence for releases, and Toolforge has been using Debian Stretch for some years now, which nowadays is considered “old-old-stable”. In accordance with our operating system upgrade policy, we should migrate our servers to Debian Buster.

Toolforge’s two different backend engines, Kubernetes and Grid Engine, are impacted by this upgrade policy. Grid Engine is notably tied to the underlying Debian release, and the execution environment offered to tools running in the grid is limited to what the Debian archive contains for a given release. This is unlike in Kubernetes, where tool developers can leverage container images and decouple the runtime environment selection from the base operating system.

Since the Toolforge grid original conception, we have been doing the same operation over and over again:

  • Prepare a parallel grid deployment with the new operating system.
  • Ask our users (tool developers) to evaluate a newer version of their runtime and programming languages.
  • Introduce a migration window and coordinate a quick migration.
  • Finally, drop the old operating system from grid servers.

We’ve done this type of migration several times before. The last few ones were Ubuntu Precise to Ubuntu Trusty and Ubuntu Trusty to Debian Stretch. But this time around we had some special angles to consider.

So, you are upgrading the Debian release

  • You are migrating to Debian 11 Bullseye, no?
  • No, we’re migrating to Debian 10 Buster
  • Wait, but Debian 11 Bullseye exists!
  • Yes, we know! Let me explain…

We’re migrating the grid from Debian 9 Stretch to Debian 10 Buster, but perhaps we should be migrating from Debian 9 Stretch to Debian 11 Bullseye directly. This is a legitimate concern, and we discussed it in September 2021.

A timeline showing Debian versions since 2014

Back then, our reasoning was that skipping to Debian 11 Bullseye would be more difficult for our users, especially because greater jump in version numbers for the underlying runtimes. Additionally, all the migration work started before Debian 11 Bullseye was released. Our original intention was for the migration to be completed before the release. For a couple of reasons the project was delayed, and when it was time to restart the project we decided to continue with the original idea.

We had some work done to get Debian 10 Buster working correctly with the grid, and supporting Debian 11 Bullseye would require an additional effort. We didn’t even check if Grid Engine could be installed in the latest Debian release. For the grid, in general, the engineering effort to do a N+1 upgrade is lower than doing a N+2 upgrade. If we had tried a N+2 upgrade directly, things would have been much slower and difficult for us, and for our users.

In that sense, our conclusion was to not skip Debian 10 Buster.

We no longer want to run Grid Engine

In a previous blog post we shared information about our desired future for Grid Engine in Toolforge. Our intention is to discontinue our usage of this technology.

No grid? What about my tools?

Toolforge logo, a circle with an anvil in the middle

Traditionally there have been two main workflows or use cases that were supported in the grid, but not in our Kubernetes backend:

  • Running jobs, long-running bots and other scheduled tasks.
  • Mixing runtime environments (for example, a nodejs app that runs some python code).

The good news is that work to handle the continuity of such use cases has already started. This takes the form of two main efforts:

  • The Toolforge buildpacks project — to support arbitrary runtime environments.
  • The Toolforge Jobs Framework — to support jobs, scheduled tasks, etc.

In particular, the Toolforge Jobs Framework has been available for a while in an open beta phase. We did some initial design and implementation, then deployed it in Toolforge for some users to try it and report bugs, report missing features, etc.

These are complex, and feature-rich projects, and they deserve a dedicated blog post. More information on each will be shared in the future. For now, it is worth noting that both initiatives have some degree of development already.

The conclusion

Knowing all the moving parts, we were faced with a few hard questions when deciding how to approach the Debian 9 Stretch deprecation:

  • Should we not upgrade the grid, and focus on Kubernetes instead? Let Debian 9 Stretch be the last supported version on the grid?
  • What is the impact of these decisions on the technical community? What is best for our users?

The choices we made are already known in the community. A couple of weeks ago we announced the Debian 9 Stretch Grid Engine deprecation. In parallel to this migration, we decided to promote the new Toolforge Jobs Framework, even if it’s still in beta phase. This new option should help users to future-proof their tool, and reduce maintenance effort. An early migration to Kubernetes now will avoid any more future grid problems.

We truly hope that Debian 10 Buster is the last version we have for the grid, but as they say, hope is not a good strategy when it comes to engineering. What we will do is to work really hard in bringing Toolforge to the service level we want, and that means to keep developing and enabling more Kubernetes-based functionalities.

Stay tuned for more upcoming blog posts with additional information about Toolforge.

This post was originally published in the Wikimedia Tech blog, authored by Arturo Borrero Gonzalez.

13 May, 2022 09:42AM

Reproducible Builds (diffoscope)

diffoscope 212 released

The diffoscope maintainers are pleased to announce the release of diffoscope version 212. This version includes the following changes:

* Add support for extracting vmlinuz/vmlinux Linux kernel images.
  (Closes: reproducible-builds/diffoscope#304)
* Some Python .pyc files report as "data", so support ".pyc" as a
  fallback extension.

You find out more by visiting the project homepage.

13 May, 2022 12:00AM

May 12, 2022

hackergotchi for Jonathan Dowland

Jonathan Dowland

Scalable Computing seminar

title slide

Last week I delivered a seminar for the research group I belong to, Scalable Computing. This was a slightly-expanded version of the presentation I gave at uksystems21. The most substantial change is the addition of a fourth example to describe recent work on optimising for a second non-functional requirement: Bandwidth.

12 May, 2022 10:19AM

May 11, 2022

hackergotchi for Rapha&#235;l Hertzog

Raphaël Hertzog

Debian 9 soon out of (free) security support

Organizations that are still running Debian 9 servers should be aware that the security support of the Debian LTS team will end on June 30th 2022.

If upgrading to a newer Debian release is not an option for them, then they should consider subscribing to Freexian’s Extended LTS to get security support for the packages that they are using on their servers.

It’s worth pointing out that we made some important changes to Freexian’s Extended LTS offering :

  • we are now willing to support each Debian release for up to 10 years (so 5 years of ELTS support after the 5 initial years), provided that we have customers willing to pay the required amount.
  • we have changed our pricing scheme so that we can announce up-front the (increasing) cost over the 5 years of ELTS
  • we have dropped the requirement to subscribe to the Debian LTS sponsorship, though it’s still a good idea to contribute to the funding of that project to ensure that one’s packages are properly monitored/maintained during the LTS period

This means that we have again extended the life of Debian 8 Jessie, this time until June 30th 2025. And that Debian 9 Stretch – that will start its “extended” life on July 1st 2022 – can be maintained up to June 30th 2027.

Organizations using Debian 10 should consider sponsoring the Debian LTS team since security support for that Debian release will soon transition from the regular security team to the LTS team.

11 May, 2022 01:13PM by Raphaël Hertzog

May 10, 2022

hackergotchi for Ben Hutchings

Ben Hutchings

Debian LTS work, April 2022

In April I was assigned 16 hours of work by Freexian's Debian LTS initiative and carried over 8 hours from March. I worked 11 hours, and will carry over the remaining time to May.

I spent most of my time triaging security issues for Linux, working out which of them were fixed upstream and which actually applied to the versions provided in Debian 9 "stretch". I also rebased the Linux 4.9 (linux) package on the latest stable update, but did not make an upload this month.

10 May, 2022 09:41PM

hackergotchi for Daniel Kahn Gillmor

Daniel Kahn Gillmor

2022 Digital Rights Job Fair

I'm lucky enough to work at the intersection between information communications technology and civil rights/civil liberties. I get to combine technical interests and social/political interests.

I've talked with many folks over the years who are interested in doing similar work. Some come from a technical background, and some from an activist background (and some from both). Are you one of them? Are you someone who works as an activist or in a technical field who wants to look into different ways of meging these interests?

Some great organizers maintain a job board for Digital Rights. Next month they'll host a Digital Rights Job Fair, which offers an opportunity to talk with good people at organizations that fight in different ways for a better world. You need to RSVP to attend.

Digital Rights Job Fair

10 May, 2022 08:39PM by Daniel Kahn Gillmor

Russell Coker

Elon and Free Speech

Elon Musk has made the news for spending billions to buy a share of Twitter for the alleged purpose of providing free speech. The problem with this claim is that having any company controlling a large portion of the world’s communication is inherently bad for free speech. The same applies for Facebook, but that’s not a hot news item at the moment.

If Elon wanted to provide free speech he would want to have decentralised messaging systems so that someone who breaks rules on one platform could find another with different rules. Among other things free speech ideally permits people to debate issues with residents of another country on issues related to different laws. If advocates for the Russian government get kicked off Twitter as part of the American sanctions against Russia then American citizens can’t debate the issue with Russian citizens via Twitter. Mastodon is one example of a federated competitor to Twitter [1]. With a federated messaging system each host could make independent decisions about interpretation of sanctions. Someone who used a Mastodon instance based in the US could get a second account in another country if they wanted to communicate with people in countries that are sanctioned by the US.

The problem with Mastodon at the moment is lack of use. It’s got a good set of features and support for different platforms, there are apps for Android and iPhone as well as lots of other software using the API. But if the people you want to communicate with aren’t on it then it’s less useful. Elon could solve that problem by creating a Tesla Mastodon server and give a free account to everyone who buys a new Tesla, which is the sort of thing that a lot of Tesla buyers would like. It’s quite likely that other companies selling prestige products would follow that example. Everyone has seen evidence of people sharing photos on social media with someone else’s expensive car, a Mastodon account on ferrari.com or mercedes.com would be proof of buying the cars in question. The number of people who buy expensive cars new is a very small portion of the world population, but it’s a group of people who are more influential than average and others would join Mastodon servers to follow them.

The next thing that Elon could do to kill Twitter would be to have all his companies (which have something more than a dozen verified Twitter accounts) use Mastodon accounts for their primary PR releases and then send the same content to Twitter with a 48 hour delay. That would force journalists and people who want to discuss those companies on social media to follow the Mastodon accounts. Again this wouldn’t be a significant number of people, but they would be influential people. Getting journalists to use a communications system increases it’s importance.

The question is whether Elon is lacking the vision necessary to plan a Mastodon deployment or whether he just wants to allow horrible people to run wild on Twitter.

The Verge has an interesting article from 2019 about Gab using Mastodon [2]. The fact that over the last 2.5 years I didn’t even hear of Gab using Mastodon suggests that the fears of some people significantly exceeded the problem. I’m sure that some Gab users managed to harass some Mastodon users, but generally they were apparently banned quickly. As an aside the Mastodon server I use doesn’t appear to ban Gab, a search for Gab on it gave me a user posting about being “pureblood” at the top of the list.

Gab claims to have 4 million accounts and has an estimated 100,000 active users. If 5.5% of Tesla owners became active users on a hypothetical Tesla server that would be the largest Mastodon server. Elon could demonstrate his commitment to free speech by refusing to ban Gab in any way. The Wikipedia page about Gab [3] has a long list of horrible people and activities associated with it. Is that the “free speech” to associate with Tesla? Polestar makes some nice electric cars that appear quite luxurious [4] and doesn’t get negative PR from the behaviour of it’s owner, that’s something Elon might want to consider.

Is this really about bragging rights? Buying a controlling interest in a company that has a partial monopoly on Internet communication is something to boast about. Could users of commercial social media be considered serfs who serve their billionaire overlord?

10 May, 2022 11:53AM by etbe

Melissa Wen

Multiple syncobjs support for V3D(V) (Part 2)

In the previous post, I described how we enable multiple syncobjs capabilities in the V3D kernel driver. Now I will tell you what was changed on the userspace side, where we reworked the V3DV sync mechanisms to use Vulkan multiple wait and signal semaphores directly. This change represents greater adherence to the Vulkan submission framework.

I was not used to Vulkan concepts and the V3DV driver. Fortunately, I counted on the guidance of the Igalia’s Graphics team, mainly Iago Toral (thanks!), to understand the Vulkan Graphics Pipeline, sync scopes, and submission order. Therefore, we changed the original V3DV implementation for vkQueueSubmit and all related functions to allow direct mapping of multiple semaphores from V3DV to the V3D-kernel interface.

Disclaimer: Here’s a brief and probably inaccurate background, which we’ll go into more detail later on.

In Vulkan, GPU work submissions are described as command buffers. These command buffers, with GPU jobs, are grouped in a command buffer submission batch, specified by vkSubmitInfo, and submitted to a queue for execution. vkQueueSubmit is the command called to submit command buffers to a queue. Besides command buffers, vkSubmitInfo also specifies semaphores to wait before starting the batch execution and semaphores to signal when all command buffers in the batch are complete. Moreover, a fence in vkQueueSubmit can be signaled when all command buffer batches have completed execution.

From this sequence, we can see some implicit ordering guarantees. Submission order defines the start order of execution between command buffers, in other words, it is determined by the order in which pSubmits appear in VkQueueSubmit and pCommandBuffers appear in VkSubmitInfo. However, we don’t have any completion guarantees for jobs submitted to different GPU queue, which means they may overlap and complete out of order. Of course, jobs submitted to the same GPU engine follow start and finish order. A fence is ordered after all semaphores signal operations for signal operation order. In addition to implicit sync, we also have some explicit sync resources, such as semaphores, fences, and events.

Considering these implicit and explicit sync mechanisms, we rework the V3DV implementation of queue submissions to better use multiple syncobjs capabilities from the kernel. In this merge request, you can find this work: v3dv: add support to multiple wait and signal semaphores. In this blog post, we run through each scope of change of this merge request for a V3D driver-guided description of the multisync support implementation.

Groundwork and basic code clean-up:

As the original V3D-kernel interface allowed only one semaphore, V3DV resorted to booleans to “translate” multiple semaphores into one. Consequently, if a command buffer batch had at least one semaphore, it needed to wait on all jobs submitted complete before starting its execution. So, instead of just boolean, we created and changed structs that store semaphores information to accept the actual list of wait semaphores.

Expose multisync kernel interface to the driver:

In the two commits below, we basically updated the DRM V3D interface from that one defined in the kernel and verified if the multisync capability is available for use.

Handle multiple semaphores for all GPU job types:

At this point, we were only changing the submission design to consider multiple wait semaphores. Before supporting multisync, V3DV was waiting for the last job submitted to be signaled when at least one wait semaphore was defined, even when serialization wasn’t required. V3DV handle GPU jobs according to the GPU queue in which they are submitted:

  • Control List (CL) for binning and rendering
  • Texture Formatting Unit (TFU)
  • Compute Shader Dispatch (CSD)

Therefore, we changed their submission setup to do jobs submitted to any GPU queues able to handle more than one wait semaphores.

These commits created all mechanisms to set arrays of wait and signal semaphores for GPU job submissions:

  • Checking the conditions to define the wait_stage.
  • Wrapping them in a multisync extension.
  • According to the kernel interface (described in the previous blog post), configure the generic extension as a multisync extension.

Finally, we extended the ability of GPU jobs to handle multiple signal semaphores, but at this point, no GPU job is actually in charge of signaling them. With this in place, we could rework part of the code that tracks CPU and GPU job completions by verifying the GPU status and threads spawned by Event jobs.

Rework the QueueWaitIdle mechanism to track the syncobj of the last job submitted in each queue:

As we had only single in/out syncobj interfaces for semaphores, we used a single last_job_sync to synchronize job dependencies of the previous submission. Although the DRM scheduler guarantees the order of starting to execute a job in the same queue in the kernel space, the order of completion isn’t predictable. On the other hand, we still needed to use syncobjs to follow job completion since we have event threads on the CPU side. Therefore, a more accurate implementation requires last_job syncobjs to track when each engine (CL, TFU, and CSD) is idle. We also needed to keep the driver working on previous versions of v3d kernel-driver with single semaphores, then we kept tracking ANY last_job_sync to preserve the previous implementation.

Rework synchronization and submission design to let the jobs handle wait and signal semaphores:

With multiple semaphores support, the conditions for waiting and signaling semaphores changed accordingly to the particularities of each GPU job (CL, CSD, TFU) and CPU job restrictions (Events, CSD indirect, etc.). In this sense, we redesigned V3DV semaphores handling and job submissions for command buffer batches in vkQueueSubmit.

We scrutinized possible scenarios for submitting command buffer batches to change the original implementation carefully. It resulted in three commits more:

We keep track of whether we have submitted a job to each GPU queue (CSD, TFU, CL) and a CPU job for each command buffer. We use syncobjs to track the last job submitted to each GPU queue and a flag that indicates if this represents the beginning of a command buffer.

The first GPU job submitted to a GPU queue in a command buffer should wait on wait semaphores. The first CPU job submitted in a command buffer should call v3dv_QueueWaitIdle() to do the waiting and ignore semaphores (because it is waiting for everything).

If the job is not the first but has the serialize flag set, it should wait on the completion of all last job submitted to any GPU queue before running. In practice, it means using syncobjs to track the last job submitted by queue and add these syncobjs as job dependencies of this serialized job.

If this job is the last job of a command buffer batch, it may be used to signal semaphores if this command buffer batch has only one type of GPU job (because we have guarantees of execution ordering). Otherwise, we emit a no-op job just to signal semaphores. It waits on the completion of all last jobs submitted to any GPU queue and then signal semaphores. Note: We changed this approach to correctly deal with ordering changes caused by event threads at some point. Whenever we have an event job in the command buffer, we cannot use the last job in the last command buffer assumption. We have to wait all event threads complete to signal

After submitting all command buffers, we emit a no-op job to wait on all last jobs by queue completion and signal fence. Note: at some point, we changed this approach to correct deal with ordering changes caused by event threads, as mentioned before.

Final considerations

With many changes and many rounds of reviews, the patchset was merged. After more validations and code review, we polished and fixed the implementation together with external contributions:

Also, multisync capabilities enabled us to add new features to V3DV and switch the driver to the common synchronization and submission framework:

  • v3dv: expose support for semaphore imports

    This was waiting for multisync support in the v3d kernel, which is already available. Exposing this feature however enabled a few more CTS tests that exposed pre-existing bugs in the user-space driver so we fix those here before exposing the feature.

  • v3dv: Switch to the common submit framework

    This should give you emulated timeline semaphores for free and kernel-assisted sharable timeline semaphores for cheap once you have the kernel interface wired in.

We used a set of games to ensure no performance regression in the new implementation. For this, we used GFXReconstruct to capture Vulkan API calls when playing those games. Then, we compared results with and without multisync caps in the kernelspace and also enabling multisync on v3dv. We didn’t observe any compromise in performance, but improvements when replaying scenes of vkQuake game.

10 May, 2022 09:00AM

Multiple syncobjs support for V3D(V) (Part 1)

As you may already know, we at Igalia have been working on several improvements to the 3D rendering drivers of Broadcom Videocore GPU, found in Raspberry Pi 4 devices. One of our recent works focused on improving V3D(V) drivers adherence to Vulkan submission and synchronization framework. We had to cross various layers from the Linux Graphics stack to add support for multiple syncobjs to V3D(V), from the Linux/DRM kernel to the Vulkan driver. We have delivered bug fixes, a generic gate to extend job submission interfaces, and a more direct sync mapping of the Vulkan framework. These changes did not impact the performance of the tested games and brought greater precision to the synchronization mechanisms. Ultimately, support for multiple syncobjs opened the door to new features and other improvements to the V3DV submission framework.

DRM Syncobjs

But, first, what are DRM sync objs?

* DRM synchronization objects (syncobj, see struct &drm_syncobj) provide a
* container for a synchronization primitive which can be used by userspace
* to explicitly synchronize GPU commands, can be shared between userspace
* processes, and can be shared between different DRM drivers.
* Their primary use-case is to implement Vulkan fences and semaphores.
[...]
* At it's core, a syncobj is simply a wrapper around a pointer to a struct
* &dma_fence which may be NULL.

And Jason Ekstrand well-summarized dma_fence features in a talk at the Linux Plumbers Conference 2021:

A struct that represents a (potentially future) event:

  • Has a boolean “signaled” state
  • Has a bunch of useful utility helpers/concepts, such as refcount, callback wait mechanisms, etc.

Provides two guarantees:

  • One-shot: once signaled, it will be signaled forever
  • Finite-time: once exposed, is guaranteed signal in a reasonable amount of time

What does multiple semaphores support mean for Raspberry Pi 4 GPU drivers?

For our main purpose, the multiple syncobjs support means that V3DV can submit jobs with more than one wait and signal semaphore. In the kernel space, wait semaphores become explicit job dependencies to wait on before executing the job. Signal semaphores (or post dependencies), in turn, work as fences to be signaled when the job completes its execution, unlocking following jobs that depend on its completion.

The multisync support development comprised of many decision-making points and steps summarized as follow:

  • added to the v3d kernel-driver capabilities to handle multiple syncobj;
  • exposed multisync capabilities to the userspace through a generic extension; and
  • reworked synchronization mechanisms of the V3DV driver to benefit from this feature
  • enabled simulator to work with multiple semaphores
  • tested on Vulkan games to verify the correctness and possible performance enhancements.

We decided to refactor parts of the V3D(V) submission design in kernel-space and userspace during this development. We improved job scheduling on V3D-kernel and the V3DV job submission design. We also delivered more accurate synchronizing mechanisms and further updates in the Broadcom Vulkan driver running on Raspberry Pi 4. Therefore, we summarize here changes in the kernel space, describing the previous state of the driver, taking decisions, side improvements, and fixes.

From single to multiple binary in/out syncobjs:

Initially, V3D was very limited in the numbers of syncobjs per job submission. V3D job interfaces (CL, CSD, and TFU) only supported one syncobj (in_sync) to be added as an execution dependency and one syncobj (out_sync) to be signaled when a submission completes. Except for CL submission, which accepts two in_syncs: one for binner and another for render job, it didn’t change the limited options.

Meanwhile in the userspace, the V3DV driver followed alternative paths to meet Vulkan’s synchronization and submission framework. It needed to handle multiple wait and signal semaphores, but the V3D kernel-driver interface only accepts one in_sync and one out_sync. In short, V3DV had to fit multiple semaphores into one when submitting every GPU job.

Generic ioctl extension

The first decision was how to extend the V3D interface to accept multiple in and out syncobjs. We could extend each ioctl with two entries of syncobj arrays and two entries for their counters. We could create new ioctls with multiple in/out syncobj. But after examining other drivers solutions to extend their submission’s interface, we decided to extend V3D ioctls (v3d_cl_submit_ioctl, v3d_csd_submit_ioctl, v3d_tfu_submit_ioctl) by a generic ioctl extension.

I found a curious commit message when I was examining how other developers handled the issue in the past:

Author: Chris Wilson <chris@chris-wilson.co.uk>
Date:   Fri Mar 22 09:23:22 2019 +0000

    drm/i915: Introduce the i915_user_extension_method
    
    An idea for extending uABI inspired by Vulkan's extension chains.
    Instead of expanding the data struct for each ioctl every time we need
    to add a new feature, define an extension chain instead. As we add
    optional interfaces to control the ioctl, we define a new extension
    struct that can be linked into the ioctl data only when required by the
    user. The key advantage being able to ignore large control structs for
    optional interfaces/extensions, while being able to process them in a
    consistent manner.
    
    In comparison to other extensible ioctls, the key difference is the
    use of a linked chain of extension structs vs an array of tagged
    pointers. For example,
    
    struct drm_amdgpu_cs_chunk {
    	__u32		chunk_id;
        __u32		length_dw;
        __u64		chunk_data;
    };
[...]

So, inspired by amdgpu_cs_chunk and i915_user_extension, we opted to extend the V3D interface through a generic interface. After applying some suggestions from Iago Toral (Igalia) and Daniel Vetter, we reached the following struct:

struct drm_v3d_extension {
	__u64 next;
	__u32 id;
#define DRM_V3D_EXT_ID_MULTI_SYNC		0x01
	__u32 flags; /* mbz */
};

This generic extension has an id to identify the feature/extension we are adding to an ioctl (that maps the related struct type), a pointer to the next extension, and flags (if needed). Whenever we need to extend the V3D interface again for another specific feature, we subclass this generic extension into the specific one instead of extending ioctls indefinitely.

Multisync extension

For the multiple syncobjs extension, we define a multi_sync extension struct that subclasses the generic extension struct. It has arrays of in and out syncobjs, the respective number of elements in each of them, and a wait_stage value used in CL submissions to determine which job needs to wait for syncobjs before running.

struct drm_v3d_multi_sync {
	struct drm_v3d_extension base;
	/* Array of wait and signal semaphores */
	__u64 in_syncs;
	__u64 out_syncs;

	/* Number of entries */
	__u32 in_sync_count;
	__u32 out_sync_count;

	/* set the stage (v3d_queue) to sync */
	__u32 wait_stage;

	__u32 pad; /* mbz */
};

And if a multisync extension is defined, the V3D driver ignores the previous interface of single in/out syncobjs.

Once we had the interface to support multiple in/out syncobjs, v3d kernel-driver needed to handle it. As V3D uses the DRM scheduler for job executions, changing from single syncobj to multiples is quite straightforward. V3D copies from userspace the in syncobjs and uses drm_syncobj_find_fence()+ drm_sched_job_add_dependency() to add all in_syncs (wait semaphores) as job dependencies, i.e. syncobjs to be checked by the scheduler before running the job. On CL submissions, we have the bin and render jobs, so V3D follows the value of wait_stage to determine which job depends on those in_syncs to start its execution.

When V3D defines the last job in a submission, it replaces dma_fence of out_syncs with the done_fence from this last job. It uses drm_syncobj_find() + drm_syncobj_replace_fence() to do that. Therefore, when a job completes its execution and signals done_fence, all out_syncs are signaled too.

Other improvements to v3d kernel driver

This work also made possible some improvements in the original implementation. Following Iago’s suggestions, we refactored the job’s initialization code to allocate memory and initialize a job in one go. With this, we started to clean up resources more cohesively, clearly distinguishing cleanups in case of failure from job completion. We also fixed the resource cleanup when a job is aborted before the DRM scheduler arms it - at that point, drm_sched_job_arm() had recently been introduced to job initialization. Finally, we prepared the semaphore interface to implement timeline syncobjs in the future.

Going Up

The patchset that adds multiple syncobjs support and improvements to V3D is available here and comprises four patches:

  • drm/v3d: decouple adding job dependencies steps from job init
  • drm/v3d: alloc and init job in one shot
  • drm/v3d: add generic ioctl extension
  • drm/v3d: add multiple syncobjs support

After extending the V3D kernel interface to accept multiple syncobjs, we worked on V3DV to benefit from V3D multisync capabilities. In the next post, I will describe a little of this work.

10 May, 2022 08:00AM

Utkarsh Gupta

FOSS Activites in April 2022

Here’s my (thirty-first) monthly but brief update about the activities I’ve done in the F/L/OSS world.

Debian

This was my 40th month of actively contributing to Debian. I became a DM in late March 2019 and a DD on Christmas ‘19! \o/

There’s a bunch of things I did this month but mostly non-technical, now that DC22 is around the corner. Here are the things I did:

Debian Uploads

  • Helped Andrius w/ FTBFS for php-text-captcha, reported via #977403.
    • I fixed the samed in Ubuntu a couple of months ago and they copied over the patch here.

Other $things:

  • Volunteering for DC22 Content team.
  • Leading the Bursary team w/ Paulo.
  • Answering a bunch of questions of referees and attendees around bursary.
  • Being an AM for Arun Kumar, process #1024.
  • Mentoring for newcomers.
  • Moderation of -project mailing list.

Ubuntu

This was my 15th month of actively contributing to Ubuntu. Now that I joined Canonical to work on Ubuntu full-time, there’s a bunch of things I do! \o/

I mostly worked on different things, I guess.

I was too lazy to maintain a list of things I worked on so there’s no concrete list atm. Maybe I’ll get back to this section later or will start to list stuff from the fall, as I was doing before. :D


Debian (E)LTS

Debian Long Term Support (LTS) is a project to extend the lifetime of all Debian stable releases to (at least) 5 years. Debian LTS is not handled by the Debian security team, but by a separate group of volunteers and companies interested in making it a success.

And Debian Extended LTS (ELTS) is its sister project, extending support to the Jessie release (+2 years after LTS support).

This was my thirty-first month as a Debian LTS and twentieth month as a Debian ELTS paid contributor.
I worked for 23.25 hours for LTS and 20.00 hours for ELTS.

LTS CVE Fixes and Announcements:

  • Issued DLA 2976-1, fixing CVE-2022-1271, for gzip.
    For Debian 9 stretch, these problems have been fixed in version 1.6-5+deb9u1.
  • Issued DLA 2977-1, fixing CVE-2022-1271, for xz-utils.
    For Debian 9 stretch, these problems have been fixed in version 5.2.2-1.2+deb9u1.
  • Working on src:tiff and src:mbedtls to fix the issues, still waiting for more issues to be reported, though.
  • Looking at src:mutt CVEs. Haven’t had the time to complete but shall roll out next month.

ELTS CVE Fixes and Announcements:

  • Issued ELA 593-1, fixing CVE-2022-1271, for gzip.
    For Debian 8 jessie, these problems have been fixed in version 1.6-4+deb8u1.
  • Issued ELA 594-1, fixing CVE-2022-1271, for xz-utils.
    For Debian 8 jessie, these problems have been fixed in version 5.1.1alpha+20120614-2+deb8u1.
  • Issued ELA 598-1, fixing CVE-2019-16935, CVE-2021-3177, and CVE-2021-4189, for python2.7.
    For Debian 8 jessie, these problems have been fixed in version 2.7.9-2-ds1-1+deb8u9.
  • Working on src:tiff and src:beep to fix the issues, still waiting for more issues to be reported for src:tiff and src:beep is a bit of a PITA, though. :)

Other (E)LTS Work:

  • Triaged gzip, xz-utils, tiff, beep, python2.7, python-django, and libgit2,
  • Signed up to be a Freexian Collaborator! \o/
  • Read through some bits around that.
  • Helped and assisted new contributors joining Freexian.
  • Answered questions (& discussions) on IRC (#debian-lts and #debian-elts).
  • General and other discussions on LTS private and public mailing list.
  • Attended monthly Debian meeting. Held on Jitsi this month.

Debian LTS Survey

I’ve spent 18 hours on the LTS survey on the following bits:

  • Rolled out the announcement. Started the survey.
  • Answered a bunch of queries, people asked via e-mail.
  • Looked at another bunch of tickets: https://salsa.debian.org/freexian-team/project-funding/-/issues/23.
  • Sent a reminder and fixed a few things here and there.
  • Gave a status update during the meeting.
  • Extended the duration of the survey.

Until next time.
:wq for today.

10 May, 2022 05:41AM

May 09, 2022

hackergotchi for Robert McQueen

Robert McQueen

Evolving a strategy for 2022 and beyond

As a board, we have been working on several initiatives to make the Foundation a better asset for the GNOME Project. We’re working on a number of threads in parallel, so I wanted to explain the “big picture” a bit more to try and connect together things like the new ED search and the bylaw changes.

We’re all here to see free and open source software succeed and thrive, so that people can be be truly empowered with agency over their technology, rather than being passive consumers. We want to bring GNOME to as many people as possible so that they have computing devices that they can inspect, trust, share and learn from.

In previous years we’ve tried to boost the relevance of GNOME (or technologies such as GTK) or solicit donations from businesses and individuals with existing engagement in FOSS ideology and technology. The problem with this approach is that we’re mostly addressing people and organisations who are already supporting or contributing FOSS in some way. To truly scale our impact, we need to look to the outside world, build better awareness of GNOME outside of our current user base, and find opportunities to secure funding to invest back into the GNOME project.

The Foundation supports the GNOME project with infrastructure, arranging conferences, sponsoring hackfests and travel, design work, legal support, managing sponsorships, advisory board, being the fiscal sponsor of GNOME, GTK, Flathub… and we will keep doing all of these things. What we’re talking about here are additional ways for the Foundation to support the GNOME project – we want to go beyond these activities, and invest into GNOME to grow its adoption amongst people who need it. This has a cost, and that means in parallel with these initiatives, we need to find partners to fund this work.

Neil has previously talked about themes such as education, advocacy, privacy, but we’ve not previously translated these into clear specific initiatives that we would establish in addition to the Foundation’s existing work. This is all a work in progress and we welcome any feedback from the community about refining these ideas, but here are the current strategic initiatives the board is working on. We’ve been thinking about growing our community by encouraging and retaining diverse contributors, and addressing evolving computing needs which aren’t currently well served on the desktop.

Initiative 1. Welcoming newcomers. The community is already spending a lot of time welcoming newcomers and teaching them the best practices. Those activities are as time consuming as they are important, but currently a handful of individuals are running initiatives such as GSoC, Outreachy and outreach to Universities. These activities help bring diverse individuals and perspectives into the community, and helps them develop skills and experience of collaborating to create Open Source projects. We want to make those efforts more sustainable by finding sponsors for these activities. With funding, we can hire people to dedicate their time to operating these programs, including paid mentors and creating materials to support newcomers in future, such as developer documentation, examples and tutorials. This is the initiative that needs to be refined the most before we can turn it into something real.

Initiative 2: Diverse and sustainable Linux app ecosystem. I spoke at the Linux App Summit about the work that GNOME and Endless has been supporting in Flathub, but this is an example of something which has a great overlap between commercial, technical and mission-based advantages. The key goal here is to improve the financial sustainability of participating in our community, which in turn has an impact on the diversity of who we can expect to afford to enter and remain in our community. We believe the existence of this is critically important for individual developers and contributors to unlock earning potential from our ecosystem, through donations or app sales. In turn, a healthy app ecosystem also improves the usefulness of the Linux desktop as a whole for potential users. We believe that we can build a case for commercial vendors in the space to join an advisory board alongside with GNOME, KDE, etc to input into the governance and contribute to the costs of growing Flathub.

Initiative 3: Local-first applications for the GNOME desktop. This is what Thib has been starting to discuss on Discourse, in this thread. There are many different threats to free access to computing and information in today’s world. The GNOME desktop and apps need to give users convenient and reliable access to technology which works similarly to the tools they already use everyday, but keeps them and their data safe from surveillance, censorship, filtering or just being completely cut off from the Internet. We believe that we can seek both philanthropic and grant funding for this work. It will make GNOME a more appealing and comprehensive offering for the many people who want to protect their privacy.

The idea is that these initiatives all sit on the boundary between the GNOME community and the outside world. If the Foundation can grow and deliver these kinds of projects, we are reaching to new people, new contributors and new funding. These contributions and investments back into GNOME represent a true “win-win” for the newcomers and our existing community.

(Originally posted to GNOME Discourse, please feel free to join the discussion there.)

09 May, 2022 02:01PM by ramcq

Russ Allbery

remctl 3.18

remctl is a simple RPC mechanism using Kerberos GSS-API authentication (or SSH authentication).

The primary change in this release, and the reason for the release, is to add support for PCRE2, the latest version of the Perl-Compatible Regular Expression library, since PCRE1 is now deprecated.

This release also improves some documentation, marks the allocation functions in the C client library with deallocation functions for GCC 11, and fixes some issues with the Python and Ruby bindings that were spotted by Ken Dreyer, as well as the normal update of portability support.

I still do plan to move the language bindings into separate packages, since this will make it easier to upload them to their per-language module repositories and that, in turn, will make them easier to use, but this version doesn't have those changes. I wanted to flush the portability changes and PCRE update out first before starting that project.

You can get the latest version from the remctl distribution page.

09 May, 2022 04:49AM

rra-c-util 10.2

rra-c-util is my collection of utility functions, mostly but not entirely for C, that I use with my various software releases.

There are two major changes in this release. The first is Autoconf support for PCRE2, the new version of the Perl-Compatible Regular Expression library (PCRE1 is now deprecated), which was the motivation for a new release. The second is a huge update to the Perl formatting rules due to lots of work by Julien ÉLIE for INN.

This release also tags deallocation functions, similar to the change mentioned for C TAP Harness 4.8, for all the utility libraries provided by rra-c-util, and fixes an issue with the systemd support.

You can get the latest version from the rra-c-util distribution page.

09 May, 2022 04:43AM

C TAP Harness 4.8

C TAP Harness is my C implementation of the Perl "Test Anything Protocol" test suite framework. It includes test runner and libraries for both C and shell.

This is mostly a cleanup release to resync with other utility libraries. It does fix an installation problem by managing symlinks correctly, and adds support for GCC 11's new deallocation warnings.

The latter is a rather interesting new GCC feature. There is a Red Hat blog post about the implementation with more details, but the short version is that the __malloc__ attribute can now take an argument that specifies the function that should be used to deallocate the allocated object. GCC 11 and later can use that information to catch some deallocation bugs, such as deallocating things with the wrong function.

You can get the latest version from the C TAP Harness distribution page.

09 May, 2022 04:26AM

May 08, 2022

Thorsten Alteholz

My Debian Activities in April 2022

FTP master

This month I accepted 186 and rejected 26 packages. The overall number of packages that got accepted was 188.

Debian LTS

This was my ninety-fourth month that I did some work for the Debian LTS initiative, started by Raphael Hertzog at Freexian.

This month my all in all workload has been 40h. During that time I did LTS and normal security uploads of:

  • [DLA 2973-1] minidlna security update for one CVE
  • [DLA 2974-1] fribidi security update for three CVEs
  • [DLA 2988-1] tinyxml security update for one CVE
  • [DLA 2987-1] libarchive security update for three CVEs
  • [#1009076] buster-pu: minidlna/1.2.1+dfsg-2+deb10u3
  • [#1009077] bullseye-pu: minidlna/1.3.0+dfsg-2+deb11u1
  • [#1009251] buster-pu: fribidi/1.0.5-3.1+deb10u2
  • [#1009250] bullseye-pu: fribidi/1.0.8-2+deb11u1
  • [#1010380] buster-pu: flac/1.3.2-3+deb10u2

Further I worked on libvirt, the dependency problems in unstable have been resolved and fixing in other releases can continue.

I also continued to work on security support for golang packages.

Last but not least I did some days of frontdesk duties.

Debian ELTS

This month was the forty-siyth ELTS month.

During my allocated time I uploaded:

  • ELA-591-1 for minidlna
  • ELA-592-1 for fribidi
  • ELA-602-1 for tinyxml
  • ELS-603-1 for libarchive

Last but not least I did some days of frontdesk duties.

Debian Printing

This month I uploaded new upstream versions or improved packaging of:

As I already became the maintainer of usb-modeswitch I also adopted usb-modeswitch-data

Debian Astro

Unfortunately I didn’t do anything for this group, but in May I will upload a new version of openvlbi and several indi-3rdparty packages.

Other stuff

Last but not least I uploaded several new upstream version of golang packages but not before checking with ratt that all dependencies still work.

08 May, 2022 10:01AM by alteholz

May 06, 2022

hackergotchi for Dirk Eddelbuettel

Dirk Eddelbuettel

RProtoBuf 0.4.19 on CRAN: Updates

A new release 0.4.19 of RProtoBuf arrived on CRAN earlier today. RProtoBuf provides R with bindings for the Google Protocol Buffers (“ProtoBuf”) data encoding and serialization library used and released by Google, and deployed very widely in numerous projects as a language and operating-system agnostic protocol.

This release contains a pull request contribution by Michael Chirico to add support for the TextFormat API, a minor maintenance fix ensuring (standard) string are referenced as std::string to avoid a hickup on Arch builds, some repo updates, plus reporting of (package and library) versions on startup. The following section from the NEWS.Rd file has more details.

Changes in RProtoBuf version 0.4.19 (2022-05-06)

  • Small cleanups to repository

  • Raise minimum Protocol Buffers version to 3.3 (closes #83)

  • Update package version display, added to startup message

  • Expose TextFormat API (Michael Chirico in #88 closing #87)

  • Add missing explicit std:: on seven string instances in one file (closes #89)

Thanks to my CRANberries, there is a diff to the previous release. The RProtoBuf page has copies of the (older) package vignette, the ‘quick’ overview vignette, and the pre-print of our JSS paper. Questions, comments etc should go to the GitHub issue tracker off the GitHub repo.

If you like this or other open-source work I do, you can now sponsor me at GitHub.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.

06 May, 2022 11:33PM

Antoine Beaupré

Wallabako 1.4.0 released

I don't particularly like it when people announce their personal projects on their blog, but I'm making an exception for this one, because it's a little special for me.

You see, I have just released Wallabako 1.4.0 (and a quick, mostly irrelevant 1.4.1 hotfix) today. It's the first release of that project in almost 3 years (the previous was 1.3.1, before the pandemic).

The other reason I figured I would mention it is that I have almost never talked about Wallabako on this blog at all, so many of my readers probably don't even know I sometimes meddle with in Golang which surprises even me sometimes.

What's Wallabako

Wallabako is a weird little program I designed to read articles on my E-book reader. I use it to spend less time on the computer: I save articles in a read-it-later app named Wallabag (hosted by a generous friend), and then Wallabako connects to that app, downloads an EPUB version of the book, and then I can read it on the device directly.

When I'm done reading the book, Wallabako notices and sets the article as read in Wallabag. I also set it to delete the book locally, but you can actually configure to keep those books around forever if you feel like it.

Wallabako supports syncing read status with the built-in Kobo interface (called "Nickel"), Koreader and Plato. I happen to use Koreader for everything nowadays, but it should work equally well on the others.

Wallabako is actually setup to be started by udev when there's a connection change detected by the kernel, which is kind of a gross hack. It's clunky, but actually works and I thought for a while about switching to something else, but it's really the easiest way to go, and that requires the less interaction by the user.

Why I'm (still) using it

I wrote Wallabako because I read a lot of articles on the internet. It's actually most of my readings. I read about 10 books a year (which I don't think is much), but I probably read more in terms of time and pages in Wallabag. I haven't actually made the math, but I estimate I spend at least double the time reading articles than I spend reading books.

If I wouldn't have Wallabag, I would have hundreds of tabs open in my web browser all the time. So at least that problem is easily solved: throw everything in Wallabag, sort and read later.

If I wouldn't have Wallabako however, I would be either spend that time reading on the computer -- which I prefer to spend working on free software or work -- or on my phone -- which is kind of better, but really cramped.

I had stopped (and developing) Wallabako for a while, actually, Around 2019, I got tired of always read those technical articles (basically work stuff!) at home. I realized I was just not "reading" (as in books! fiction! fun stuff!) anymore, at least not as much as I wanted.

So I tried to make this separation: the ebook reader is for cool book stuff. The rest is work. But because I had the Wallabag Android app on my phone and tablet, I could still read those articles there, which I thought was pretty neat. But that meant that I was constantly looking at my phone, which is something I'm generally trying to avoid, as it sets a bad example for the kids (small and big) around me.

Then I realized there was one stray ebook reader lying around at home. I had recently bought a Kobo Aura HD to read books, and I like that device. And it's going to stay locked down to reading books. But there's still that old battered Kobo Glo HD reader lying around, and I figured I could just borrow it to read Wallabag articles.

What is this new release

But oh boy that was a lot of work. Wallabako was kind of a mess: it was using the deprecated go dep tool, which lost the battle with go mod. Cross-compilation was broken for older devices, and I had to implement support for Koreader.

go mod

So I had to learn go mod. I'm still not sure I got that part right: LSP is yelling at me because it can't find the imports, and I'm generally just "YOLO everythihng" every time I get anywhere close to it. That's not the way to do Go, in general, and not how I like to do it either.

But I guess that, given time, I'll figure it out and make it work for me. It certainly works now. I think.

Cross compilation

The hard part was different. You see, Nickel uses SQLite to store metadata about books, so Wallabako actually needs to tap into that SQLite database to propagate read status. Originally, I just linked against some sqlite3 library I found lying around. It's basically a wrapper around the C-based SQLite and generally works fine. But that means you actually link your Golang program against a C library. And that's when things get a little nutty.

If you would just build Wallabag naively, it would fail when deployed on the Kobo Glo HD. That's because the device runs a really old kernel: the prehistoric Linux kobo 2.6.35.3-850-gbc67621+ #2049 PREEMPT Mon Jan 9 13:33:11 CST 2017 armv7l GNU/Linux. That was built in 2017, but the kernel was actually released in 2010, a whole 5 years before the Glo HD was released, in 2015 which is kind of outrageous. and yes, that is with the latest firmware release.

My bet is they just don't upgrade the kernel on those things, as the Glo was probably bought around 2017...

In any case, the problem is we are cross-compiling here. And Golang is pretty good about cross-compiling, but because we have C in there, we're actually cross-compiling with "CGO" which is really just Golang with a GCC backend. And that's much, much harder to figure out because you need to pass down flags into GCC and so on. It was a nightmare.

That's until I found this outrageous "little" project called modernc.org/sqlite. What that thing does (with a hefty does of dependencies that would make any Debian developer recoil in horror) is to transpile the SQLite C source code to Golang. You read that right: it rewrites SQLite in Go. On the fly. It's nuts.

But it works. And you end up with a "pure go" program, and that thing compiles much faster and runs fine on older kernel.

I still wasn't sure I wanted to just stick with that forever, so I kept the old sqlite3 code around, behind a compile-time tag. At the top of the nickel_modernc.go file, there's this magic string:

//+build !sqlite3

And at the top of nickel_sqlite3.go file, there's this magic string:

//+build sqlite3

So now, by default, the modernc file gets included, but if I pass --tags sqlite3 to the Go compiler (to go install or whatever), it will actually switch to the other implementation. Pretty neat stuff.

Koreader port

The last part was something I was hesitant in doing for a long time, but that turned out to be pretty easy. I have basically switch to using Koreader to read everything. Books, PDF, everything goes through it. I really like that it stores its metadata in sidecar files: I synchronize all my books with Syncthing which means I can carry my read status, annotations and all that stuff without having to think about it. (And yes, I installed Syncthing on my Kobo.)

The koreader.go port was less than 80 lines, and I could even make a nice little test suite so that I don't have to redeploy that thing to the ebook reader at every code iteration.

I had originally thought I should add some sort of graphical interface in Koreader for Wallabako as well, and had requested that feature upstream. Unfortunately (or fortunately?), they took my idea and just ran with it. Some courageous soul actually wrote a full Wallabag plugin for koreader, in Lua of course.

Compared to the Wallabako implementation however, the koreader plugin is much slower, probably because it downloads articles serially instead of concurrently. It is, however, much more usable as the user is given a visible feedback of the various steps. I still had to enable full debugging to diagnose a problem (which was that I shouldn't have a trailing slash, and that some special characters don't work in passwords). It's also better to write the config file with a normal text editor, over SSH or with the Kobo mounted to your computer instead of typing those really long strings over the kobo.

There's no sample config file which makes that harder but a workaround is to save the configuration with dummy values and fix them up after. Finally I also found the default setting ("Remotely delete finished articles") really dangerous as it can basically lead to data loss (Wallabag article being deleted!) for an unsuspecting user...

So basically, I started working on Wallabag again because the koreader implementation of their Wallabag client was not up to spec for me. It might be good enough for you, but I guess if you like Wallabako, you should thank the koreader folks for their sloppy implementation, as I'm now working again on Wallabako.

Actual release notes

Those are the actual release notes for 1.4.0.

Ship a lot of fixes that have accumulated in the 3 years since the last release.

Features:

  • add timestamp and git version to build artifacts
  • cleanup and improve debugging output
  • switch to pure go sqlite implementation, which helps
  • update all module dependencies
  • port to wallabago v6
  • support Plato library changes from 0.8.5+
  • support reading koreader progress/read status
  • Allow containerized builds, use gomod and avoid GOPATH hell
  • overhaul Dockerfile
  • switch to go mod

Documentation changes:

  • remove instability warning: this works well enough
  • README: replace branch name master by main in links
  • tweak mention of libreoffice to clarify concern
  • replace "kobo" references by "nickel" where appropriate
  • make a section about related projects
  • mention NickelMenu
  • quick review of the koreader implementation

Bugfixes:

  • handle errors in http request creation
  • Use OutputDir configuration instead of hardcoded wallabako paths
  • do not noisily fail if there's no entry for book in plato
  • regression: properly detect read status again after koreader (or plato?) support was added

How do I use this?

This is amazing. I can't believe someone did something that awesome. I want to cover you with gold and Tesla cars and fresh water.

You're weird please stop. But if you want to use Wallabako, head over to the README file which has installation instructions. It basically uses a hack in Kobo e-readers that will happily overwrite their root filesystem as soon as you drop this file named KoboRoot.tgz in the .kobo directory of your e-reader.

Note that there is no uninstall procedure and it messes with the reader's udev configuration (to trigger runs on wifi connect). You'll also need to create a JSON configuration file and configure a client in Wallabag.

And if you're looking for Wallabag hosting, Wallabag.it offers a 14-day free trial. You can also, obviously, host it yourself. Which is not the case for Pocket, even years after Mozilla bought the company. All this wouldn't actually be necessary if Pocket was open-source because Nickel actually ships with a Pocket client.

Shame on you, Mozilla. But you still make an awesome browser, so keep doing that.

06 May, 2022 04:26PM

hackergotchi for Holger Levsen

Holger Levsen

20220506-i-had-an-abortion

I had an abortion...

Well, it wasn't me, but when I was 18 my partner thankfully was able to take a 'morning-after-pill' because we were seriously not ready to have a baby. As one data point: We were both still in high school.

It's not possible to ban abortions. It's only possible to ban safe abortions.

06 May, 2022 01:42PM

hackergotchi for Dirk Eddelbuettel

Dirk Eddelbuettel

RQuantLib 0.4.16 on CRAN: Small Updates

A new release 0.4.16 of RQuantLib arrived at CRAN earlier today, and has been uploaded to Debian as well.

QuantLib is a very comprehensice free/open-source library for quantitative finance; RQuantLib connects it to the R environment and language.

The release of RQuantLib comes agaain about four months after the previous release, and brings a a few small updates for daycounters, all thanks to Kai Lin, plus a small parameter change to avoid an error in an example, and small updates to the Docker files.

Changes in RQuantLib version 0.4.16 (2022-05-05)

  • Documentationn for daycounters was updated and extended (Kai Lin)

  • Deprecated daycounters were approtiately updated (Kai Lin)

  • One example parameterization was changed to avoid error (Dirk)

  • The Docker files were updated

Courtesy of my CRANberries, there is also a diffstat report for the this release. As always, more detailed information is on the RQuantLib page. Questions, comments etc should go to the new rquantlib-devel mailing list. Issue tickets can be filed at the GitHub repo.

If you like this or other open-source work I do, you can now sponsor me at GitHub.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.

06 May, 2022 12:50AM

May 05, 2022

Reproducible Builds

Reproducible Builds in April 2022

Welcome to the April 2022 report from the Reproducible Builds project! In these reports, we try to summarise the most important things that we have been up to over the past month. If you are interested in contributing to the project, please take a few moments to visit our Contribute page on our website.

News

Cory Doctorow published an interesting article this month about the possibility of Undetectable backdoors for machine learning models. Given that machine learning models can provide unpredictably incorrect results, Doctorow recounts that there exists another category of “adversarial examples” that comprise “a gimmicked machine-learning input that, to the human eye, seems totally normal — but which causes the ML system to misfire dramatically” that permit the possibility of planting “undetectable back doors into any machine learning system at training time”.


Chris Lamb published two ‘supporter spotlights’ on our blog: the first about Amateur Radio Digital Communications (ARDC) and the second about the Google Open Source Security Team (GOSST).


Piergiorgio Ladisa, Henrik Plate, Matias Martinez and Olivier Barais published a new academic paper titled A Taxonomy of Attacks on Open-Source Software Supply Chains (PDF):

This work proposes a general taxonomy for attacks on open-source supply chains, independent of specific programming languages or ecosystems, and covering all supply chain stages from code contributions to package distribution. Taking the form of an attack tree, it covers 107 unique vectors, linked to 94 real-world incidents, and mapped to 33 mitigating safeguards.


Elsewhere in academia, Ly Vu Duc published his PhD thesis. Titled Towards Understanding and Securing the OSS Supply Chain (PDF), Duc’s abstract reads as follows:

This dissertation starts from the first link in the software supply chain, ‘developers’. Since many developers do not update their vulnerable software libraries, thus exposing the user of their code to security risks. To understand how they choose, manage and update the libraries, packages, and other Open-Source Software (OSS) that become the building blocks of companies’ completed products consumed by end-users, twenty-five semi-structured interviews were conducted with developers of both large and small-medium enterprises in nine countries. All interviews were transcribed, coded, and analyzed according to applied thematic analysis


Upstream news

Filippo Valsorda published an informative blog post recently called How Go Mitigates Supply Chain Attacks outlining the high-level features of the Go ecosystem that helps prevent various supply-chain attacks.


There was new/further activity on a pull request filed against openssl by Sebastian Andrzej Siewior in order to prevent saved CFLAGS (which may contain the -fdebug-prefix-map=<PATH> flag that is used to strip an arbitrary the build path from the debug info — if this information remains recorded then the binary is no longer reproducible if the build directory changes.


Events

The Linux Foundation’s SupplyChainSecurityCon, will take place June 21st — 24th 2022, both virtually and in Austin, Texas. Long-time Reproducible Builds and openSUSE contributor Bernhard M. Wiedemann learned that he had his talk accepted, and will speak on Reproducible Builds: Unexpected Benefits and Problems on June 21st.


There will be an in-person “Debian Reunion” in Hamburg, Germany later this year, taking place from 23 — 30 May. Although this is a “Debian” event, there will be some folks from the broader Reproducible Builds community and, of course, everyone is welcome. Please see the event page on the Debian wiki for more information. 41 people have registered so far, and there’s approx 10 “on-site” beds still left.


The minutes and logs from our April 2022 IRC meeting have been published. In case you missed this one, our next IRC meeting will take place on May 31st at 15:00 UTC on #reproducible-builds on the OFTC network.


Debian

Roland Clobus wrote another in-depth status update about the status of ‘live’ Debian images, summarising the current situation that all major desktops build reproducibly with bullseye, bookworm and sid, including the Cinnamon desktop on bookworm and sid, “but at a small functionality cost: 14 words will be incorrectly abbreviated”. This work incorporated:

  • Reporting an issue about unnecessarily modified timestamps in the daily Debian installer images. []
  • Reporting a bug against the debian-installer: in order to use a suitable kernel version. (#1006800)
  • Reporting a bug in: texlive-binaries regarding the unreproducible content of .fmt files. (#1009196)
  • Adding hacks to make the Cinnamon desktop image reproducible in bookworm and sid. []
  • Added a script to rebuild a live-build ISO image from a given timestamp. [
  • etc.

On our mailing list, Venkata Pyla started a thread on the Debian debconf cache is non-reproducible issue while creating system images and Vagrant Cascadian posted an excellent summary of the reproducibility status of core package sets in Debian and solicited for similar information from other distributions.


Lastly, 122 reviews of Debian packages were added, 44 were updated and 193 were removed this month adding to our extensive knowledge about identified issues. A number of issue types have been updated as well, including timestamps_generated_by_hevea, randomness_in_ocaml_preprocessed_files, build_path_captured_in_emacs_el_file, golang_compiler_captures_build_path_in_binary and build_path_captured_in_assembly_objects,


Other distributions

Happy birthday to GNU Guix, which recently turned 10 years old! People have been sharing their stories, in which reproducible builds and bootstrappable builds are a recurring theme as a feature important to its users and developers. The experiences are available on the GNU Guix blog as well as a post on fossandcrafts.org


In openSUSE, Bernhard M. Wiedemann posted his usual monthly reproducible builds status report.


Upstream patches

The Reproducible Builds project detects, dissects and attempts to fix as many currently-unreproducible packages as possible. We endeavour to send all of our patches upstream where appropriate. This month, we wrote a large number of such patches, including:


diffoscope

diffoscope is our in-depth and content-aware diff utility. Not only can it locate and diagnose reproducibility issues, it can provide human-readable diffs from many kinds of binary formats. This month, Chris Lamb prepared and uploaded versions 210 and 211 to Debian unstable, as well as noticed that some Python .pyc files are reported as data, so we should support .pyc as a fallback filename extension [].

In addition, Mattia Rizzolo disabled the Gnumeric tests in Debian as the package is not currently available [] and dropped mplayer from Build-Depends too []. In addition, Mattia fixed an issue to ensure that the PATH environment variable is properly modified for all actions, not just when running the comparator. []


Testing framework

The Reproducible Builds project runs a significant testing framework at tests.reproducible-builds.org, to check packages and other artifacts for reproducibility. This month, the following changes were made:

  • Daniel Golle:

    • Prefer a different solution to avoid building all OpenWrt packages; skip packages from optional community feeds. []
  • Holger Levsen:

    • Detect Python deprecation warnings in the node health check. []
    • Detect failure to build the Debian Installer. []
  • Mattia Rizzolo:

    • Install disorderfs for building OpenWrt packages. []
  • Paul Spooren (OpenWrt-related changes):

    • Don’t build all packages whilst the core packages are not yet reproducible. []
    • Add a missing RUN directive to node_cleanup. []
    • Be less verbose during a toolchain build. []
    • Use disorderfs for rebuilds and update the documentation to match. [][][]
  • Roland Clobus:

    • Publish the last reproducible Debian ISO image. []
    • Use the rebuild.sh script from the live-build package. []

Lastly, node maintenance was also performed by Holger Levsen [][].


If you are interested in contributing to the Reproducible Builds project, please visit our Contribute page on our website. However, you can get in touch with us via:

05 May, 2022 07:18PM

hackergotchi for Bits from Debian

Bits from Debian

Google Platinum Sponsor of DebConf22

Googlelogo

We are very pleased to announce that Google has committed to support DebConf22 as a Platinum sponsor. This is the third year in a row that Google is sponsoring The Debian Conference with the higher tier!

Google is one of the largest technology companies in the world, providing a wide range of Internet-related services and products as online advertising technologies, search, cloud computing, software, and hardware.

Google has been supporting Debian by sponsoring DebConf since more than ten years, and is also a Debian partner sponsoring parts of Salsa's continuous integration infrastructure within Google Cloud Platform.

With this additional commitment as Platinum Sponsor for DebConf22, Google contributes to make possible our annual conference, and directly supports the progress of Debian and Free Software helping to strengthen the community that continues to collaborate on Debian projects throughout the rest of the year.

Thank you very much Google, for your support of DebConf22!

Become a sponsor too!

DebConf22 will take place from July 17th to 24th, 2022 at the Innovation and Training Park (ITP) in Prizren, Kosovo, and will be preceded by DebCamp, from July 10th to 16th.

And DebConf22 is still accepting sponsors! Interested companies and organizations may contact the DebConf team through sponsors@debconf.org, and visit the DebConf22 website at https://debconf22.debconf.org/sponsors/become-a-sponsor.

DebConf22 banner open registration

05 May, 2022 08:00AM by The Debian Publicity Team

May 03, 2022

hackergotchi for Steve Kemp

Steve Kemp

A plea for books ..

Recently I've been getting much more interested in the "retro" computers of my youth, partly because I've been writing crazy code in Z80 assembly-language, and partly because I've been preparing to introduce our child to his first computer:

  • An actual 1982 ZX Spectrum, cassette deck and all.
    • No internet
    • No hi-rez graphics
    • Easily available BASIC
    • And as a nice bonus the keyboard is wipe-clean!

I've got a few books, books I've hoarded for 30+ years, but I'd love to collect some more. So here's my request:

  • If you have any books covering either the Z80 processor, or the ZX Spectrum, please consider dropping me an email.

I'd be happy to pay €5-10 each for any book I don't yet own, and I'd also be more than happy to cover the cost of postage to Finland.

I'd be particularly pleased to see anything from Melbourne House, and while low-level is best, the coding-books from Usbourne (The Mystery Of Silver Mountain, etc, etc) wouldn't go amiss either.

I suspect most people who have collected and kept these wouldn't want to part with them, but just in case ..

03 May, 2022 05:15PM

hackergotchi for Gunnar Wolf

Gunnar Wolf

Using a RPi as a display adapter

Almost ten months ago, I mentioned on this blog I bought an ARM laptop, which is now my main machine while away from home — a Lenovo Yoga C630 13Q50. Yes, yes, I am still not as much away from home as I used to before, as this pandemic is still somewhat of a thing, but I do move more.

My main activity in the outside world with my laptop is teaching. I teach twice a week, and… well, having a display for my slides and for showing examples in the terminal and such is a must. However, as I said back in August, one of the hardware support issues for this machine is:

No HDMI support via the USB-C displayport. While I don’t expect
to go to conferences or even classes in the next several months,
I hope this can be fixed before I do. It’s a potential important
issue for me.

It has sadly… not yet been solved ☹ While many things have improved since kernel 5.12 (the first I used), the Device Tree does not yet hint at where external video might sit.

So, I went to the obvious: Many people carry different kinds of video adaptors… I carry a slightly bulky one: A RPi3 😐

For two months already (time flies!), I had an ugly contraption where the RPi3 connected via Ethernet and displayed a VNC client, and my laptop had a VNC server. Oh, but did I mention — My laptop works so much better with Wayland than with Xorg that I switched, and am now a happy user of the Sway compositor (a drop-in replacement for the i3 window manager). It is built over WLRoots, which is a great and (relatively) simple project, but will thankfully not carry some of Gnome or KDE’s ideas — not even those I’d rather have. So it took a bit of searching; I was very happy to find WayVNC, a VNC server for wlroot-sbased Wayland compositors. I launched a second Wayland, to be able to have my main session undisturbed and present only a window from it.

Only that… VNC is slow and laggy, and sometimes awkward. So I kept searching for something better. And something better is, happily, what I was finally able to do!

In the laptop, I am using wf-recorder to grab an area of the screen and funnel it into a V4L2 loopback device (which allows it to be used as a camera, solving the main issue with grabbing parts of a Wayland screen):

/usr/bin/wf-recorder -g '0,32 960x540' -t --muxer=v4l2 --codec=rawvideo --pixelformat=yuv420p --file=/dev/video10

(yes, my V4L2Loopback device is set to /dev/video10). You will note I’m grabbing a 960×540 rectangle, which is the top ¼ of my screen (1920x1080) minus the Waybar. I think I’ll increase it to 960×720, as the projector to which I connect the Raspberry has a 4×3 output.

After this is sent to /dev/video10, I tell ffmpeg to send it via RTP to the fixed address of the Raspberry:

/usr/bin/ffmpeg -i /dev/video10 -an -f rtp -sdp_file /tmp/video.sdp rtp://10.0.0.100:7000/

Yes, some uglier things happen here. You will note /tmp/video.sdp is created in the laptop itself; this file describes the stream’s metadata so it can be used from the client side. I cheated and copied it over to the Raspberry, doing an ugly hardcode along the way:

user@raspi:~ $ cat video.sdp
v=0
o=- 0 0 IN IP4 127.0.0.1
s=No Name
c=IN IP4 10.0.0.100
t=0 0
a=tool:libavformat 58.76.100
m=video 7000 RTP/AVP 96
b=AS:200
a=rtpmap:96 MP4V-ES/90000
a=fmtp:96 profile-level-id=1

People familiar with RTP will scold me: How come I’m streaming to the unicast client address? I should do it to an address in the 224.0.0.0–239.0.0.0 range. And it worked, sometimes. I switched over to 10.0.0.100 because it works, basically always ☺

Finally, upon bootup, I have configured NoDM to start a session with the user user, and dropped the following in my user’s .xsession:

setterm -blank 0 -powersave off -powerdown 0
xset s off
xset -dpms
xset s noblank

mplayer -msglevel all=1 -fs /home/usuario/video.sdp

Anyway, as a result, my students are able to much better follow the pace of my presentation, and I’m able to do some tricks better (particularly when it requires quick reaction times, as often happens when dealing with concurrency and such issues).

Oh, and of course — in case it’s of interest to anybody, knowing that SD cards are all but reliable in the long run, I wrote a vmdb2 recipe to build the images. You can grab it here; it requires some local files to be present to be built — some are the ones I copied over above, and the other ones are surely of no interest to you (such as my public ssh key or such :-] )

What am I still missing? (read: Can you help me with some ideas? 😉)

  • I’d prefer having Ethernet-over-USB. I have the USB-C Ethernet adapter, which powers the RPi and provides a physical link, but I’m sure I could do away with the fugly cable wrapped around the machine…
  • Of course, if that happens, I would switch to a much sexier Zero RPi. I have to check whether the video codec is light enough for a plain ol’ Zero (armel) or I have to use the much more powerful Zero 2… I prefer sticking to the lowest possible hardware!
  • Naturally… The best would be to just be able to connect my USB-C-to-{HDMI,VGA} adapter, that has been sitting idly… 😕 One day, I guess…

Of course, this is a blog post published to brag about my stuff, but also to serve me as persistent memory in case I need to recreate this…

03 May, 2022 04:16PM

May 01, 2022

hackergotchi for Thomas Koch

Thomas Koch

Missing memegen

Posted on May 1, 2022

Back at $COMPANY we had an internal meme-site. I had some reputation in my team for creating good memes. When I watched Episode 3 of Season 2 from Yes Premier Minister yesterday, I really missed a place to post memes.

This is the full scene. Please watch it or even the full episode before scrolling down to the GIFs. I had a good laugh for some time.

With Debian, I could just download the episode from somewhere on the net with youtube-dl and easily create two GIFs using ffmpeg, with and without subtitle:

ffmpeg  -ss 0:5:59.600 -to 0:6:11.150 -i Downloads/Yes.Prime.Minister.S02E03-1254485068289.mp4 tmp/tragic.gif

ffmpeg  -ss 0:5:59.600 -to 0:6:11.150 -i Downloads/Yes.Prime.Minister.S02E03-1254485068289.mp4 \
        -vf "subtitles=tmp/sub.srt:force_style='Fontsize=60'" tmp/tragic_with_subtitle.gif

And this sub.srt file:

1
00:00:10,000 --> 00:00:12,000
Tragic.

I believe, one needs to install the libavfilter-extra variant to burn the subtitle in the GIF.

Some

space

to

hide

the

GIFs.

The Premier Minister just learned, that his predecessor, who was about to publish embarassing memories, died of a sudden heart attack:

I can’t actually think of a meme with this GIF, that the internal thought police community moderation would not immediately take down.

For a moment I thought that it would be fun to have a Meme-Site for Debian members. But it is probably not the right time for this.

Maybe somebody likes the above GIFs though and wants to use them somewhere.

01 May, 2022 06:17PM

lsp-java coming to debian

Posted on March 12, 2022
Tags: debian

The Language Server Protocol (LSP) standardizes communication between editors and so called language servers for different programming languages. This reduces the old problem that every editor had to implement many different plugins for all different programming languages. With LSP an editor just needs to talk LSP and can immediately provide typicall IDE features.

I already packaged the Emacs packages lsp-mode and lsp-haskell for Debian bullseye. Now lsp-java is waiting in the NEW queue.

I’m always worried about downloading and executing binaries from random places of the internet. It should be a matter of hygiene to only run binaries from official Debian repositories. Unfortunately this is not feasible when programming and many people don’t see a problem with running multiple curl-sh pipes to set up their programming environment.

I prefer to do such stuff only in virtual machines. With Emacs and LSP I can finally have a lightweight textmode programming environment even for Java.

Unfortunately the lsp-java mode does not yet work over tramp. Once this is solved, I could run emacs on my host and only isolate the code and language server inside the VM.

The next step would be to also keep the code on the host and mount it with Virtio FS in the VM. But so far the necessary daemon is not yet in Debian (RFP: #1007152).

In Detail I uploaded these packages:

01 May, 2022 06:17PM

Waiting for a STATE folder in the XDG basedir spec

Posted on February 18, 2014

The XDG Basedirectory specification proposes default homedir folders for the categories DATA (~/.local/share), CONFIG (~/.config) and CACHE (~/.cache). One category however is missing: STATE. This category has been requested several times but nothing happened.

Examples for state data are:

  • history files of shells, repls, anything that uses libreadline
  • logfiles
  • state of application windows on exit
  • recently opened files
  • last time application was run
  • emacs: bookmarks, ido last directories, backups, auto-save files, auto-save-list

The missing STATE category is especially annoying if you’re managing your dotfiles with a VCS (e.g. via VCSH) and you care to keep your homedir tidy.

If you’re as annoyed as me about the missing STATE category, please voice your opinion on the XDG mailing list.

Of course it’s a very long way until applications really use such a STATE directory. But without a common standard it will never happen.

01 May, 2022 06:17PM

shared infrastructure coop

Posted on February 5, 2014

I’m working in a very small web agency with 4 employees, one of them part time and our boss who doesn’t do programming. It shouldn’t come as a surprise, that our development infrastructure is not perfect. We have many ideas and dreams how we could improve it, but not the time. Now we have two obvious choices: Either we just do nothing or we buy services from specialized vendors like github, atlassian, travis-ci, heroku, google and others.

Doing nothing does not work for me. But just buying all this stuff doesn’t please me either. We’d depend on proprietary software, lock-in effects or one-size-fits-all offerings. Another option would be to find other small web shops like us, form a cooperative and share essential services. There are thousands of web shops in the same situation like us and we all need the same things:

  • public and private Git hosting
  • continuous integration (Jenkins)
  • code review (Gerrit)
  • file sharing (e.g. git-annex + webdav)
  • wiki
  • issue tracking
  • virtual windows systems for Internet Explorer testing
  • MySQL / Postgres databases
  • PaaS for PHP, Python, Ruby, Java
  • staging environment
  • Mails, Mailing Lists
  • simple calendar, CRM
  • monitoring

As I said, all of the above is available as commercial offerings. But I’d prefer the following to be satisfied:

  • The infrastructure itself should be open (but not free of charge), like the OpenStack Project Infrastructure as presented at LCA. I especially like how they review their puppet config with Gerrit.

  • The process to become an admin for the infrastructure should work much the same like the process to become a Debian Developer. I’d also like the same attitude towards quality as present in Debian.

Does something like that already exists? There already is the German cooperative hostsharing which is kind of similar but does provide mainly hosting, not services. But I’ll ask them next after writing this blog post.

Is your company interested in joining such an effort? Does it sound silly?

Comments:

Sounds promising. I already answered by mail. Dirk Deimeke (Homepage) am 16.02.2014 08:16 Homepage: http://d5e.org

I’m sorry for accidentily removing a comment that linked to https://mayfirst.org while moderating comments. I’m really looking forward to another blogging engine… Thomas Koch am 16.02.2014 12:20

Why? What are you missing? I am using s9y for 9 years now. Dirk Deimeke (Homepage) am 16.02.2014 12:57

01 May, 2022 06:17PM

Paul Wise

FLOSS Activities April 2022

Focus

This month I didn't have any particular focus. I just worked on issues in my info bubble.

Changes

Issues

Review

  • Spam: reported 33 Debian mailing list posts
  • Debian wiki: RecentChanges for the month
  • Debian BTS usertags: changes for the month
  • Debian screenshots:

Administration

  • Debian wiki: unblock IP addresses, approve accounts

Communication

Sponsors

The libpst, gensim, SPTAG work was sponsored. All other work was done on a volunteer basis.

01 May, 2022 12:26AM

April 30, 2022

hackergotchi for Junichi Uekawa

Junichi Uekawa

Already May.

Already May. I've been writing some code in rust and a bit of javascript. But real life is too busy.

30 April, 2022 11:57PM by Junichi Uekawa

April 29, 2022

hackergotchi for Jonathan Dowland

Jonathan Dowland

hyperlinked PDF planner

The Year page

The Year page

A day page

A day page

I've been having reasonable success with time blocking, a technique I learned from Cal Newport's writings, in particular Deep Work. I'd been doing it on paper for a while, but I wanted to try and move to a digital solution.

There's a cottage industry of people making (and selling) various types of diary and planner as PDF files for use on tablets such as the Remarkable. Some of these use PDF hyperlinks to greatly improve navigating around. This one from Clou Media is particularly good, but I found that I wanted something slightly different from what I could find out there, so I decided to build my own.

I explored a couple of different approaches for how to do this. One was Latex, and here's one example of a latex-based planner, but I decided against as I spend too much time wrestling with it for my PhD work already.

Another approach might have been Pandoc, but as far as I could tell its PDF pipeline went via Latex, so I thought I might as well cut out the middleman.

Eventually I stumbled across tools to build PDFs from HTML, via "CSS Paged Media". This appealed, because I've done plenty of HTML generation. print-css.rocks is a fantastic resource to explore the print-specific CSS features. Weasyprint is a fantastic open source tool to convert appropriately-written HTML/CSS into PDF.

Finally I wanted to use a templating system to take shortcuts on writing HTML. I settled for embedded Ruby, which is something I haven't touched in over a decade. This was a relatively simple project and I found it surprisingly fun.

The results are available on GitHub: https://github.com/jmtd/planner. Right now, you get exactly what I have described. But my next plan is to add support for re-generating a planner, incorporating new information: pulling diary info from iCal, and any annotations made (such as with the Remarkable tablet) on top of the last generation and preserving them on the next.

29 April, 2022 09:29PM

hackergotchi for Steinar H. Gunderson

Steinar H. Gunderson

Should we stop teaching the normal distribution?

I guess Betteridge's law of headlines gives you the answer, but bear with me. :-)

Like most engineers, I am a layperson in statistics; I had some in high school, then an intro course in university and then used it in a couple of random courses (like speech recognition). (I also took a multivariate statistics course on my own after I had graduated.) But pretty much every practical tool I ever learned was, eventually, centered around the normal distribution; we learned about Student's t-test in various scenarios, made confidence intervals, learned about the central limit theorem that showed its special place in statistics, how the binomial distribution converges to the normal distribution under reasonable circumstances (not the least due to the CLT), and so on.

But then I got out in the wild and started trying to make sense out of the troves of data coming my way (including some stemming from experiments I designed on my own). And it turns out… a lot of things really are not normal. I'd see distributions with heavy tails, with skew, or that were bimodal. And here's the thing—people, who had the same kind of non-statistics-specialized education as me, continued to treat these as Gaussian. And it still appears to work. You get the beautiful confidence intervals and low p-values that seem to make sense… it's just so odd that you get “p<0.05 significant“ tests way too often from random noise. You just assume that's how it is, without really realizing that you're doing junk statistics. And even if you do, you don't have the tools to do anything about it, because everything else is hidden away in obscure R libraries or somewhere on Math Stack Exchange.

So I ask: If we're really going to learn people one thing, is the normal distribution really the best tool? (Yes, sure, we learned about the Poission and Weibull and many others, but we never really did hypothesis testing on them, and we never really learned what to do when things didn't follow a tidy mathematical formula. Or even how to identify that.) It's beautiful and simple (“simple”) and mathematical and you only need a huge table and then you can almost do calculations by hand, but perhaps that's not really what we want? I understand we want to teach fundamental understanding and not just “use this computer tool”, but again, we're sending people out with a really limited tool set to make sense of the world.

I don't know what we should do instead—again, I am a layperson, and my understanding of this is limited. But it feels like we should be able to come up with fairly simple techniques that don't break down fatally if the data doesn't follow one given distribution, no matter how important. Bootstrap? Wilcoxon signed-rank test? I know, of course, that if the data really is normal, you will need a lot less data for the same-quality result (and some natural processes, like, I guess, radioactive decay, surely follow normal distributions), but perhaps we should leave the Gaussians and other parametric tools for the advanced courses? I don't know. But it's worth a thought. And I need to learn more statistics.

29 April, 2022 06:59PM

hackergotchi for Holger Levsen

Holger Levsen

20220429-Debian-Reunion-Hamburg-2022

Debian Reunion Hamburg 2022 from May 23 to 30

This is just a quick reminder for the Debian Reunion Hamburg 2022 happening in a bit more than 3 weeks.

So far 43 people have registered and thus there's still some on site accomodation available. There's no real deadline for registration, however if you register after May 1st you might not get a t-shirt in your prefered size.

Also: if you attend to give a presentation but haven't replied to the CfP, please do so.

The wiki page linked above has all the details.

29 April, 2022 12:34PM

Russ Allbery

Review: Interesting Times

Review: Interesting Times, by Terry Pratchett

Series: Discworld #17
Publisher: Harper
Copyright: 1994
Printing: February 2014
ISBN: 0-06-227629-8
Format: Mass market
Pages: 399

Interesting Times is the seventeenth Discworld novel and certainly not the place to start. At the least, you will probably want to read The Colour of Magic and The Light Fantastic before this book, since it's a sequel to those (although Rincewind has had some intervening adventures).

Lord Vetinari has received a message from the Counterweight Continent, the first in ten years, cryptically demanding the Great Wizzard be sent immediately.

The Agatean Empire is one of the most powerful states on the Disc. Thankfully for everyone else, it normally suits its rulers to believe that the lands outside their walls are inhabited only by ghosts. No one is inclined to try to change their minds or otherwise draw their attention. Accordingly, the Great Wizard must be sent, a task that Vetinari efficiently delegates to the Archchancellor. There is only the small matter of determining who the Great Wizzard is, and why it was spelled with two z's.

Discworld readers with a better memory than I will recall Rincewind's hat. Why the Counterweight Continent would demanding a wizard notorious for his near-total inability to perform magic is a puzzle for other people. Rincewind is promptly located by a magical computer, and nearly as promptly transported across the Disc, swapping him for an unnecessarily exciting object of roughly equivalent mass and hurling him into an unexpected rescue of Cohen the Barbarian. Rincewind predictably reacts by running away, although not fast or far enough to keep him from being entangled in a glorious popular uprising. Or, well, something that has aspirations of being glorious, and popular, and an uprising.

I hate to say this, because Pratchett is an ethically thoughtful writer to whom I am willing to give the benefit of many doubts, but this book was kind of racist.

The Agatean Empire is modeled after China, and the Rincewind books tend to be the broadest and most obvious parodies, so that was already a recipe for some trouble. Some of the social parody is not too objectionable, albeit not my thing. I find ethnic stereotypes and making fun of funny-sounding names in other languages (like a city named Hunghung) to be in poor taste, but Pratchett makes fun of everyone's names and cultures rather equally. (Also, I admit that some of the water buffalo jokes, despite the stereotypes, were pretty good.) If it had stopped there, it would have prompted some eye-rolling but not much comment.

Unfortunately, a significant portion of the plot depends on the idea that the population of the Agatean Empire has been so brainwashed into obedience that they have a hard time even imagining resistance, and even their revolutionaries are so polite that the best they can manage for slogans are things like "Timely Demise to All Enemies!" What they need are a bunch of outsiders, such as Rincewind or Cohen and his gang. More details would be spoilers, but there are several deliberate uses of Ankh-Morpork as a revolutionary inspiration and a great deal of narrative hand-wringing over how awful it is to so completely convince people they are slaves that you don't need chains.

There is a depressingly tedious tendency of western writers, even otherwise thoughtful and well-meaning ones like Pratchett, to adopt a simplistic ranking of political systems on a crude measure of freedom. That analysis immediately encounters the problem that lots of people who live within systems that rate poorly on this one-dimensional scale seem inadequately upset about circumstances that are "obviously" horrific oppression. This should raise questions about the validity of the assumptions, but those assumptions are so unquestionable that the writer instead decides the people who are insufficiently upset about their lack of freedom must be defective. The more racist writers attribute that defectiveness to racial characteristics. The less racist writers, like Pratchett, attribute that defectiveness to brainwashing and systemic evil, which is not quite as bad as overt racism but still rests on a foundation of smug cultural superiority.

Krister Stendahl, a bishop of the Church of Sweden, coined three famous rules for understanding other religions:

  1. When you are trying to understand another religion, you should ask the adherents of that religion and not its enemies.
  2. Don't compare your best to their worst.
  3. Leave room for "holy envy."

This is excellent advice that should also be applied to politics. Most systems exist for some reason. The differences from your preferred system are easy to see, particularly those that strike you as horrible. But often there are countervailing advantages that are less obvious, and those are more psychologically difficult to understand and objectively analyze. You might find they have something that you wish your system had, which causes discomfort if you're convinced you have the best political system in the world, or are making yourself feel better about the abuses of your local politics by assuring yourself that at least you're better than those people.

I was particularly irritated to see this sort of simplistic stereotyping in Discworld given that Ankh-Morpork, the setting of most of the Discworld novels, is an authoritarian dictatorship. Vetinari quite capably maintains his hold on power, and yet this is not taken as a sign that the city's inhabitants have been brainwashed into considering themselves slaves. Instead, he's shown as adept at maintaining the stability of a precarious system with a lot of competing forces and a high potential for destructive chaos. Vetinari is an awful person, but he may be better than anyone who would replace him. Hmm.

This sort of complexity is permitted in the "local" city, but as soon as we end up in an analog of China, the rulers are evil, the system lacks any justification, and the peasants only don't revolt because they've been trained to believe they can't. Gah.

I was muttering about this all the way through Interesting Times, which is a shame because, outside of the ham-handed political plot, it has some great Pratchett moments. Rincewind's approach to any and all danger is a running (sorry) gag that keeps working, and Cohen and his gang of absurdly competent decrepit barbarians are both funnier here than they have been in any previous book and the rare highly-positive portrayal of old people in fantasy adventures who are not wizards or crones. Pretty Butterfly is a great character who deserved to be in a better plot. And I loved the trouble that Rincewind had with the Agatean tonal language, which is an excuse for Pratchett to write dialog full of frustrated non-sequiturs when Rincewind mispronounces a word.

I do have to grumble about the Luggage, though. From a world-building perspective its subplot makes sense, but the Luggage was always the best character in the Rincewind stories, and the way it lost all of its specialness here was oddly sad and depressing. Pratchett also failed to convince me of the drastic retcon of The Colour of Magic and The Light Fantastic that he does here (and which I can't talk about in detail due to spoilers), in part because it's entangled in the orientalism of the plot.

I'm not sure Pratchett could write a bad book, and I still enjoyed reading Interesting Times, but I don't think he gave the politics his normal care, attention, and thoughtful humanism. I hope later books in this part of the Disc add more nuance, and are less confident and judgmental. I can't really recommend this one, even though it has some merits.

Also, just for the record, "may you live in interesting times" is not a Chinese curse. It's an English saying that likely was attributed to China to make it sound exotic, which is the sort of landmine that good-natured parody of other people's cultures needs to be wary of.

Followed in publication order by Maskerade, and in Rincewind's personal timeline by The Last Continent.

Rating: 6 out of 10

29 April, 2022 02:50AM

April 28, 2022

hackergotchi for Jonathan McDowell

Jonathan McDowell

Resizing consoles automatically

I have 2 very useful shell scripts related to resizing consoles. The first is imaginatively called resize and just configures the terminal to be the requested size, neatly resizing an xterm or gnome-terminal:

#!/bin/sh

# resize <rows> <columns>
/bin/echo -e '\033[8;'$1';'$2't'

The other is a bit more complicated and useful when connecting to a host via a serial console, or when driving a qemu VM with -display none -nographic and all output coming over a “serial console” on stdio. It figures out the size of the terminal it’s running in and correctly sets the local settings to match so you can take full advantage of a larger terminal than the default 80x24:

#!/bin/bash

echo -ne '\e[s\e[5000;5000H'
IFS='[;' read -p $'\e[6n' -d R -a pos -rs
echo -ne '\e[u'

# cols / rows
echo "Size: ${pos[2]} x ${pos[1]}"

stty cols "${pos[2]}" rows "${pos[1]}"

export TERM=xterm-256color

Generally I source this with . fix-term or the TERM export doesn’t get applied. Both of these exist in various places around the ‘net (and there’s a resize binary shipped along with xterm) but I always forget the exact terms to find it again when I need it. So this post is mostly intended to serve as future reference next time I don’t have them handy.

28 April, 2022 07:03PM

hackergotchi for Rapha&#235;l Hertzog

Raphaël Hertzog

Freexian’s report about Debian Long Term Support, March 2022

A Debian LTS logo

Every month we review the work funded by Freexian’s Debian LTS offering. Please find the report for March below.

Debian project funding

  • There was no new activity in Debian project funding in the two existing projects. However, there was a survey run with hundreds of Debian Developers and Debian contributors. The survey results are being collated and we will use the anonymized data to further develop the Freexian project funding initiative.
  • We are preparing to more broadly announce additional support for Debian 8 Jessie and Debian 9 Stretch. Now, Debian 8 can be supported until June 2025 and Debian 9 until June 2027. More information on ELTS support is available.
  • In March € 2250 was put aside to fund Debian projects.

Learn more about the rationale behind this initiative in this article.

Debian LTS contributors

In March, 11 contributors were paid to work on Debian LTS, their reports are available below. If you’re interested in participating in the LTS or ELTS teams, we welcome participation from the Debian community. Simply get in touch with Jeremiah or Raphaël if you are if you are interested in participating.

Evolution of the situation

In March we released 42 DLAs.

The security tracker currently lists 81 packages with a known CVE and the dla-needed.txt file has 52 packages needing an update.

We’re glad to welcome a few new sponsors such as Électricité de France (Gold sponsor), Telecats BV and Soliton Systems.

Thanks to our sponsors

Sponsors that joined recently are in bold.

28 April, 2022 10:47AM by Raphaël Hertzog

hackergotchi for Bits from Debian

Bits from Debian

DebConf22 bursary applications and call for papers are closing in less than 72 hours!

If you intend to apply for a DebConf22 bursary and/or submit an event proposal and have not yet done so, please proceed as soon as possible!

Bursary applications for DebConf22 will be accepted until May 1st at 23:59 UTC. Applications submitted after this deadline will not be considered.

You can apply for a bursary when you register for the conference.

Remember that giving a talk or organising an event is considered towards your bursary; if you have a submission to make, submit it even if it is only sketched-out. You will be able to detail it later. DebCamp plans can be entered in the usual Sprints page at the Debian wiki.

Please make sure to double-check your accommodation choices (dates and venue). Details about accommodation arrangements can be found on the accommodation page.

Event proposals will be accepted until May 1st at 23:59 UTC too.

Events are not limited to traditional presentations or informal sessions (BoFs): we welcome submissions of tutorials, performances, art installations, debates, or any other format of event that you think would be of interest to the Debian community.

Regular sessions may either be 20 or 45 minutes long (including time for questions), other kinds of sessions (workshops, demos, lightning talks, and so on) could have different durations. Please choose the most suitable duration for your event and explain any special requests. You can submit it here.

The the 23rd edition of DebConf will take place from July 17th to 24th, 2022 at the Innovation and Training Park (ITP) in Prizren, Kosovo, and will be preceded by DebCamp, from July 10th to 16th.

See you in Prizren!

DebConf22 banner open registration

28 April, 2022 07:30AM by The Debian Publicity Team

hackergotchi for Louis-Philippe Véronneau

Louis-Philippe Véronneau

Montreal's Debian & Stuff - April 2022

After two long years of COVID hiatus, local Debian events in Montreal are back! Last Sunday, nine of us met at Koumbit to work on Debian (and other stuff!), chat and socialise.

Even though these events aren't always the most productive, it was super fun and definitely helps keeping me motivated to work on Debian in my spare time.

Many thanks to Debian for providing us a budget to rent the venue for the day and for the pizzas! Here are a few pictures I took during the event:

Pizza boxes on a wooden bench

Whiteboard listing TODO items for some of the participants

A table with a bunch of laptops, and LeLutin :)

If everything goes according to plan, our next meeting should be sometime in June. If you are interested, the best way to stay in touch is either to subscribe to our mailing list or to join our IRC channel (#debian-quebec on OFTC). Events are also posted on Quebec's Agenda du libre.

28 April, 2022 04:00AM by Louis-Philippe Véronneau