July 12, 2025

Christian Kastner

Easy dynamic dispatch using GLIBC Hardware Capabilities

TL;DR With GLIBC 2.33+, you can build a shared library multiple times targeting various optimization levels, and the dynamic linker/loader will pick the highest version supported by the current CPU. For example, with the layout below, on a Ryzen 9 5900X, x86-64-v3/libfoo0.so would be loaded:

/usr/lib/glibc-hwcaps/x86-64-v4/libfoo0.so
/usr/lib/glibc-hwcaps/x86-64-v3/libfoo0.so
/usr/lib/glibc-hwcaps/x86-64-v2/libfoo0.so
/usr/lib/libfoo0.so

Longer Version

GLIBC Hardware Capabilities or "hwcaps" are an easy, almost trivial way to add a simple form of dynamic dispatch to any amd64 or POWER build, provided that either the build target or the compiler's optimizations can make use of certain CPU extensions.

Mo Zhou pointed me towards this when I was faced with the challenge of creating a performant Debian package for ggml, the tensor library behind llama.cpp and whisper.cpp.

The Challenge

A performant yet universally loadable library needs to make use of some form of dynamic dispatch to leverage the most effective SIMD extensions available on any given CPU it may run on. Last January, when I first started with the packaging of ggml for Debian, ggml did have support for this through its GGML_CPU_ALL_VARIANTS=ON option, but this was limited to amd64. This meant that on all the other architectures that Debian supports, I would need to target some ancient baseline, thus effectively crippling the package there.

Dynamic Dispatch using hwcaps

hwcaps were introduced in GLIBC 2.33 and replace the (now) Legacy Hardware Capabilities, which were removed in 2.37. The way hwcaps work is delightfully simple: the dynamic linker/loader will look for a shared library not just in the standard library paths, but also in subdirectories thereof of the form hwcaps/<level>, starting with the highest <level> that the current CPU supports. The levels are predefined. I'm using the amd64 levels below.

For ggml, this meant that I simply could build the library in multiple passes, each time targeting a different <level>, and install the result in the corresponding subdirectory, which resulted in the following layout (reduced to libggml.so for brevity):

/usr/lib/x86_64-linux-gnu/ggml/glibc-hwcaps/x86-64-v4/libggml.so
/usr/lib/x86_64-linux-gnu/ggml/glibc-hwcaps/x86-64-v3/libggml.so
/usr/lib/x86_64-linux-gnu/ggml/glibc-hwcaps/x86-64-v2/libggml.so
/usr/lib/x86_64-linux-gnu/ggml/libggml.so

In practice, this means that on a CPU supporting AVX512, the linker/loader would load x86-64-v4/libggml.so if it existed, and otherwise continue to look for the other levels, all the way down to the lowest one. On a CPU which supported only SSE4.2, the lookup process would be the same, ending with picking x86-64-v2/libggml.so. With QEMU, all of this was quickly verified.

Note that the lowest-level library, targeting x86-64-v1, is not installed to a subdirectory, but to the path where the library would normally have been installed. This has the nice property that on systems not using GLIBC, and thus not having hwcaps available, package installation will still result in a loadable library, albeit the version with the worst performance. And a careful observer might have noticed that in the example above, the library is installed to a private ggml/ directory, so this mechanism also works when using RUNPATH or LD_LIBRARY_PATH.

As mentioned above, Debian's ggml package will soon switch to GGML_CPU_ALL_VARIANTS=ON, but this was still quite the useful feature to discover.

12 July, 2025 06:20PM by Christian Kastner

Reproducible Builds

Reproducible Builds in June 2025

Welcome to the 6th report from the Reproducible Builds project in 2025. Our monthly reports outline what we’ve been up to over the past month, and highlight items of news from elsewhere in the increasingly-important area of software supply-chain security. If you are interested in contributing to the Reproducible Builds project, please see the Contribute page on our website.

In this report:

  1. Reproducible Builds at FOSSY 2025
  2. Distribution work
  3. diffoscope
  4. OSS Rebuild updates
  5. Website updates
  6. Upstream patches
  7. Reproducibility testing framework

Reproducible Builds at FOSSY 2025

On Saturday 2nd August, Vagrant Cascadian and Chris Lamb will be presenting at this year’s FOSSY 2025. Their talk, titled Never Mind the Checkboxes, Here’s Reproducible Builds!, is being introduced as follows:

There are numerous policy compliance and regulatory processes being developed that target software development… but do they solve actual problems? Does it improve the quality of software? Do Software Bill of Materials (SBOMs) actually give you the information necessary to verify how a given software artifact was built? What is the goal of all these compliance checklists anyways… or more importantly, what should the goals be? If a software object is signed, who should be trusted to sign it, and can they be trusted … forever?

The talk will introduce the audience to Reproducible Builds as a set of best practices which allow users and developers to verify that software artifacts were built from the source code, but also allows auditing for license compliance, providing security benefits, and removes the need to trust arbitrary software vendors.

Hosted by the Software Freedom Conservancy and taking place in Portland, Oregon, USA, FOSSY aims to be a community-focused event: “Whether you are a long time contributing member of a free software project, a recent graduate of a coding bootcamp or university, or just have an interest in the possibilities that free and open source software bring, FOSSY will have something for you”. More information on the event is available on the FOSSY 2025 website, including the full programme schedule.

Vagrant and Chris will also be staffing a table this year, where they will be available to answer any questions about Reproducible Builds and discuss collaborations with other projects.



Distribution work

In Debian this month:

  • Holger Levsen has discovered that it is now possible to bootstrap a minimal Debian trixie using 100% reproducible packages. This result can itself be reproduced, using the debian-repro-status tool and mmdebstrap’s support for hooks:

      $ mmdebstrap --variant=apt --include=debian-repro-status \
           --chrooted-customize-hook=debian-repro-status \
           trixie /dev/null 2>&1 | grep "Your system has"
       INFO  debian-repro-status > Your system has 100.00% been reproduced.
    
  • On our mailing list this month, Helmut Grohne wrote an extensive message raising an issue related to Uploads with conflicting buildinfo filenames:

    Having several .buildinfo files for the same architecture is something that we plausibly want to have eventually. Imagine running two sets of buildds and assembling a single upload containing buildinfo files from both buildds in the same upload. In a similar vein, as a developer I may want to supply several .buildinfo files with my source upload (e.g. for multiple architectures). Doing any of this is incompatible with current incoming processing and with reprepro.

  • 5 reviews of Debian packages were added, 4 were updated and 8 were removed this month adding to our ever-growing knowledge about identified issues.


In GNU Guix, Timothee Mathieu reported that a long-standing issue with reproducibility of shell containers across different host operating systems has been solved. In their message, Timothee mentions:

I discovered that pytorch (and maybe other dependencies) has a reproducibility problem of order 1e-5 when on AVX512 compared to AVX2. I first tried to solve the problem by disabling AVX512 at the level of pytorch, but it did not work. The dev of pytorch said that it may be because some components dispatch computation to MKL-DNN, I tried to disable AVX512 on MKL, and still the results were not reproducible, I also tried to deactivate in openmpi without success. I finally concluded that there was a problem with AVX512 somewhere in the dependencies graph but I gave up identifying where, as this seems very complicated.


The IzzyOnDroid Android APK repository made more progress in June. Not only have they just passed 48% reproducibility coverage, Ben started making their reproducible builds more visible, by offering rbtlog shields, a kind of badge that has been quickly picked up by many developers who are proud to present their applications’ reproducibility status.


Lastly, in openSUSE news, Bernhard M. Wiedemann posted another monthly update for their work there.


diffoscope

diffoscope is our in-depth and content-aware diff utility that can locate and diagnose reproducibility issues. This month, Chris Lamb made the following changes, including preparing and uploading versions 298, 299 and 300 to Debian:

  • Add python3-defusedxml to the Build-Depends in order to include it in the Docker image. []
  • Handle the RPM format’s HEADERSIGNATURES and HEADERIMMUTABLE as a special-case to avoid unnecessarily large diffs. Thanks to Daniel Duan for the report and suggestion. [][]
  • Update copyright years. []

In addition, @puer-robustus fixed a regression introduced in an earlier commit which resulted in some differences being lost. [][]

Lastly, Vagrant Cascadian updated diffoscope in GNU Guix to version 299 [][] and 300 [][].


OSS Rebuild updates

OSS Rebuild has added a new network analyzer that provides transparent HTTP(S) interception during builds, capturing all network traffic to monitor external dependencies and identify suspicious behavior, even in unmodified maintainer-controlled build processes.

The text-based user interface now features automated failure clustering that can group similar rebuild failures and provides natural language failure summaries, making it easier to identify and understand patterns across large numbers of build failures.

OSS Rebuild has also improved the local development experience with a unified interface for build execution strategies, allowing for more extensible environment setup for build execution. The team also designed a new website and logo.


Website updates

Once again, there were a number of improvements made to our website this month including:



Upstream patches

The Reproducible Builds project detects, dissects and attempts to fix as many currently-unreproducible packages as possible. We endeavour to send all of our patches upstream where appropriate. This month, we wrote a large number of such patches, including:


Reproducibility testing framework

The Reproducible Builds project operates a comprehensive testing framework running primarily at tests.reproducible-builds.org in order to check packages and other artifacts for reproducibility. In June, however, a number of changes were made by Holger Levsen, including:


  • reproduce.debian.net-related:

    • Installed and deployed rebuilderd version 0.24 from Debian unstable in order to make use of the new compression feature added by Jarl Gullberg for the database. This resulted in massive decrease of the SQLite databases:

      • 79G → 2.8G (all)
      • 84G → 3.2G (amd64)
      • 75G → 2.9G (arm64)
      • 45G → 2.1G (armel)
      • 48G → 2.2G (armhf)
      • 73G → 2.8G (i386)
      • 72G → 2.7G (ppc64el)
      • 45G → 2.1G (riscv64)

      … for a combined saving from 521G → 20.8G. This naturally reduces the requirements to run an independent rebuilderd instance and will permit us to add more Debian suites as well.

    • During migration to the latest version of rebuilderd, make sure several services are not started. []
    • Actually run rebuilderd from /usr/bin. []
    • Raise temperatures for NVME devices on some riscv64 nodes that should be ignored. [][]
    • Use a 64KB kernel page size on the ppc64el architecture (see #1106757). []
    • Improve ordering of some “failed to reproduce” statistics. []
    • Detect a number of potential causes of build failures within the statistics. [][]
    • Add support for manually scheduling for the any architecture. []
  • Misc:

    • Update the Codethink nodes as there are now many kernels installed. [][]
    • Install linux-sysctl-defaults on Debian trixie systems as we need ping functionality. []
    • Limit the fs.nr_open kernel turnable. []
    • Stop submitting results to deprecated buildinfo.debian.net service. [][]

In addition, Jochen Sprickerhof greatly improved the statistics and the logging functionality, including adopting to the new database format of rebuilderd version 0.24.0 [] and temporarily increasing maximum log size in order to debug a nettlesome build []. Jochen also dropped the CPUSchedulingPolicy=idle systemd flag on the workers. []



Finally, if you are interested in contributing to the Reproducible Builds project, please visit our Contribute page on our website. However, you can get in touch with us via:

12 July, 2025 04:08PM

hackergotchi for Louis-Philippe Véronneau

Louis-Philippe Véronneau

Adulting

In the last past weeks, I have done something I had been meaning to do for a while but always pushed back at the bottom of my TODO pile: prepare for my death.

I am still quite young and perfectly healthy (mentally and physically) and I do plan to live a long and full life, but death is something that comes from us all and can strike anytime. Having witnessed friends and colleagues who lost loved ones who did not prepare adequately for their passing, dealing with all this legal stuff ahead of time seems like the best gift you can leave them.

Writing my will was the easiest part of this "preparation for death" process. I have few material possessions and I'm leaving everything to my SO. As for the copyright for my code, I have decided everything I wrote will be licensed under CC0 (public domain) when I die. Quebec — where I live — also accepts holograph wills, which means I didn't have to hire a notary.

Apart from the will, I also wrote a protection mandate1, filled out Quebec's organ donation form2, took a contract for prearranged funeral services3 and finally, wrote a disaster recovery plan.

This recovery plan was by far the longest and most tedious part of this entire ordeal. If all your machines use full-disk encryption and you die or forget your passwords (for example after a head injury), can your data be recovered? How do you arbitrate between easy recovery and data security? If all your local devices burn down and you also pass away in the process, how will your next of kin access your remote backups and extract the relevant data (in my case, my password manager)?

I had to ask myself many complex questions in this process and although I won't be sharing my disaster recovery plan here (security through obscurity), I urge you to take the time to do something similar yourself and make sure you will leave a house in order when you go away.


  1. in case I become incapacitated and can't make choices by myself anymore. 

  2. it's sadly still opt-in here... 

  3. you pay now for the services you want, the money is kept in a trust in your name and you can't be charged extra when you do pass away. This protects you from inflation and is a great way to make sure your next of kin don't have to deal with the complexities of funeral services while grieving. 

12 July, 2025 02:45AM by Louis-Philippe Véronneau

July 11, 2025

Jamie McClelland

Avoiding Apache Max Request Workers Errors

Wow, I hate this error:

AH00484: server reached MaxRequestWorkers setting, consider raising the MaxRequestWorkers setting

For starters, it means I have to relearn how MaxRequestWorkers functions in Apache:

For threaded and hybrid servers (e.g. event or worker), MaxRequestWorkers restricts the total number of threads that will be available to serve clients. For hybrid MPMs, the default value is 16 (ServerLimit) multiplied by the value of 25 (ThreadsPerChild). Therefore, to increase MaxRequestWorkers to a value that requires more than 16 processes, you must also raise ServerLimit.

Ok… remind me what ServerLimit refers to?

For the prefork MPM, this directive sets the maximum configured value for MaxRequestWorkers for the lifetime of the Apache httpd process. For the worker and event MPMs, this directive in combination with ThreadLimit sets the maximum configured value for MaxRequestWorkers for the lifetime of the Apache httpd process. For the event MPM, this directive also defines how many old server processes may keep running and finish processing open connections. Any attempts to change this directive during a restart will be ignored, but MaxRequestWorkers can be modified during a restart. Special care must be taken when using this directive. If ServerLimit is set to a value much higher than necessary, extra, unused shared memory will be allocated. If both ServerLimit and MaxRequestWorkers are set to values higher than the system can handle, Apache httpd may not start or the system may become unstable. With the prefork MPM, use this directive only if you need to set MaxRequestWorkers higher than 256 (default). Do not set the value of this directive any higher than what you might want to set MaxRequestWorkers to. With worker, use this directive only if your MaxRequestWorkers and ThreadsPerChild settings require more than 16 server processes (default). Do not set the value of this directive any higher than the number of server processes required by what you may want for MaxRequestWorkers and ThreadsPerChild. With event, increase this directive if the process number defined by your MaxRequestWorkers and ThreadsPerChild settings, plus the number of gracefully shutting down processes, is more than 16 server processes (default).

Got it? In other words, you can “consider” raising the MaxRequestWorkers setting all you want, but you can’t just change that setting, you have to read about several other compliated settings, do some math, and spend a lot of time wondering if you are going to remember what you just did and how to undo it if you blow up your server.

On the plus side, typically, nobody should increase this limit - because if the server runs out of connections, it usually means something else is wrong.

In our case, on a shared web server running Apache2 and PHP-FPM, it’s usually because a single web site has gone out of control.

But wait! How can that happen, we are using PHP-FPM’s max_children setting to prevent a single PHP web site from taking down the server?

After years of struggling with this problem I have finally made some headway.

Our PHP pool configuration typically looks like this:

user = site342999writer
group = site342999writer
listen = /run/php/8.1-site342999.sock
listen.owner = www-data
listen.group = www-data
pm = ondemand
pm.max_children = 12
pm.max_requests = 500
php_admin_value[memory_limit] = 256M

And we invoke PHP-FPM via this apache snippet:

<FilesMatch \.php$>
        SetHandler "proxy:unix:/var/run/php/8.1-site342999.sock|fcgi://localhost"
</FilesMatch>

With these settings in place, what happens when we use up all 12 max_children?

According to the docs:

By default, mod_proxy will allow and retain the maximum number of connections that could be used simultaneously by that web server child process. Use the max parameter to reduce the number from the default. The pool of connections is maintained per web server child process, and max and other settings are not coordinated among all child processes, except when only one child process is allowed by configuration or MPM design.

The max parameter seems to default to the ThreadsPerChild, so it seems that the default here is to allow any web site to consume ThreadsPerChild (25) x ServerLimit (16), which is also the max number of over all connections. Not great.

To make matter worse, there is another setting available which is mysteriously called acquire:

If set, this will be the maximum time to wait for a free connection in the connection pool, in milliseconds. If there are no free connections in the pool, the Apache httpd will return SERVER_BUSY status to the client.

By default this is not set which seems to suggest Apache will just hang on to connections forever until a free PHP process becomes available (or some other time out happens).

So, let’s try something different:

 <Proxy "fcgi://localhost">
    ProxySet acquire=1 max=12
  </proxy>

This snippet is the way you can configure the proxy configuration we setup in the SetHandler statement above. It’s documented on the Apache mod_proxy page.

Now we limit the maximum pool size per process to half of what is available for the entire server and we tell Apache to immediately throw a 503 error if we have exceeded our maximum number of connecitons.

Now, if a site is overwhelmed with traffic, instead of maxing out the available Apache connections while leaving user with constantly spinning browsers, the users will get 503’ed and the server will be able to server other sites.

11 July, 2025 12:27PM

hackergotchi for David Bremner

David Bremner

Hibernate on the pocket reform 5/n

Context

A Kernel Patch

  • The follow patch looks potentially relevant:

https://patchwork.kernel.org/project/linux-rockchip/patch/20250509-b4-pci_dwc_reset_support-v3-1-37e96b4692e7@wdc.com/

  • git clone https://github.com/torvalds/linux.git (Is there a better place? kernel.org is pretty opaque)

  • are the pre-reqs in mnt kernel? The patch header contains

    base-commit: 08733088b566b58283f0f12fb73f5db6a9a9de30
    change-id: 20250430-b4-pci_dwc_reset_support-d720dbafb7ea
    prerequisite-change-id: 20250404-pcie-reset-slot-730bfa71a202:v4
    prerequisite-patch-id: 2dad85eb26838d89569b12c19d70f392fa592667
    prerequisite-patch-id: 6238a682bd8e9476e5911b7a59263c3fc618d63e
    prerequisite-patch-id: 37cab00bc255a62b1e8396a48a3afba5e1751abd
    prerequisite-patch-id: ff711f65cf9926374646b76cd38bdd823d576764
    prerequisite-patch-id: 1654cca919d024b9a9190b28e90f722975c797e8
  • First check and see what is upstream. I had to remember how to use git-patch-id and also how to split a long regex disjunction into multiple lines.
git log --patch --no-merges v6.13.. | \
  git patch-id --stable | \
  grep -F -e 2dad85eb26838d89569b12c19d70f392fa592667 \
    -e 6238a682bd8e9476e5911b7a59263c3fc618d63e \
    -e 37cab00bc255a62b1e8396a48a3afba5e1751abd \
    -e ff711f65cf9926374646b76cd38bdd823d576764 \
    -e 1654cca919d024b9a9190b28e90f722975c797e8

yields

37cab00bc255a62b1e8396a48a3afba5e1751abd d1c696dba120624256ab335ab8247f535b872309
2dad85eb26838d89569b12c19d70f392fa592667 b06d125e6280603a34d9064cd9c12748ca2edb04

The two commits that are actually found, are only in tag 'v6.16~rc1'

  • The discussion on LKML mentions pci/slot-reset. Where does that branch live?
git remote add pci https://git.kernel.org/pub/scm/linux/kernel/git/pci/pci.git
git fetch pci
git for-each-ref refs/remotes/pci --format "%(refname)" | \
    while read branch
    do
        echo "checking $branch"
        git log --patch --no-merges --since 2025-01-01 $branch | \
            git patch-id --stable | \
            grep -F -e 2dad85eb26838d89569b12c19d70f392fa592667 \
                 -e 6238a682bd8e9476e5911b7a59263c3fc618d63e \
                 -e 37cab00bc255a62b1e8396a48a3afba5e1751abd \
                 -e ff711f65cf9926374646b76cd38bdd823d576764 \
                 -e 1654cca919d024b9a9190b28e90f722975c797e8
    done

This did not find any more commits, but I did learn how to use git-for-each-ref, so I guess not a total loss.

previous episode

11 July, 2025 10:29AM

Reproducible Builds (diffoscope)

diffoscope 301 released

The diffoscope maintainers are pleased to announce the release of diffoscope version 301. This version includes the following changes:

[ Chris Lamb ]
* Avoid spurious differences in h5dump output caused by exposure of absolute
  internal extraction paths. (Closes: #1108690)
* Use our_check_output in the ODT comparator.
* Memoize a number of calls to --version. Thanks, Jade! (Closes: #412)
* Update copyright years.

You find out more by visiting the project homepage.

11 July, 2025 12:00AM

July 10, 2025

hackergotchi for David Bremner

David Bremner

Hibernate on the pocket reform 4/n

Context

Log from (failed) platform test

After some fun I got the serial console working and re-ran the platform test.

After a bit of reading the serial console, I realized that rmmod dwc3 was causing more problems than it solved, in particularly reliable hard lockup on one of the CPUs.

My revised test script is

set -x
echo platform >  /sys/power/pm_test
echo reboot > /sys/power/disk
sleep 2
rmmod mt76x2u
sleep 2
echo disk >  /sys/power/state
sleep 2
modprobe mt76x2u

The current problem seems to be pcie not resuming properly.

[   65.306842] usbcore: deregistering interface driver mt76x2u
[   65.343606] wlx000a5205eb2d: deauthenticating from 20:05:b7:00:2d:89 by local choice (Reason: 3=DEAUTH_LEAVING)
[   67.995239] PM: hibernation: hibernation entry
[   68.048103] Filesystems sync: 0.022 seconds
[   68.049005] Freezing user space processes
[   68.051075] Freezing user space processes completed (elapsed 0.001 seconds)
[   68.051760] OOM killer disabled.
[   68.052597] PM: hibernation: Basic memory bitmaps created
[   68.053108] PM: hibernation: Preallocating image memory
[   69.719040] PM: hibernation: Allocated 366708 pages for snapshot
[   69.719650] PM: hibernation: Allocated 1466832 kbytes in 1.66 seconds (883.63 MB/s)
[   69.720370] Freezing remaining freezable tasks
[   69.723558] Freezing remaining freezable tasks completed (elapsed 0.002 seconds)
[   69.728002] rk_gmac-dwmac fe1b0000.ethernet end0: Link is Down
[   69.992324] rockchip-dw-pcie a40c00000.pcie: Failed to receive PME_TO_Ack
[   69.993405] PM: hibernation: debug: Waiting for 5 seconds.
[   76.059484] rockchip-dw-pcie a40c00000.pcie: Phy link never came up
[   76.060043] rockchip-dw-pcie a40c00000.pcie: fail to resume
[   76.060546] rockchip-dw-pcie a40c00000.pcie: PM: dpm_run_callback(): genpd_restore_noirq returns -110
[   76.061363] rockchip-dw-pcie a40c00000.pcie: PM: failed to restore noirq: error -110

previous episode|next episode

10 July, 2025 10:29AM

Russell Coker

Bad Product Comparisons and EVs

When companies design products a major concern seems to be what the reviewers will have to say about it. For any product of significant value the users are unable to perform any reasonable test before buying, for a casual user some problems may only be apparent after weeks of use so professional reviews are important to many people. The market apparently doesn’t want reviews of the form “here’s a list of products that are quite similar and all do the job well, you can buy any of them, it’s no big deal” which would be the most technically accurate way of doing it.

So the reviewers compare the products on the criteria that are easiest to measure, this lead to phones being compared by how light and thin they are. I think it’s often the case that users would be better served by thicker heavier phones that have larger batteries but instead they are being sold phones that have good battery life in a fresh installation but which don’t last a day with a full load of apps installed.

The latest issue with bad reviews driving poor product design is electric cars. For a while the advocates of old fashioned cars have touted the range of petrol cars which has become an issue for comparing EVs. I have been driving cars for 35 years and so far I have never driven anywhere that’s out of range of the current electric charging network, even with the range of the LEAF (which is smaller than many other EVs). If I ever felt the need to drive across the Nullarbor Plain then I could rent a car to do that and the costs of such car rental would be small compared to the money I’m saving by driving an EV and also small when compared to the premium I would have to pay for an EV with a larger range.

Some of the recent articles I’ve seen about EVs have covered vehicles with a battery range over 700Km which is greater than the legal distance a commercial driver can drive without a break. I’ve also seen articles about plans to have a small petrol or Diesel motor in an EV to recharge the battery without directly driving the wheels. A 9KW Diesel motor could provide enough electricity on average to keep the charge maintained in a LEAF battery and according to the specs of Diesel generators would take about 55Kg of fuel to provide the charge a LEAF needs to drive 1000Km. The idea of a mostly electric hybrid car that can do 1000Km on one tank of fuel is interesting as a thought experiment but doesn’t seem to have much actual use. Apparently a Chinese company is planning to release a car that can do 1400Km one one tank of fuel using such technology which is impressive but not particularly useful.

The next issue of unreasonable competition is in charge speed. Charging a car at 2KW from a regular power socket is a real limit to what you can do with a car. It’s a limit that hasn’t bothered me so far because the most driving I typically do in a week is less than one full charge, so at most I have to charge overnight twice in a week. But if I was going to drive to another city without hiring a car that has better range I’d need a fast charger. Most current models of the Nissan LEAF support charging speeds up to 50KW which means fully charging the battery in under an hour (or slightly over an hour for the long range version). If I was to drive from Melbourne to Canberra in my LEAF I’d have to charge twice which would be an annoyance at those speeds. There are a variety of EVs that can charge at 100KW and some as high as 350KW. 350KW is enough to fully charge the largest EV batteries in half an hour which seems to be as much as anyone would need. But there are apparently plans for 1MW car chargers which would theoretically be able to charge a Hummer (the EV with the largest battery) in 12 minutes. One obvious part of the solution to EV charging times is to not drive a Hummer! Another thing to note is that batteries can’t be charged at a high rate for all charge levels, this is why advertising for fast chargers makes claims like “80% charge in half an hour” which definitely doesn’t mean “100% charge in 37.5 minutes”!

There are significant engineering issues with high power applications. A 1MW cable is not just a bigger version of a regular power cable, there are additional safety issues, user training is required and cooling of the connector is probably required. That’s a lot to just get a better number in the table at the end of a review. There is research in progress on the Megawatt Charging System which is designed to charge heavy vehicles (presumably trucks and buses) at up to 3.75MW. Charging a truck at that rate is reasonable as the process of obtaining and maintaining a heavy vehicle license requires a significant amount of effort and some extra training in 3.75MW charging probably doesn’t make much difference.

A final issue with fast charging is the capacity of the grid. A few years ago I attended a lecture by an electrical engineer who works for the Victorian railway system which was very interesting. The Vic rail power setup involved about 100MW of grid connectivity with special contracts with the grid operators due to the fact that 1MW trains suddenly starting and stopping causes engineering problems that aren’t trivial to solve. They were also working on battery packs and super capacitors to deal with regenerative braking and to avoid brownouts in long sections of track. For a medium size petrol station 14 bays for fuelling cars is common. If 6 such petrol stations were replaced with fast charging stations that can charge cars at 1MW each that would draw the same power as the train network for the entire state! There is a need for significant engineering work to allow most cars to be electric no matter how it’s done, but we don’t need to make that worse just for benchmarks.

10 July, 2025 09:19AM by etbe

Tianon Gravi

Yubi Whati? (YubiKeys, ECDSA, and X.509)

Off-and-on over the last several weeks, I've been spending time trying to learn/understand YubiKeys better, especially from the perspective of ECDSA and signing. 🔏

I had a good mental model for how "slots" work (canonically referenced by their hexadecimal names such as 9C), but found that it had a gap related to "objects"; while closing that, I was annoyed that the main reference table for this gap lives primarily in either a PDF or inside several implementations, so I figured I should create the reference I want to see in the world, but that it would also be useful to write down some of my understanding for my own (and maybe others') future reference. 😎

So, to that end, I'm going to start with a bit (❗) of background information, with the heavy caveat that this only applies to "PIV" ("FIPS 201") usage of YubiKeys, and that I only actually care about ECDSA, although I've been reassured that it's the same for at least RSA (anything outside this is firmly Here Be Not Tianon; "gl hf dd"). 👍

(Incidentally, learning all this helped me actually appreciate the simplicity of cloud-based KMS solutions, which was an unexpected side effect. 😬)

At a really high level, ECDSA is like many other (asymmetric) cryptographic solutions – you've got a public key and a private key, the private key can be used to "sign" data (tiny amounts of data, in fact, like P-256 can only reasonably sign 256 bits of data, which is where cryptographic hashes like SHA256 come in as secure analogues for larger data in small bit sizes), and the public key can then be used to verify that the data was indeed signed by the private key, and only someone with the private key could've done so. There's some complex math and RNGs involved, but none of that's actually relevant to this post, so find that information elsewhere. 🙈

Unfortunately, this is where things go off the rails: PIV is X.509 ("x509") heavy, and there's no X.509 in the naïve view of my use case. 😞

In a YubiKey (or any other PIV-signing-supporting smart card? do they actually have competitors in this specific niche? 🤔), a given "slot" can hold one single private key. There are ~24 slots which can hold a private key and be used for signing, although "Slot 9c" is officially designated as the "Digital Signature" slot and is encouraged for signing purposes. 🌈⭐

One of the biggest gotchas is that with pure-PIV (and older YubiKey firmware 🤬) the public key for a given slot is only available at the time the key is generated, and the whole point of the device in the first place is that the private key is never, ever available from it (all cryptographic operations happen inside the device), so if you don't save that public key when you first ask the device to generate a private key in a particular slot, the public key is lost forever (asterisk). 🙊

$ # generate a new ECDSA P-256 key in "slot 9c" ("Digital Signature")
$ # WARNING: THIS WILL GLEEFULLY WIPE SLOT 9C WITHOUT PROMPTING
$ yubico-piv-tool --slot 9c --algorithm ECCP256 --action generate
-----BEGIN PUBLIC KEY-----
MFkwEwYHKoZIzj0CAQYIKoZIzj0DAQcDQgAEtGoWRGyjjUlJFXpu8BL6Rnx8jjKR
5+Mzl2Vepgor+k7N9q7ppOtSMWefjFVR0SEPmXqXINNsCi6LpLtNEigIRg==
-----END PUBLIC KEY-----
Successfully generated a new private key.
$ # this is the only time/place we (officially) get this public key

With that background, now let's get to the second aspect of "slots" and how X.509 fits. For every aforementioned slot, there is a corresponding "object" (read: place to store arbitrary data) which is corresponding only by convention. For all these "key" slots the (again, by convention) corresponding "object" is explicitly supposed to be an X.509 certificate (see also the PDF reference linked above). 🙉

It turns out this is a useful and topical place to store that public key we need to keep handy! It's also an interesting place to shove additional details about what the key in a given slot is being used for, if that's your thing. Converting the raw public key into a (likely self-signed) X.509 certificate is an exercise for the reader, but if you want to follow the conventions, you need some way to convert a given "slot" to the corresponding "object", and that is the lookup table I wish existed in more forms. 🕳

So, without further ado, here is the anti-climax: 💫

Slot Object Description
0x9A 0x5FC105 X.509 Certificate for PIV Authentication
0x9E 0x5FC101 X.509 Certificate for Card Authentication
0x9C 0x5FC10A X.509 Certificate for Digital Signature
0x9D 0x5FC10B X.509 Certificate for Key Management
0x82 0x5FC10D Retired X.509 Certificate for Key Management 1
0x83 0x5FC10E Retired X.509 Certificate for Key Management 2
0x84 0x5FC10F Retired X.509 Certificate for Key Management 3
0x85 0x5FC110 Retired X.509 Certificate for Key Management 4
0x86 0x5FC111 Retired X.509 Certificate for Key Management 5
0x87 0x5FC112 Retired X.509 Certificate for Key Management 6
0x88 0x5FC113 Retired X.509 Certificate for Key Management 7
0x89 0x5FC114 Retired X.509 Certificate for Key Management 8
0x8A 0x5FC115 Retired X.509 Certificate for Key Management 9
0x8B 0x5FC116 Retired X.509 Certificate for Key Management 10
0x8C 0x5FC117 Retired X.509 Certificate for Key Management 11
0x8D 0x5FC118 Retired X.509 Certificate for Key Management 12
0x8E 0x5FC119 Retired X.509 Certificate for Key Management 13
0x8F 0x5FC11A Retired X.509 Certificate for Key Management 14
0x90 0x5FC11B Retired X.509 Certificate for Key Management 15
0x91 0x5FC11C Retired X.509 Certificate for Key Management 16
0x92 0x5FC11D Retired X.509 Certificate for Key Management 17
0x93 0x5FC11E Retired X.509 Certificate for Key Management 18
0x94 0x5FC11F Retired X.509 Certificate for Key Management 19
0x95 0x5FC120 Retired X.509 Certificate for Key Management 20

See also "piv-objects.json" for a machine-readable copy of this data. 👀🤖💻💾

(Major thanks to paultag and jon gzip johnson for helping me learn and generally putting up with me, but especially dealing with my live-stream-of-thoughts while I stumble through the dark. 💖)

10 July, 2025 07:00AM by Tianon Gravi (admwiggin@gmail.com)

July 08, 2025

hackergotchi for Steinar H. Gunderson

Steinar H. Gunderson

Superimposed codes, take two

After my last post on superimposed codes, I discovered that OEIS already had a sequence for it (I had just missed it due to a slightly different convention), namely A286874 (and its sister sequence A303977, which lists the number of distinct maximal solutions). However, very few terms of this sequence were known; in particular, it was known that a(12) >= 20 (easily proved by simply demonstrating a set of twenty 12-bit numbers with the desired property), but it wasn't known if the value could be higher (i.e., whether there existed a 12-bit set with 21 elements or more). The SAT solver wasn't really working well for this anymore, so I thought; can I just bruteforce it? I.e., can I enumerate all 12-bit 20-element sets and then see if any of them have room for a 21st element?

Now, obviously you cannot run a completely dumb bruteforce. The raw state space is 12*20 = 240 bits, and going through 2^240 different options is far away. But it's a good place to start, and then we can start employing tricks from there. (I'm sure there are more fancy ways somehow, but this one was what I chose. I'm a genius with mathematics, but I can write code.)

So I started with a 20-level deep for loop, with each element counting from 0 to 4095 (inclusive). Now, there are some speedups that are obvious; for instance, once you have two elements, you can check that neither is a subset of the other (which is, except in some edge cases with small sets that we don't need to worry about here, a looser condition than what we're trying to test for), and then skip the remaining 18 levels. Similarly, once we have the first three elements, we can start testing whether one is a subset of OR of the two others, and abort similarly.

Furthermore, we can start considering symmetries. We only care about solutions that are qualitatively distinct, in that the ordering of the elements don't matter and the ordering of the bits also don't matter. So we can simply only consider sequences where the elements are in order, which is extremely simple, very cheap, and nets us a speedup of 20! ~= 2.4 * 10^18. We have to be a bit careful, though, because this symmetry can conflict with other symmetries that we'd like to use for speedup. For instance, it would be nice to impose the condition that the elements must be in order of increasing population count (number of set bits), but if we do this at the same time as the “strictly increasing” condition, we'll start missing valid solutions. (I did use a very weak variant of it, though; no element can have smaller popcount than the first one. Otherwise, you could just swap those two elements and shuffle columns around, and it wouldn't be in conflict.)

However, there is more that we can do which isn't in conflict. In particular, let's consider (writing only 5-bit elements for brevity) that we are considering candidates for the first element:

00011
00101
00110
10010

These are all, obviously, the same (except that the latter ones will be more restrictive); we could just shuffle bits around and get the same thing. So we impose a new symmetry: Whenever we introduce new bits (bits that were previously always set), they need to start from the right. So now this start of a sequence is valid:

00011
00101

but this is not:

00011
01001

The reason is, again, that we could get the first sequence from the second by flipping the second and third bit (counting from the left). This is cheap and easy to test for, and is not in conflict with our “increasing” criterion as long as we make this specific choice.

But we can extend this even further. Look at these two alternatives:

00111
01001

and

00111
01010

They are also obviously equivalent as prefixes (just swap the fourth and fifth bits), so we don't want to keep both. We make a very similar restriction as before; if all previous bits in a position are the same, then we need to fill bits from the right. (If they're not, then we cannot impose a restriction.) This is also fairly easy to do with some bit fiddling, although my implementation only considers consecutive bits. (It's not in conflict with the strictly-increasing criterion, again because it only makes values lower, not higher. It is, in a sense, a non-decreasing criterion on the columns.)

And finally, consider these two sequences (with some other elements in-between):

00111
01001
.....
10011

and

00111
01011
.....
10001

They are also equivalent; if you exchange first and second bit and then swap the order of them, you end up with the same. So this brings us to the last symmetry: If you introduce a new bit (or more generally N new bits), then you are not allowed to introduce later a value that is the same bit shifted more to the left and with the other bits being lower. So the second sequence would be outlawed.

Now, how do we do all of these tests efficiently? (In particular, the last symmetry, while it helped a lot in reducing the number of duplicate solutions, wasn't a speed win at first.) My first choice was to just generate code that did all the tests, and did them as fast as possible. This was actually quite efficient, although it took GCC several minutes to compile (and Clang even more, although the resulting code wasn't much faster). Amusingly, this code ended up with an IPC above 6 on my Zen 3 (5950X); no need for hyperthreading here! I don't think I've ever seen real-life code this taxing on the execution units, even though this code is naturally extremely branch-heavy. Modern CPUs are amazing beasts.

It's a bit wasteful that we have 64-bit ALUs (and 256-bit SIMD ALUs) and use them to do AND/OR on 12 bits at a time. So I tried various tricks with packing the values to do more tests at a time, but unfortunately, it only lead to slowdowns. So eventually, I settled at a very different solution: Bitsets. At any given time, we have a 4096-bit set of valid future values for the inner for loops. Whenever we decide on a value, we look up in a set of pregenerated tables and just AND them into our set. For instance, if we just picked the value 3 (00011), we look up into the “3” table and it will instantly tell us that values like 7 (00111), 11 (01011), and many others are going to be invalid for all inner iterations and we can just avoid considering them altogether. (Iterating over only the set bits in a bitset is pretty fast in general, using only standard tricks.) This saves us from testing any further value against these illegals, so it's super-fast. The resulting tables are large (~4 GB), since we need to look up pairs of values into it, so this essentially transforms our high-ALU problem into a memory-bound problem, but it's still easily worth it (I think it gave a speedup of something like 80x). The actual ANDing is easily done with AVX2, 256 bits at a time.

This optimization not only made the last symmetry-breaking feasible, but also sped up the entire process enough (you essentially get O(n) bitset intersections instead of O(n²) new tests per level) that it went from a “multiple machines, multiple months” project to running comfortably within a day on my 5950X (~6 core-days). I guess maybe a bit anticlimactic; I had to move the database I used for work distribution locally to the machine or else the latency would be killing me. It found the five different solutions very quickly and then a couple of thousand duplicates of them (filtering those out efficiently is a kind of tricky problem in itself!), and then confirmed there were no others. I submitted it to OEIS, and it should hopefully go through the editing process fairly fast.

The obvious next question is: Can we calculate a(13) in the same way? Unfortunately, it seems the answer is no. Recompiling the same code with 13-bit parameters (taking the LUTs up to ~31 GB, still within the amount of RAM I've got) and making a 25-deep instead of 20-level deep for loop, and then running for a while, it seems that we're looking at roughly 4–5000 core years. Which is infeasible unless you've got a lot of money to burn (with spot VMs on GCE, you're talking about roughly half a million dollars, give or take) on something that isn't a very important problem in computer science.

In theory, there's still hope, though: The fact that we're still finding the same solution ~1000x (down from ~100000x before the last symmetries were added!) indicates that there's some more symmetry that we could in theory exploit and break (and that factor 1000 is likely to be much larger for 25 elements than for 20). So if someone more creative than me could invent code for identifying them—or some other way of rejecting elements early—we could perhaps identify a(13). But I don't think that's happening anytime soon. Brute force found its sweet spot and I'm happy about that, but it doesn't scale forever. :-)

08 July, 2025 07:34PM

Scarlett Gately Moore

KDE Applications snaps 25.04.3 released, plus new snaps and fixes!

I have released 25.04.3 I have upgraded the QT6 content snap to 6.9! Fixed a bug in kde-neon* extensions with cmake prefix path.

New snaps!

Audex: A CD ripping application.

GCompris – An excellent childrens education application

Labplot – Scientific plotting

Digikam – 8.7.0 with exiftool bug fixed https://bugs.kde.org/show_bug.cgi?id=501424

Krita – 5.2.11 – Excellent Graphic art platform ( compares to Photoshop )

kgraphviewer – Graphiz .dot file viewer

I am happy to report my arm is mostly functional! Unfortunately, maintaining all these snaps is an enormous amount of work, with time I don’t have! Please consider a donation for the time I should be spending job hunting / getting a website business off the ground. Thank you for your consideration!

08 July, 2025 03:25PM by sgmoore

hackergotchi for David Bremner

David Bremner

Hibernate on the pocket reform 3/n

Context

Serial console hardware

  • Manual is unclear about name of connector (J16 in schematics, J17 in manual).
  • Also numbering of pins is not given afaict.
  • Clone https://source.mnt.re/reform/pocket-reform.git
  • Look at pocket-reform-motherboard.kicad_pcb
  • From the PCB I can confirm J16 and pins numbered left (sysctl) to right.
  • attach "dtech" prolific PL2303 based serial to usb cable per serial console section of PR manual
  • lsusb shows ID 067b:23a3 Prolific Technology, Inc. ATEN Serial Bridge
  • install tio
  • add my user to group dialout
  • newgrp dialout
  • tio /dev/ttyUSB0 -b 1500000
  • A closer look at the PCB in kicad makes me realize the pin labels in the manual are wrong. 4 = GND, 5 = UART1_RX, 6= UART1_TX. With that change I have U-boot output on boot.

Serial console software

With some help from minute on ircs://irc.libera.chat:6697/#mnt-reform, I got the kernel boot arguments right to have not just u-boot output but linux kernel output on the serial console. In consfigurator notation

(on-change
      (file:has-content "/etc/flash-kernel/ubootenv.d/01reform2_serial_console"
        "setenv bootargs \"${bootargs} console=ttyS2,1500000 keep_bootcon\"")
    (cmd:single "flash-kernel"))

previous episode|next episode

08 July, 2025 10:29AM

Sahil Dhiman

Five Years of Writing - Now What?

Okay, here’s the deal, I pushed my first post on Reimagined Doodle - Alias Command, five years ago on July 8th, 2020. Don’t think I ever mentioned that post started out as a Github Gist which I later transferred here seeking a more long-term home on an independent platform.

Writing about writings, motivations, and the blog itself has been a recurring theme here over the years. 1 2 3 4 5 6 7 8 9

I’m unsure how I sustained expressing myself and writing here for this long. Now and then, I go months without any thought of writing, and then all of a sudden I start in bursts with sequential posts one after another. There isn’t a pattern per se in topics other than whatever burning question I have at the moment.

So here’s to a milestone and then some.

08 July, 2025 03:11AM

hackergotchi for Junichi Uekawa

Junichi Uekawa

Updated my timezone tool.

Updated my timezone tool. hover of mouse will change color. Trying to make it more visible to me.

08 July, 2025 01:56AM by Junichi Uekawa

July 07, 2025

Sahil Dhiman

Let's Talk About AI

Recently, Seth Godin wrote Productivity, AI and pushback:

Typesetters did not like the laser printer. Wedding photographers still hate the iphone. And some musicians are outraged that AI is now making mediocre pop music.

In the article, Seth connected how AI is increasing productivity and how anything that improves productivity always wins.

Nowadays, large language models (LLMs) have become synonymous with AI, while AI is a broader field. AI has brought a shift in how things are done. Use cases might vary, but it’s helping in ways like quickly summarizing huge knowledge bases to answer questions or, in my case, helping understand the contextual meaning of complex word (or sentence) usage in language and literature in both English and Hindi, which was sometimes not easy to comprehend with simple web search results.

Even if you or I don’t really like “AI in everything”, we can’t deny the fact that AI is here to stay. This doesn’t take away from the fact that AI needs to become ethical, regulated, and environmentally sustainable.

07 July, 2025 04:42PM

Thorsten Alteholz

My Debian Activities in June 2025

Debian LTS

This was my hundred-thirty-second month that I did some work for the Debian LTS initiative, started by Raphael Hertzog at Freexian. During my allocated time I uploaded or worked on:

  • [DLA 4221-1] libblockdev security update of one embargoed CVE related to obtaining full root privileges.
  • [hardening udisks2] uploaded new version of udisks2 with a hardening patch related to DLA 4221-1
  • [DLA 4235-1] sudo security update to fix one embargoed CVE related to prevent a local privilege escalation.
  • [#1106867] got permission to upload kmail-account-wizard; the package was marked as accepted in July.

This month I also did a week of FD duties and attended the monthly LTS/ELTS meeting.

Debian ELTS

This month was the eighty-third ELTS month. During my allocated time I uploaded or worked on:

  • [ELA-1465-1] libblockdev security update to fix one embargoed CVE in Buster, related to obtaining full root privileges.
  • [ELA-1475-1] gst-plugins-good1.0 security update to fix 16 CVEs in Stretch. This also includes cherry picking other commits to make this fixes possible.
  • [ELA-1476-1] sudo security update to fix one embargoed CVE in Buster, Stretch and Jessie. The fix is related to prevent a local privilege escalation.

This month I also did a week of FD duties and attended the monthly LTS/ELTS meeting.

Debian Printing

This month I uploaded bugfix versions of:

  • lprng to update translations.
  • mtink to update translations
  • cups to fix a FTBFS introduced by changes to systemd

Thanks a lot again to the Release Team who quickly handled all my unblock bugs!

This work is generously funded by Freexian!

Debian Astro

This month I uploaded bugfix versions of:

  • siril (sponsored upload to experimental)
  • calceph (sponsored upload to experimental)

Debian Mobcom

Unfortunately I didn’t found any time to work on this topic.

misc

This month I uploaded bugfix versions of:

Unfortunately I stumbled over a discussion about RFPs. One part of those involved wanted to automatically close older RFPs, the other part just wanted to keep them. But nobody suggested to really take care of those RFPs. Why is it easier to spend time on talking about something instead of solving the real problem? Anyway, I had a look at those open RFPs. Some of them can be just closed because they haven’t been closed when uploading the corresponding package. For some others the corresponding software has not seen any upstream activity for several years and depends on older software no longer in Debian (like Python 2). Such bugs can be just closed. Some requested software only works together with long gone technology (for example the open Twitter API). Such bugs can be just closed. Last but not least, even the old RFPs contain nice software, that is still maintained upstream and useful. One example is ta-lib that I uploaded in June. So, please, let’s put our money where out mouths are. My diary of closed RFP bugs is on people.d.o. If only ten people follow suit, all bugs can be closed within a year.

FTP master

It is still this time of the year when just a few packages arrive in NEW: it is Hard Freeze. So please don’t hold it against me that I enjoy the sun more than processing packages in NEW. This month I accepted 104 and rejected 13 packages. The overall number of packages that got accepted was 105.

07 July, 2025 09:40AM by alteholz

Birger Schacht

Debian on Framework 12

For some time now I was looking for a device to replace my Thinkpad. Its a 14" device, but thats to big for my taste. I am a big fan of small notebooks, so when frame.work announced their 12" laptop, I took the chance and ordered one right away.

I was in one of the very early batches and got my package a couple of days ago. When ordering, I chose the DIY edition, but in the end there was not that much of DIY to do: I had to plug in the storage and the memory, put the keyboard in and tighten some screws. There are very detailed instructions with a lot of photos that tell you which part to put where, which is nice.

Image of the Framework 12 laptop, assembled but powered off

My first impressions of the device are good - it is heavier than I anticipated, but very vell made. It is very easy to assemble and disassemble and it feels like it can take a hit.

When I started it the first time it took some minutes to boot because of the new memory module, but then it told me right away that it could not detect an operating system. As usual when I want to install a new system, I created a GRML live usb system and tried to boot from this USB device. But the Framwork BIOS did not want to let me boot GRML, telling me it is blocked by the current security policy. So I started to look in the BIOS where I could find the SecureBoot configuration, but there was no such setting anywhere. I then resorted to a Debian Live image, which was allowed to boot.

Image of the screen of the Framework 12 laptop, saying it could not detect an operating system

I only learned later, that the SecureBoot setting is in a separate section that is not part of the main BIOS configuration dialog. There is an “Administer Secure Boot” icon which you can choose when starting the device, but apparently only before you try to load an image that is not allowed.

I always use my personal minimal install script to install my Debian systems, so it did not make that much of a difference to use Debian Live instead of GRML. I only had to apt install debootstrap before running the script.

I updated the install script to default to trixie and to also install shim-signed and after successful installation booted into Debian 13 on the Framwork 12. Everthing seems to work fine so far. WIFI works. For sway to start I had to install firmware-intel-graphics. The touchscreen works without me having to configure anything (though I don’t have frame.work stylus, as they are not yet available), also changing the brightness of the screen worked right away. The keyboard feels very nice, likewise the touchpad, which I configured to allow tap-to-click using the tap enabled option of sway-input.

Image of the a Framework 12 laptop, showing the default Sway background image

One small downside of the keyboard is that it does not have a backlight, which was a surprise. But given that this is a frame.work laptop, there are chances that a future generation of the keyboard will have backlight support.

The screen of the laptop can be turned all the way around to the back of the laptops body, so it can be used as a tablet. In this mode the keyboard gets disabled to prevent accidently pushing keys when using the device in tablet mode.

For online meetings I still prefer using headphones with cables over bluetooth once, so I’m glad that the laptop has a headphone jack on the side.

Above the screen there are a camera and a microphone, which both have separate physical switches to disable them.

I ordered a couple of expansion cards, in the current setup I use two USB-C, one HDMI and one USB-A. I also ordered a 1TB expansion card and only used this to transfer my /home, but I soon realized that the card got rather hot, so I probably won’t use it as a permanent expansion.

I can not yet say a lot about how long the battery lasts, but I will bring the laptop to DebConf 25, I guess there I’ll find out. There I might also have a chance to test if the screen is bright enough to be usable outdoors ;)

07 July, 2025 05:28AM

July 05, 2025

hackergotchi for Bits from Debian

Bits from Debian

Bits from the DPL

Dear Debian community,

This is bits from the DPL for June.

The Challenge of Mentoring Newcomers

In June there was an extended discussion about the ongoing challenges around mentoring newcomers in Debian. As many of you know, this is a topic I’ve cared about deeply--long before becoming DPL. In my view, the issue isn’t just a matter of lacking tools or needing to “try harder” to attract contributors. Anyone who followed the discussion will likely agree that it’s more complex than that.

I sometimes wonder whether Debian’s success contributes to the problem. From the outside, things may appear to “just work”, which can lead to the impression: “Debian is doing fine without me--they clearly have everything under control.” But that overlooks how much volunteer effort it takes to keep the project running smoothly.

We should make it clearer that help is always needed--not only in packaging, but also in writing technical documentation, designing web pages, reaching out to upstreams about license issues, finding sponsors, or organising events. (Speaking from experience, I would have appreciated help in patiently explaining Free Software benefits to upstream authors.) Sometimes we think too narrowly about what newcomers can do, and also about which tasks could be offloaded from overcommitted contributors.

In fact, one of the most valuable things a newcomer can contribute is better documentation. Those of us who’ve been around for years may be too used to how things work--or make assumptions about what others already know. A person who just joined the project is often in the best position to document what’s confusing, what’s missing, and what they wish they had known sooner.

In that sense, the recent "random new contributor’s experience" posts might be a useful starting point for further reflection. I think we can learn a lot from positive user stories, like this recent experience of a newcomer adopting the courier package. I'm absolutely convinced that those who just found their way into Debian have valuable perspectives--and that we stand to learn the most from listening to them.

We should also take seriously what Russ Allbery noted in the discussion: "This says bad things about the project's sustainability and I think everyone knows that." Volunteers move on--that’s normal and expected. But it makes it all the more important that we put effort into keeping Debian's contributor base at least stable, if not growing.

Project-wide LLM budget for helping people

Lucas Nussbaum has volunteered to handle the paperwork and submit a request on Debian’s behalf to LLM providers, aiming to secure project-wide access for Debian Developers. If successful, every DD will be free to use this access--or not--according to their own preferences.

Kind regards Andreas.

05 July, 2025 10:00PM by Andreas Tille

Sergio Cipriano

How I finally tracked my Debian uploads correctly

How I finally tracked my Debian uploads correctly

A long time ago, I became aware of UDD (Ultimate Debian Database), which gathers various Debian data into a single SQL database.

At that time, we were trying to do something simple: list the contributions (package uploads) of our local community, Debian Brasília. We ended up with a script that counted uploads to unstable and experimental.

I was never satisfied with the final result because some uploads were always missing. Here is an example:

debci (3.0) experimental; urgency=medium
...
   [ Sergio de almeida cipriano Junior ]
   * Fix Style/GlovalVars issue
   * Rename blacklist to rejectlist
...

I made changes in debci 3.0, but the upload was done by someone else. This kind of contribution cannot be tracked by that script.

Then, a few years ago, I learned about Minechangelogs, which allows us to search through the changelogs of all Debian packages currently published.

Today, I decided to explore how this was done, since I couldn't find anything useful for that kind of query in UDD's tables.

That's when I came across ProjectB. It was my first time hearing about it. ProjectB is a database that stores all the metadata about the packages in the Debian archive, including the changelogs of those packages.

Now that I'm a Debian Developer, I have access to this database. If you also have access and want to try some queries, you can do this:

$ ssh <username>@mirror.ftp-master.debian.org -N -L 15434:danzi.debian.org:5435
$ psql postgresql://guest@localhost:15434/projectb?sslmode=allow

In the end, it finally solved my problem.

Using the code below, with UDD, I get 38 uploads:

import psycopg2

contributor = 'almeida cipriano'

try:
    connection = psycopg2.connect(
        user="udd-mirror",
        password="udd-mirror",
        host="udd-mirror.debian.net",
        port="5432",
        database="udd"
    )

    cursor = connection.cursor()

    query = f"SELECT source,version,date,distribution,signed_by_name \
FROM public.upload_history \
WHERE changed_by_name ILIKE '%{contributor}%' \
ORDER BY date;"

    cursor.execute(query)
    records = cursor.fetchall()

    print(f"I have {len(records)} uploads.")

    cursor.close()
    connection.close()

except (Exception, psycopg2.Error) as error:
    print("Error while fetching data from PostgreSQL", error)

Using the code bellow, with ProjectB, I get 43 uploads (the correct amount):

import psycopg2

contributor = 'almeida cipriano'

try:
    # SSH tunnel is required to access the database:
    # ssh <username>@mirror.ftp-master.debian.org -N -L 15434:danzi.debian.org:5435
    connection = psycopg2.connect(
        user="guest",
        host="localhost",
        port="15434",
        database="projectb",
        sslmode="allow"
    )
    connection.set_client_encoding('UTF8')

    cursor = connection.cursor()

    query = f"SELECT c.source, c.version, c.changedby \
FROM changes c \
JOIN changelogs ch ON ch.id = c.changelog_id \
WHERE c.source != 'debian-keyring' \
  AND (\
    ch.changelog ILIKE '%{contributor}%' \
    OR c.changedby ILIKE '%{contributor}%' \
  )\
ORDER BY c.seen;"

    cursor.execute(query)
    records = cursor.fetchall()

    print(f"I have {len(records)} uploads.")

    cursor.close()
    connection.close()

except (Exception, psycopg2.Error) as error:
    print("Error while fetching data from PostgreSQL", error)

It feels good to finally solve this itch I've had for years.

05 July, 2025 01:28PM

Taavi Väänänen

Tracking my train travel by parsing tickets in emails

Rumour has it that I might be a bit of a train nerd. At least I want to collect various nerdy data about my travels. Historically that data has lived in manual form in several places,1 but over the past year and a half I've been working on a toy project to collect most of that information into a custom tool.

That toy project2 uses various sources to get information about trains to fill up its database: for example, in Finland Fintraffic, the organization responsible for railway traffic management, publishes very comprehensive open data about almost everything that's moving on the Finnish railway network. Unfortunately, I cannot be on all of the trains.3 Thus I need to tell the system details about my journeys.

The obvious solution is to make a form that lets me save that data. Which I did, but I got very quickly bored of filling out that form, and as regular readers of this blog know, there is no reason to settle for a simple but boring solution when the alternative is to make something that is ridiculously overengineered.

Parsing data out of my train tickets

Finnish long-distance trains generally require train-specific seat reservations, which means VR (the train company) knows which trains I am on. We just need to find a way to extract that information in some machine-readable format. So my plan for the ridiculously overengineered solution was to parse the booking emails to get the details I need.

Now, VR ticket emails include the data I want in a couple of different formats: they're included as text in the HTML email body, they're in the embedded calendar invite, as text in the included PDF ticket, and encoded in the Aztec Code in the included PDF ticket. I chose to parse the last option with the hopes of building something that could be ported to parse other operators' tickets with relative ease.

Example Aztec code

Example Aztec code

After a bit of digging (thank you to the KDE Itinerary people for documenting this!) I stumbled upon an European Union Agency for Railways PDF titled ELECTRONIC SEAT/BERTH RESERVATION AND ELECTRONIC PRODUCTION OF TRANSPORT DOCUMENTS - TRANSPORT DOCUMENTS (RCT2 STANDARD) which, in its Appendix C.1, describes how the information is encoded in the code.4 (As a side note, various sources call these codes SSB version 1 codes, although that term isn't used in this specification. So maybe there are more specifications about the format that I haven't discovered yet!)

I then wrote a parser in Go for the binary data embedded in these codes. So far it works, although I wouldn't be surprised if there are some edge cases that it doesn't handle. In particular, the spec specifies a custom lookup table for converting between text and binary data, and that only has support for characters 0-9 and A-Z. But Finnish railway station codes can also use Ä and Ö.. maybe I need to buy a ticket to a station with one of those.

Extracting barcodes out of emails

A parser just for the binary format isn't enough here if the intended source input is the emails that VR sends upon making a booking. Time to write a single-purpose email server! In short, the logic in the server, again written in Go and with the help of go-smtp and go-message, is:

  • Accept any mail with a reasonable body size
  • Process through all body parts
  • For all PDF parts, extract all images
  • For all images, run them through ZXing
  • For all decoded barcodes, try to parse them with my new ticket parsing library I mentioned earlier
  • If any tickets are found, send the data from them and any metadata to the main backend, which will save them to a database

The custom mail server exposes an LMTP interface over TCP for my internet-facing mail servers to forward to. I chose LMTP for this because it seemed like a better fit in theory than normal (E)SMTP. I've since discovered that curl doesn't support LMTP which makes development much harder, and in practice there's no benefit of LMTP here as all mails are being sent to the backend in a single request regardless of the number of recipients, so maybe I'll migrate it to regular SMTP at some point.

Side quest time

The last missing part is automatically forwarding the ticket mails to the new service. I've routed a dedicated subdomain to the new service, and the backend is configured to allocate addresses like i2v44g2pygkcth64stjgyuqz@somedomain.example for each user. That's great if we wanted to manually forward mails to the service, but we can go one step above that. I created a dedicated email alias in my mail server config that routes both to my regular mailbox and the service address. That way I can update my VR account to use the alias and have mails automatically processed while still receiving backup copies of the tickets (and any other important mail that VR might send me).

Unfortunately that last part turns out something that's easier said than done. Logging in on the website, I'm greeted by this text stating I need to contact customer service by phone to change the address associated with my account.5 After a bit of digging, I noticed that the mobile app suggests filling out a feedback form in order to change the address. So I filled that, and after a day or two I got a "confirm you want to change your email" mail. Success!


  1. Including (but not limited to): a page of this website, the notes app on my phone, and an uMap map. ↩︎

  2. Which I'm not directly naming here because I still think it needs a lot more work before being presentable, but if you're really interested it's not that hard to find out. ↩︎

  3. Someone should invent human cloning so that we can fix this. ↩︎

  4. People who know much more about railway ticketing than I do were surprised when I told them this format is still in use somewhere. So, uh, sorry if you were expecting a nice universal worldwide standard! ↩︎

  5. In case you have not guessed yet, I do not like making phone calls. ↩︎

05 July, 2025 12:00AM by Taavi Väänänen (hi@taavi.wtf)

July 04, 2025

Russell Coker

Function Keys

For at least 12 years laptops have been defaulting to not having the traditional PC 101 key keyboard function key functionality and instead have had other functions like controlling the volume and have had a key labelled Fn to toggle the functions. It’s been a BIOS option to control whether traditional function keys or controls for volume etc are the default and for at least 12 years I’ve configured all my laptops to have the traditional function keys as the default.

Recently I’ve been working in corporate IT and having exposure to many laptops with the default BIOS settings for those keys to change volume etc and no reasonable option for addressing it. This has made me reconsider the options for configuring these things.

Here’s a page listing the standard uses of function keys [1]. Here is a summary of the relevant part of that page:

  • F1 key launches help doesn’t seem to get much use. The main help option in practice is Google (I anticipate controversy about this and welcome comments) and all the software vendors are investigating LLM options for help which probably won’t involve F1.
  • F2 is for renaming files but doesn’t get much use. Probably most people who use graphical file managers use the right mouse button for it. I use it when sorting a selection of photos.
  • F3 is for launching a search (which is CTRL-F in most programs).
  • ALT-F4 is for closing a window which gets some use, although for me the windows I close are web browsers (via CTRL-W) and terminals (via CTRL-D).
  • F5 is for reloading a page which is used a lot in web browsers.
  • F6 moves the input focus to the URL field of a web browser.
  • F8 is for moving a file which in the degenerate case covers the rename functionality of F2.
  • F11 is for full-screen mode in browsers which is sometimes handy.

The keys F1, F3, F4, F7, F9, F10, and F12 don’t get much use for me and for the people I observe. The F2 and F8 keys aren’t useful in most programs, F6 is only really used in web browsers – but the web browser counts as “most programs” nowadays.

Here’s the description of Thinkpad Fn keys [2]. I use Thinkpads for fun and Dell laptops for work, so it would be nice if they both worked in similar ways but of course they don’t. Dell doesn’t document how their Fn keys are laid out, but the relevant bit is that F1 to F4 are the same as on Thinkpads which is convenient as they are the ones that are likely to be commonly used and needed in a hurry.

I have used the KDE settings on my Thinkpad to map the function F1 to F3 keys to the Fn equivalents which are F1 to mute-audio, F2 for vol-down, and F3 for vol-up to allow using them without holding down the Fn key while having other function keys such as F5 and F6 have their usual GUI functionality. Now I have to could train myself to use F8 in situations where I usually use F2, at least when using a laptop.

The only other Fn combinations I use are F5 and F6 for controlling screen brightness, but that’s not something I use much.

It’s annoying that the laptop manufacturers forced me to this. Having a Fn key to get extra functions and not need 101+ keys on a laptop size device is a reasonable design choice. But they could have done away with the PrintScreen key to make space for something else. Also for Thinkpads a touch pad is something that could obviously be removed to gain some extra space as the Trackpoint does all that’s needed in that regard.

04 July, 2025 11:44AM by etbe

Sahil Dhiman

Secondary Authoritative Name Server Options for Self-Hosted Domains

In the past few months, I have moved authoritative name servers (NS) of two of my domains (sahilister.net and sahil.rocks) in house using PowerDNS. Subdomains of sahilister.net see roughly 320,000 hits/day across my IN and DE mirror nodes, so adding secondary name servers with good availability (in addition to my own) servers was one of my first priorities.

I explored the following options for my secondary NS, which also didn’t cost me anything:

1984 Hosting

Hurriance Electric

Afraid.org

Puck

NS-Global

Asking friends

Two of my friends and fellow mirror hosts have their own authoritative name server setup, Shrirang (ie albony) and Luke. Shirang gave me another POP in IN and through Luke (who does have an insane amount of in-house NS, see dig ns jing.rocks +short), I added a JP POP.

If we know each other, I would be glad to host a secondary NS for you in (IN and/or DE locations).

Some notes

  • Adding a third-party secondary is putting trust that the third party would serve your zone right.

  • Hurricane Electric and 1984 hosting provide multiple NS. One can use some or all of them. Ideally, you can get away with just using your own with full set from any of these two. Play around with adding and removing secondaries, which gives you the best results. . Using everyone is anyhow overkill, unless you have specific reasons for it.

  • Moving NS in-house isn’t that hard. Though, be prepared to get it wrong a few times (and some more). I have already faced partial outages because:

    • Recursive resolvers (RR) in the wild behave in a weird way and cache the wrong NS response for longer time than in TTL.
    • NS expiry took more than time. 2 out of 3 of my Netim’s NS (my domain registrar) had stopped serving my domain, while RRs in the wild hadn’t picked up my new in-house NS. I couldn’t really do anything about it, though.
    • Dot is pretty important at the end.
    • With HE.net, I forgot to delegate my domain on their panel and just added in my NS set, thinking I’ve already done so (which I did but for another domain), leading to a lame server situation.
  • In terms of serving traffic, there’s no distinction between primary and secondary NS. RR don’t really care who they’re asking the query to. So one can have hidden primary too.

  • I initially thought of adding periodic RIPE Atlas measurements from the global set but thought against it as I already host a termux mirror, which brings in thousands of queries from around the world leading to a diverse set of RRs querying my domain already.

  • In most cases, query resolution time would increase with out of zone NS servers (which most likely would be in external secondary). 1 query vs. 2 queries. Pay close attention to ADDITIONAL SECTION Shrirang’s case followed by mine:

$ dig ns albony.in

; <<>> DiG 9.18.36 <<>> ns albony.in
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 60525
;; flags: qr rd ra; QUERY: 1, ANSWER: 4, AUTHORITY: 0, ADDITIONAL: 9

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 65494
;; QUESTION SECTION:
;albony.in.			IN	NS

;; ANSWER SECTION:
albony.in.		1049	IN	NS	ns3.albony.in.
albony.in.		1049	IN	NS	ns4.albony.in.
albony.in.		1049	IN	NS	ns2.albony.in.
albony.in.		1049	IN	NS	ns1.albony.in.

;; ADDITIONAL SECTION:
ns3.albony.in.		1049	IN	AAAA	2a14:3f87:f002:7::a
ns1.albony.in.		1049	IN	A	82.180.145.196
ns2.albony.in.		1049	IN	AAAA	2403:44c0:1:4::2
ns4.albony.in.		1049	IN	A	45.64.190.62
ns2.albony.in.		1049	IN	A	103.77.111.150
ns1.albony.in.		1049	IN	AAAA	2400:d321:2191:8363::1
ns3.albony.in.		1049	IN	A	45.90.187.14
ns4.albony.in.		1049	IN	AAAA	2402:c4c0:1:10::2

;; Query time: 29 msec
;; SERVER: 127.0.0.53#53(127.0.0.53) (UDP)
;; WHEN: Fri Jul 04 07:57:01 IST 2025
;; MSG SIZE  rcvd: 286

vs mine

$ dig ns sahil.rocks

; <<>> DiG 9.18.36 <<>> ns sahil.rocks
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 64497
;; flags: qr rd ra; QUERY: 1, ANSWER: 11, AUTHORITY: 0, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 65494
;; QUESTION SECTION:
;sahil.rocks.			IN	NS

;; ANSWER SECTION:
sahil.rocks.		6385	IN	NS	ns5.he.net.
sahil.rocks.		6385	IN	NS	puck.nether.net.
sahil.rocks.		6385	IN	NS	colin.sahilister.net.
sahil.rocks.		6385	IN	NS	marvin.sahilister.net.
sahil.rocks.		6385	IN	NS	ns2.afraid.org.
sahil.rocks.		6385	IN	NS	ns4.he.net.
sahil.rocks.		6385	IN	NS	ns2.albony.in.
sahil.rocks.		6385	IN	NS	ns3.jing.rocks.
sahil.rocks.		6385	IN	NS	ns0.1984.is.
sahil.rocks.		6385	IN	NS	ns1.1984.is.
sahil.rocks.		6385	IN	NS	ns-global.kjsl.com.

;; Query time: 24 msec
;; SERVER: 127.0.0.53#53(127.0.0.53) (UDP)
;; WHEN: Fri Jul 04 07:57:20 IST 2025
;; MSG SIZE  rcvd: 313
  • Theoretically speaking, a small increase/decrease in resolution would occur based on the chosen TLD and the popularity of the TLD in query originators area (already cached vs. fresh recursion).
  • One can get away with having only 3 NS (or be like Google and have 4 anycast NS or like Amazon and have 8 or like Verisign and make it 13 :P).
  • Nowhere it’s written, your NS needs not to be called dns* or ns1, ns2 etc. Get creative with naming NS; be deceptive with the naming :D.
  • A good understanding of RR behavior can help engineer a good authoritative NS system.

Further reading

04 July, 2025 02:36AM

Valhalla's Things

Emergency Camisole

Posted on July 4, 2025
Tags: madeof:atoms, craft:sewing, FreeSoftWear

A camisole of white linen fabric; the sides have two vertical strips of filet cotton lace, about 5 cm wide, the top of the front is finished with another lace with triangular points and the straps are made with another insertion lace, about 2 cm wide.

And this is the time when one realizes that she only has one white camisole left. And it’s summer, so I’m wearing a lot of white shirts, and I always wear a white camisole under a white shirt (unless I’m wearing a full chemise).

Not a problem, I have a good pattern for a well fitting camisole that I’ve done multiple times, I don’t even need to take my measurements and draft things, I can get some white jersey from the stash and quickly make a few.

From the stash. Where I have a roll of white jersey and one of off-white jersey. It’s in the inventory. With the “position” field set to a place that no longer exists. uooops.

But I have some leftover lightweight (woven) linen fabric. Surely if I cut the pattern as is with 2 cm of allowance and then sew it with just 1 cm of allowance it will work even in a woven fabric, right?

Wrong.

I mean, it would have probably fit, but it was too tight to squeeze into, and would require adding maybe a button closure to the front. feasible, but not something I wanted.

But that’s nothing that can’t be solved with the Power of Insertion Lace, right?

One dig through the Lace Stash1 and some frantic zig-zag sewing later, I had a tube wide enough for me to squiggle in, with lace on the sides not because it was the easiest place for me to put it, but because it was the right place for it to preserve my modesty, of course.

Encouraged by this, I added a bit of lace to the front, for the look of it, and used some more insertion lace for the straps, instead of making them out of fabric.

And, it looks like it can work. I plan to wear it tonight, so that I can find out whether there is something that chafes or anything, but from a quick test it feels reasonable.

a detail of the side of the camisole, showing the full pattern of the filet lace (alternating Xs and Os), the narrow hem on the back (done with an hemming foot) and the fact that the finishing isn't very neat (but should be stable enough for long term use).

At bust level it’s now a bit too wide, and it gapes a bit under the arms, but I don’t think that it’s going to cause significant problems, and (other than everybody on the internet) nobody is going to see it, so it’s not a big deal.

I still have some linen, but I don’t think I’m going to make another one with the same pattern: maybe I’ll try to do something with a front opening, but I’ll see later on, also after I’ve been looking for the missing jersey in a few more potential places.

As for now, the number of white camisoles I have has doubled, and this is progress enough for today.


  1. with many thanks to my mother’s friend who gave me quite a bit of vintage cotton lace.↩︎

04 July, 2025 12:00AM

July 03, 2025

Matthias Geiger

Using the debputy language server in Debian (with neovim)

Since some time now debputy is available in the archive. It is a declarative buildsystem for debian packages, but also includes a Language Server (LS) part. A LS is a binary can hook into any client (editor) supporting the LSP (Language Server Protocol) and deliver syntax highlighting, completions, warnings and …

03 July, 2025 10:00PM by Matthias Geiger

Russell Coker

The Fuss About “AI”

There are many negative articles about “AI” (which is not about actual Artificial Intelligence also known as “AGI”). Which I think are mostly overblown and often ridiculous.

Resource Usage

Complaints about resource usage are common, training Llama 3.1 could apparently produce as much pollution as “10,000 round trips by car between Los Angeles and New York City”. That’s not great but when you compare to the actual number of people doing such drives in the US and the number of people taking commercial flights on that route it doesn’t seem like such a big deal. Apparently commercial passenger jets cause CO2 emissions per passenger about equal to a car with 2 people. Why is it relevant whether pollution comes from running servers, driving cars, or steel mills? Why not just tax polluters for the damage they do and let the market sort it out? People in the US make a big deal about not being communist, so why not have a capitalist solution, make it more expensive to do undesirable things and let the market sort it out?

ML systems are a less bad use of compute resources than Bitcoin, at least ML systems give some useful results while Bitcoin has nothing good going for it.

The Dot-Com Comparison

People often complain about the apparent impossibility of “AI” companies doing what investors think they will do. But this isn’t anything new, that all happened before with the “dot com boom”. I’m not the first person to make this comparison, The Daily WTF (a high quality site about IT mistakes) has an interesting article making this comparison [1]. But my conclusions are quite different.

The result of that was a lot of Internet companies going bankrupt, the investors in those companies losing money, and other companies then bought up their assets and made profitable companies. The cheap Internet we now have was built on the hardware from bankrupt companies which was sold for far less than the manufacture price. That allowed it to scale up from modem speeds to ADSL without the users paying enough to cover the purchase of the infrastructure. In the early 2000s I worked for two major Dutch ISPs that went bankrupt (not my fault) and one of them continued operations in the identical manner after having the stock price go to zero (I didn’t get to witness what happened with the other one). As far as I’m aware random Dutch citizens and residents didn’t suffer from this and employees just got jobs elsewhere.

There are good things being done with ML systems and when companies like OpenAI go bankrupt other companies will buy the hardware and do good things.

NVidia isn’t ever going to have the future sales that would justify a market capitalisation of almost 4 Trillion US dollars. This market cap can support paying for new research and purchasing rights to patented technology in a similar way to the high stock price of Google supported buying YouTube, DoubleClick, and Motorola Mobility which are the keys to Google’s profits now.

The Real Upsides of ML

Until recently I worked for a company that used ML systems to analyse drivers for signs of fatigue, distraction, or other inappropriate things (smoking which is illegal in China, using a mobile phone, etc). That work was directly aimed at saving human lives with a significant secondary aim of saving wear on vehicles (in the mining industry drowsy drivers damage truck tires and that’s a huge business expense).

There are many applications of ML in medical research such as recognising cancer cells in tissue samples.

There are many less important uses for ML systems, such as recognising different types of pastries to correctly bill bakery customers – technology that was apparently repurposed for recognising cancer cells.

The ability to recognise objects in photos is useful. It can be used for people who want to learn about random objects they see and could be used for helping young children learn about their environment. It also has some potential for assistance for visually impaired people, it wouldn’t be good for safety critical systems (don’t cross a road because a ML system says there are no cars coming) but could be useful for identifying objects (is this a lemon or a lime). The Humane AI pin had some real potential to do good things but there wasn’t a suitable business model [2], I think that someone will develop similar technology in a useful way eventually.

Even without trying to do what the Humane AI Pin attempted, there are many ways for ML based systems to assist phone and PC use.

ML systems allow analysing large quantities of data and giving information that may be correct. When used by a human who knows how to recognise good answers this can be an efficient way of solving problems. I personally have solved many computer problems with the help of LLM systems while skipping over many results that were obviously wrong to me. I believe that any expert in any field that is covered in the LLM input data could find some benefits from getting suggestions from an LLM. It won’t necessarily allow them to solve problems that they couldn’t solve without it but it can provide them with a set of obviously wrong answers mixed in with some useful tips about where to look for the right answers.

Jobs and Politics

Noema Magazine has an insightful article about how “AI” can allow different models of work which can enlarge the middle class [3].

I don’t think it’s reasonable to expect ML systems to make as much impact on society as the industrial revolution, and the agricultural revolutions which took society from more than 90% farm workers to less than 5%. That doesn’t mean everything will be fine but it is something that can seem OK after the changes have happened. I’m not saying “apart from the death and destruction everything will be good”, the death and destruction are optional. Improvements in manufacturing and farming didn’t have to involve poverty and death for many people, improvements to agriculture didn’t have to involve overcrowding and death from disease. This was an issue of political decisions that were made.

The Real Problems of ML

Political decisions that are being made now have the aim of making the rich even richer and leaving more people in poverty and in many cases dying due to being unable to afford healthcare. The ML systems that aim to facilitate such things haven’t been as successful as evil people have hoped but it will happen and we need appropriate legislation if we aren’t going to have revolutions.

There are documented cases of suicide being inspired by Chat GPT systems [4]. There have been people inspired towards murder by ChatGPT systems but AFAIK no-one has actually succeeded in such a crime yet. There are serious issues that need to be addressed with the technology and with legal constraints about how people may use it. It’s interesting to consider the possible uses of ChatGPT systems for providing suggestions to a psychologist, maybe ChatGPT systems could be used to alleviate mental health problems.

The cases of LLM systems being used for cheating on assignments etc isn’t a real issue. People have been cheating on assignments since organised education was invented.

There is a real problem of ML systems based on biased input data that issue decisions that are the average of the bigotry of the people who provided input. That isn’t going to be worse than the current situation of bigoted humans making decisions based on hate and preconceptions but it will be more insidious. It is possible to search for that so for example a bank could test it’s mortgage approval ML system by changing one factor at a time (name, gender, age, address, etc) and see if it changes the answer. If it turns out that the ML system is biased on names then the input data could have names removed. If it turns out to be biased about address then there could be weights put in to oppose that.

For a long time there has been excessive trust in computers. Computers aren’t magic they just do maths really fast and implement choices based on the work of programmers – who have all the failings of other humans. Excessive trust in a rule based system is less risky than excessive trust in a ML system where no-one really knows why it makes the decisions it makes.

Self driving cars kill people, this is the truth that Tesla stock holders don’t want people to know.

Companies that try to automate everything with “AI” are going to be in for some nasty surprises. Getting computers to do everything that humans do in any job is going to be a large portion of an actual intelligent computer which if it is achieved will raise an entirely different set of problems.

I’ve previously blogged about ML Security [5]. I don’t think this will be any worse than all the other computer security problems in the long term, although it will be more insidious.

How Will It Go?

Companies spending billions of dollars without firm plans for how to make money are going to go bankrupt no matter what business they are in. Companies like Google and Microsoft can waste some billions of dollars on AI Chat systems and still keep going as successful businesses. Companies like OpenAI that do nothing other than such chat systems won’t go well. But their assets can be used by new companies when sold at less than 10% the purchase price.

Companies like NVidia that have high stock prices based on the supposed ongoing growth in use of their hardware will have their stock prices crash. But the new technology they develop will be used by other people for other purposes. If hospitals can get cheap diagnostic ML systems because of unreasonable investment into “AI” then that could be a win for humanity.

Companies that bet their entire business on AI even when it’s not necessarily their core business (as Tesla has done with self driving) will have their stock price crash dramatically at a minimum and have the possibility of bankruptcy. Having Tesla go bankrupt is definitely better than having people try to use them as self driving cars.

03 July, 2025 10:21AM by etbe

Sergio Cipriano

Disable sleep on lid close

Disable sleep on lid close

I am using an old laptop in my homelab, but I want to do everything from my personal computer, with ssh. The default behavior in Debian is to suspend when the laptop lid is closed, but it's easy to change that, just edit

/etc/systemd/logind.conf

and change the line

#HandleLidSwitch=suspend

to

HandleLidSwitch=ignore

then

$ sudo systemctl restart systemd-logind

That's it.

03 July, 2025 01:49AM

July 02, 2025

hackergotchi for Dirk Eddelbuettel

Dirk Eddelbuettel

RcppArmadillo 14.6.0-1 on CRAN: New Upstream Minor Release

armadillo image

Armadillo is a powerful and expressive C++ template library for linear algebra and scientific computing. It aims towards a good balance between speed and ease of use, has a syntax deliberately close to Matlab, and is useful for algorithm development directly in C++, or quick conversion of research code into production environments. RcppArmadillo integrates this library with the R environment and language–and is widely used by (currently) 1241 other packages on CRAN, downloaded 40.4 million times (per the partial logs from the cloud mirrors of CRAN), and the CSDA paper (preprint / vignette) by Conrad and myself has been cited 634 times according to Google Scholar.

Conrad released a minor version 4.6.0 yesterday which offers new accessors for non-finite values. And despite being in Beautiful British Columbia on vacation, I had wrapped up two rounds of reverse dependency checks preparing his 4.6.0 release, and shipped this to CRAN this morning where it passed with flying colours and no human intervention—even with over 1200 reverse dependencies. The changes since the last CRAN release are summarised below.

Changes in RcppArmadillo version 14.6.0-1 (2025-07-02)

  • Upgraded to Armadillo release 14.6.0 (Caffe Mocha)

    • Added balance() to transform matrices so that column and row norms are roughly the same

    • Added omit_nan() and omit_nonfinite() to extract elements while omitting NaN and non-finite values

    • Added find_nonnan() for finding indices of non-NaN elements

    • Added standalone replace() function

  • The fastLm() help page now mentions that options to solve() can control its behavior.

Courtesy of my CRANberries, there is a diffstat report relative to previous release. More detailed information is on the RcppArmadillo page. Questions, comments etc should go to the rcpp-devel mailing list off the Rcpp R-Forge page.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. If you like this or other open-source work I do, you can sponsor me at GitHub.

02 July, 2025 09:21PM

Rcpp 1.1.0 on CRAN: C++11 now Minimum, Regular Semi-Annual Update

rcpp logo

With a friendly Canadian hand wave from vacation in Beautiful British Columbia, and speaking on behalf of the Rcpp Core Team, I am excited to shared that the (regularly scheduled bi-annual) update to Rcpp just brought version 1.1.0 to CRAN. Debian builds haven been prepared and uploaded, Windows and macOS builds should appear at CRAN in the next few days, as will builds in different Linux distribution–and of course r2u should catch up tomorrow as well.

The key highlight of this release is the switch to C++11 as minimum standard. R itself did so in release 4.0.0 more than half a decade ago; if someone is really tied to an older version of R and an equally old compiler then using an older Rcpp with it has to be acceptable. Our own tests (using continuous integration at GitHub) still go back all the way to R 3.5.* and work fine (with a new-enough compiler). In the previous release post, we commented that we had only reverse dependency (falsely) come up in the tests by CRAN, this time there was none among the well over 3000 packages using Rcpp at CRAN. Which really is quite amazing, and possibly also a testament to our rigorous continued testing of our development and snapshot releases on the key branch.

This release continues with the six-months January-July cycle started with release 1.0.5 in July 2020. As just mentioned, we do of course make interim snapshot ‘dev’ or ‘rc’ releases available. While we not longer regularly update the Rcpp drat repo, the r-universe page and repo now really fill this role admirably (and with many more builds besides just source). We continue to strongly encourage their use and testing—I run my systems with these versions which tend to work just as well, and are of course also fully tested against all reverse-dependencies.

Rcpp has long established itself as the most popular way of enhancing R with C or C++ code. Right now, 3038 packages on CRAN depend on Rcpp for making analytical code go faster and further. On CRAN, 13.6% of all packages depend (directly) on Rcpp, and 61.3% of all compiled packages do. From the cloud mirror of CRAN (which is but a subset of all CRAN downloads), Rcpp has been downloaded 100.8 million times. The two published papers (also included in the package as preprint vignettes) have, respectively, 2023 (JSS, 2011) and 380 (TAS, 2018) citations, while the the book (Springer useR!, 2013) has another 695.

As mentioned, this release switches to C++11 as the minimum standard. The diffstat display in the CRANberries comparison to the previous release shows how several (generated) sources files with C++98 boilerplate have now been removed; we also flattened a number of if/else sections we no longer need to cater to older compilers (see below for details). We also managed more accommodation for the demands of tighter use of the C API of R by removing DATAPTR and CLOENV use. A number of other changes are detailed below.

The full list below details all changes, their respective PRs and, if applicable, issue tickets. Big thanks from all of us to all contributors!

Changes in Rcpp release version 1.1.0 (2025-07-01)

  • Changes in Rcpp API:

    • C++11 is now the required minimal C++ standard

    • The std::string_view type is now covered by wrap() (Lev Kandel in #1356 as discussed in #1357)

    • A last remaining DATAPTR use has been converted to DATAPTR_RO (Dirk in #1359)

    • Under R 4.5.0 or later, R_ClosureEnv is used instead of CLOENV (Dirk in #1361 fixing #1360)

    • Use of lsInternal switched to lsInternal3 (Dirk in #1362)

    • Removed compiler detection macro in a header cleanup setting C++11 as the minunum (Dirk in #1364 closing #1363)

    • Variadic templates are now used onconditionally given C++11 (Dirk in #1367 closing #1366)

    • Remove RCPP_USING_CXX11 as a #define as C++11 is now a given (Dirk in #1369)

    • Additional cleanup for __cplusplus checks (Iñaki in #1371 fixing #1370)

    • Unordered set construction no longer needs a macro for the pre-C++11 case (Iñaki in #1372)

    • Lambdas are supported in a Rcpp Sugar functions (Iñaki in #1373)

    • The Date(time)Vector classes now have default ctor (Dirk in #1385 closing #1384)

    • Fixed an issue where Rcpp::Language would duplicate its arguments (Kevin in #1388, fixing #1386)

  • Changes in Rcpp Attributes:

    • The C++26 standard now has plugin support (Dirk in #1381 closing #1380)
  • Changes in Rcpp Documentation:

    • Several typos were correct in the NEWS file (Ben Bolker in #1354)

    • The Rcpp Libraries vignette mentions PACKAGE_types.h to declare types used in RcppExports.cpp (Dirk in #1355)

    • The vignettes bibliography file was updated to current package versions, and now uses doi references (Dirk in #1389)

  • Changes in Rcpp Deployment:

    • Rcpp.package.skeleton() creates ‘URL’ and ‘BugReports’ if given a GitHub username (Dirk in #1358)

    • R 4.4.* has been added to the CI matrix (Dirk in #1376)

    • Tests involving NA propagation are skipped under linux-arm64 as they are under macos-arm (Dirk in #1379 closing #1378)

Thanks to my CRANberries, you can also look at a diff to the previous release Questions, comments etc should go to the rcpp-devel mailing list off the R-Forge page. Bugs reports are welcome at the GitHub issue tracker as well (where one can also search among open or closed issues).

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. If you like this or other open-source work I do, you can sponsor me at GitHub.

02 July, 2025 08:05PM

hackergotchi for Junichi Uekawa

Junichi Uekawa

Japan is now very hot.

Japan is now very hot. If you are coming to Banpaku, be prepared.

02 July, 2025 01:12AM by Junichi Uekawa

July 01, 2025

hackergotchi for Ben Hutchings

Ben Hutchings

FOSS activity in June 2025

01 July, 2025 07:08PM by Ben Hutchings

hackergotchi for David Bremner

David Bremner

Hibernate on the pocket reform 2/n

Context

Testing continued

  • following a suggestion of gordon1, unload the mediatek module first. The following seems to work, either from the console or under sway
echo devices >  /sys/power/pm_test
echo reboot > /sys/power/disk
rmmod mt76x2u
echo disk >  /sys/power/state
modprobe mt76x2u
  • It even works via ssh (on wired ethernet) if you are a bit more patient for it to come back.
  • replacing "reboot" with "shutdown" doesn't seem to affect test mode.
  • replacing "devices" with "platform" (or "processors") leads to unhappiness.
    • under sway, the screen goes blank, and it does not resume
    • same on console

previous episode|next episode

01 July, 2025 10:29AM

hackergotchi for Guido Günther

Guido Günther

Free Software Activities June 2025

Another short status update of what happened on my side last month. Phosh 0.48.0 is out with nice improvements, phosh.mobi e.V. is alive, helped a bit to get cellbroadcastd out, osk bugfixes and some more:

See below for details on the above and more:

phosh

  • Fix crash triggered by our mpris player refactor (MR)
  • Generate vapi file for libphosh (MR)
  • Backport fixes for 0.47 (MR)
  • Media players lockscreen plugin (MR), bugfix
  • Fix lockscreen clock when am/pm is localized (MR)
  • Another round of CI cleanups (MR)
  • Proper life cycle for MeatinfoCache in app-grid button tests (MR)
  • Enable cell broadcast display by default (MR)
  • Release 0.48~rc1, 0.48.0

phoc

  • Unify output config updates and support adaptive sync (MR)
  • Avoid crash on shutdown (MR)
  • Avoid use after free in gtk-shell (MR)
  • Simplify CI (MR)
  • Release 0.48~rc1, 0.48.0

phosh-mobile-settings

stevia (formerly phosh-osk-stub)

  • Release 0.48~rc1, 0.48.0
  • Reject non-UTF-8 dictionaries for hunspell so avoid broken completion bar (MR)
  • Output tracking (MR) as prep for future work
  • Handle non-UTF-8 dictionaries for hunspell for input and output (MR)
  • Fix some leaks (MR)
  • Handle default completer changes right away (MR)

phosh-osk-data

  • Handle stevia rename (MR)
  • Supply ru presage data

phosh-vala-plugins

  • Add example plugin (MR)

pfs

  • Fix initial empty state (MR)
  • Use GNOME's mirror for fdo templates (MR)

xdg-desktop-portal-phosh

xdg-desktop-portal

  • Fix categories for cell broadcasts (MR)
  • Relax app-id requirement in app-chooser portal (MR)

phosh-debs

  • Switch from osk-stub to stevia (MR)

meta-phosh

  • Make installing from sid and experimental convenient (MR)

feedbackd

feedbackd-device-themes

gmobile

  • Release 0.4.0
  • Make gir and doc build warning free (MR)

GNOME clocks

  • Use libfeedback instead of GTK's media api: (MR). This way the alarm become more recognizable and users can tweak alarm sounds.
  • Fix flatpak build and CI in our branch that carries the needed patches for mobile

Debian

  • meta-phosh: Switch to 0.47 (MR)
  • libmbim: Upload 1.33.1 to experimental
  • libqmi: Upload 1.37.1 to experimental
  • modemmanager: Upload 1.23.1 to experimental
  • Update mobile-broadband-provider-info to 20250613 (MR) in experimental
  • Upload phoc 0.48~rc1, 0.48.0 to experimental
  • Upload gmobile 0.4.0 to experimental
  • Upload phosh-mobile-settings 0.48~rc1, 0.48.0 to experimental
  • Upload xdg-desktop-portal-phosh 0.48~rc1, 0.48.0 to experimental
  • Prepare stevia 0.48~rc1 and upload 0.48.0 to experimental
  • Upload feedbackd 0.8.3 to experimental
  • Upload feedbackd-device-themes 0.8.4 to experimental

Mobian

  • Add feedbackd and wakeup timer support (MR)

ModemManager

  • Release 1.25.1
  • Test and warning fixes (MR)
  • run asan in ci (MR) and fix more leaks

libmbim

libqmi

mobile-broadband-provider-info

Cellbroadcastd

  • Better handle empty operator (MR)
  • Use GApplicaation (MR)
  • Fix library init (MR)
  • Add desktop file (MR)
  • Allow to send notifications for cell broadcast messages (MR)
  • Build introspection data (MR)
  • Only indicate Cell Broadcast support for MM >= 1.25 (MR)
  • Implement duplication detection (MR)
  • Reduce API surface (MR)
  • Add symbols file (MR)
  • Support vala (MR)

iio-sensor-proxy

  • Add minimal gio dependency (MR)

twenty-twenty-hugo

  • Support Mastodon (MR)

gotosocial

  • Explain STARTTLS behavior in docs (MR)

Reviews

This is not code by me but reviews on other peoples code. The list is (as usual) slightly incomplete. Thanks for the contributions!

  • cellbroadcastd: Message store (MR)
  • cellbroadcastd: Print severity (MR)
  • cellbroadcastd: Packaging (MR)
  • cellbroadcastd: Rename from cbd (MR)
  • cellbroadcastd: Release 0.0.1 (MR)
  • cellbroadcastd: Release 0.0.2 (MR)
  • cellbroadcastd: Close file descriptors (MR)
  • cellbroadcastd: Sort messages by timestamp (MR)
  • meta-phosh: Ignore subprojects in format check (MR)
  • p-m-s: pmOS tweaks ground work (MR)
  • p-m-s: osk popover switch (MR)
  • p-m-s: Add panel search (MR)
  • p-m-s: Add cellbroadcastd message history (MR)
  • phosh: Add search daemon and command line tool to query search results (MR)
  • phosh: App-grid: Set max-width entries (MR)
  • chatty: Keyboard navigation improvements (MR)
  • phosh: LTR QuickSettings and fix LTR in screenshot tests (MR)
  • iio-sensor-proxy: improve buffer sensor discovery: (MR)
  • Calls: allow favorites to ring (MR)
  • feedbackd: More haptic udev rules (MR)
  • feedbackd: Simplify udev rules (MR)
  • feedbackd: Support legacy LED naming scheme (MR)
  • gmobile: FLX1 wakeup key support (MR)
  • gmobile: FP6 support (MR)

Help Development

If you want to support my work see donations.

Comments?

Join the Fediverse thread

01 July, 2025 08:47AM

Paul Wise

FLOSS Activities June 2025

Focus

This month I didn't have any particular focus. I just worked on issues in my info bubble.

Changes

Issues

Review

Sponsors

All work was done on a volunteer basis.

01 July, 2025 01:55AM

June 30, 2025

hackergotchi for Colin Watson

Colin Watson

Free software activity in June 2025

My Debian contributions this month were all sponsored by Freexian. This was a very light month; I did a few things that were easy or that seemed urgent for the upcoming trixie release, but otherwise most of my energy went into Debusine. I’ll be giving a talk about that at DebConf in a couple of weeks; this is the first DebConf I’ll have managed to make it to in over a decade, so I’m pretty excited.

You can also support my work directly via Liberapay or GitHub Sponsors.

PuTTY

After reading a bunch of recent discourse about X11 and Wayland, I decided to try switching my laptop (a Framework 13 AMD running Debian trixie with GNOME) over to Wayland. I don’t remember why it was running X; I think I must have either inherited some configuration from my previous laptop (in which case it could have been due to anything up to ten years ago or so), or else I had some initial problem while setting up my new laptop and failed to make a note of it. Anyway, the switch was hardly noticeable, which was great.

One problem I did notice is that my preferred terminal emulator, pterm, crashed after the upgrade. I run a slightly-modified version from git to make some small terminal emulation changes that I really must either get upstream or work out how to live without one of these days, so it took me a while to notice that it only crashed when running from the packaged version, because the crash was in code that only runs when pterm has a set-id bit. I reported this upstream, they quickly fixed it, and I backported it to the Debian package.

groff

Upstream bug #67169 reported URLs being dropped from PDF output in some cases. I investigated the history both upstream and in Debian, identified the correct upstream patch to backport, and uploaded a fix.

libfido2

I upgraded libfido2 to 1.16.0 in experimental.

Python team

I upgraded pydantic-extra-types to a new upstream version, and fixed some resulting fallout in pendulum.

I updated python-typing-extensions in bookworm-backports, to help fix python3-tango: python3-pytango from bookworm-backports does not work (10.0.2-1~bpo12+1).

I upgraded twisted to a new upstream version in experimental.

I fixed or helped to fix a few release-critical bugs:

30 June, 2025 11:30PM by Colin Watson

hackergotchi for Gunnar Wolf

Gunnar Wolf

Get your personalized map of DebConf25 in Brest

As I often do, this year I have also prepared a set of personalized maps for your OpenPGP keysigning in DebConf25, in Brest!

What is that, dare you ask?

Partial view of my OpenPGP map

One of the not-to-be-missed traditions of DebConf is a Key-Signing Party (KSP) that spans the whole conference! Travelling from all the corners of the world to a single, large group gathering, we have the ideal opportunity to spread some communicable diseases trust on your peers’ identities and strengthen Debian’s OpenPGP keyring.

But whom should you approach for keysigning?

Go find yourself in the nice listing I have prepared. By clicking on your long keyid (in my case, the link labeled 0x2404C9546E145360), anybody can download your certificate (public key + signatures). The SVG and PNG links will yield a graphic version of your position within the DC25 keyring, and the TXT link will give you a textual explanation of it. (of course, your links will differ, yada yada…)

Please note this is still a preview of our KSP information: You will notice there are outstanding several things for me to fix before marking the file as final. First, some names have encoding issues I will fix. Second, some keys might be missing — if you submitted your key as part of the conference registration form but it is not showing, it must be because my scripts didn’t find it in any of the queried keyservers. My scripts are querying the following servers:

hkps://keyring.debian.org/
hkps://keys.openpgp.org/
hkps://keyserver.computer42.org/
hkps://keyserver.ubuntu.com/
hkps://pgp.mit.edu/
hkps://pgp.pm/
hkps://pgp.surf.nl/
hkps://pgpkeys.eu/
hkps://the.earth.li/

Make sure your key is available in at least some of them; I will try to do a further run on Friday, before travelling, or shortly after arriving to France.

If you didn’t submit your key in time, but you will be at DC25, please mail me stating [DC25 KSP] in your mail title, and I will manually add it to the list.

On (hopefully!) Friday, I’ll post the final, canonical KSP coordination page which you should download and calculate its SHA256-sum. We will have printed out convenience sheets to help you do your keysigning at the front desk.

30 June, 2025 11:07PM

hackergotchi for David Bremner

David Bremner

Hibernate on the pocket reform 1/n

Configuration

  • script: https://docs.kernel.org/power/basic-pm-debugging.html

  • kernel is 6.15.4-1~exp1+reform20250628T170930Z

State of things

  • normal reboot works

  • Either from the console, or from sway, the intial test of reboot mode hibernate fails. In both cases it looks very similar to halting.

    • the screen is dark (but not completely black)
    • the keyboard is still illuminated
    • the system-controller still seems to work, althought I need to power off before I can power on again, and any "hibernation state" seems lost.

Running tests

  • this is 1a from above
  • freezer test passes
  • devices test from console
    • console comes back (including input)
    • networking (both wired and wifi) seems wedged.
    • console is full of messages from mt76x2u about vendor request 06 and 07 failing. This seems related to https://github.com/morrownr/7612u/issues/17
    • at some point the console becomes non-responsive, except for the aforementioned messages from the wifi module.
  • devices test under sway
    • display comes back
    • keyboard/mouse seem disconnected
    • network down / disconnected?

next episode

30 June, 2025 06:13PM

Russell Coker

hackergotchi for Otto Kekäläinen

Otto Kekäläinen

Corporate best practices for upstream open source contributions

Featured image of post Corporate best practices for upstream open source contributions

This post is based on presentation given at the Validos annual members’ meeting on June 25th, 2025.

When I started getting into Linux and open source over 25 years ago, the majority of the software development in this area was done by academics and hobbyists. The number of companies participating in open source has since exploded in parallel with the growth of mobile and cloud software, the majority of which is built on top of open source. For example, Android powers most mobile phones today and is based on Linux. Almost all software used to operate large cloud provider data centers, such as AWS or Google, is either open source or made in-house by the cloud provider.

Pretty much all companies, regardless of the industry, have been using open source software at least to some extent for years. However, the degree to which they collaborate with the upstream origins of the software varies. I encourage all companies in a technical industry to start contributing upstream. There are many benefits to having a good relationship with your upstream open source software vendors, both for the short term and especially for the long term. Moreover, with the rollout of CRA in EU in 2025-2027, the law will require software companies to contribute security fixes upstream to the open source projects their products use.

To ensure the process is well managed, business-aligned and legally compliant, there are a few do’s and don’t do’s that are important to be aware of.

Maintain your SBOMs

For every piece of software, regardless of whether the code was done in-house, from an open source project, or a combination of these, every company needs to produce a Software Bill of Materials (SBOM). The SBOMs provide a standardized and interoperable way to track what software and which versions are used where, what software licenses apply, who holds the copyright of which component, which security fixes have been applied and so forth.

A catalog of SBOMs, or equivalent, forms the backbone of software supply-chain management in corporations.

Identify your strategic upstream vendors

The SBOMs are likely to reveal that for any piece of non-trivial software, there are hundreds or thousands of upstream open source projects in use. Few organizations have resources to contribute to all of their upstreams.

If your organization is just starting to organize upstream contribution activities, identify the key projects that have the largest impact on your business and prioritize forming a relationship with them first. Organizations with a mature contribution process will be collaborating with tens or hundreds of upstreams.

An upstream contribution policy typically covers things such as who decides what can be contributed upstream from a business point of view, what licenses are allowed or to avoid, how to document copyright, how to deal with projects that require signing copyright assignments (e.g. contributor license agreements), other potential legal guidelines to follow. Additionally, the technical steps on how to prepare a contribution should be outlined, including how to internally review and re-review them, who the technical approvers are to ensure high quality and good reputation and so on.

The policy does not have to be static or difficult to produce. Start with a small policy and a few trusted senior developers following it, and update its contents as you run into new situations that need internal company alignment. For example, don’t require staff to create new GitHub accounts merely for the purpose of doing one open source contribution. Initially, do things with minimal overhead and add requirements to the policy only if they have clear and strong benefits. The purpose of a policy should be to make it obvious and easy for employees to do the right thing, not to add obstacles and stop progress or encourage people to break the policy.

Appoint an internal coordinator and champions

Having a written policy on how to contribute upstream will help ensure a consistent process and avoid common pitfalls. However, a written policy alone does not automatically translate into a well-running process. It is highly recommended to appoint at least one internal coordinator who is knowledgeable about how open source communities work, how software licensing and patents work, and is senior enough to have a good sense of what business priorities to optimize for. In small organizations it can be a single person, while larger organizations typically have a full Open Source Programs Office.

This coordinator should oversee the contribution process, track all contributions made across the organization, and further optimize the process by working with stakeholders across the business, including legal experts, business owners and CTOs. The marketing and recruiting folks should also be involved, as upstream contributions will have a reputation-building aspect as well, which can be enhanced with systematic tracking and publishing of activities.

Additionally, at least in the beginning, the organization should also appoint key staff members as open source champions. Implementing a new process always includes some obstacles and occasional setbacks, which may discourage employees from putting in the extra effort to reap the full long-term benefits for the company. Having named champions will empower them to make the first few contributions themselves, setting a good example and encouraging and mentoring others to contribute upstream as well.

Avoid excessive approvals

To maintain a high quality bar, it is always good to have all outgoing submissions reviewed by at least one or two people. Two or three pairs of eyeballs are significantly more likely to catch issues that might slip by someone working alone. The review also slows down the process by a day or two, which gives the author time to “sleep on it”, which usually helps to ensure the final submission is well-thought-out by the author.

Do not require more than one or two reviewers. The marginal utility goes quickly to zero beyond a few reviewers, and at around four or five people the effect becomes negative, as the weight of each approval decreases and the reviewers begin to take less personal responsibility. Having too many people in the loop also makes each feedback round slow and expensive, to the extent that the author will hesitate to make updates and ask for re-reviews due to the costs involved.

If the organization experiences setbacks due to mistakes slipping through the review process, do not respond by adding more reviewers, as it will just grind the contribution process to a halt. If there are quality concerns, invest in training for engineers, CI systems and perhaps an internal certification program for those making public upstream code submissions. A typical software engineer is more likely to seriously try to become proficient at their job and put effort into a one-off certification exam and then make multiple high-quality contributions, than it is for a low-skilled engineer to improve and even want to continue doing more upstream contributions if they are burdened by heavy review processes every time they try to submit an upstream contribution.

Don’t expect upstream to accept all code contributions

Sure, identifying the root cause of and fixing a tricky bug or writing a new feature requires significant effort. While an open source project will certainly appreciate the effort invested, it doesn’t mean it will always welcome all contributions with open arms. Occasionally, the project won’t agree that the code is correct or the feature is useful, and some contributions are bound to be rejected.

You can minimize the chance of experiencing rejections by having a solid internal review process that includes assessing how the upstream community is likely to understand the proposal. Sometimes how things are communicated is more important than how they are coded. Polishing inline comments and git commit messages help ensure high-quality communication, along with a commitment to respond quickly to review feedback and conducting regular follow-ups until a contribution is finalized and accepted.

Start small to grow expertise and reputation

In addition to keeping the open source contribution policy lean and nimble, it is also good to start practical contributions with small issues. Don’t aim to contribute massive features until you have a track record of being able to make multiple small contributions.

Keep in mind that not all open source projects are equal. Each has its own culture, written and unwritten rules, development process, documented requirements (which may be outdated) and more. Starting with a tiny contribution, even just a typo fix, is a good way to validate how code submissions, reviews and approvals work in a particular project. Once you have staff who have successfully landed smaller contributions, you can start planning larger proposals. The exact same proposal might be unsuccessful when proposed by a new person, and successful when proposed by a person who already has a reputation for prior high-quality work.

Embrace all and any publicity you get

Some companies have concerns about their employees working in the open. Indeed, every email and code patch an employee submits, and all related discussions become public. This may initially sound scary, but is actually a potential source of good publicity. Employees need to be trained on how to conduct themselves publicly, and the discussions about code should contain only information strictly related to the code, without any references to actual production environments or other sensitive information. In the long run most employees contributing have a positive impact and the company should reap the benefits of positive publicity. If there are quality issues or employee judgment issues, hiding the activity or forcing employees to contribute with pseudonyms is not a proper solution. Instead, the problems should be addressed at the root, and bad behavior addressed rather than tolerated.

When people are working publicly, there tends to also be some degree of additional pride involved, which motivates people to try their best. Contributions need to be public for the sponsoring corporation to later be able to claim copyright or licenses. Considering that thousands of companies participate in open source every day, the prevalence of bad publicity is quite low, and the benefits far exceed the risks.

Scratch your own itch

When choosing what to contribute, select things that benefit your own company. This is not purely about being selfish - often people working on resolving a problem they suffer from are the same people with the best expertise of what the problem is and what kind of solution is optimal. Also, the issues that are most pressing to your company are more likely to be universally useful to solve than any random bug or feature request in the upstream project’s issue tracker.

Remember there are many ways to help upstream

While submitting code is often considered the primary way to contribute, please keep in mind there are also other highly impactful ways to contribute. Submitting high-quality bug reports will help developers quickly identify and prioritize issues to fix. Providing good research, benchmarks, statistics or feedback helps guide development and the project make better design decisions. Documentation, translations, organizing events and providing marketing support can help increase adoption and strengthen long-term viability for the project.

In some of the largest open source projects there are already far more pending contributions than the core maintainers can process. Therefore, developers who contribute code should also get into the habit of contributing reviews. As Linus’ law states, given enough eyeballs, all bugs are shallow. Reviewing other contributors’ submissions will help improve quality, and also alleviate the pressure on core maintainers who are the only ones providing feedback. Reviewing code submitted by others is also a great learning opportunity for the reviewer. The reviewer does not need to be “better” than the submitter - any feedback is useful; merely posting review feedback is not the same thing as making an approval decision.

Many projects are also happy to accept monetary support and sponsorships. Some offer specific perks in return. By human nature, the largest sponsors always get their voice heard in important decisions, as no open source project wants to take actions that scare away major financial contributors.

Starting is the hardest part

Long-term success in open source comes from a positive feedback loop of an ever-increasing number of users and collaborators. As seen in the examples of countless corporations contributing open source, the benefits are concrete, and the process usually runs well after the initial ramp-up and organizational learning phase has passed.

In open source ecosystems, contributing upstream should be as natural as paying vendors in any business. If you are using open source and not contributing at all, you likely have latent business risks without realizing it. You don’t want to wake up one morning to learn that your top talent left because they were forbidden from participating in open source for the company’s benefit, or that you were fined due to CRA violations and mismanagement in sharing security fixes with the correct parties. The faster you start with the process, the less likely those risks will materialize.

30 June, 2025 12:00AM

June 29, 2025

Matthias Geiger

Hello world

I finally got around to setting up a blog with pelican as SSG, so here I will be posting about my various Debian-related activities.

29 June, 2025 10:00PM by Matthias Geiger

Sergio Cipriano

How I deployed this Website

How I deployed this Website

I will describe the step-by-step process I followed to make this static website accessible on the Internet.

DNS

I bought this domain on NameCheap and am using their DNS for now, where I created these records:

Record Type Host Value
A sergiocipriano.com 201.54.0.17
CNAME www sergiocipriano.com

Virtual Machine

I am using Magalu Cloud for hosting my VM, since employees have free credits.

Besides creating a VM with a public IP, I only needed to set up a Security Group with the following rules:

Type Protocol Port Direction CIDR
IPv4 / IPv6 TCP 80 IN Any IP
IPv4 / IPv6 TCP 443 IN Any IP

Firewall

The first thing I did in the VM was enabling ufw (Uncomplicated Firewall).

Enabling ufw without pre-allowing SSH is a common pitfall and can lock you out of your VM. I did this once :)

A safe way to enable ufw:

$ sudo ufw allow OpenSSH      # or: sudo ufw allow 22/tcp
$ sudo ufw allow 'Nginx Full' # or: sudo ufw allow 80,443/tcp
$ sudo ufw enable

To check if everything is ok, run:

$ sudo ufw status verbose
Status: active
Logging: on (low)
Default: deny (incoming), allow (outgoing), disabled (routed)
New profiles: skip

To                           Action      From
--                           ------      ----
22/tcp (OpenSSH)             ALLOW IN    Anywhere                  
80,443/tcp (Nginx Full)      ALLOW IN    Anywhere                  
22/tcp (OpenSSH (v6))        ALLOW IN    Anywhere (v6)             
80,443/tcp (Nginx Full (v6)) ALLOW IN    Anywhere (v6) 

Reverse Proxy

I'm using Nginx as the reverse proxy. Since I use the Debian package, I just needed to add this file:

/etc/nginx/sites-enabled/sergiocipriano.com

with this content:

server {
    listen 443 ssl;      # IPv4
    listen [::]:443 ssl; # IPv6

    server_name sergiocipriano.com www.sergiocipriano.com;

    root /path/to/website/sergiocipriano.com;
    index index.html;

    location / {
        try_files $uri /index.html;
    }
}

server {
    listen 80;
    listen [::]:80;

    server_name sergiocipriano.com www.sergiocipriano.com;

    # Redirect all HTTP traffic to HTTPS
    return 301 https://$host$request_uri;
}

TLS

It's really easy to setup TLS thanks to Let's Encrypt:

$ sudo apt-get install certbot python3-certbot-nginx
$ sudo certbot install --cert-name sergiocipriano.com
Saving debug log to /var/log/letsencrypt/letsencrypt.log
Deploying certificate
Successfully deployed certificate for sergiocipriano.com to /etc/nginx/sites-enabled/sergiocipriano.com
Successfully deployed certificate for www.sergiocipriano.com to /etc/nginx/sites-enabled/sergiocipriano.com

Certbot will edit the nginx configuration with the path to the certificate.

HTTP Security Headers

I decided to use wapiti, which is a web application vulnerability scanner, and the report found this problems:

  1. CSP is not set
  2. X-Frame-Options is not set
  3. X-XSS-Protection is not set
  4. X-Content-Type-Options is not set
  5. Strict-Transport-Security is not set

I'll explain one by one:

  1. The Content-Security-Policy header prevents XSS and data injection by restricting sources of scripts, images, styles, etc.
  2. The X-Frame-Options header prevents a website from being embedded in iframes (clickjacking).
  3. The X-XSS-Protection header is deprecated. It is recommended that CSP is used instead of XSS filtering.
  4. The X-Content-Type-Options header stops MIME-type sniffing to prevent certain attacks.
  5. The Strict-Transport-Security header informs browsers that the host should only be accessed using HTTPS, and that any future attempts to access it using HTTP should automatically be upgraded to HTTPS. Additionally, on future connections to the host, the browser will not allow the user to bypass secure connection errors, such as an invalid certificate. HSTS identifies a host by its domain name only.

I added this security headers inside the HTTPS and HTTP server block, outside the location block, so they apply globally to all responses. Here's how the Nginx config look like:

add_header Content-Security-Policy "default-src 'self'; style-src 'self';" always;
add_header X-Frame-Options "DENY" always;
add_header X-Content-Type-Options "nosniff" always;
add_header Strict-Transport-Security "max-age=31536000; includeSubDomains" always;

I added always to ensure that nginx sends the header regardless of the response code.

To add Content-Security-Policy header I had to move the css to a separate file, because browsers block inline styles under strict CSP unless you allow them explicitly. They're considered unsafe inline unless you move to a separate file and link it like this:

<link rel="stylesheet" href="./assets/header.css">

29 June, 2025 06:57PM

June 27, 2025

hackergotchi for Jonathan Dowland

Jonathan Dowland

Viva

On Monday I had my Viva Voce (PhD defence), and passed (with minor corrections).

Post-viva refreshment

Post-viva refreshment

It's a relief to have passed after 8 years of work. I'm not quite done of course, as I have the corrections to make! Once those are accepted I'll upload my thesis here.

27 June, 2025 02:00PM

Reproducible Builds (diffoscope)

diffoscope 300 released

The diffoscope maintainers are pleased to announce the release of diffoscope version 300. This version includes the following changes:

[ "Alex" ]
* Fix a regression and add a test so that diffoscope picks up differences
  in metadata for identical files again. (Closes: reproducible-builds/diffoscope#411)

You find out more by visiting the project homepage.

27 June, 2025 12:00AM

June 26, 2025

hackergotchi for Bits from Debian

Bits from Debian

AMD Platinum Sponsor of DebConf25

amd-logo

We are pleased to announce that AMD has committed to sponsor DebConf25 as a Platinum Sponsor.

The AMD ROCm platform includes programming models, tools, compilers, libraries, and runtimes for AI and HPC solution development on AMD GPUs. Debian is an officially supported platform for AMD ROCm and a growing number of components are now included directly in the Debian distribution.

For more than 55 years AMD has driven innovation in high-performance computing, graphics and visualization technologies. AMD is deeply committed to supporting and contributing to open-source projects, foundations, and open-standards organizations, taking pride in fostering innovation and collaboration within the open-source community.

With this commitment as Platinum Sponsor, AMD is contributing to the annual Debian Developers’ Conference, directly supporting the progress of Debian and Free Software. AMD contributes to strengthening the worldwide community that collaborates on Debian projects year-round.

Thank you very much, AMD, for your support of DebConf25!

Become a sponsor too!

DebConf25 will take place from 14 to 20 July 2025 in Brest, France, and will be preceded by DebCamp, from 7 to 13 July 2025.

DebConf25 is accepting sponsors! Interested companies and organizations may contact the DebConf team through sponsors@debconf.org, and visit the DebConf25 website at https://debconf25.debconf.org/sponsors /become-a-sponsor/.

26 June, 2025 09:37PM by Daniel Lange

June 25, 2025

hackergotchi for Tollef Fog Heen

Tollef Fog Heen

Pronoun support in userdir-ldap

Debian uses LDAP for storing information about users, hosts and other objects. The wrapping around this is called userdir-ldap, or ud-ldap for short. It provides a mail gateway, web UI and a couple of schemas for different object types.

Back in late 2018 and early 2019, we (DSA) removed support for ISO5218 in userdir-ldap, and removed the corresponding data. This made some people upset, since they were using that information, as imprecise as it was, to infer people’s pronouns. ISO5218 has four values for sex, unknown, male, female and N/A. This might have been acceptable when the standard was new (in 1976), but it wasn’t acceptable any longer in 2018.

A couple of days ago, I finally got around to adding support to userdir-ldap to let people specify their pronouns. As it should be, it’s a free-form text field. (We don’t have localised fields in LDAP, so it probably makes sense for people to put the English version of their pronouns there, but the software does not try to control that.)

So far, it’s only exposed through the LDAP gateway, not in the web UI.

If you’re a Debian developer, you can set your pronouns using

echo "pronouns: he/him" | gpg --clearsign | mail changes@db.debian.org

I see that four people have already done so in the time I’ve taken to write this post.

25 June, 2025 08:00PM

June 24, 2025

hackergotchi for Evgeni Golov

Evgeni Golov

Using LXCFS together with Podman

JP was puzzled that using podman run --memory=2G … would not result in the 2G limit being visible inside the container. While we were able to identify this as a visualization problem — tools like free(1) only look at /proc/meminfo and that is not virtualized inside a container, you'd have to look at /sys/fs/cgroup/memory.max and friends instead — I couldn't leave it at that. And then I remembered there is actually something that can provide a virtual (cgroup-aware) /proc for containers: LXCFS!

But does it work with Podman?! I always used it with LXC, but there is technically no reason why it wouldn't work with a different container solution — cgroups are cgroups after all.

As we all know: there is only one way to find out!

Take a fresh Debian 12 VM, install podman and verify things behave as expected:

user@debian12:~$ podman run -ti --rm --memory=2G centos:stream9
bash-5.1# grep MemTotal /proc/meminfo
MemTotal:        6067396 kB
bash-5.1# cat /sys/fs/cgroup/memory.max
2147483648

And after installing (and starting) lxcfs, we can use the virtual /proc/meminfo it generates by bind-mounting it into the container (LXC does that part automatically for us):

user@debian12:~$ podman run -ti --rm --memory=2G --mount=type=bind,source=/var/lib/lxcfs/proc/meminfo,destination=/proc/meminfo centos:stream9
bash-5.1# grep MemTotal /proc/meminfo
MemTotal:        2097152 kB
bash-5.1# cat /sys/fs/cgroup/memory.max
2147483648

The same of course works with all the other proc entries lxcfs provides (cpuinfo, diskstats, loadavg, meminfo, slabinfo, stat, swaps, and uptime here), just bind-mount them.

And yes, free(1) now works too!

bash-5.1# free -m
               total        used        free      shared  buff/cache   available
Mem:            2048           3        1976           0          67        2044
Swap:              0           0           0

Just don't blindly mount the whole /var/lib/lxcfs/proc over the container's /proc. It did work (as in: "bash and free didn't crash") for me, but with /proc/$PID etc missing, I bet things will go south pretty quickly.

24 June, 2025 07:46PM by evgeni

hackergotchi for Dirk Eddelbuettel

Dirk Eddelbuettel

RcppRedis 0.2.6 on CRAN: Extensions

A new minor release 0.2.6 of our RcppRedis package arrived on CRAN today. RcppRedis is one of several packages connecting R to the fabulous Redis in-memory datastructure store (and much more). It works equally well with the newer fork Valkey. RcppRedis does not pretend to be feature complete, but it may do some things faster than the other interfaces, and also offers an optional coupling with MessagePack binary (de)serialization via RcppMsgPack. The package has been “deployed in production” as a risk / monitoring tool on a trading floor for several years. It also supports pub/sub dissemination of streaming market data as per this earlier example.

This update brings new functions del, lrem, and lmove (for the matching Redis / Valkey commands) which may be helpful in using Redis (or Valkey) as a job queue. We also extended the publish accessor by supporting text (i.e. string) mode along with raw or rds (the prior default which always serialized R objects) just how listen already worked with these three cases. The change makes it possible to publish from R to subscribers not running R as they cannot rely on the R deserealizer. An example is provided by almm, a live market monitor, which we introduced in this blog post. Apart from that the continuous integration script received another mechanical update.

The detailed changes list follows.

Changes in version 0.2.6 (2025-06-24)

  • The commands DEL, LREM and LMOVE have been added

  • The continuous integration setup was updated once more

  • The pub/sub publisher now supports a type argument similar to the listener, this allows string message publishing for non-R subscribers

Courtesy of my CRANberries, there is also a diffstat report for this this release. More information is on the RcppRedis page and at the repository and its issue tracker.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. If you like this or other open-source work I do, you can sponsor me at GitHub.

24 June, 2025 04:23PM

Uwe Kleine-König

Temperature and humitidy sensor on OpenWrt

I have a SHT3x humidity and temperature sensor connected to the i2c bus of my Turris Omnia that runs OpenWrt.

To make it produce nice graphs shown in the webif I installed the packages collectd-mod-sensors, luci-app-statistics and kmod-hwmon-sht3x.

To make the sht3x driver bind to the device I added

echo 'sht3x 0x44' > /sys/bus/i2c/devices/0-0070/channel-6/new_device

to /etc/rc.local. After that I only had to enable the Sensors plugin below Statistics -> Setup -> General plugins and check 'Monitor all except specified` in its "Configure" dialog.

24 June, 2025 08:22AM

hackergotchi for Matthew Garrett

Matthew Garrett

Why is there no consistent single signon API flow?

Single signon is a pretty vital part of modern enterprise security. You have users who need access to a bewildering array of services, and you want to be able to avoid the fallout of one of those services being compromised and your users having to change their passwords everywhere (because they're clearly going to be using the same password everywhere), or you want to be able to enforce some reasonable MFA policy without needing to configure it in 300 different places, or you want to be able to disable all user access in one place when someone leaves the company, or, well, all of the above. There's any number of providers for this, ranging from it being integrated with a more general app service platform (eg, Microsoft or Google) or a third party vendor (Okta, Ping, any number of bizarre companies). And, in general, they'll offer a straightforward mechanism to either issue OIDC tokens or manage SAML login flows, requiring users present whatever set of authentication mechanisms you've configured.

This is largely optimised for web authentication, which doesn't seem like a huge deal - if I'm logging into Workday then being bounced to another site for auth seems entirely reasonable. The problem is when you're trying to gate access to a non-web app, at which point consistency in login flow is usually achieved by spawning a browser and somehow managing submitting the result back to the remote server. And this makes some degree of sense - browsers are where webauthn token support tends to live, and it also ensures the user always has the same experience.

But it works poorly for CLI-based setups. There's basically two options - you can use the device code authorisation flow, where you perform authentication on what is nominally a separate machine to the one requesting it (but in this case is actually the same) and as a result end up with a straightforward mechanism to have your users socially engineered into giving Johnny Badman a valid auth token despite webauthn nominally being unphisable (as described years ago), or you reduce that risk somewhat by spawning a local server and POSTing the token back to it - which works locally but doesn't work well if you're dealing with trying to auth on a remote device. The user experience for both scenarios sucks, and it reduces a bunch of the worthwhile security properties that modern MFA supposedly gives us.

There's a third approach, which is in some ways the obviously good approach and in other ways is obviously a screaming nightmare. All the browser is doing is sending a bunch of requests to a remote service and handling the response locally. Why don't we just do the same? Okta, for instance, has an API for auth. We just need to submit the username and password to that and see what answer comes back. This is great until you enable any kind of MFA, at which point the additional authz step is something that's only supported via the browser. And basically everyone else is the same.

Of course, when we say "That's only supported via the browser", the browser is still just running some code of some form and we can figure out what it's doing and do the same. Which is how you end up scraping constants out of Javascript embedded in the API response in order to submit that data back in the appropriate way. This is all possible but it's incredibly annoying and fragile - the contract with the identity provider is that a browser is pointed at a URL, not that any of the internal implementation remains consistent.

I've done this. I've implemented code to scrape an identity provider's auth responses to extract the webauthn challenges and feed those to a local security token without using a browser. I've also written support for forwarding those challenges over the SSH agent protocol to make this work with remote systems that aren't running a GUI. This week I'm working on doing the same again, because every identity provider does all of this differently.

There's no fundamental reason all of this needs to be custom. It could be a straightforward "POST username and password, receive list of UUIDs describing MFA mechanisms, define how those MFA mechanisms work". That even gives space for custom auth factors (I'm looking at you, Okta Fastpass). But instead I'm left scraping JSON blobs out of Javascript and hoping nobody renames a field, even though I only care about extremely standard MFA mechanisms that shouldn't differ across different identity providers.

Someone, please, write a spec for this. Please don't make it be me.

comment count unavailable comments

24 June, 2025 06:03AM

June 23, 2025

hackergotchi for Gunnar Wolf

Gunnar Wolf

Private key management • Oh, the humanity...

If we ever thought a couple of years or decades of constant use would get humankind to understand how an asymetric key pair is to be handled… It’s time we moved back to square one.

I had to do an online tramit with the Mexican federal government to get a statement certifying I successfully finished my studies, and I found this jewel of user interface:

E.firma

So… I have to:

  1. Submit the asymetric key I use for tax purposes, as that’s the ID the government has registered for me. OK, I didn’t expect it to be used for this purpose as well, but I’ll accept it. Of course, in our tax system many people don’t require having a public key generated (“easier” regimes are authenticated by password only), but all professionals with a cédula profesional (everybody getting a unviersitary title) is now compelled to do this step.
  2. Not only I have to submit my certificate (public key)… But also the private part (and, of course, the password that secures it).

    I understand I’m interacting with a Javascript thingie that runs only client-side, and I trust it is not shipping my private key to their servers. But given it is an opaque script, I have no assurance about it. And, of course, this irks me because I am who I am and because I’ve spent several years thinking about cryptography. But for regular people, it just looks as a stupid inconvenience: they have to upload two weird files with odd names and provide a password. What for?

This is beyond stupid. I’m baffled.

(of course, I did it, because I need the fsckin’ document. Oh, and of course, I paid my MX$1770, ≈€80, for it… which does not make me too happy for a tramit that’s not even shuffling papers, only storing the right bits in the right corner of the right datacenter, but anyhow…)

23 June, 2025 07:40PM

Russell Coker

PFAs

For some time I’ve been noticing news reports about PFAs [1]. I hadn’t thought much about that issue, I grew up when leaded petrol was standard, when almost all thermometers had mercury, when all small batteries had mercury, and I had generally considered that I had already had so many nasty chemicals in my body that as long as I don’t eat bottom feeding seafood often I didn’t have much to worry about. I already had a higher risk of a large number of medical issues than I’d like due to decisions made before I was born and there’s not much to do about it given that there are regulations restricting the emissions of lead, mercury etc.

I just watched a Veritasium video about Teflon and the PFA poisoning related to it’s production [2]. This made me realise that it’s more of a problem than I realised and it’s a problem that’s getting worse. PFA levels in the parts-per-trillion range in the environment can cause parts-per-billion in the body which increases the risks of several cancers and causes other health problems. Fortunately there is some work being done on water filtering, you can get filters for a home level now and they are working on filters that can work at a sufficient scale for a city water plant.

There is a map showing PFAs in the environment in Australia which shows some sites with concerning levels that are near residential areas [3]. One of the major causes for that in Australia is fire retardant foam – Australia has never had much if any Teflon manufacturing AFAIK.

Also they noted that donating blood regularly can decrease levels of PFAs in the bloodstream. So presumably people who have medical conditions that require receiving donated blood regularly will have really high levels.

23 June, 2025 12:26PM by etbe

June 22, 2025

Iustin Pop

Coding, as we knew it, has forever changed

Back when I was terribly naïve

When I was younger, and definitely naïve, I was so looking forward to AI, which will help us write lots of good, reliable code faster. Well, principally me, not thinking what impact it will have industry-wide. Other more general concerns, like societal issues, role of humans in the future and so on were totally not on my radar.

At the same time, I didn’t expect this will actually happen. Even years later, things didn’t change dramatically. Even the first release of ChatGPT a few years back didn’t click for me, as the limitations were still significant.

Hints of serious change

The first hint of the change, for me, was when a few months ago (yes, behind the curve), I asked ChatGPT to re-explain a concept to me, and it just wrote a lot of words, but without a clear explanation. On a whim, I asked Grok—then recently launched, I think—to do the same. And for the first time, the explanation clicked and I felt I could have a conversation with it. Of course, now I forgot again that theoretical CS concept, but the first step was done: I can ask an LLM to explain something, and it will, and I can have a back and forth logical discussion, even if on some theoretical concept. Additionally, I learned that not all LLMs are the same, and that means there’s real competition and that leap frogging is possible.

Another topic on which I tried to adopt early and failed to get mileage out of it, was GitHub Copilot (in VSC). I tried, it helped, but didn’t feel any speed-up at all. Then more recently, in May, I asked Grok what’s the state of the art in AI-assisted coding. It said either Claude in a browser tab, or in VSC via continue.dev extension.

The continue.dev extension/tooling is a bit of a strange/interesting thing. It seems to want to be a middle-man between the user and actual LLM services, i.e. you pay a subscription to continue.dev, not to Anthropic itself, and they manage the keys/APIs, for whatever backend LLMs you want to use. The integration with Visual Studio Code is very nice, but I don’t know if long-term their business model will make sense. Well, not my problem.

Claude: reverse engineering my old code and teaching new concepts

So I installed the latter and subscribed, thinking 20 CHF for a month is good for testing. I skipped the tutorial model/assistant, created a new one from scratch, just enabled Claude 3.7 Sonnet, and started using it. And then, my mind was blown-not just by the LLM, but by the ecosystem. As said, I’ve used GitHub copilot before, but it didn’t seem effective. I don’t know if a threshold has been reached, or Claude (3.7 at that time) is just better than ChatGPT.

I didn’t use the AI to write (non-trivial) code for me, at most boilerplate snippets. But I used it both as partner for discussion - “I want to do x, what do you think, A or B?”, and as a teacher, especially for fronted topics, which I’m not familiar with.

Since May, in mostly fragmented sessions, I’ve achieved more than in the last two years. Migration from old school JS to ECMA modules, a webpacker (reducing bundle size by 50%), replacing an old Javascript library with hand written code using modern APIs, implementing the zoom feature together with all of keyboard, mouse, touchpad and touchscreen support, simplifying layout from manually computed to automatic layout, and finding a bug in webkit for which it also wrote a cool minimal test (cool, as in, way better than I’d have ever, ever written, because for me it didn’t matter that much). And more. Could I have done all this? Yes, definitely, nothing was especially tricky here. But hours and hours of reading MDN, scouring Stack Overflow and Reddit, and lots of trial and error. So doable, but much more toily.

This, to me, feels like cheating. 20 CHF per month to make me 3x more productive is free money—well, except that I don’t make money on my code which is written basically for myself. However, I don’t get stuck anymore searching hours in the web for guidance, I ask my question, and I get at least direction if not answer, and I’m finished way earlier. I can now actually juggle more hobbies, in the same amount of time, if my personal code takes less time or differently said, if I’m more efficient at it.

Not all is roses, of course. Once, it did write code with such an endearing error that it made me laugh. It was so blatantly obvious that you shouldn’t keep other state in the array that holds pointer status because that confuses the calculation of “how many pointers are down”, probably to itself too if I’d have asked. But I didn’t, since it felt a bit embarassing to point out such a dumb mistake. Yes, I’m anthropomorphising again, because this is the easiest way to deal with things.

In general, it does an OK-to-good-to-sometimes-awesome job, and the best thing is that it summarises documentation and all of Reddit and Stack Overflow. And gives links to those.

Now, I have no idea yet what this means for the job of a software engineer. If on open source code, my own code, it makes me 3x faster—reverse engineering my code from 10 years ago is no small feat—for working on large codebases, it should do at least the same, if not more.

As an example of how open-ended the assistance can be, at one point, I started implementing a new feature—threading a new attribute to a large number of call points. This is not complex at all, just add a new field to a Haskell record, and modifying everything to take it into account, populate it, merge it when merging the data structures, etc. The code is not complex, tending toward boilerplate a bit, and I was wondering on a few possible choices for implementation, so, with just a few lines of code written that were not even compiling, I asked “I want to add a new feature, should I do A or B if I want it to behave like this”, and the answer was something along the lines of “I see you want to add the specific feature I was working on, but the implementation is incomplete, you still need to to X, Y and Z”. My mind was blown at this point, as I thought, if the code doesn’t compile, surely the computer won’t be able to parse it, but this is not a program, this is an LLM, so of course it could read it kind of as a human would. Again, the code complexity is not great, but the fact that it was able to read a half-written patch, understand what I was working towards, and reason about, was mind-blowing, and scary. Like always.

Non-code writing

Now, after all this, while writing a recent blog post, I thought—this is going to be public anyway, so let me ask Claude what it thinks about it. And I was very surprised, again: gone was all the pain of rereading three times my post to catch typos (easy) or phrasing structure issues. It gave me very clearly points, and helped me cut 30-40% of the total time. So not only coding, but word smithing too is changed. If I were an author, I’d be delighted (and scared). Here is the overall reply it gave me:

  • Spelling and grammar fixes, all of them on point except one mistake (I claimed I didn’t capitalize one word, but I did). To the level of a good grammar checker.
  • Flow Suggestions, which was way beyond normal spelling and grammar. It felt like a teacher telling me to do better in my writing, i.e. nitpicking on things that actually were true even if they’d still work. I.e. lousy phrase structure, still understandable, but lousy nevertheless.
  • Other notes: an overall summary. This was mostly just praising my post 😅. I wish LLMs were not so focused on “praise the user”.

So yeah, this speeds me up to about 2x on writing blog posts, too. It definitely feels not fair.

Wither the future?

After all this, I’m a bit flabbergasted. Gone are the 2000’s with code without unittests, gone are the 2010’s without CI/CD, and now, mid-2020’s, gone is the lone programmer that scours the internet to learn new things, alone?

What this all means for our skills in software development, I have no idea, except I know things have irreversibly changed (a butlerian jihad aside). Do I learn better with a dedicated tutor even if I don’t fight with the problem for so long? Or is struggling in finding good docs the main method of learning? I don’t know yet. I feel like I understand the topics I’m discussing with the AI, but who knows in reality what it will mean long term in terms of “stickiness” of learning. For the better, or for worse, things have changed. After all the advances over the last five centuries in mechanical sciences, it has now come to some aspects of the intellectual work.

Maybe this is the answer to the ever-growing complexity of tech stacks? I.e. a return of the lone programmer that builds things end-to-end, but with AI taming the complexity added in the last 25 years? I can dream, of course, but this also means that the industry overall will increase in complexity even more, because large companies tend to do that, so maybe a net effect of not much…

One thing I did learn so far is that my expectation that AI (at this level) will only help junior/beginner people, i.e. it would flatten the skills band, is not true. I think AI can speed up at least the middle band, likely the middle top band, I don’t know about the 10x programmers (I’m not one of them). So, my question about AI now is how to best use it, not to lament how all my learning (90% self learning, to be clear) is obsolete. No, it isn’t. AI helps me start and finish one migration (that I delayed for ages), then start the second, in the same day.

At the end of this—a bit rambling—reflection on the past month and a half, I still have many questions about AI and humanity. But one has been answered: yes, “AI”, quotes or no quotes, already has changed this field (producing software), and we’ve not seen the end of it, for sure.

22 June, 2025 09:33PM

hackergotchi for Steinar H. Gunderson

Steinar H. Gunderson

Superimposed codes

I had a peculiar question at work recently, and it went off of a tangent that was way too long and somewhat interesting, so I wanted to share.

The question is: Can you create a set of N-bit numbers (codes), so that

a) Neither is a subset of each other, and
b) Neither is a subset of the OR of two of the others?

Of course, you can trivially do this (e.g., for N=5, choose 10000, 01000, 00100 and so on), but how many can you make for a given N? This is seemingly an open question, but at least I found that they are called (1,2) superimposed codes and have history at least back to this 1964 paper. They present a fairly elegant (but definitely non-optimal) way of constructing them for certain N; let me show an example for N=25:

We start by counting 3-digit numbers (k=3) in base 5 (q=5):

  • 000
  • 001
  • 002
  • 003
  • 004
  • 010
  • 011
  • etc…

Now we have 5^3 numbers. Let's set out to give them the property that we want.

This code (set of numbers) trivially has distance 1; that is, every number differs from every other number by at least one digit. We'd like to increase that distance so that it is at least as large as k. Reed-Solomon gives us an optimal way of doing that; for every number, we add two checksum digits and R-S will guarantee that the resulting code has distance 3. (Just trust me on this, I guess. It only works for q >= (k+1)/2, though, and q must be a power of an odd prime because otherwise the group theory doesn't work out.)

We now have a set of 5-digit numbers with distance 3. But if we now take any three numbers from this set, there is at least one digit where all three must differ, since the distance is larger than half the number of digits: Two numbers A and B differ from each other in at least 3 of the 5 digits, and A and C also has to differ from each other in at least 3 of the 5 digits. There just isn't room for A and B to be the same in all the places that A differ from C.

To modify this property into the one that we want, we encode each digit into binary using one-hot encoding (00001, 00010, 00100, etc.). Now our 5-digit numbers are 25-bit numbers. And due to the "all different" property in the previous paragraph, we also have our superimposition property; there's at least one 5-bit group where A|B shares no bits with C. So this gives us a 25-bit set with 125 different values and our desired property.

This isn't necessarily an optimal code (and the authors are very clear on that), but it's at least systematic and easy to extend to larger sizes. (I used a SAT solver to extend this to 170 different values, just by keeping the 125 first and asking for 45 more that were not in conflict. 55 more was evidently hard.) The paper has tons more information, including some stuff based on Steiner systems that I haven't tried to understand. And of course, there are tons more later papers, including one by Erdős. :-)

I've applied for an account at OEIS so I can add a sequence for the maximum number of possible codes for each N. It doesn't have many terms known yet, because the SAT solver struggles hard with this (at least in my best formulation), but at least it will give the next person something to find when they are searching. :-)

22 June, 2025 11:45AM

Sahil Dhiman

Case of (broken) maharashtra.gov.in Authoritative Name Servers

Maharashtra is a state here in India, which has Mumbai, the financial capital of India, as its capital. maharashtra.gov.in is the official website of the State Government of Maharashtra. We’re going to talk about authoritative name servers serving it (and bunch of child zones under maharashtra.gov.in).

Here’s a simple trace for the main domain:

$ dig +trace maharashtra.gov.in

; <<>> DiG 9.18.33-1~deb12u2-Debian <<>> +trace maharashtra.gov.in
;; global options: +cmd
.            33128    IN    NS    j.root-servers.net.
.            33128    IN    NS    h.root-servers.net.
.            33128    IN    NS    l.root-servers.net.
.            33128    IN    NS    k.root-servers.net.
.            33128    IN    NS    i.root-servers.net.
.            33128    IN    NS    g.root-servers.net.
.            33128    IN    NS    f.root-servers.net.
.            33128    IN    NS    e.root-servers.net.
.            33128    IN    NS    b.root-servers.net.
.            33128    IN    NS    d.root-servers.net.
.            33128    IN    NS    c.root-servers.net.
.            33128    IN    NS    m.root-servers.net.
.            33128    IN    NS    a.root-servers.net.
.            33128    IN    RRSIG    NS 8 0 518400 20250704050000 20250621040000 53148 . pGxGZftwj+6VNTSQtstTKVN95Z7/b5Q8GSjRCXI68GoVYbVai9HNelxs OGIRKL4YmSrsiSsndXuEsBuvL9QvQ+qbybNLkekJUAiicKYNgr3KM3+X 69rsS9KxHgT2T8/oqG8KN8EJLJ8VkuM2PJ2HfSKijtF7ULtgBbERNQ4i u2I/wQ7elOyeF2M76iEOa7UGhgiBHSBqPulsbpnB//WbKL71yyFhWSk0 tiFEPuZM+iLrN2qBsElriF4kkw37uRHq8sSGcCjfBVdkpbb3/Sb3sIgN /zKU17f+hOvuBQTDr5qFIymqGAENA5UZ2RQjikk6+zK5EfBUXNpq1+oo 2y64DQ==
;; Received 525 bytes from 9.9.9.9#53(9.9.9.9) in 3 ms

in.            172800    IN    NS    ns01.trs-dns.com.
in.            172800    IN    NS    ns01.trs-dns.net.
in.            172800    IN    NS    ns10.trs-dns.org.
in.            172800    IN    NS    ns10.trs-dns.info.
in.            86400    IN    DS    48140 8 2 5EE4748C2069B99C98BC39A56881A64AF17CC78711E6297D43AC5A4F 4B5BB6E5
in.            86400    IN    RRSIG    DS 8 1 86400 20250704050000 20250621040000 53148 . jkCotYosapreoKKPvr9zPOEDECYVe9OtJLjkQbFfTin8uYbm/kdWzieW CkN5sabif5IHTFU4FEVOShfu4DFeUolhNav56TPKjGqEGjQ7qCghpqTj dNN4iY2s8BcJ2ujHwhm6HRfdbQRVoKYQ73UUZ+oWSute6lXWHE9+Snk2 1ZCAYPdZ2s1s7NZhrZW2YXVw/nHIcRl/rHqWIQ9sgUlsd6MwmahcAAG+ v15HG9Q48rCG1A2gJlJPbxWpVe0EUEu8LzDsp+ORqy1pHhzgJynrJHJz qMiYU0egv2j7xVPSoQHXjx3PG2rsOLNnqDBYCA+piEXOLsY3d+7c1SZl w9u66g==
;; Received 679 bytes from 199.7.83.42#53(l.root-servers.net) in 3 ms

maharashtra.gov.in.    900    IN    NS    ns8.maharashtra.gov.in.
maharashtra.gov.in.    900    IN    NS    ns9.maharashtra.gov.in.
maharashtra.gov.in.    900    IN    NS    ns10.maharashtra.gov.in.
maharashtra.gov.in.    900    IN    NS    ns18.maharashtra.gov.in.
maharashtra.gov.in.    900    IN    NS    ns20.maharashtra.gov.in.
npk19skvsdmju264d4ono0khqf7eafqv.gov.in. 300 IN    NSEC3 1 1 0 - P0KKR4BMBGLJDOKBGBI0KDM39DSM0EA4 NS SOA MX TXT RRSIG DNSKEY NSEC3PARAM
npk19skvsdmju264d4ono0khqf7eafqv.gov.in. 300 IN    RRSIG NSEC3 8 3 300 20250626140337 20250528184339 48544 gov.in. Khcq3n1Jn34HvuBEZExusVqoduEMH6DzqkWHk9dFkM+q0RVBYBHBbW+u LsSnc2/Rqc3HAYutk3EZeS+kXVF07GA/A486dr17Hqf3lHszvG/MNT/s CJfcdrqO0Q8NZ9NQxvAwWo44bCPaECQV+fhznmIaVSgbw7de9xC6RxWG ZFcsPYwYt07yB5neKa99RlVvJXk4GHX3ISxiSfusCNOuEKGy5cMxZg04 4PbYsP0AQNiJWALAduq2aNs80FQdWweLhd2swYuZyfsbk1nSXJQcYbTX aONc0VkYFeEJzTscX8/wNbkJeoLP0r/W2ebahvFExl3NYpb7b2rMwGBY omC/QA==
npk19skvsdmju264d4ono0khqf7eafqv.gov.in. 300 IN    RRSIG NSEC3 13 3 300 20250718144138 20250619135610 22437 gov.in. mbj7td3E6YE7kIhYoSlDTZR047TXY3Z60NY0aBwU7obyg5enBQU9j5nl GUxn9zUiwVUzei7v5GIPxXS7XDpk7g==
6bflkoouitlvj011i2mau7ql5pk61sks.gov.in. 300 IN    NSEC3 1 1 0 - 78S0UO5LI1KV1SVMH1889FHUCNC40U6T TXT RRSIG
6bflkoouitlvj011i2mau7ql5pk61sks.gov.in. 300 IN    RRSIG NSEC3 8 3 300 20250626133905 20250528184339 48544 gov.in. M2yPThQpX0sEf4klooQ06h+rLR3e3Q/BqDTSFogyTIuGwjgm6nwate19 jGmgCeWCYL3w/oxsg1z7SfCvDBCXOObH8ftEBOfLe8/AGHAEkWFSu3e0 s09Ccoz8FJiCfBJbbZK5Vf4HWXtBLfBq+ncGCEE24tCQLXaS5cT85BxZ Zne6Y6u8s/WPgo8jybsvlGnL4QhIPlW5UkHDs7cLLQSwlkZs3dwxyHTn EgjNWClhghGXP9nlvOlnDjUkmacEYeq5ItnCQjYPl4uwh9fBJ9CD/8LV K+Tn3+dgqDBek6+2HRzjGs59NzuHX8J9wVFxP7/nd+fUgaSgz+sST80O vrXlHA==
6bflkoouitlvj011i2mau7ql5pk61sks.gov.in. 300 IN    RRSIG NSEC3 13 3 300 20250718141148 20250619135610 22437 gov.in. raWzWsQnPkXYtr2v1SRH/fk2dEAv/K85NH+06pNUwkxPxQk01nS8eYlq BPQ41b26kikg8mNOgr2ULlBpJHb1OQ==
couldn't get address for 'ns18.maharashtra.gov.in': not found
couldn't get address for 'ns20.maharashtra.gov.in': not found
;; Received 1171 bytes from 2620:171:813:1534:8::1#53(ns10.trs-dns.org) in 0 ms

;; communications error to 10.187.202.24#53: timed out
;; communications error to 10.187.202.24#53: timed out
;; communications error to 10.187.202.24#53: timed out
;; communications error to 10.187.202.28#53: timed out
;; communications error to 10.187.203.201#53: timed out
;; no servers could be reached

Quick takeaways:

  • 5 authoritative NS are listed ie:

    • ns8.maharashtra.gov.in.
    • ns9.maharashtra.gov.in.
    • ns10.maharashtra.gov.in.
    • ns18.maharashtra.gov.in.
    • ns20.maharashtra.gov.in.
  • No address (no A/AAAA records) could be found for ns18.maharashtra.gov.in and ns20.maharashtra.gov.in. Internet Archive snapshots for bgp.tools at time of writing NS18 and NS20.

  • “communications error to 10.187.202.24#53: timed out”, “communications error to 10.187.202.28#53: timed out” and “communications error to 10.187.203.201#53: timed out” is likely due to RFC 1918 records for NS. Ofcourse, they will never respond on public internet.

  • Not in trace, but NS10 has private or empty A/AAAA record against it (detailed further down).

  • The query resolution failed with “no servers could be reached” ie we didn’t recieved any A/AAAA record for that query.

It’s a hit or miss for this DNS query resolution.

Looking at in zone data

Let’s look at NS added in zone itself (with 9.9.9.9):

$ dig ns maharashtra.gov.in

; <<>> DiG 9.18.33-1~deb12u2-Debian <<>> ns maharashtra.gov.in
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 172
;; flags: qr rd ra; QUERY: 1, ANSWER: 2, AUTHORITY: 0, ADDITIONAL: 3

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 1232
;; QUESTION SECTION:
;maharashtra.gov.in.        IN    NS

;; ANSWER SECTION:
maharashtra.gov.in.    300    IN    NS    ns8.maharashtra.gov.in.
maharashtra.gov.in.    300    IN    NS    ns9.maharashtra.gov.in.

;; ADDITIONAL SECTION:
ns9.maharashtra.gov.in.    300    IN    A    10.187.202.24
ns8.maharashtra.gov.in.    300    IN    A    10.187.202.28

;; Query time: 180 msec
;; SERVER: 9.9.9.9#53(9.9.9.9) (UDP)
;; WHEN: Sat Jun 21 23:00:49 IST 2025
;; MSG SIZE  rcvd: 115

Pay special attention to “ADDITIONAL SECTION”. Running dig ns9.maharashtra.gov.in and dig ns8.maharashtra.gov.in, return RFC 1918 ie these private addresses. This is coming from zone itself, so in zone A records of NS8 and NS9 point to 10.187.202.28 and 10.187.202.24 respectively.

Cloudflare’s 1.1.1.1 has a slightly different version:

$ dig ns maharashtra.gov.in @1.1.1.1

; <<>> DiG 9.18.33-1~deb12u2-Debian <<>> ns maharashtra.gov.in @1.1.1.1
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 36005
;; flags: qr rd ra; QUERY: 1, ANSWER: 3, AUTHORITY: 0, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 1232
;; QUESTION SECTION:
;maharashtra.gov.in.        IN    NS

;; ANSWER SECTION:
maharashtra.gov.in.    300    IN    NS    ns8.
maharashtra.gov.in.    300    IN    NS    ns10.maharashtra.gov.in.
maharashtra.gov.in.    300    IN    NS    ns9.

;; Query time: 7 msec
;; SERVER: 1.1.1.1#53(1.1.1.1) (UDP)
;; WHEN: Sun Jun 22 10:38:30 IST 2025
;; MSG SIZE  rcvd: 100

Interesting response here for sure :D.

The reason for difference between response from 1.1.1.1 and 9.9.9.9 is in the next section.

Looking at parent zone

gov.in is the parent zone here. Tucows is operator for gov.in as well as .in ccTLD zone:

$ dig ns gov.in +short
ns01.trs-dns.net.
ns01.trs-dns.com.
ns10.trs-dns.org.
ns10.trs-dns.info.

Let’s take a look at what parent zone (NS) hold:

$ dig ns maharashtra.gov.in @ns01.trs-dns.net.

; <<>> DiG 9.18.36 <<>> ns maharashtra.gov.in @ns01.trs-dns.net.
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 56535
;; flags: qr rd; QUERY: 1, ANSWER: 0, AUTHORITY: 5, ADDITIONAL: 6
;; WARNING: recursion requested but not available

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 1232
; COOKIE: f13027aa39632404010000006856fa2a9c97d6bbc973ba4f (good)
;; QUESTION SECTION:
;maharashtra.gov.in.        IN    NS

;; AUTHORITY SECTION:
maharashtra.gov.in.    900    IN    NS    ns8.maharashtra.gov.in.
maharashtra.gov.in.    900    IN    NS    ns18.maharashtra.gov.in.
maharashtra.gov.in.    900    IN    NS    ns10.maharashtra.gov.in.
maharashtra.gov.in.    900    IN    NS    ns9.maharashtra.gov.in.
maharashtra.gov.in.    900    IN    NS    ns20.maharashtra.gov.in.

;; ADDITIONAL SECTION:
ns20.maharashtra.gov.in. 900    IN    A    52.183.143.210
ns18.maharashtra.gov.in. 900    IN    A    35.154.30.166
ns10.maharashtra.gov.in. 900    IN    A    164.100.128.234
ns9.maharashtra.gov.in.    900    IN    A    103.23.150.89
ns8.maharashtra.gov.in.    900    IN    A    103.23.150.88

;; Query time: 28 msec
;; SERVER: 64.96.2.1#53(ns01.trs-dns.net.) (UDP)
;; WHEN: Sun Jun 22 00:00:02 IST 2025
;; MSG SIZE  rcvd: 248

The ADDITIONAL SECTION gives a completely different picture (different from in zone NSes). Maybe this was how it was supposed to be, but none of the IPs listed for NS10, NS18 and NS20 are responding to any DNS query.

Assuming NS8 as 103.23.150.88 and NS9 as 103.23.150.89, checking SOA on each gives following:

$ dig soa maharashtra.gov.in @103.23.150.88 +short
ns8.maharashtra.gov.in. postmaster.maharashtra.gov.in. 2013116777 1200 600 1296000 300

$ dig soa maharashtra.gov.in @103.23.150.89 +short
ns8.maharashtra.gov.in. postmaster.maharashtra.gov.in. 2013116757 1200 600 1296000 300

NS8 (which is marked as primary in SOA) has serial 2013116777 and NS9 is on serial 2013116757, so looks like the sync (IXFR/AXFR) between primary and secondary is broken. That’s why NS8 and NS9 are serving different responses, evident from the following:

$ dig ns8.maharashtra.gov.in @103.23.150.88 +short
103.23.150.88

$ dig ns8.maharashtra.gov.in @103.23.150.89 +short
10.187.202.28

$ dig ns9.maharashtra.gov.in @103.23.150.88 +short
103.23.150.89

$ dig ns9.maharashtra.gov.in @103.23.150.89 +short
10.187.202.24

$ dig ns maharashtra.gov.in @103.23.150.88 +short
ns9.
ns8.
ns10.maharashtra.gov.in.

$ dig ns maharashtra.gov.in @103.23.150.89 +short
ns9.maharashtra.gov.in.
ns8.maharashtra.gov.in.

$ dig ns10.maharashtra.gov.in @103.23.150.88 +short
10.187.203.201

$ dig ns10.maharashtra.gov.in @103.23.150.89 +short

# No/empty response ^

This is the reason for difference in 1.1.1.1 and 9.9.9.9 responses in previous section.

To summarize:

  • Primary and secondary NS aren’t in sync. Serials aren’t matching, while NS8 and NS9 are responding differently for same queries.
  • NSes have A records with private address, not reachable on the internet, so lame servers.
  • Incomplete NS address, not even FQDN in some cases.
  • Difference between NS delegated in parent zone and NS added in actual zone.
  • Name resolution works in very particular order (in my initial trace it failed).

Initially, I thought of citing RFCs, but I don’t really think it’s even required. 1.1.1.1, 8.8.8.8 and 9.9.9.9 are handling (lame servers, this probelm) well, handing out the A record for the main website, so dig maharashtra.gov.in would mostly pass and that was the reason I started this post with +trace to recurse the complete zone to show the problem.

For later reference:

$ dig maharashtra.gov.in @8.8.8.8 +short
103.8.188.109

Email to SOA address

I have sent the following email to address listed in SOA:


Subject - maharashtra.gov.in authoritative DNS servers not reachable

Hello,

I wanted to highlight the confusing state of maharashtra.gov.in authoritative DNS servers.

Parent zone list following as name servers for your DNS zone:

  • ns8.maharashtra.gov.in.
  • ns18.maharashtra.gov.in.
  • ns10.maharashtra.gov.in.
  • ns9.maharashtra.gov.in.
  • ns20.maharashtra.gov.in.

Out of these, ns18 and ns20 don’t have public A/AAAA records and are thus not reachable. ns10 keeps on shuffling between NO A record and 10.187.203.201 (private, not reachable address). ns8 keeps on shuffling between 103.23.150.88 and 10.187.202.28 (private, not reachable address). ns9 keeps on shuffling between 103.23.150.89 and 10.187.202.24 (private, not reachable address).

These are leading to long, broken, or no DNS resolution for the website(s). Can you take a look at the problem?

Regards, Sahil


I’ll update here if I get a response. Hopefully, they’ll listen and fix their problem.

22 June, 2025 10:01AM

June 21, 2025

Ravi Dwivedi

Getting Brunei visa

In December 2024, my friend Badri and I were planning a trip to Southeast Asia. At this point, we were planning to visit Singapore, Malaysia and Vietnam. My Singapore visa had already been approved, and Malaysia was visa-free for us. For Vietnam, we had to apply for an e-visa online.

We considered adding Brunei to our itinerary. I saw some videos of the Brunei visa process and got the impression that we needed to go to the Brunei embassy in Kuching, Malaysia in person.

However, when I happened to search for Brunei on Organic Maps1, I stumbled upon the Brunei Embassy in Delhi. It seemed to be somewhere in Hauz Khas. As I was going to Delhi to collect my Singapore visa the next day, I figured I’d also visit the Brunei Embassy to get information about the visa process.

The next day I went to the location displayed by Organic Maps. It was next to the embassy of Madagascar, and a sign on the road divider confirmed that I was at the right place.

That said, it actually looked like someone’s apartment. I entered and asked for directions to the Brunei embassy, but the people inside did not seem to understand my query. After some back and forth, I realized that the embassy wasn’t there.

I now searched for the Brunei embassy on the Internet, and this time I got an address in Vasant Vihar. It seemed like the embassy had been moved from Hauz Khas to Vasant Vihar. Going by the timings mentioned on the web page, the embassy was closing in an hour.

I took a Metro from Hauz Khas to Vasant Vihar. After deboarding at the Vasant Vihar metro station, I took an auto to reach the embassy. The address listed on the webpage got me into the correct block. However, the embassy was still nowhere to be seen. I asked around, but security guards in that area pointed me to the Burundi embassy instead.

After some more looking around, I did end up finding the embassy. I spoke to the security guards at the gate and told them that I would like to know the visa process. They dialled a number and asked that person to tell me the visa process.

I spoke to a lady on the phone. She listed the documents required for the visa process and mentioned that the timings for visa application were from 9 o’clock to 11 o’clock in the morning. She also informed me that the visa fees was ₹1000.

I also asked about the process Badri, who lives far away in Tamil Nadu and cannot report at the embassy physically. She told me that I can submit a visa application on his behalf, along with an authorization letter.

Having found the embassy in Delhi was a huge relief. The other plan - going to Kuching, Malaysia - was a bit uncertain, and we didn’t know how much time it would take. Getting our passport submitted at an embassy in a foreign country was also not ideal.

A few days later, Badri sent me all the documents required for his visa. I went to the embassy and submitted both the applications. The lady who collected our visa submissions asked me for our flight reservations from Delhi to Brunei, whereas ours were (keeping with our itinerary) from Kuala Lampur. She said that she might contact me later if it was required.

For reference, here is the list of documents we submitted -

  • Visa application form
  • Passport
  • A photocopy of passport
  • Authorization letter from Badri (authorizing me to submit his application on his behalf)
  • Airline ticket itinerary
  • Hotel bookings
  • Cover letter
  • 2 photos
  • Proof of employment
  • 6 months bank statement (they specifically asked for ₹1,00,000 or more in bank balance)

I then asked about the procedure to collect the passports and visa results. Usually, embassies will tell you that they will contact you when they have decided on your applications. However, here I was informed that if they don’t contact me within 5 days, I can come and collect our passports and visa result between 13:30-14:30 hours on the fifth day. That was strange :)

I did visit the embassy to collect our visa results on the fifth day. However, the lady scolded me for not bringing the receipt she gave me. I was afraid that I might have to go all the way back home and bring the receipt to get our passports. The travel date was close, and it would take some time for Badri to receive his passport via courier as well.

Fortunately, she gave me our passports (with the visa attached) and asked me to share a scanned copy of the receipt via email after I get home.

We were elated that our visas were approved. Now we could focus on booking our flights.

If you are going to Brunei, remember to fill their arrival card from the website within 48 hours of your arrival!

Thanks to Badri and Contrapunctus for reviewing the draft before publishing the article.


  1. Nowadays, I prefer using Comaps instead of Organic Maps and recommend you do the same. Organic Maps had some issues with its governance and the community issues weren’t being addressed. ↩︎

21 June, 2025 08:00AM

June 20, 2025

Sven Hoexter

Terraform: Validation Condition Cycles

Terraform 1.9 introduced some time ago the capability to reference in an input variable validation condition other variables, not only the one you're validating.

What does not work is having two variables which validate each other, e.g.

variable "nat_min_ports" {
  description = "Minimal amount of ports to allocate for 'min_ports_per_vm'"
  default     = 32
  type        = number
  validation {
    condition = (
      var.nat_min_ports >= 32 &&
      var.nat_min_ports <= 32768 &&
      var.nat_min_ports < var.nat_max_ports
    )
    error_message = "Must be between 32 and 32768 and less than 'nat_max_ports'"
  }
}

variable "nat_max_ports" {
  description = "Maximal amount of ports to allocate for 'max_ports_per_vm'"
  default     = 16384
  type        = number
  validation {
    condition = (
      var.nat_max_ports >= 64 &&
      var.nat_max_ports <= 65536 &&
      var.nat_max_ports > var.nat_min_ports
    )
    error_message = "Must be between 64 and 65536 and above 'nat_min_ports'"
  }
}

That let directly to the following rather opaque error message: Received an error Error: Cycle: module.gcp_project_network.var.nat_max_ports (validation), module.gcp_project_network.var.nat_min_ports (validation)

Removed the sort of duplicate check var.nat_max_ports > var.nat_min_ports on nat_max_ports to break the cycle.

20 June, 2025 09:34AM

hackergotchi for Matthew Garrett

Matthew Garrett

My a11y journey

23 years ago I was in a bad place. I'd quit my first attempt at a PhD for various reasons that were, with hindsight, bad, and I was suddenly entirely aimless. I lucked into picking up a sysadmin role back at TCM where I'd spent a summer a year before, but that's not really what I wanted in my life. And then Hanna mentioned that her PhD supervisor was looking for someone familiar with Linux to work on making Dasher, one of the group's research projects, more usable on Linux. I jumped.

The timing was fortuitous. Sun were pumping money and developer effort into accessibility support, and the Inference Group had just received a grant from the Gatsy Foundation that involved working with the ACE Centre to provide additional accessibility support. And I was suddenly hacking on code that was largely ignored by most developers, supporting use cases that were irrelevant to most developers. Being in a relatively green field space sounds refreshing, until you realise that you're catering to actual humans who are potentially going to rely on your software to be able to communicate. That's somewhat focusing.

This was, uh, something of an on the job learning experience. I had to catch up with a lot of new technologies very quickly, but that wasn't the hard bit - what was difficult was realising I had to cater to people who were dealing with use cases that I had no experience of whatsoever. Dasher was extended to allow text entry into applications without needing to cut and paste. We added support for introspection of the current applications UI so menus could be exposed via the Dasher interface, allowing people to fly through menu hierarchies and pop open file dialogs. Text-to-speech was incorporated so people could rapidly enter sentences and have them spoke out loud.

But what sticks with me isn't the tech, or even the opportunities it gave me to meet other people working on the Linux desktop and forge friendships that still exist. It was the cases where I had the opportunity to work with people who could use Dasher as a tool to increase their ability to communicate with the outside world, whose lives were transformed for the better because of what we'd produced. Watching someone use your code and realising that you could write a three line patch that had a significant impact on the speed they could talk to other people is an incomparable experience. It's been decades and in many ways that was the most impact I've ever had as a developer.

I left after a year to work on fruitflies and get my PhD, and my career since then hasn't involved a lot of accessibility work. But it's stuck with me - every improvement in that space is something that has a direct impact on the quality of life of more people than you expect, but is also something that goes almost unrecognised. The people working on accessibility are heroes. They're making all the technology everyone else produces available to people who would otherwise be blocked from it. They deserve recognition, and they deserve a lot more support than they have.

But when we deal with technology, we deal with transitions. A lot of the Linux accessibility support depended on X11 behaviour that is now widely regarded as a set of misfeatures. It's not actually good to be able to inject arbitrary input into an arbitrary window, and it's not good to be able to arbitrarily scrape out its contents. X11 never had a model to permit this for accessibility tooling while blocking it for other code. Wayland does, but suffers from the surrounding infrastructure not being well developed yet. We're seeing that happen now, though - Gnome has been performing a great deal of work in this respect, and KDE is picking that up as well. There isn't a full correspondence between X11-based Linux accessibility support and Wayland, but for many users the Wayland accessibility infrastructure is already better than with X11.

That's going to continue improving, and it'll improve faster with broader support. We've somehow ended up with the bizarre politicisation of Wayland as being some sort of woke thing while X11 represents the Roman Empire or some such bullshit, but the reality is that there is no story for improving accessibility support under X11 and sticking to X11 is going to end up reducing the accessibility of a platform.

When you read anything about Linux accessibility, ask yourself whether you're reading something written by either a user of the accessibility features, or a developer of them. If they're neither, ask yourself why they actually care and what they're doing to make the future better.

comment count unavailable comments

20 June, 2025 08:48AM

Reproducible Builds (diffoscope)

diffoscope 299 released

The diffoscope maintainers are pleased to announce the release of diffoscope version 299. This version includes the following changes:

[ Chris Lamb ]
* Add python3-defusedxml to the Build-Depends in order to include it in the
  Docker image. (Closes: #407)

You find out more by visiting the project homepage.

20 June, 2025 12:00AM

June 19, 2025

hackergotchi for Jonathan Carter

Jonathan Carter

My first tag2upload upload

Tag2upload?

The tag2upload service has finally gone live for Debian Developers in an open beta.

If you’ve never heard of tag2upload before, here is a great primer presented by Ian Jackson and prepared by Ian Jackson and Sean Whitton.

In short, the world has moved on to hosting and working with source code in Git repositories. In Debian, we work with source packages that are used to generated the binary artifacts that users know as .deb files. In Debian, there is so much tooling and culture built around this. For example, our workflow passes what we call the island test – you could take every source package in Debian along with you to an island with no Internet, and you’ll still be able to rebuild or modify every package. When changing the workflows, you risk losing benefits like this, and over the years there has been a number of different ideas on how to move to a purely or partially git flow for Debian, none that really managed to gain enough momentum or project-wide support.

Tag2upload makes a lot of sense. It doesn’t take away any of the benefits of the current way of working (whether technical or social), but it does make some aspects of Debian packages significantly simpler and faster. Even so, if you’re a Debian Developer and more familiar with how the sausage have made, you’ll have noticed that this has been a very long road for the tag2upload maintainers, they’ve hit multiple speed bumps since 2019, but with a lot of patience and communication and persistence from all involved (and almost even a GR), it is finally materializing.

Performing my first tag2upload

So, first, I needed to choose which package I want to upload. We’re currently in hard freeze for the trixie release, so I’ll look for something simple that I can upload to experimental.

I chose bundlewrap, it’s quote a straightforward python package, and updates are usually just as straightforward, so it’s probably a good package to work on without having to deal with extra complexities in learning how to use tag2upload.

So, I do the usual uscan and dch -i to update my package…

And then I realise that I still want to build a source package to test it in cowbuilder. Hmm, I remember that Helmut showed me that building a source package isn’t necessary in sbuild, but I have a habit of breaking my sbuild configs somehow, but I guess I should revisit that.

So, I do a dpkg-buildpackage -S -sa and test it out with cowbuilder, because that’s just how I roll (at least for now, fixing my local sbuild setup is yak shaving for another day, let’s focus!).

I end up with a binary that looks good, so I’m satisfied that I can upload this package to the Debian archives. So, time to configure tag2upload.

The first step is to set up the webhook in Salsa. I was surprised two find two webhooks already configured:

I know of KGB that posts to IRC, didn’t know that this was the mechanism it does that by before. Nice! Also don’t know what the tagpending one does, I’ll go look into that some other time.

Configuring a tag2upload webhook is quite simple, add a URL, call the name tag2upload, and select only tag push events:

I run the test webhook, and it returned a code 400 message about a missing ‘message’ header, which the documentation says is normal.

Next, I install git-debpush from experimental.

The wiki page simply states that you can use the git-debpush command to upload, but doesn’t give any examples on how to use it, and its manpage doesn’t either. And when I run just git-debpush I get:

jonathan@lapcloud:~/devel/debian/python-team/bundlewrap/bundlewrap-4.23.1$ git-debpush
git-debpush: check failed: upstream tag upstream/4.22.0 is not an ancestor of refs/heads/debian/master; probably a mistake ('upstream-nonancestor' check)
pristine-tar is /usr/bin/pristine-tar
git-debpush: some check(s) failed; you can pass --force to ignore them

I have no idea what that’s supposed to mean. I was also not sure whether I should tag anything to begin with, or if some part of the tag2upload machinery automatically does it. I think I might have tagged debian/4.23-1 before tagging upstream/4.23 and perhaps it didn’t like it, I reverted and did it the other way around and got a new error message. Progress!

jonathan@lapcloud:~/devel/debian/python-team/bundlewrap/bundlewrap-4.23.1$ git-debpush
git-debpush: could not determine the git branch layout
git-debpush: please supply a --quilt= argument

Looking at the manpage, it looks like –quilt=baredebian matches my package the best, so I try that:

jonathan@lapcloud:~/devel/debian/python-team/bundlewrap/bundlewrap-4.23.1$ git-debpush --quilt=baredebian
Enumerating objects: 70, done.
Counting objects: 100% (70/70), done.
Delta compression using up to 12 threads
Compressing objects: 100% (37/37), done.
Writing objects: 100% (37/37), 8.97 KiB | 2.99 MiB/s, done.
Total 37 (delta 30), reused 0 (delta 0), pack-reused 0 (from 0)
To salsa.debian.org:python-team/packages/bundlewrap.git
6f55d99..3d5498f debian/master -> debian/master

 * [new tag] upstream/4.23.1 -> upstream/4.23.1
 * [new tag] debian/4.23.1-1_exp1 -> debian/4.23.1-1_exp1

Ooh! That looked like it did something! And a minute later I received the notification of the upload in my inbox:

So, I’m not 100% sure that this makes things much easier for me than doing a dput, but, it’s not any more difficult or more work either (once you know how it works), so I’ll be using git-debpush from now on, and I’m sure as I get more used to the git workflow of doing things I’ll understand more of the benefits. And at last, my one last use case for using FTP is now properly dead. RIP FTP :)

19 June, 2025 07:49PM by jonathan

Debian Outreach Team

GSoC 2025 Introduction: Make Debian for Raspberry Pi Build Again

Hello everyone! I am Kurva Prashanth, Interested in the lower level working of system software, CPUs/SoCs and Hardware design. I was introduced to Open Hardware and Embedded Linux while studying electronics and embedded systems as part of robotics coursework. Initially, I did not pay much attention to it and quickly moved on. However, a short talk on “Liberating SBCs using Debian” by Yuvraj at MiniDebConf India, 2021 caught my interest. The talk focused on Open Hardware platforms such as Olimex and BeagleBone Black, as well as the Debian distributions tailored for these ARM-based single-board computers has intrigued me to delve deeper into the realm of Open Hardware and Embedded Linux.

These days I’m trying to improve my abilities to contribute to Debian and Linux Kernel development. Before finding out about the Google Summer of Code project, I had already started my journey with Debian. I extensively used Debian system build tools(debootstrap, sbuild, deb-build-pkg, qemu-debootstrap) for Building Debian Image for Bela Cape a real-time OS for music making to achieve extremely fast audio and sensor processing times. In 2023, I had the opportunity to attend DebConf23 in Kochi, India - thanks to Nilesh Patra (@nilesh) and I met Hector Oron (@zumbi) over dinner at DebConf23 and It was nice talking about his contributions/work at Debian on armhf port and Debian System Administration that conversation got me interested in knowing more about Debian ARM, Installer and I found it fascinating that EmDebian was once a external project bringing Debian to embedded systems and now, Debian itself can be run on many embedded systems. And, also during DebCamp I got Introduced to PGP/GPG keys and the web of trust by Carlos Henrique Lima Melara (@charles) I learned how to use and generate GPG keys. After DebConf23 I tried debian packaging and I miserably failed to get sponsorship for a python library I packaged.

I came across the Debian project for this year’s Google Summer of Code and found the project titled Make Debian for Raspberry Pi Build Again quite interesting to me and applied. Gladly, on May 8th, I received an acceptance e-mail from GSoC. I got excited that I’ll spend the summer working on something that I like doing.

I am thrilled to be part of this project and I am super excited for the summer of‘25. I’m looking forward to work on what I most like, new connections and learning opportunities.

So, let me talk a bit more about my project. I will be working on to Make Debian for Raspberry Pi SBC’s under the guidance of Gunnar Wolf (@gwolf). In this post, I will describe the project I will be working on.

Why make Debian for Raspberry Pi build again?

There is an available set of images for running Debian in Raspberry Pi computers (all models below the 5 series)! However, the maintainer severely lacking time to take care for them; called for help for somebody to adopt them, but have not been successful. The image generation scripts might have bitrotted a bit, but it is mostly all done. And there is a lot of interest and use still in having the images freshly generated and decently tested! This GSoC project is about getting the [https://raspi.debian.net/ | Raspberry Pi Debian images] site working reliably, daily-built images become automatic again and ideally making it easily deployable to be run in project machines and migrating exsisting hosting infrastructure to Debian.

How much it differ from Debian build process?

While the goal is to stay as close as possible to the Debian build process, Raspberry Pi boards require some necessary platform-specific changes primarily in the early boot sequence and firmware handling. Unlike typical Debian systems, Raspberry Pi boards depend on a non-standard bootloader and use non-free firmware (raspi-firmware), Introducing some hardware-specific differences in the initialization process.

These differences are largely confined to the early boot and hardware initialization stages. Once the system boots, the userspace remains closely aligned with a typical Debian install, using Debian packages.

The current modifications are required due to non-free firmware. However, several areas merit review: but there are a few parts that might be worth changing.

  1. Boot flow: Transitioning to a U-Boot based boot process (as used in Debian installer images for many other SBCs) would reduce divergence and better align with Debian Installer.

  2. Current scripts/workarounds: Some existing hacks may now be redundant with recent upstream support and could be removed.

  3. Board-specific images: Shift to architecture-specific base images with runtime detection could simplify builds and reduce duplication.

Debian already already building SD card images for a wide range of SBCs (e.g., BeagleBone, BananaPi, OLinuXino, Cubieboard, etc.) installer-arm64/images/u-boot and installer-armhf/images/u-boot, a similar approach for Raspberry Pi could improve maintainability and consistency with Debian’s broader SBC support.

Quoted from Mail Discussion Thread with Mentor (Gunnar Wolf)

"One direction we wanted to explore was whether we should still be building one image per family, or whether we could instead switch to one image per architecture (armel, armhf, arm64). There were some details to iron out as RPi3 and RPi4 were quite different, but I think it will be similar to the differences between the RPi 0 and 1, which are handled at first-boot time. To understand what differs between families, take a look at Cyril Brulebois’ generate-recipe (in the repo), which is a great improvement over the ugly mess I had before he contributed it"

In this project, I intend to to build one image per architecture (armel, armhf, arm64) rather than continuing with the current model of building one image per board. This change simplifies image management, reduces redundancy, and leverages dynamic configuration at boot time to support all supported boards within each architecture. By using U-Boot and flash-kernel, we can detect the board type and configure kernel parameters, DTBs, and firmware during the first boot, reducing duplication across images and simplifying the maintenance burden and we can also generalize image creation while still supporting board-specific behavior at runtime. This method aligns with existing practices in the DebianInstaller team and aligns with Debian’s long-term maintainability goals and better leverages upstream capabilities, ensuring a consistent and scalable boot experience.

To streamline and standardize the process of building bootable Debian images for Raspberry Pi devices, I proposed a new workflow that leverages U-Boot and flash-kernel Debian packages. This provides a clean, maintainable, and reproducible way to generate images for armel, armhf and arm64 boards. The workflow is vmdb2, a lightweight, declarative tool designed to automate the creation of disk images. A typical vmdb2 recipe defines the disk layout, base system installation (via debootstrap), architecture-specific packages, and any custom post-install hooks and the image should includes U-Boot (the u-boot-rpi package), flash-kernel, and a suitable Debian kernel package like linux-image-arm64 or linux-image-armmp.

U-Boot serves as the platform’s bootloader and is responsible for loading the kernel and initramfs. Unlike Raspberry Pi’s non-free firmware/proprietary bootloader, U-Boot provides an open and scriptable interface, allowing us to follow a more standard Debian boot process. It can be configured to boot using either an extlinux.conf or a boot.scr script generated automatically by flash-kernel. The role of flash-kernel is to bridge Debian’s kernel installation system with the specifics of embedded bootloaders like U-Boot. When installed, it automatically copies the kernel image, initrd, and device tree blobs (DTBs) to the /boot partition. It also generates the necessary boot.scr script if the board configuration demands it. To work correctly, flash-kernel requires that the target machine be identified via /etc/flash-kernel/machine, which must correspond to an entry in its internal machine database.\ Once the vmdb2 build is complete, the resulting image will contain a fully configured bootable system with all necessary boot components correctly installed. The image can be flashed to an SD card and used to boot on the intended device without additional manual configuration. Because all key packages (U-Boot, kernel, flash-kernel) are managed through Debian’s package system, kernel updates and boot script regeneration are handled automatically during system upgrades.

Current Workflow: Builds one Image per family

The current vmdb2 recipe uses the Raspberry Pi GPU bootloader provided via the raspi-firmware package. This is the traditional boot process followed by Raspberry Pi OS, and it’s tightly coupled with firmware files like bootcode.bin, start.elf, and fixup.dat. These files are installed to /boot/firmware, which is mounted from a FAT32 partition labeled RASPIFIRM. The device tree files (*.dtb) are manually copied from /usr/lib/linux-image-*-arm64/broadcom/ into this partition.

The kernel is installed via the linux-image-arm64 package, and the boot arguments are injected by modifying /boot/firmware/cmdline.txt using sed commands. Booting depends on the root partition being labeled RASPIROOT, referenced through that file. There is no bootloader like UEFI-based or U-Boot involved — the Raspberry Pi firmware directly loads the kernel, which is standard for Raspberry Pi boards.

- apt: install
  packages:
    ...
    - raspi-firmware  

The boot partition contents and kernel boot setup are tightly controlled via scripting in the recipe.

Limitations of Current Workflow: While this setup works, it is

  1. Proprietary and Raspberry Pi–specific – It relies on the closed-source GPU bootloader the raspi-firmware package, which is tightly coupled to specific Raspberry Pi models.

  2. Manual DTB handling – Device tree files are manually copied and hardcoded, making upgrades or board-specific changes error-prone.

  3. Not easily extendable to future Raspberry Pi boards – Any change in bootloader behavior (as seen in the Raspberry Pi 5, which introduces a more flexible firmware boot process) would require significant rework.

  4. No UEFI-based/U-Boot – The current method bypasses the standard bootloader layers, making it inconsistent with other Debian ARM platforms and harder to maintain long-term.

As Raspberry Pi firmware and boot processes evolve, especially with the introduction of Pi 5 and potentially Pi 6, maintaining compatibility will require more flexibility - something best delivered by adopting U-Boot and flash-kernel.

New Workflow: Building Architecture-Specific Images with vmdb2, U-Boot, flash-kernel, and Debian Kernel

This workflow outlines an improved approach to generating bootable Debian images architecture specific, using vmdb2, U-Boot, flash-kernel, and Debian kernels and also to move away from Raspberry Pi’s proprietary bootloader to a fully open-source boot process which improves maintainability, consistency, and cross-board support.

New Method: Shift to U-Boot + flash-kernel

U-Boot (via Debian’su-boot-rpi package) and flash-kernel bring the image building process closer to how Debian officially boots ARM devices. flash-kernel integrates with the system’s initramfs and kernel packages to install bootloaders, prepare boot.scr or extlinux.conf, and copy kernel/initrd/DTBs to /boot in a format that U-Boot expects. U-Boot will be used as a second-stage bootloader, loaded by the Raspberry Pi’s built-in firmware. Once U-Boot is in place, it will read standard boot scripts ( boot.scr) generated by flash-kernel, providing a Debian-compatible and board-flexible solution.

Extending YAML spec for vmdb2 build with U-Boot and flash-kernel

To improve an existing vmdb2 YAML spec(https://salsa.debian.org/raspi-team/image-specs/raspi_master.yaml), to integrate U-Boot, flash-kernel, and the architecture-specific Debian kernel into the image build process. By incorporating u-boot-rpi and flash-kernel from Debian packages, alongside the standard initramfs-tools, we align the image closer to Debian best practices while supporting both armhf and arm64 architectures.

Below are key additions and adjustments needed in a vmdb2 YAML spec to support the workflow: Install U-Boot, flash-kernel, initramfs-tools and the architecture-specific Debian kernel.

- apt: install
  packages:
    - u-boot-rpi
    - flash-kernel
    - initramfs-tools
    - linux-image-arm64 # or linux-image-armmp for armhf 
  tag: tag-root

Replace linux-image-arm64 with the correct kernel package for specific target architecture. These packages should be added under the tag-root section in YAML spec for vmdb2 build recipe. This ensures that the necessary bootloader, kernel, and initramfs tools are included and properly configured in the image.

Configure Raspberry Pi firmware to Load U-Boot

Install the U-Boot binary as kernel.img in /boot/firmware we can also download and build U-Boot from source, but Debian provides tested binaries.

- shell: |
    cp /usr/lib/u-boot/rpi_4/u-boot.bin ${ROOT?}/boot/firmware/kernel.img
    echo "enable_uart=1" >> ${ROOT?}/boot/firmware/config.txt
  root-fs: tag-root

This makes the RPi firmware load u-boot.bin instead of the Linux kernel directly.

Set Up flash-kernel for Debian-style Boot

flash-kernel integrates with initramfs-tools and writes boot config suitable for U-Boot. We need to make sure /etc/flash-kernel/db contains an entry for board (most Raspberry Pi boards already supported in Bookworm).

Set up /etc/flash-kernel.conf with:

- create-file: /etc/flash-kernel.conf
  contents: |
    MACHINE="Raspberry Pi 4"
    BOOTPART="/dev/disk/by-label/RASPIFIRM"
    ROOTPART="/dev/disk/by-label/RASPIROOT"
  unless: rootfs_unpacked

This allows flash-kernel to write an extlinux.conf or boot.scr into /boot/firmware.

Clean up Proprietary/Non-Free Firmware Bootflow

Remove the direct kernel loading flow:

- shell: |
    rm -f ${ROOT?}/boot/firmware/vmlinuz*
    rm -f ${ROOT?}/boot/firmware/initrd.img*
    rm -f ${ROOT?}/boot/firmware/cmdline.txt
  root-fs: tag-root

Let U-Boot and flash-kernel manage kernel/initrd and boot parameters instead.

Boot Flow After This Change

[SoC ROM] -> [start.elf] -> [U-Boot] -> [boot.scr] -> [Linux Kernel]
  1. This still depends on the Raspberry Pi firmware to start, but it only loads U-Boot, not Linux kernel.

  2. U-Boot gives you more flexibility (e.g., networking, boot menus, signed boot).

  3. Using flash-kernel ensures kernel updates are handled the Debian Installer way.

  4. Test with a serial console (enable_uart=1) in case HDMI doesn’t show early boot logs.

Advantage of New Workflow

  1. Replaces the proprietary Raspberry Pi bootloader with upstream U-Boot.

  2. Debian-native tooling – Uses flash-kernel and initramfs-tools to manage boot configuration.

  3. Consistent across boards – Works for both armhf and arm64, unifying the image build process.

  4. Easier to support new boards – Like the Raspberry Pi 5 and future models.

This transition will standardize a bit image-building process, making it aligned with upstream Debian Installer workflows.

vmdb2 configuration for arm64 using u-boot and flash-kernel

NOTE: This is a baseline example and may require tuning.

# Raspberry Pi arm64 image using U-Boot and flash-kernel

steps:
  # ... (existing mkimg, partitions, mount, debootstrap, etc.) ...

  # Install U-Boot, flash-kernel, initramfs-tools and architecture specific kernel
  - apt: install
    packages:
      - u-boot-rpi
      - flash-kernel
      - initramfs-tools
      - linux - image - arm64 # or linux - image - armmp for armhf
    tag: tag-root

  # Install U-Boot binary as kernel.img in firmware partition
  - shell: |
      cp /usr/lib/u-boot/rpi_arm64 /u-boot.bin ${ROOT?}/boot/firmware/kernel.img
      echo "enable_uart=1" >> ${ROOT?}/boot/firmware/config.txt
    root-fs: tag-root

  # Configure flash-kernel for Raspberry Pi
  - create-file: /etc/flash-kernel.conf
    contents: |
      MACHINE="Generic Raspberry Pi ARM64"
      BOOTPART="/dev/disk/by-label/RASPIFIRM"
      ROOTPART="/dev/disk/by-label/RASPIROOT"
    unless: rootfs_unpacked

  # Remove direct kernel boot files from Raspberry Pi firmware
  - shell: |
      rm -f ${ROOT?}/boot/firmware/vmlinuz*
      rm -f ${ROOT?}/boot/firmware/initrd.img*
      rm -f ${ROOT?}/boot/firmware/cmdline.txt
    root-fs: tag-root

  # flash-kernel will manage boot scripts and extlinux.conf
  # Rest of image build continues...

Required Changes to Support Raspberry Pi Boards in Debian (flash-kernel + U-Boot)

Overview of Required Changes

Component Required Task
Debian U-Boot Package Add build target for rpi_arm64 in u-boot-rpi. Optionally deprecate legacy 32-bit targets.
Debian flash-kernel Package Add or verify entries in db/all.db for Pi 4, Pi 5, Zero 2W, CM4. Ensure boot script generation works via bootscr.uboot-generic.
Debian Kernel Ensure DTBs are installed at /usr/lib/linux-image-<version>/ and available for flash-kernel to reference.

flash-kernel

Already Supported Boards in flash-kernel Debian Package

https://sources.debian.org/src/flash-kernel/3.109/db/all.db/#L1700

Model Arch DTB-Id
Raspberry Pi 1 A/B/B+, Rev2 armel bcm2835-*
Raspberry Pi CM1 armel bcm2835-rpi-cm1-io1.dtb
Raspberry Pi Zero/Zero W armel bcm2835-rpi-zero*.dtb
Raspberry Pi 2B armhf bcm2836-rpi-2-b.dtb
Raspberry Pi 3B/3B+ arm64 bcm2837-*
Raspberry Pi CM3 arm64 bcm2837-rpi-cm3-io3.dtb
Raspberry Pi 400 arm64 bcm2711-rpi-400.dtb

uboot

Already Supported Boards in Debian U-Boot Package

https://salsa.debian.org/installer-team/flash-kernel/-/blob/master/db/all.db

arm64

| Model |Arch | Upstream Defconfig | Debian Target | | ————————- | ——- | ———————— | ——————- | | Raspberry Pi 3B | arm64 | rpi_3_defconfig | rpi_3 | | Raspberry Pi 4B | arm64 | rpi_4_defconfig | rpi_4 | | Raspberry Pi 3B/3B+/CM3/CM3+/4B/CM4/400/5B/Zero 2W | arm64 | rpi_arm64_defconfig | rpi_arm64 |

armhf

| Model |Arch | Upstream Defconfig | Debian Target | | ————————- | ——- | ———————— | ——————- | | Raspberry Pi 2 | armhf | rpi_2_defconfig | rpi_2 | | Raspberry Pi 3B (32-bit) | armhf | rpi_3_32b_defconfig | rpi_3_32b | | Raspberry Pi 4B (32-bit) | armhf | rpi_4_32b_defconfig | rpi_4_32b |

armel

| Model |Arch | Upstream Defconfig | Debian Target | | ————————- | ——- | ———————— | ——————- | | Raspberry Pi | armel | rpi_defconfig | rpi | | Raspberry Pi 1/Zero | armel | rpi_0_w | rpi_0_w |

These boards are already defined in debian/rules under the u-boot-rpi source package and generates usable U-Boot binaries for corresponding Raspberry Pi models.

To-Do: Add Missing Board Support to U-Boot and flash-kernel in Debian

Several Raspberry Pi models are missing from the Debian U-Boot and flash-kernel packages, even though upstream support and DTBs exist in the Debian kernel but are missing entries in the flash-kernel database to enable support for bootloader installation and initrd handling.

Boards Not Yet Supported in flash-kernel Debian Package

Model Arch DTB-Id
Raspberry Pi 3A+ (32 & 64 bit) armhf, arm64 bcm2837-rpi-3-a-plus.dtb
Raspberry Pi 4B (32 & 64 bit) armhf, arm64 bcm2711-rpi-4-b.dtb
Raspberry Pi CM4 arm64 bcm2711-rpi-cm4-io.dtb
Raspberry Pi CM 4S arm64 -
Raspberry Zero 2 W arm64 bcm2710-rpi-zero-2-w.dtb
Raspberry Pi 5 arm64 bcm2712-rpi-5-b.dtb
Raspberry Pi CM5 arm64 -
Raspberry Pi 500 arm64 -

Boards Not Yet Supported in Debian U-Boot Package

Model Arch Upstream defconfig(s)
Raspberry Pi 3A+/3B+ arm64 -, rpi_3_b_plus_defconfig
Raspberry Pi CM 4S arm64 -
Raspberry Pi 5 arm64 -
Raspberry Pi CM5 arm64 -
Raspberry Pi 500 arm64 -

So, what next?

During the Community Bonding Period, I got hands-on with workflow improvements, set up test environments, and began reviewing Raspberry Pi support in Debian’s U-Boot and flash-kernel and these are the logs of the project, where I provide weekly reports on the work done. You can check here: Community Bonding Period logs.

My next steps include submitting patches to the u-boot and flash-kernel packages to ensure all missing Raspberry Pi entries are built and shipped. And, also to confirm the kernel DTB installation paths and make sure the necessary files are included for all Raspberry Pi variants. Finally, plan to validate changes with test builds on Raspberry Pi hardware.

In parallel, I’m organizing my tasks and setting up my environment to contribute more effectively. It’s been exciting to explore how things work under the hood and to prepare for a summer of learning and contributing to this great community.

19 June, 2025 03:53AM by Kurva Prashanth

June 18, 2025

Sergio Durigan Junior

GCC, glibc, stack unwinding and relocations – A war story

I’ve been meaning to write a post about this bug for a while, so here it is (before I forget the details!).

First, I’d like to thank a few people:

  • My friend Gabriel F. T. Gomes, who helped with debugging and simply talking about the issue. I love doing some pair debugging, and I noticed that he also had a great time diving into the internals of glibc and libgcc.
  • My teammate Dann Frazier, who always provides invaluable insights and was there to motivate me to push a bit further in order to figure out what was going on.
  • The upstream GCC and glibc developers who finally drove the investigation to completion and came up with an elegant fix.

I’ll probably forget some details because it’s been more than a week (and life at $DAYJOB moves fast), but we’ll see.

The background story

Wolfi OS takes security seriously, and one of the things we have is a package which sets the hardening compiler flags for C/C++ according to the best practices recommended by OpenSSF. At the time of this writing, these flags are (in GCC’s spec file parlance):

*self_spec:
+ %{!O:%{!O1:%{!O2:%{!O3:%{!O0:%{!Os:%{!0fast:%{!0g:%{!0z:-O2}}}}}}}}} -fhardened -Wno-error=hardened -Wno-hardened %{!fdelete-null-pointer-checks:-fno-delete-null-pointer-checks} -fno-strict-overflow -fno-strict-aliasing %{!fomit-frame-pointer:-fno-omit-frame-pointer} -mno-omit-leaf-frame-pointer

*link:
+ --as-needed -O1 --sort-common -z noexecstack -z relro -z now

The important part for our bug is the usage of -z now and -fno-strict-aliasing.

As I was saying, these flags are set for almost every build, but sometimes things don’t work as they should and we need to disable them. Unfortunately, one of these problematic cases has been glibc.

There was an attempt to enable hardening while building glibc, but that introduced a strange breakage to several of our packages and had to be reverted.

Things stayed pretty much the same until a few weeks ago, when I started working on one of my roadmap items: figure out why hardening glibc wasn’t working, and get it to work as much as possible.

Reproducing the bug

I started off by trying to reproduce the problem. It’s important to mention this because I often see young engineers forgetting to check if the problem is even valid anymore. I don’t blame them; the anxiety to get the bug fixed can be really blinding.

Fortunately, I already had one simple test to trigger the failure. All I had to do was install the py3-matplotlib package and then invoke:

$ python3 -c 'import matplotlib'

This would result in an abortion with a coredump.

I followed the steps above, and readily saw the problem manifesting again. OK, first step is done; I wasn’t getting out easily from this one.

Initial debug

The next step is to actually try to debug the failure. In an ideal world you get lucky and are able to spot what’s wrong after just a few minutes. Or even better: you also can devise a patch to fix the bug and contribute it to upstream.

I installed GDB, and then ran the py3-matplotlib command inside it. When the abortion happened, I issued a backtrace command inside GDB to see where exactly things had gone wrong. I got a stack trace similar to the following:

#0  0x00007c43afe9972c in __pthread_kill_implementation () from /lib/libc.so.6
#1  0x00007c43afe3d8be in raise () from /lib/libc.so.6
#2  0x00007c43afe2531f in abort () from /lib/libc.so.6
#3  0x00007c43af84f79d in uw_init_context_1[cold] () from /usr/lib/libgcc_s.so.1
#4  0x00007c43af86d4d8 in _Unwind_RaiseException () from /usr/lib/libgcc_s.so.1
#5  0x00007c43acac9014 in __cxxabiv1::__cxa_throw (obj=0x5b7d7f52fab0, tinfo=0x7c429b6fd218 <typeinfo for pybind11::attribute_error>, dest=0x7c429b5f7f70 <pybind11::reference_cast_error::~reference_cast_error() [clone .lto_priv.0]>)
    at ../../../../libstdc++-v3/libsupc++/eh_throw.cc:93
#6  0x00007c429b5ec3a7 in ft2font__getattr__(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >) [clone .lto_priv.0] [clone .cold] () from /usr/lib/python3.13/site-packages/matplotlib/ft2font.cpython-313-x86_64-linux-gnu.so
#7  0x00007c429b62f086 in pybind11::cpp_function::initialize<pybind11::object (*&)(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >), pybind11::object, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, pybind11::name, pybind11::scope, pybind11::sibling>(pybind11::object (*&)(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >), pybind11::object (*)(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >), pybind11::name const&, pybind11::scope const&, pybind11::sibling const&)::{lambda(pybind11::detail::function_call&)#1}::_FUN(pybind11::detail::function_call&) [clone .lto_priv.0] ()
   from /usr/lib/python3.13/site-packages/matplotlib/ft2font.cpython-313-x86_64-linux-gnu.so
#8  0x00007c429b603886 in pybind11::cpp_function::dispatcher(_object*, _object*, _object*) () from /usr/lib/python3.13/site-packages/matplotlib/ft2font.cpython-313-x86_64-linux-gnu.so
...

Huh. Initially this didn’t provide me with much information. There was something strange seeing the abort function being called right after _Unwind_RaiseException, but at the time I didn’t pay much attention to it.

OK, time to expand our horizons a little. Remember when I said that several of our packages would crash with a hardened glibc? I decided to look for another problematic package so that I could make it crash and get its stack trace. My thinking here is that maybe if I can compare both traces, something will come up.

I happened to find an old discussion where Dann Frazier mentioned that Emacs was also crashing for him. He and I share the Emacs passion, and I totally agreed with him when he said that “Emacs crashing is priority -1!” (I’m paraphrasing).

I installed Emacs, ran it, and voilà: the crash happened again. OK, that was good. When I ran Emacs inside GDB and asked for a backtrace, here’s what I got:

#0  0x00007eede329972c in __pthread_kill_implementation () from /lib/libc.so.6
#1  0x00007eede323d8be in raise () from /lib/libc.so.6
#2  0x00007eede322531f in abort () from /lib/libc.so.6
#3  0x00007eede262879d in uw_init_context_1[cold] () from /usr/lib/libgcc_s.so.1
#4  0x00007eede2646e7c in _Unwind_Backtrace () from /usr/lib/libgcc_s.so.1
#5  0x00007eede3327b11 in backtrace () from /lib/libc.so.6
#6  0x000059535963a8a1 in emacs_backtrace ()
#7  0x000059535956499a in main ()

Ah, this backtrace is much simpler to follow. Nice.

Hmmm. Now the crash is happening inside _Unwind_Backtrace. A pattern emerges! This must have something to do with stack unwinding (or so I thought… keep reading to discover the whole truth). You see, the backtrace function (yes, it’s a function) and C++’s exception handling mechanism use similar techniques to do their jobs, and it pretty much boils down to unwinding frames from the stack.

I looked into Emacs’ source code, specifically the emacs_backtrace function, but could not find anything strange over there. This bug was probably not going to be an easy fix…

The quest for a minimal reproducer

Being able to easily reproduce the bug is awesome and really helps with debugging, but even better is being able to have a minimal reproducer for the problem.

You see, py3-matplotlib is a huge package and pulls in a bunch of extra dependencies, so it’s not easy to ask other people to “just install this big package plus these other dependencies, and then run this command…”, especially if we have to file an upstream bug and talk to people who may not even run the distribution we’re using. So I set up to try and come up with a smaller recipe to reproduce the issue, ideally something that’s not tied to a specific package from the distribution.

Having all the information gathered from the initial debug session, especially the Emacs backtrace, I thought that I could write a very simple program that just invoked the backtrace function from glibc in order to trigger the code path that leads to _Unwind_Backtrace. Here’s what I wrote:

#include <execinfo.h>

int
main(int argc, char *argv[])
{
  void *a[4096];
  backtrace (a, 100);
  return 0;
}

After compiling it, I determined that yes, the problem did happen with this small program as well. There was only a small nuisance: the manifestation of the bug was not deterministic, so I had to execute the program a few times until it crashed. But that’s much better than what I had before, and a small price to pay. Having a minimal reproducer pretty much allows us to switch our focus to what really matters. I wouldn’t need to dive into Emacs’ or Python’s source code anymore.

At the time, I was sure this was a glibc bug. But then something else happened.

GCC 15

I had to stop my investigation efforts because something more important came up: it was time to upload GCC 15 to Wolfi. I spent a couple of weeks working on this (it involved rebuilding the whole archive, filing hundreds of FTBFS bugs, patching some programs, etc.), and by the end of it the transition went smooth. When the GCC 15 upload was finally done, I switched my focus back to the glibc hardening problem.

The first thing I did was to… yes, reproduce the bug again. It had been a few weeks since I had touched the package, after all. So I built a hardened glibc with the latest GCC and… the bug did not happen anymore!

Fortunately, the very first thing I thought was “this must be GCC”, so I rebuilt the hardened glibc with GCC 14, and the bug was there again. Huh, unexpected but very interesting.

Diving into glibc and libgcc

At this point, I was ready to start some serious debugging. And then I got a message on Signal. It was one of those moments where two minds think alike: Gabriel decided to check how I was doing, and I was thinking about him because this involved glibc, and Gabriel contributed to the project for many years. I explained what I was doing, and he promptly offered to help. Yes, there are more people who love low level debugging!

We spent several hours going through disassembles of certain functions (because we didn’t have any debug information in the beginning), trying to make sense of what we were seeing. There was some heavy GDB involved; unfortunately I completely lost the session’s history because it was done inside a container running inside an ephemeral VM. But we learned a lot. For example:

  • It was hard to actually understand the full stack trace leading to uw_init_context_1[cold]. _Unwind_Backtrace obviously didn’t call it (it called uw_init_context_1, but what was that [cold] doing?). We had to investigate the disassemble of uw_init_context_1 in order to determined where uw_init_context_1[cold] was being called.

  • The [cold] suffix is a GCC function attribute that can be used to tell the compiler that the function is unlikely to be reached. When I read that, my mind immediately jumped to “this must be an assertion”, so I went to the source code and found the spot.

  • We were able to determine that the return code of uw_frame_state_for was 5, which means _URC_END_OF_STACK. That’s why the assertion was triggering.

After finding these facts without debug information, I decided to bite the bullet and recompiled GCC 14 with -O0 -g3, so that we could debug what uw_frame_state_for was doing. After banging our heads a bit more, we found that fde is NULL at this excerpt:

// ...
  fde = _Unwind_Find_FDE (context->ra + _Unwind_IsSignalFrame (context) - 1,
                          &context->bases);
  if (fde == NULL)
    {
#ifdef MD_FALLBACK_FRAME_STATE_FOR
      /* Couldn't find frame unwind info for this function.  Try a
         target-specific fallback mechanism.  This will necessarily
         not provide a personality routine or LSDA.  */
      return MD_FALLBACK_FRAME_STATE_FOR (context, fs);
#else
      return _URC_END_OF_STACK;
#endif
    }
// ...

We’re debugging on amd64, which means that MD_FALLBACK_FRAME_STATE_FOR is defined and therefore is called. But that’s not really important for our case here, because we had established before that _Unwind_Find_FDE would never return NULL when using a non-hardened glibc (or a glibc compiled with GCC 15). So we decided to look into what _Unwind_Find_FDE did.

The function is complex because it deals with .eh_frame , but we were able to pinpoint the exact location where find_fde_tail (one of the functions called by _Unwind_Find_FDE) is returning NULL:

if (pc < table[0].initial_loc + data_base)
  return NULL;

We looked at the addresses of pc and table[0].initial_loc + data_base, and found that the former fell within libgcc’s text section, which the latter fell within /lib/ld-linux-x86-64.so.2 text.

At this point, we were already too tired to continue. I decided to keep looking at the problem later and see if I could get any further.

Bisecting GCC

The next day, I woke up determined to find what changed in GCC 15 that caused the bug to disappear. Unless you know GCC’s internals like they are your own home (which I definitely don’t), the best way to do that is to git bisect the commits between GCC 14 and 15.

I spent a few days running the bisect. It took me more time than I’d have liked to find the right range of commits to pass git bisect (because of how branches and tags are done in GCC’s repository), and I also had to write some helper scripts that:

  • Modified the gcc.yaml package definition to make it build with the commit being bisected.
  • Built glibc using the GCC that was just built.
  • Ran tests inside a docker container (with the recently built glibc installed) to determine whether the bug was present.

At the end, I had a commit to point to:

commit 99b1daae18c095d6c94d32efb77442838e11cbfb
Author: Richard Biener <rguenther@suse.de>
Date:   Fri May 3 14:04:41 2024 +0200

    tree-optimization/114589 - remove profile based sink heuristics

Makes sense, right?! No? Well, it didn’t for me either. Even after reading what was changed in the code and the upstream bug fixed by the commit, I was still clueless as to why this change “fixed” the problem (I say “fixed” because it may very well be an unintended consequence of the change, and some other problem might have been introduced).

Upstream takes over

After obtaining the commit that possibly fixed the bug, while talking to Dann and explaining what I did, he suggested that I should file an upstream bug and check with them. Great idea, of course.

I filed the following upstream bug:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120653

It’s a bit long, very dense and complex, but ultimately upstream was able to find the real problem and have a patch accepted in just two days. Nothing like knowing the code base. The initial bug became:

https://sourceware.org/bugzilla/show_bug.cgi?id=33088

In the end, the problem was indeed in how the linker defines __ehdr_start, which, according to the code (from elf/dl-support.c):

if (_dl_phdr == NULL)
  {
    /* Starting from binutils-2.23, the linker will define the
       magic symbol __ehdr_start to point to our own ELF header
       if it is visible in a segment that also includes the phdrs.
       So we can set up _dl_phdr and _dl_phnum even without any
       information from auxv.  */


    extern const ElfW(Ehdr) __ehdr_start attribute_hidden;
    assert (__ehdr_start.e_phentsize == sizeof *GL(dl_phdr));
    _dl_phdr = (const void *) &__ehdr_start + __ehdr_start.e_phoff;
    _dl_phnum = __ehdr_start.e_phnum;
  }

But the following definition is the problematic one (from elf/rtld.c):

extern const ElfW(Ehdr) __ehdr_start attribute_hidden;

This symbol (along with its counterpart, __ehdr_end) was being run-time relocated when it shouldn’t be. The fix that was pushed added optimization barriers to prevent the compiler from doing the relocations.

I don’t claim to fully understand what was done here, and Jakub’s analysis is a thing to behold, but in the end I was able to confirm that the patch fixed the bug. And in the end, it was indeed a glibc bug.

Conclusion

This was an awesome bug to investigate. It’s one of those that deserve a blog post, even though some of the final details of the fix flew over my head.

I’d like to start blogging more about these sort of bugs, because I’ve encountered my fair share of them throughout my career. And it was great being able to do some debugging with another person, exchange ideas, learn things together, and ultimately share that deep satisfaction when we find why a crash is happening.

I have at least one more bug in my TODO list to write about (another one with glibc, but this time I was able to get to the end of it and come up with a patch). Stay tunned.

P.S.: After having published the post I realized that I forgot to explain why the -z now and -fno-strict-aliasing flags were important.

-z now is the flag that I determined to be the root cause of the breakage. If I compiled glibc with every hardening flag except -z now, everything worked. So initially I thought that the problem had to do with how ld.so was resolving symbols at runtime. As it turns out, this ended up being more a symptom than the real cause of the bug.

As for -fno-strict-aliasing, a Gentoo developer who commented on the GCC bug above mentioned that this OpenSSF bug had a good point against using this flag for hardening. I still have to do a deep dive on what was discussed in the issue, but this is certainly something to take into consideration. There’s this very good write-up about strict aliasing in general if you’re interested in understanding it better.

18 June, 2025 03:29AM