strager.net

In software, abstraction is removing details which are unimportant, leaving what is important.

For example, the list abstraction removes usually-unimportant details such as growth rate and pointer arithmetic for element access, leaving important details such as order and accessing elements by index. For some applications, the usually-unimportant details of growth rate and pointer arithmetic are essential, but for most applications, these details obscure.

Graph a timeline of your Ninja build with this gnuplot script:

boxWidth = 0.8

set style fill solid
set yrange [0:*]
set ytic noenhanced
unset key

set xlabel "time (milliseconds)"
set ylabel "build target"

plot '.ninja_log' \
  using 1:0:1:2:($0-boxWidth/2.):($0+boxWidth/2.):($0+1):ytic(4) \
  with boxxyerror lc var
$ gnuplot -e 'set term png size 2560, 1600 truecolor; set output "plot.png"' plot.gnuplot

Ninja build timeline

Takeaways from J. B. Rainsberger's Integrated Tests are a Scam talk:

Four categories of tests:

Mocks of collaboration tests are the same as actions of contract tests.

Assertions of contract tests are the same as stubs of collaboration tests.

Collaboration test Contract test
Arrange
Call to interface method Arrange (mock) Act
Result of interface method Arrange (stub) Assert
Act
Assert

I wrote a one-off script to migrate backup archives from Duplicity to Borg. It and alternates between duplicity restore and borg create for the last Duplicity backup of each month. I'm sharing it here just in case I need it in the future. It's definitely full of spaghetti.

Rough instructions:

  1. Run TZ=UTC duplicity collection-status file:///path/to/archive >archive-status
  2. Run gpg --export-secret-keys --armor key | GNUPGHOME="${PWD}/gpghome" gpg --import --passphrase-file /dev/null
  3. Run python doit.py archive
doit.py
import contextlib
import datetime
import json
import logging
import os
import pathlib
import pipes
import pprint
import re
import shutil
import subprocess
import sys
import time
import typing

logger = logging.getLogger(__name__)

directory = sys.argv[1]

logging.basicConfig(format="%(message)s", level=logging.INFO)

timestamp_to_duplicity_timestamp = {}

def parse_duplicity_timestamp(timestamp: str) -> datetime.datetime:
    return datetime.datetime(*time.strptime(timestamp, "%a %b %d %H:%M:%S %Y")[:6], tzinfo=datetime.timezone.utc)

def parse_duplicity_backup_timestamps(status_file) -> typing.Iterator[datetime.datetime]:
    for line in status_file:
        match = re.match(r"^\s*(?:Full|Incremental)\s+(?P<timestamp>.*?)\s+\d+$", line)
        if match is not None:
            duplicity_timestamp = match.group("timestamp")
            timestamp = parse_duplicity_timestamp(duplicity_timestamp)
            timestamp_to_duplicity_timestamp[timestamp] = duplicity_timestamp
            yield timestamp

def run_borg_command(
    command: typing.Sequence[str],
    key_file: typing.Optional[pathlib.Path] = None,
    keys_dir: typing.Optional[pathlib.Path] = None,
    passphrase: typing.Optional[str] = None,
    passphrase_file: typing.Optional[pathlib.Path] = None,
    new_passphrase: typing.Optional[str] = None,
    **subprocess_run_kwargs,
) -> subprocess.CompletedProcess:
    with contextlib.ExitStack() as cleanups:
        extra_env = {}
        pass_fds = []

        if key_file is not None:
            extra_env["BORG_KEY_FILE"] = str(key_file)
        if keys_dir is not None:
            extra_env["BORG_KEYS_DIR"] = str(keys_dir)

        assert (
            passphrase is None or passphrase_file is None
        ), "passphrase and passphrase_file are mututally exclusive"
        if passphrase is not None:
            # TODO(strager): Always give passphrases using files.
            extra_env["BORG_PASSPHRASE"] = passphrase
        if passphrase_file is not None:
            passphrase_fd = os.open(passphrase_file, os.O_RDONLY)
            cleanups.callback(lambda: os.close(passphrase_fd))
            extra_env["BORG_PASSPHRASE_FD"] = str(passphrase_fd)
            pass_fds.append(passphrase_fd)

        if new_passphrase is not None:
            # TODO(strager): Always give passphrases using files. Unfortunately,
            # Borg does not have a way to give a file or fd or command for the
            # new passphrase.
            extra_env["BORG_NEW_PASSPHRASE"] = new_passphrase

        logger.info(
            f"$ {command_string(command=command, extra_env=extra_env)}"
        )

        return subprocess.run(
            command,
            env=dict(os.environ, **extra_env),
            pass_fds=pass_fds,
            **subprocess_run_kwargs,
        )

_secret_environment_variables = {"BORG_PASSPHRASE", "BORG_NEW_PASSPHRASE"}


def command_string(
    command: typing.Sequence[str], extra_env: typing.Dict[str, str] = {}
) -> str:
    command_string = ""
    for (env_name, env_value) in extra_env.items():
        if env_name in _secret_environment_variables:
            env_value_str = "--REDACTED--"
        else:
            env_value_str = pipes.quote(env_value)
        command_string += f"{pipes.quote(env_name)}={env_value_str} "
    command_string += " ".join(map(pipes.quote, command))
    return command_string

with open(f"status-{directory}", "r") as status_file:
    timestamps = list(parse_duplicity_backup_timestamps(status_file))

last_timestamp_per_month = {}
for timestamp in timestamps:
    key = (timestamp.year, timestamp.month)
    last_timestamp_in_month = last_timestamp_per_month.get(key)
    if last_timestamp_in_month is None or last_timestamp_in_month < timestamp:
        last_timestamp_per_month[key] = timestamp

os.environ["TZ"] = "UTC"
os.environ["GNUPGHOME"] = str(pathlib.Path("gpghome").absolute())
os.environ["PASSPHRASE"] = ""

timestamps_to_restore = list(sorted(last_timestamp_per_month.values()))
for timestamp in timestamps_to_restore:
    print(f"--------------------------------------------------------------------------------")
    print(f"--------------------------------------------------------------------------------")
    print(f"restoring {timestamp_to_duplicity_timestamp[timestamp]}\n")
#    if timestamp < datetime.datetime(year=2020, month=3, day=1, tzinfo=datetime.timezone.utc):
#        print("ignoring")
#        continue

    duplicity_timestamp = timestamp_to_duplicity_timestamp[timestamp]
    borg_archive_name = f"duplicity-{directory}-{duplicity_timestamp}"

    list_command = run_borg_command([
        "borg",
        "list",
        "--json",
        f"/Users/strager/borg-backups/straddler",
    ], key_file="/Users/strager/borg-keys/straddler-archive.borgkey",
       passphrase_file="/Users/strager/borg-keys/straddler-archive.borgkey.passphrase",
       capture_output = True
    )
    list_command.check_returncode()
    archives = json.loads(list_command.stdout)
    if [a for a in archives['archives'] if a['archive'] == borg_archive_name]:
        print("already archived; skipping")
        continue

    extract_directory = pathlib.Path(directory)
    if extract_directory.exists():
        shutil.rmtree(extract_directory)

    command = [
        "duplicity",
        "restore",

        "--time", str(int(timestamp.timestamp())),

        f"file:///Users/strager/duplicity-backups/straddler/{directory}",
        str(extract_directory),
    ]
    logger.info(f"$ {command_string(command)}")
    subprocess.check_call(command)

    create_command = run_borg_command([
        "borg",
        "create",

        "--files-cache", "rechunk,ctime",

        "--one-file-system",
        "--info",
        "--show-version",
        "--stats",
        "--noatime",
        "--nobsdflags",

        "--timestamp",
        timestamp.strftime("%Y-%m-%dT%H:%M:%S"),

        f"/Users/strager/borg-backups/straddler::{borg_archive_name}",
        str(extract_directory) + "/",
    ], key_file="/Users/strager/borg-keys/straddler-archive.borgkey",
       passphrase_file="/Users/strager/borg-keys/straddler-archive.borgkey.passphrase",
    )
    create_command.check_returncode()

Colorize your manual pages with lolcat:

$ COLUMNS=`tput cols` PAGER='sh -c "col -bpx|lolcat -f|less -R"' man ls
LS(1)                  User Commands                 LS(1)

NAME
       ls - list directory contents

SYNOPSIS
       ls [OPTION]... [FILE]...

DESCRIPTION
       List  information  about  the  FILEs  (the  current
       directory by default).  Sort entries alphabetically
       if none of -cftuvSUX nor --sort is specified.

       Mandatory  arguments  to long options are mandatory
       for short options too.

       -a, --all
              do not ignore entries starting with .

       -A, --almost-all

Yesterday, I decided to finally debug an issue on my home network. This post documents my investigations.

Scenario

I have three relevant nodes on my network:

Router
A Netgear router running OpenWRT (Linux).
straglum (192.168.2.206)
My workstation. Attached to the br-lan or eth0.1 interface on my router via Wi-Fi.
strager-nas (192.168.3.89)
My network-attached storage. Attached to the eth0.2 interface on my router via a cable.

Desired state: straglum can connect to strager-nas via SSH. strager-nas cannot connect to straglum (or any other node on the work, or the public internet).

Current state: straglum cannot connect to strager-nas via SSH. I don't know what strager-nas can connect to.

Investigation

First, I made sure the strager-nas machine was powered on. It wasn't, so I hit the power button.

Hypothesis
strager-nas is entirely offline or has a broken network configuration.
Test
On the router, run ping strager-nas and ssh strager-nas.
Results
At first, ping showed no pongs. After waiting a minute, ping finally showed pongs. ssh failed with No matching algo kex.
Conclusion
strager-nas is online. strager-nas' network is configured properly. strager-nas' SSH service is running.

I still can't talk to strager-nas from straglum, so turning on strager-nas was insufficient.

Hypothesis
The router is failing to route TCP and ICMP packets from straglum to strager-nas.
Test
On the router, run tcpdump host strager-nas. On straglum, run ping strager-nas and ssh strager-nas.
Results
tcpdump showed the TCP and ICMP packets coming from straglum destined for strager-nas. tcpdump also showed the TCP and ping responses, coming from strager-nas destined for straglum.
Conclusion
The router successfully routes TCP and ICMP packets from straglum to strager-nas. Additionally, strager-nas is responding to ping and SSH requests.

According to tcpdump, everything is working fine. Perhaps tcpdump is capturing packets before the firewall drops them.

Hypothesis
The iptables rules which should permit traffic from strager-nas to straglum are not taking effect. The firewall is blocking traffic.
Test
Add probe rules to the router's iptables, attempt to connect from straglum, and observe iptables' counters:
$ iptables -I FORWARD -o eth0.2 -m comment --comment "strager test A" -m conntrack --ctstate NEW
$ iptables -I FORWARD -i eth0.2 -d 192.168.2.206 -m comment --comment "strager test B" -m conntrack --ctstate ESTABLISHED
$ iptables -I FORWARD -i eth0.2 -d 192.168.2.206 -m comment --comment "strager test C"

$ iptables -vL FORWARD
Results
While pinging, the strager test A and strager test C rules matched the ping packets, but the strager test B rule matched no packets.
Conclusion
The existing ctstate RELATED,ESTABLISHED rules are indeed not taking effect. We have a probing rule which does match response packets.

At this point, I decided to manually allow all traffic from strager-nas to straglum:

$ iptables -I FORWARD -i eth0.2 -d 192.168.2.206 -j ACCEPT -m comment --comment "strager test D"

After adding this rule, I started receiving ping responses, and was able to connect to strager-nas from straglum via SSH. We have made progress!

However, I was able to connect to straglum's SSH server from strager-nas. This is undesireable; strager-nas shouldn't be able to initiate connections to other hosts. We need a way to allow new connections into strager-nas but not new connects from strager-nas.

Hypothesis
For the strager-nas-to-straglum packets being blocked by the router's iptables, the connect state is not ESTABLISHED. The state must be something else.
Test
Add probe rules to the router's iptables, attempt to ping strager-nas from straglum, and observe iptables' counters:
$ iptables -I FORWARD -o eth0.2 -m comment --comment "strager test from-NEW" -m conntrack --ctstate NEW
$ iptables -I FORWARD -o eth0.2 -m comment --comment "strager test from-ESTABLISHED" -m conntrack --ctstate ESTABLISHED
$ iptables -I FORWARD -o eth0.2 -m comment --comment "strager test from-INVALID" -m conntrack --ctstate INVALID
$ iptables -I FORWARD -o eth0.2 -m comment --comment "strager test from-RELATED" -m conntrack --ctstate RELATED
$ iptables -I FORWARD -o eth0.2 -m comment --comment "strager test from-SNAT" -m conntrack --ctstate SNAT
$ iptables -I FORWARD -o eth0.2 -m comment --comment "strager test from-DNAT" -m conntrack --ctstate DNAT
$ iptables -I FORWARD -i eth0.2 -d 192.168.2.206 -m comment --comment "strager test to-NEW" -m conntrack --ctstate NEW
$ iptables -I FORWARD -i eth0.2 -d 192.168.2.206 -m comment --comment "strager test to-ESTABLISHED" -m conntrack --ctstate ESTABLISHED
$ iptables -I FORWARD -i eth0.2 -d 192.168.2.206 -m comment --comment "strager test to-INVALID" -m conntrack --ctstate INVALID
$ iptables -I FORWARD -i eth0.2 -d 192.168.2.206 -m comment --comment "strager test to-RELATED" -m conntrack --ctstate RELATED
$ iptables -I FORWARD -i eth0.2 -d 192.168.2.206 -m comment --comment "strager test to-SNAT" -m conntrack --ctstate SNAT
$ iptables -I FORWARD -i eth0.2 -d 192.168.2.206 -m comment --comment "strager test to-DNAT" -m conntrack --ctstate DNAT

$ iptables -vL FORWARD
Results
None of the strager test to- rules matched any packets. However, the strager test from- rules did match packets.
Conclusion
Either the TCP packets coming from strager-nas have no connection state, or they have an undocumented connection state.

In order to observe the connection states, I ran ssh straglum on strager-nas, then grepped the /proc/net/nf_conntrack file on the router:

$ grep 192.168.3.89 /proc/net/nf_conntrack
ipv4     2 tcp      6 107 SYN_SENT src=192.168.2.206 dst=192.168.3.89 sport=37670 dport=22 packets=1 bytes=60 [UNREPLIED] src=192.168.3.89 dst=192.168.2.206 sport=22 dport=37670 packets=0 bytes=0 mark=0 use=2

The output looks odd. [UNREPLIED] looks very suspicious. Perhaps this is the cause? I think connection tracking may be turned off somehow.

Hypthesis
Connection tracking is disabled.
Test
Run iptables -t raw -vL and see if any packets have tracking disabled.
Results
$ iptables -t raw -vL
Chain PREROUTING (policy ACCEPT 937 packets, 570K bytes)
 pkts bytes target            prot opt in     out     source               destination
 486K  409M delegate_notrack  all  --  any    any     anywhere             anywhere

Chain OUTPUT (policy ACCEPT 416 packets, 492K bytes)
 pkts bytes target     prot opt in     out     source               destination

Chain delegate_notrack (1 references)
 pkts bytes target                prot opt in     out     source               destination
  259 65085 zone_lan_nas_notrack  all  --  eth0.2 any     anywhere             anywhere

Chain zone_lan_nas_notrack (1 references)
 pkts bytes target     prot opt in     out     source               destination
   66 18466 CT         all  --  any    any     anywhere             anywhere             CT notrack
Conclusion
Tracking is indeed disabled for strager-nas' interface on the router. Tracking is enabled for other interfaces.

Something inside OpenWRT must have created the notrack rule and the zone_lan_nas_notrack chain. I looked through the LuCl web interface, and sure enough, the "Force connection tracking" checkbox for the lan_nas firewall zone is unchecked.

Solution

I forced connection tracking for the lan_nas zone in the UI, turned off my manual ACCEPT rule, and pinged. I got a response. Success!

In C, you can write a switch statement without braces. The following program prints first then wat:

#include <stdio.h>

int main() {
  switch (1)
  case 1:
    puts("first");

  switch (0)
  case 1:
    puts("second");

  puts("wat");
}

Idea for a tutorial for teaching programming to beginners:

  1. Learn I/O programming with no logic. Maybe CLI or Arduino.
  2. Learn logic programming with no I/O. Create tic-tac-toe. Code for a GUI is provided for you.
  3. Swap the pre-made GUI of the tic-tac-toe program with an Arduino-based UI (also pre-made).

The goal of this tutorial is to encourage learners to think of logic and I/O as separate things. The two components can be developed independently. Logic can be decoupled from its user-visible interface.

Remove elements from the end of a JavaScript array by manipulating its length property:

eggs = ['a', 'b', 'c', 'd'];
eggs.length -= 1;
console.log(eggs); // [ 'a', 'b', 'c' ]

What are some overheads for different operating system kernel implementation techniques?

Overheads of OS kernels
Traditional ring0/ring3 kernel ring0 single address-space ring0 machine code sandbox (e.g. NaCl) ring0 virtual machine code sandbox (e.g. PNaCl, WebAssembly)
I/O system call context switch function call function call (none; inlinable)
Load code (none) (none) overhead overhead
Access memory (none) (none) overhead overhead
Jump indirect (none) (none) overhead overhead
Switch thread context switch (none) (none) (none)

The extract method refactoring is symmetrical to the inline function optimization. What other symmetries are there between refactoring and optimization?

Brainstorm of refactorings which relate to optimizations
Refactoring Optimization
Extract function
Inline function
Inline function
Extract variable Common sub-expression elimination
Inline variable Constant folding
Remove dead code Dead code elimination
Slide statements Loop-invariant code motion
Split loop Loop fusion
Loop fission
Loop peeling
Loop un-switching
Replace conditional with polymorphism Devirtualization
Replace control flag with break Dead store elimination
Replace temp with query Common sub-expression elimination
Introduce parameter object Scalar replacement of aggregates

In C++, initializing a const reference automatic variable might make a reference to an unnamed temporary:

struct s {
    int b: 6;
};
void f() {
    s my_s;
    my_s.b = 6;
    const int& b = my_s.b; // Copy!
    assert(b == 6);

    my_s.b += 1;
    assert(my_s.b == 7);
    // b was a copy to my_s.b (and not a reference to
    // my_s.b) and remains unchanged.
    assert(b == 6);
}
void g() {
    char a[] = "hello world";
    char* cs = a;
    const std::string& s = cs; // Copy!
    assert(s == "hello world");

    a[0] = 'y';
    assert(strcmp(cs, "yello world") == 0);
    // s was a copy of cs (not a reference to cs) and
    // remains unchanged.
    assert(s == "hello world");
}

I think this behavior is familiar for function parameters, but is fragile for automatic variables.

Tenative conclusion: avoid const reference automatic variables.

Follow-up questions for further research:

Possible solutions:

Idea for a less ambiguous syntax for C++ forwarding references (i.e. universal references):

template<class T>
auto begin2(decltype(T) x)  // C++17: auto begin2(T&& x)
{
  return std::forward<T>(x);
}

When exporting to PNG, gnuplot ignores the transparency of objects by default:

set term png size 1280, 1024
set output "plot2.png"

To fix this, set the truecolor option:

set term png size 1280, 1024 truecolor
set output "plot2.png"