Snippets

LLMs don't do what they're usually presented as doing

LLMs, the technology underpinning the current AI hype wave, don’t do what they’re usually presented as doing. They have no innate understanding, they do not think or reason, and they have no way of knowing if a response they provide is truthful or, indeed, harmful. They work based on statistical continuation of token streams, and everything else is a user-facing patch on top.

Tags

JavaScript template literal as object property name

This will throw an error in every JavaScript engine:

Code language: JavaScript

{
  `mouseenter.${eventNamespace}`: handler,
}

Why? Because template literals are not actually literals, as confusing as that name is - they’re expressions. However, it is possible to (ab)use JavaScript’s zany, vibes-based type coercion to make this valid by wrapping the template literal in an array like so:

Code language: JavaScript

{
  [`mouseenter.${eventNamespace}`]: handler,
}

This causes the JavaScript engine to evaluate the template literal into a string, then it coerces the array containing that one string into a string, which is then valid as the property name.

Callback hell

Asynchronous JavaScript, or JavaScript that uses callbacks, is hard to get right intuitively. A lot of code ends up looking like this: 

Code language: JavaScript

fs.readdir(source, function (err, files) {
  if (err) {
    console.log('Error finding files: ' + err)
  } else {
    files.forEach(function (filename, fileIndex) {
      console.log(filename)
      gm(source + filename).size(function (err, values) {
        if (err) {
          console.log('Error identifying file size: ' + err)
        } else {
          console.log(filename + ' : ' + values)
          aspect = (values.width / values.height)
          widths.forEach(function (width, widthIndex) {
            height = Math.round(width / aspect)
            console.log('resizing ' + filename + 'to ' + height + 'x' + height)
            this.resize(width, height).write(dest + 'w' + width + '_' + filename, function(err) {
              if (err) console.log('Error writing file: ' + err)
            })
          }.bind(this))
        }
      })
    })
  }
})

See the pyramid shape and all the }) at the end? Eek! This is affectionately known as callback hell.

Please stop externalizing your costs directly into my face

If you think [LLM] crawlers respect robots.txt then you are several assumptions of good faith removed from reality. These bots crawl everything they can find, robots.txt be damned, including expensive endpoints like git blame, every page of every git log, and every commit in every repo, and they do so using random User-Agents that overlap with end-users and come from tens of thousands of IP addresses – mostly residential, in unrelated subnets, each one making no more than one HTTP request over any time period we tried to measure – actively and maliciously adapting and blending in with end-user traffic and avoiding attempts to characterize their behavior or block their traffic.

We are experiencing dozens of brief outages per week, and I have to review our mitigations several times per day to keep that number from getting any higher. When I do have time to work on something else, often I have to drop it when all of our alarms go off because our current set of mitigations stopped working. Several high-priority tasks at SourceHut have been delayed weeks or even months because we keep being interrupted to deal with these bots, and many users have been negatively affected because our mitigations can’t always reliably distinguish users from bots.

All of my sysadmin friends are dealing with the same problems. I was asking one of them for feedback on a draft of this article and our discussion was interrupted to go deal with a new wave of LLM bots on their own server.

[…]

Please stop legitimizing LLMs or AI image generators or GitHub Copilot or any of this garbage. I am begging you to stop using them, stop talking about them, stop making new ones, just stop. If blasting CO2 into the air and ruining all of our freshwater and traumatizing cheap laborers and making every sysadmin you know miserable and ripping off code and books and art at scale and ruining our fucking democracy isn’t enough for you to leave this shit alone, what is?

Mirroring a GitLab repository to GitHub

We recently migrated Neurocracy from GitHub to GitLab but I wanted to keep our numerous repositories on GitHub up to date without having to push to both. Repository mirroring is relatively simple to set up once you know the exact steps required, and it allows us to push only to GitLab yet have our GitHub repositories also receive the same commits nearly instantly.

GitLab

First, we need to set up your GitLab project. Copy your project’s GitHub HTTPS Git URL (not the SSH URL); for example, for Omnipedia it would be:

https://github.com/neurocracy/omnipedia.git

Now head to your GitLab project:

  1. Go to “Settings” → “Repository” and expand “Mirroring repositories”
  2. Click “Add new”
  3. Paste the GitHub URL from earlier into “Git repository URL”
  4. Edit the URL to replace https at the start with ssh; in the Omnipedia example, the URL would now be ssh://github.com/neurocracy/omnipedia.git
  5. Now click “Detect host keys” and wait for it to finish - this is important
  6. Set “Authentication method” to “SSH public key”
  7. Enter “git” as the user name
  8. Click “Mirror repository”
  9. Lastly, click the “Copy SSH public key” button on the newly configured repository - you’ll need this for GitHub

GitHub

Now in your GitHub project:

  1. Go to “Settings” → “Deploy keys”
  2. Click “Add deploy key”
  3. Give your key a title; I recommend something like “GitLab mirror key”
  4. Paste the SSH public key you copied from GitLab previously
  5. Check “Allow write access” - this is important
  6. Click “Add key”

Testing

Mirroring should now be fully configured, so you can test it in one of two ways:

  1. If you have new commits locally that you haven’t pushed yet, you can push now to your GitLab project and it should mirror that to the GitHub counterpart.
  2. If you don’t want to add new commits, you can tell GitLab to mirror manually by clicking the “Update now” button in your GitLab project’s “Settings” → “Repository” → “Mirroring repositories”

Once you’ve done either one, you can refresh the GitHub project’s “Settings” → “Deploy keys” page, after which you should see green text under the deploy key stating something along the lines of “Last used within the last week”.

I wasted a day on CSS selector performance to make a website load 2ms faster

For a few minutes, I felt like Indiana Jones staring at the holy grail. 230ms of savings from some CSS selector changes. This felt significant. And yet, the niggling phrase of “CSS selector efficiency is not something to worry about in 2024” kept ringing through my head.

I ran the before/after through WebPageTest and to my surprise (and lack of surprise), nothing significant had changed.

Programmers' mental models keep software alive

What keeps the software alive are the programmers who have an accurate mental model (theory) of how it is built and works. That mental model can only be learned by having worked on the project while it grew or by working alongside somebody who did, who can help you absorb the theory. Replace enough of the programmers, and their mental models become disconnected from the reality of the code, and the code dies. That dead code can only be replaced by new code that has been ‘grown’ by the current programmers.