Performance

I wasted a day on CSS selector performance to make a website load 2ms faster

For a few minutes, I felt like Indiana Jones staring at the holy grail. 230ms of savings from some CSS selector changes. This felt significant. And yet, the niggling phrase of “CSS selector efficiency is not something to worry about in 2024” kept ringing through my head.

I ran the before/after through WebPageTest and to my surprise (and lack of surprise), nothing significant had changed.

Don't attach tooltips to document.body

Instead of attaching tooltips directly to document.body, attach them to a predefined div in document.body.

BAD

Code language: HTML

<body>
    <!-- temporary div, vanishes when tooltips vanishes -->
    <div>my tooltip</div>
<body>

GOOD

Code language: HTML

<body>
    <!-- this div stays forever, just for attaching tooltips -->
    <div id="tooltips-container">
        <!-- temporary div, vanishes when tooltips vanishes -->
        <div>my tooltip</div>
    </div>
<body>

Introduction

Tooltips in our app were taking >80ms. And during this time, the main thread was blocked, you couldn’t interact with anything.

Other components like modal, popover, dropdown had similar performance issues. In some cases, a modal took more than 1 second to appear while making the UI unresponsive.

Read on in the source link for details.

Avoid Large, Complex Layouts and Layout Thrashing

Layout is where the browser figures out the geometric information for elements: their size and location in the page. Each element will have explicit or implicit sizing information based on the CSS that was used, the contents of the element, or a parent element. The process is called Layout in Chrome, Opera, Safari, and Internet Explorer. In Firefox it’s called Reflow, but effectively the process is the same.

Similarly to style calculations, the immediate concerns for layout cost are:

  1. The number of elements that require layout.
  2. The complexity of those layouts.

TL;DR

  • Layout is normally scoped to the whole document.
  • The number of DOM elements will affect performance; you should avoid triggering layout wherever possible.
  • Assess layout model performance; new Flexbox is typically faster than older Flexbox or float-based layout models.
  • Avoid forced synchronous layouts and layout thrashing; read style values then make style changes.

See the source link for details and solutions.

Idle Until Urgent

A few weeks ago I was looking at some of the performance metrics for my site. Specifically, I wanted to see how I was doing on our newest metric, first input delay (FID). My site is just a blog (and doesn’t run much JavaScript), so I expected to see pretty good results.

Input delay that’s less than 100 milliseconds is typically perceived as instant by users, so the performance goal we recommend (and the numbers I was hoping to see in my analytics) is FID < 100ms for 99% of page loads.

To my surprise, my site’s FID was 254ms at the 99th percentile. And while that’s not terrible, the perfectionist in me just couldn’t let that slide. I had to fix it!

[…]

[W]hile I was trying to solve my issue I stumbled upon a pretty interesting performance strategy that I want to share (it’s the primary reason I’m writing this article).

I’m calling the strategy: idle until urgent.

My performance problem

First input delay (FID) is a metric that measures the time between when a user first interacts with your site (for a blog like mine, that’s most likely them clicking a link) and the time when the browser is able to respond to that interaction (make a request to load the next page).

The reason there might be a delay is if the browser’s main thread is busy doing something else (usually executing JavaScript code). So to diagnose a higher-than-expected FID, you should start by creating a performance trace of your site as it’s loading (with CPU and network throttling enabled) and look for individual tasks on the main thread that take a long time to execute. Then once you’ve identified those long tasks, you can try to break them up into smaller tasks.

Here’s what I found when doing a performance trace of my site:

So what’s taking so long to run?

Well, if you look at the tails of this flame chart, you won’t see any single functions that are clearly taking up the bulk of the time. Most individual functions are run in less than 1ms, but when you add them all up, it’s taking more than 100ms to run them in a single, synchronous call stack.

This is the JavaScript equivalent of death by a thousand cuts.

Since the problem is all these functions are being run as part of a single task, the browser has to wait until this task finishes to respond to user interaction. So clearly the solution is to break up this code into multiple tasks[.]

[…]

A perfect example of a component that really needs to have its initialization code broken up can be illustrated by zooming closer down into this performance trace. Mid-way through the main() function, you’ll see one of my components uses the Intl.DateTimeFormat API[.]

[…]

Creating this object took 13.47 milliseconds!

The thing is, the Intl.DateTimeFormat instance is created in the component’s constructor, but it’s not actually used until it’s needed by other components that reference it to format dates. However, this component doesn’t know when it’s going to be referenced, so it’s playing it safe and instantiating the Int.DateTimeFormat object right away.

[…]

Idle Until Urgent

After spending a lot of time thinking about this problem, I realized that the evaluation strategy I really wanted was one where my code would initially be deferred to idle periods but then run immediately as soon as it’s needed. In other words: idle-until-urgent.

Idle-until-urgent sidesteps most of the downsides I described in the previous section. In the worst case, it has the exact same performance characteristics as lazy evaluation, and in the best case it doesn’t block interactivity at all because execution happens during idle periods.

What forces layout/reflow: a comprehensive list by Paul Irish

All of the below properties or methods, when requested/called in JavaScript, will trigger the browser to synchronously calculate the style and layout*. This is also called reflow or layout thrashing, and is common performance bottleneck.

Generally, all APIs that synchronously provide layout metrics will trigger forced reflow / layout. Read on for additional cases and details.

The self-fulfilling prophecy of poor mobile performance

There are plenty of stories floating around about how some organization improved performance and suddenly saw an influx of traffic from places they hadn’t expected. This is why. We build an experience that is completely [unusable] for them, and is completely invisible to our data. We create, what Kat Holmes calls, a “mismatch”. So we look at the data and think, “Well, we don’t get any of those low-end Android devices so I guess we don’t have to worry about that.” A self-fulfilling prophecy.

I’m a big advocate for ensuring you have robust performance monitoring in place. But just as important as analyzing what’s in the data, is considering what’s not in the data, and why that might be.