Lab 7: Visualizing quantitative data with D3

In this lab, we will learn:

How do we draw visualizations for quantitative data, such as bar charts and scatter plots, using D3

How to show tooltips on hover as a way to provide more information about the data

How to compute summary statistics about our data in a structured a way

Something to note at the start of this lab is that it is significantly longer than previous labs. Therefore, we suggest you to start early with it. We know that it might seem tedious at points, but please keep going and make the best use of the available office hours. It is important that you understand the concepts we work through here to have a strong basis for your future FP progress.

Table of contents

Check-off
Lab 7 Rubric
Slides (or lack thereof)
Step 0: Setting up
- Step 0.1: Data Setup
- Step 0.2: Getting Rid of Warnings
Step 1: Displaying summary stats
Step 2: Visualizing time and day of commits in a scatterplot
Step 3: Adding a tooltip
Step 4: Communicating lines edited via the size of the dots (optional)
Step 5: Clickable Commits
- Step 5.1: Making commits clickable
- Step 5.2: Adding the Visuals
Step 6: Bar Chart
Step 7: Brushing (Optional)

Check-off

You need to come to TA Office Hours to get checked off for this lab (any of them, no appointment needed), OR submit your work asynchronously by filling out this form.

If you choose to submit your work asynchronously and have an incorrect or incomplete part of the lab, you will not receive any credit for the lab (we do not offer partial credit on labs). You may not resubmit this form nor ask for a synchronous check off for the same lab.

Lab 7 Rubric

To successfully complete this lab check-off, ensure your work meets all of the following requirements:

General

Same functionality from Labs 4-6.
Succesfully deployed to GitHub Pages.
When executing the Svelte development server locally, there are no warnings

Meta Page & GitHub Commit Statistics

GitHub workflow has been updated to generate loc.csv
Meta page is reachable from the navbar (and all the current navbar tab gets highlighted like the others)
CSV is successfully loaded
One aggregated stats (other than total LOC (Lines of Code)) is computed and displayed

Scatterplot

Commits show as separate data points
Axes are aligned and visible
Horizontal grid lines are visible but don’t interfere with the actual plot (opacity &/ color adapted)

Tooltip

On mouseenter and mouseleave, the tooltip shows and hides accordingly
The circles of the data points are scaled using the d3.scaleSqrt scale
The circles of the data points increase in size when hovered (no change in the x and y positions though)

Click Selection

All commits are clickable and thus selectable
The circles change color upon click and remaing colored as long as they are selected

Bar Chart

Bar chart component renders on meta page
Selecting commits changes the data underlying the chart and thus also the chart
Colors and positions of the Bar chart remain unchanged no matter which commits are selected
When no commits are selected, the language statistics of all are shown

Slides (or lack thereof)

Just like the previous lab, there are no slides for this lab! Since the topic was covered in last Monday’s lecture, it can be helpful for you to review the material from it.

This lab is a little more involved than most of the previous labs, because it’s introducing the core technical material around data visualization. A robust understanding of these concepts will be invaluable as you work on your final projects, so spending time practicing them for the lab will be time will spent.

Step 0: Setting up

Step 0.1: Data Setup

If you haven’t completed Step 7 of Lab 6 yet, do it now.

Step 0.2: Getting Rid of Warnings

You might have noticed that we’ve been carrying a warning since Lab 5. Specifically, when running npm run dev -- --open to launch your development server locally, you probably see this:

Avoid calling `fetch` eagerly during server side rendering — put your `fetch` calls inside `onMount` or a `load` function instead

This warning tells us that we’re trying to access a resource preemptively, which can cause issues in a production (i.e., public-facing) setting. Such warnings can be hard to debug since they don’t throw errors—they’re simply internal Svelte warnings. The best way to track down the issue is to search for where we’re calling fetch.

A quick global search reveals the culprit in routes/+page.svelte. The problematic code looks like this:

{#await fetch("https://api.github.com/users/YOUR_USERNAME")}
	<p>Loading...</p>
{:then response}
	{#await response.json()}
		<p>Decoding...</p>
	{:then data}
		<section>
			...

To prevent fetch from being called eagerly during server-side rendering, we use Svelte’s onMount hook. Add the following logic to your <script> element (replace YOUR_USERNAME to match the URL you have been previously using):

import { onMount } from "svelte";

let githubData = null;
let loading = true;
let error = null;

onMount(async () => {
	try {
		const response = await fetch("https://api.github.com/users/YOUR_USERNAME");
		githubData = await response.json();
	} catch (err) {
		error = err;
	}
	loading = false;
});

Now, instead of fetching data directly in the HTML markup, we do it when the component mounts. This approach is not only safer but also more performant since Svelte schedules the call appropriately. The only thing that is left to do now is to display “loading” if that variable ist still true or to show what your data in githubData holds.

Update the code that displays your GitHub statistics like so:

{#if loading}
    <p>Loading...</p>
{:else if error}
    <p class="error">Something went wrong: {error.message}</p>
{:else}
    <section>
        <h2>My GitHub Stats</h2>
        <dl>
            <dt>Followers</dt>
            <dd>{githubData.followers}</dd>
            <dt>Following</dt>
            <dd>{githubData.following}</dd>
            <dt>Public Repositories</dt>
            <dd>{githubData.public_repos}</dd>
        </dl>
    </section>
{/if}

Note that in the process, we’ve significantly improved readability. When you restart your dev server, you should now see no more warnings. Yay!

Step 1: Displaying summary stats

Step 1.1: Reading the CSV file in D3

In our routes/meta/+page.svelte file, we will now read the CSV file. Thankfully, we don’t have to reinvent the wheel and parse CSV files ourselves, D3 has a built-in function for that.

Add a <script> element to the Meta page, and import D3, like you did in the previous lab:

import * as d3 from "d3";

We will be using the d3.csv() function from the d3-fetch module, which provides helper functions for fetching data.

Now, continuing in the <script> element, let’s read the CSV file:

import { onMount } from "svelte";

let data = [];

onMount(async () => {
	data = await d3.csv("/loc.csv");
});

and let’s print out the total lines of code in our repo in the HTML part of the page to make sure it worked:

<p>Total lines of code: {data.length}</p>

If everything went well, you’ll be seeing something like this (I added a title and another paragraph just to make it pretty!):

To see the structure of these objects, add a console.log(data) right after the statement (within onMount) that sets the variable, then check your console. We place this inside onMount to ensure that the statement is triggered once the data is loaded.

You should be seeing something like this(I’ve expanded the first row):

Screenshot of the output

Note that everything is a string, including the numbers and dates. That can be quite a footgun when handling data (as an anecdote, I spent about an hour debugging an issue caused by using + to add two numbers together, which instead concatenated them as strings while developing this very lab!). To fix it, we adapt our data fetching logic to include a row conversion function:

data = await d3.csv("/loc.csv", row => ({
	...row,
	line: Number(row.line), // or just +row.line
	depth: Number(row.depth),
	length: Number(row.length),
	date: new Date(row.date + "T00:00" + row.timezone),
	datetime: new Date(row.datetime)
}));

It should now look like this:

Screenshot of the output

Don’t forget to delete the console.log line now that we’re done — we don’t want to clutter our console with debug info!

Step 1.2: Computing commit data

Notice that while this data includes information about each commit¹ (that still has an effect on the codebase), it’s not in a format we can easily access, but mixed in with the data about each line (this is called denormalized data).

Let’s extract this data about commits in a separate object for easy access. We will compute this inside onMount after reading the CSV file.

First, define a commits variable outside onMount:

let commits = [];

Then, inside onMount, we will use the d3.groups() method to group the data by the commit property.

commits = d3.groups(data, d => d.commit);

This will give us an array where each element is an array with two values:

The first value is the unique commit identifier
The second value is an array of objects for lines that have been modified by that commit.

Print it out with {JSON.stringify(commits, null, "\t")} to see what it looks like!

To transform this into an array of objects about each commit, with a lines property that contains the number of lines that were modified by that commit:

commits = d3.groups(data, d => d.commit).map(([commit, lines]) => {
	let first = lines[0];
	let {author, date, time, timezone, datetime} = first;
	let ret = {
		id: commit,
		url: "https://github.com/vis-society/lab-7/commit/" + commit,
		author, date, time, timezone, datetime,
		hourFrac: datetime.getHours() + datetime.getMinutes() / 60,
		totalLines: lines.length
	};

	// Like ret.lines = lines
	// but non-enumerable so it doesn’t show up in JSON.stringify
	Object.defineProperty(ret, "lines", {
		value: lines,
		configurable: true,
		writable: true,
		enumerable: false,
	});

	return ret;
});

Check it out by adding console.log(commits) after setting it. In my case it looks like this:

Step 1.3: Displaying the stats

Let’s get our feet wet with this data by displaying two more stats. Use a <dl> list that reuses the same formatting as in the stats on your homepage.

Avoid copy-pasting the CSS. You can either create a class and define the styling for dl.stats and its children in your style.css file, or create a <Stats> Svelte component that wraps it (I went with the former for simplicity, but the “proper” way is the latter).

Replace the paragraph we added previously and replace it with the following, such that data.length is now the first stat we display:

<dl class="stats">
	<dt>Total <abbr title="Lines of code">LOC</abbr></dt>
	<dd>{data.length}</dd>
</dl>

You can display the total number of commits as the second statistic.

What other aggregate stats can you calculate about the whole codebase? Here are a few ideas (pick 1 from the list below, or come up with your own that you then show us during the lab check-off):

Number of files in the codebase
Maximum file length (in lines)
Longest file
Average file length (in lines)
Average line length (in characters)
Longest line length
Longest line
Maximum depth
Deepest line
Average depth
Average file depth
Time of day (morning, afternoon, evening, night) that most work is done
Day of the week that most work is done

You will find the d3-array module very helpful for these kinds of computations, and especially:

Following is some advice on how to calculate these stats depending on their category.

Other stats that you can (but don't have to) consider :)

Aggregates over the whole dataset

The following code is meant to be living in your <script> section of your +page.svelte file. You want to compute them, when data is computed (i.e. when it has a length superior to 0) and then call the respective data fields in the HTML portion of your file.

These measures compute an aggregate (e.g. sum, mean, min, max) over a property across the whole dataset.

Examples:

Average line length
Longest line
Maximum depth
Average depth

These variables involve using one of the data summarization methods over the whole dataset, mapping to the property you want to summarize, and then applying the method. For example, to calculate the maximum depth, you’d use d3.max(data, d => d.depth). To calculate the average depth, you’d use d3.mean(data, d => d.depth).

Number of distinct values

These compute the number of distinct values of a property across the whole dataset.

Examples:

Number of files
Number of authors
Number of days worked on site

To calculate these, you’d use d3.group() / d3.groups() to group the data by the property you want to count the distinct values of, and then use result.size / result.length respectively to get the number of groups.

For example, the number of files would be d3.group(data, d => d.file).size, (or d3.groups(data, d => d.file).length).

Grouped aggregates

These are very interesting stats, but also the most involved of the bunch. These compute an aggregate within a group, and then a different aggregate across all groups.

Examples:

Average file length (in lines)
Average file depth (average of max depth per file)

First, we use d3.rollup() / d3.rollups() to compute the aggregate within each group. If it seems familiar, it’s because we used it in the previous lab to calculate projects per year. For example, to calculate the average file length, we’d use d3.rollups() to callculate lengths for all files via

$: fileLengths = d3.rollups(data, v => d3.max(v, v => v.line), d => d.file);

Then, to find the average of those, we’d use d3.mean() on the result:

$: averageFileLength = d3.mean(fileLengths, d => d[1]);

Note that those reactive statements (defined by $), are placed outside of onMount. They are supposed to update dynamically whenever the variable it depends on changes. If we would place it inside, it would only update once (when the component is first mounted).

Min/max value

These involve finding not the min/max of a property itself, but another property of the row with the min/max value. This can apply both to the whole dataset and to groups.

Examples:

Longest file
Longest line
Deepest line
Time of day (morning, afternoon, evening, night) that most work is done
Day of the week that most work is done

For example, let’s try to calculate the time of day that the most work is done. We’d use date.toLocaleString() to get the time of day and use that as the grouping value:

$: workByPeriod = d3.rollups(data, v => v.length, d => d.datetime.toLocaleString("en", {dayPeriod: "short"}))

Then, to find the period with the most work, we’d use d3.greatest() instead of d3.max() to get the entire element, then access the name of the period with .[0]:

$: maxPeriod = d3.greatest(workByPeriod, (d) => d[1])?.[0];

Step 2: Visualizing time and day of commits in a scatterplot

Now let’s visualize our edits in a scatterplot with the time of day as the Y axis and the date as the X axis.

Step 2.1: Drawing the dots

First, let’s define a width and height for our coordinate space in our <script> block (just below your imports, outside any function such as onMount):

let width = 1000, height = 600;

Then, in the HTML we add an <svg> element to hold our chart, and a suitable heading (e.g. “Commits by time of day” in a <h3> element):

<svg viewBox="0 0 {width} {height}">
	<!-- scatterplot will go here -->
</svg>

Add the following to the <style> element:

<style>
	svg {
		overflow: visible;
	}
</style>

Now, as shown in the Web-based visualization lecture, specifically when we discussed the x- and y-scales, we need to create scales to map our data to the coordinate space using the d3-scale module.

We will need to create two scales: a Y scale for the times of day, and an X scale for the dates.

The Y scale (yScale variable) is a standard linear scale that maps the hour of day (0 to 24) to the Y axis (0 to height).

But for the X scale (xScale variable), there’s a few things to unpack:

Instead of a linear scale, which is meant for any type of quantitative data, We use a time scale which handles dates and times automagically. It works with JS Date objects, which we already have in the datetime property of each commit.
We can use d3.extent() to find the minimum and maximum date in our data in one fell swoop instead of computing it separately via d3.min() and d3.max().
We can use scale.nice() to extend the domain to the nearest “nice” values (e.g. multiples of 5, 10, 15, etc. for numbers, or round dates to the nearest day, month, year, etc. for dates).

Define both scales as reactive variables ($:) in the <script> element based on the linked resources. To avoid finding a dot which is outside the x-axis range, which occurs if you have a commit which is at some time after 00:00 of the current date, we need to create two variables holding the minimum (starting) and maximum (latest + 1 day) date. To this end, we’ll use d3’s max and min functionalities and add one day to the maxDate. This allows us to create the correct xScale extent.

Check your answers here!

// Thanks to Nathanael Jenkins for flagging this to us!
$: minDate = d3.min(commits.map(d => d.date));
$: maxDate = d3.max(commits.map(d => d.date));
$: maxDatePlusOne = new Date(maxDate);
$: maxDatePlusOne.setDate(maxDatePlusOne.getDate() + 1);

$: xScale = d3.scaleTime()
              .domain([minDate, maxDatePlusOne])
              .range([0, width])
              .nice();

$: yScale = d3.scaleLinear()
              .domain([24, 0])
              .range([height, 0]);

Once we have both scales, we can draw the scatter plot by drawing circles with the appropriate coordinates inside our <svg> element. Note that we already define an index count here to be able to use it later for the tooltip in Step 3.

<g class="dots">
{#each commits as commit, index }
	<circle
		cx={ xScale(commit.datetime) }
		cy={ yScale(commit.hourFrac) }
		r="5"
		fill="steelblue"
	/>
{/each}
</g>

The group (<g>) element is not necessary, but it helps keep the SVG structure a bit more organized once we start adding other visual elements.

If we preview at this point, we’ll get something like this:

That was a bit anti-climactic! We did all this work and all we got was a bunch of dots?

Indeed, without axes, a scatterplot does not even look like a chart. Let’s add them!

Step 2.2: Adding axes

As shown in lecture, the first step to add axes is to create space for them. We define margins in our JS:

let margin = {top: 10, right: 10, bottom: 30, left: 20};

Now we would need to adjust our scales to account for these margins by changing:

The range of the X scale from [0, width] to [margin.left, width - margin.right]
The range of the Y scale from [height, 0] to [height - margin.bottom, margin.top]

However, for readability and convenience, you can also define a usableArea variable to hold these bounds, since we’ll later need them for other things too:

let usableArea = {
	top: margin.top,
	right: width - margin.right,
	bottom: height - margin.bottom,
	left: margin.left
};
usableArea.width = usableArea.right - usableArea.left;
usableArea.height = usableArea.bottom - usableArea.top;

Now the ranges become much more readable. Update the arguments of .range in xScale and yScale accordingly.

[usableArea.left, usableArea.right] for the X scale
[usableArea.bottom, usableArea.top] for the Y scale

Then we create xAxis and yAxis variables in our JS to hold our axes:

let xAxis, yAxis;

and <g> elements within our <svg> that we bind to them:

<g transform="translate(0, {usableArea.bottom})" bind:this={xAxis} />
<g transform="translate({usableArea.left}, 0)" bind:this={yAxis} />

Make sure these elements come before your dots, since SVG paints elements in the order they appear in the document, and you want your dots to be painted over anything else.

Then we use d3.select() below our xAxis and yAxis definitions to select these elements and apply the axes to them via d3-axis functions:

$: {
	d3.select(xAxis).call(d3.axisBottom(xScale));
	d3.select(yAxis).call(d3.axisLeft(yScale));
}

If we view our scatterplot now, we’ll see something like this:

Much better, right?

But how does it work? Right click one of the points in the axes and select “Inspect Element”. You will notice that the ticks are actually <g> elements with <text> elements inside them. So D3 has auto-magically generated these elements and added them to our visualization.

Dev tools screenshot

The only thing that remains is to actually format the Y axis to look like actual times. We can do that using the axis.tickFormat() method:

d3.select(yAxis).call(d3.axisLeft(yScale).tickFormat(d => String(d % 24).padStart(2, "0") + ":00"));

What is this function actually doing? Let’s break it down:

d % 24 uses the remainder operator (%) to get 0 instead of 24 for midnight (we could have done d === 24? 0 : d instead)

String(d % 24) converts the number to a string

string.padStart() formats it as a two digit number Finally, we append ":00" to it to make it look like a time.

D3 provides a host of date/time formatting helpers in the d3-time-format module, however for this case, simple string manipulation is actually easier.

The result looks like this:

Screenshot of the scatter plot with formatted Y axis

Step 2.3: Adding horizontal grid lines

Axes already improved our plot tenfold (it now looks like an actual scatterplot for one!) but it’s still hard to see what X and Y values each dot corresponds to.

Let’s add grid lines to make it easier to read the plot at a glance.

When adding grid lines, there are a few tradeoffs to consider. You want to make them prominent enough to assist in reading the chart, but not so prominent that they add clutter and distract from the data itself. Err on the side of fewer, fainter grid lines rather than dense and darker ones.

We will only create horizontal grid lines for simplicity, but you can easily add vertical ones too if you want (but be extra mindful of visual clutter).

Conceptually, there is no D3 primitive specifically for grid lines. Grid lines are basically just axes with no labels and freakishly long ticks. 😁

So we add grid lines in a very similar way to how we added axes: We create a JS variable to hold the axis (I called it yAxisGridlines), and use a reactive statement that starts off identical to the one for our yScale. First, we will use axis.tickFormat() again, but this time to remove the text. Then, we use the axis.tickSize() method with a tick size of -usableArea.width to make the lines extend across the whole chart (the - is to flip them). Place this definition together with the ones for the x- and y-Axis in the same reactive statement.

$: {
	d3.select(yAxisGridlines).call(
		d3.axisLeft(yScale)
		  .tickFormat("")
		  .tickSize(-usableArea.width)
	);
}

We also need to create a <g> element to hold the grid lines. Let’s give it a class of gridlines so we can style it later:

<g class="gridlines" transform="translate({usableArea.left}, 0)" bind:this={yAxisGridlines} />

Make sure that your <g> element for the grid lines comes before the <g> element for the Y axis, as you want the grid lines to be painted under the axis, not over it.

If we look now, we already have grid lines, but they look a bit too prominent.

Let’s add some CSS to fix this:

.gridlines {
	stroke-opacity: .2;
}

Here the comparison before and after:

The grid lines before and after `stroke-opacity`.

Do not use .gridlines line, .gridlines .tick line or any other descendant selector to style the lines: Svelte thinks it’s unused CSS and removes it!

Coloring each line based on the time of day, with bluer colors for night times and orangish ones for daytime? 😁

Even with the gridlines, it’s still hard to see what each dot corresponds to. Let’s add a tooltip that shows information about the commit when you hover over a dot.

Step 3.1: Static element

First, we’ll render the data in an HTML element, and once we’re sure everything works well, we’ll make it look like a tooltip.

Similarly to Step 5.2 of the previous lab, when we were selecting a pie wedge, we will now use with a hoveredIndex variable to hold the index of the hovered commit, and a hoveredCommit variable that is reactively updated every time a commit is hovered and holds the data we want to display in the tooltip:

let hoveredIndex = -1;
$: hoveredCommit = commits[hoveredIndex] ?? hoveredCommit ?? {};

Then, in our SVG, we add mouseenter and mouseleave event listeners on each circle element:

<circle
	on:mouseenter={evt => hoveredIndex = index}
	on:mouseleave={evt => hoveredIndex = -1}
	<!-- Your other elements ... -->
/>

You may notice that you have a yellow squiggly here indicating an Accessibility warning. These built-in warnings are great reminders from Svelte to think about how accessible our applications are when we make them. We will come back to address warnings like these in Lab 10!

Now add an element to display data about the hovered commit:

<dl class="info tooltip">
	<dt>Commit</dt>
	<dd><a href="{ hoveredCommit.url }" target="_blank">{ hoveredCommit.id }</a></dd>

	<dt>Date</dt>
	<dd>{ hoveredCommit.datetime?.toLocaleString("en", {dateStyle: "full"}) }</dd>

	<!-- Add: Time, author, lines edited -->
</dl>

In the CSS, we add two rules:

dl.info with grid layout so that the <dt>s are on the 1st column and the <dd>s on the 2nd, remove their default margins, and apply some styling to make the labels less prominent than the values.
.tooltip with position: fixed to it and top: 1em; and left: 1em; to place it at the top left of the viewport so we can see it regardless of scroll status.

Why not just add everything on a single CSS rule? Because this way we can reuse the .info class for other <dl>s that are not tooltips and the .tooltip class for other tooltips that are not <dl>s.

What’s the difference between fixed and absolute positioning? position: fixed positions the element relative to the viewport, while position: absolute positions it relative to the nearest positioned ancestor (or the root element if there is none). The position offsets are specified via top, right, bottom, and left properties (or their shorthand, inset) In practice, it means that position: fixed elements stay in the same place even when you scroll, while position: absolute elements scroll with the rest of the page.

We should also apply some hover styles on the dots, e.g to smoothly make them bigger when hovered we can do something like this:

circle {
	transition: 200ms;


	&:hover {
		transform: scale(1.5);
	}
}

If you preview now, you will see some weirdness (slowed down by 10x):

This is because in SVG by default the origin of transforms is the top left corner of the coordinate system. To fix that and set the origin to the center of the dot itself, we need to add two properties to the circle CSS rule:

transform-origin: center;
transform-box: fill-box;

The hover effect now looks far more reasonable:

Overall, at the end of this step, we should have something like this:

Seeing this info is already useful, but it’s not really a tooltip yet. There are three components to making our <dl> an actual tooltip:

Styling it like a tooltip (e.g. giving it a shadow that makes it look raised from the page)
Making it only appear when we are hovering over a dot (Step 3.3)
Positioning it near the mouse cursor (Step 3.4)

FYI, these steps can be done in any order since they are basically independent tasks.

In terms of styling, you should definitely give it a background-color as otherwise the text will be hard to read. You can either go for a solid color (e.g. white) or a semi-transparent color (e.g. oklch(100% 0% 0 / 80%)) that will show some of the chart behind it.

A few other useful CSS properties are:

box-shadow for shadows. Avoid overly prominent shadows: you are trying to make it look elevated, not to decorate it. The shadow should not be distracting, but just enough to make it look like it’s floating above the page. Generally, the larger the blur radius and the more transparent the color, the more raised the element will look. Experiment with different values to see what looks best for your design.
border-radius for rounded corners
backdrop-filter to blur what’s underneath (frosted glass effect). This is only relevant if you have a semi-transparent background color.

You would also probably want to add some spacing between its content and the edges of the tooltip, i.e. padding.

Step 3.3: Making only appear when we are hovering over a dot

Currently, our tooltip appears even when it has no content, which is quite jarring. It also appears when we are not hovering over any dot, and just shows the previous content. That’s not too bad when it’s fixed at the top left of the viewport, but can you picture how annoying this would be if it was an actual tooltip that just won’t take a hint and go away?

We could wrap the whole tooltip with an {#if hoveredIndex > -1 }...{/if} block and it would work. However, that’s not very flexible. It makes it hard to use transition effects when the tooltip disappears (because it’s gone immediately), make it disappear with a delay to allow users to interact with it, or not disappear at all if users are actively interacting with it (hovering it or focusing elements within it).

Instead, we will use the HTML hidden attribute:

<dl class="info tooltip" hidden={hoveredIndex === -1}>

and add some CSS to hide the element by fading it out:

dl.info {
	/* ... other styles ... */
	transition-duration: 500ms;
	transition-property: opacity, visibility;

	&[hidden]:not(:hover, :focus-within) {
		opacity: 0;
		visibility: hidden;
	}
}

It should now behave like this:

Now, the final piece of the puzzle to make this element into an actual tooltip!

Our tooltip is currently positioned at the top left corner of the viewport (actually 1em from the top and 1em from the left) in a hardcoded way, via the top and left properties. To position it near the mouse cursor instead, we need to set these properties dynamically based on the mouse position.

Thankfully, the event object on mouse events has several properties that give us the mouse position relative to different things. To get the mouse position relative to the viewport, we can use the x and y properties of the event object.

We will declare a new variable in our JS and use it to store the last recorded mouse position:

let cursor = {x: 0, y: 0};

Then, we will update it in our mouseenter event listener:

<circle
	on:mouseenter={evt => {
		hoveredIndex = index;
		cursor = {x: evt.x, y: evt.y};
	}}
	<!-- Other attributes/directives that you already have in this element -->
/>

Print it out in your HTML via {JSON.stringify(cursor, null, "\t")} and move the mouse around to make sure it works!

As with all these debug statements, don’t forget to remove it once you verify it works.

Now let’s use these to set top and left on the tooltip:

<dl class="info tooltip" hidden={hoveredIndex === -1} style="top: {cursor.y}px; left: {cursor.x}px">

This is the result:

While we directly set top and left for simplicity, we usually want to avoid setting CSS properties directly. It’s more flexible to set custom properties that we then use in our CSS. For example, assume you wanted to subtly move the shadow as the mouse pointer moves to create more sense of depth (parallax). If we had custom properties with the mouse coordinates, we could just use them in other properties too, whereas here we’d have to set the box-shadow with inline styles too.

Step 3.5: Bulletproof positioning

Our naïve approach to positioning the tooltip near the mouse cursor by setting the top and left CSS properties works well if the tooltip is small and the mouse is near the center of the viewport. However, if the tooltip is near the edges of the viewport, it falls apart.

Try it yourself: dock the dev tools at the bottom of the window and make them tall enough that you can scroll the page. Now hover over a dot near the bottom of the page. Can you see the tooltip?

Solving this on our own is actually an incredibly complicated problem in the general case. Thankfully, there are many wonderful packages that solve it for us. We will use Floating UI here.

First, we install it via npm:

npm install @floating-ui/dom

Then, we import the three functions we will need from it:

import {
	computePosition,
	autoPlacement,
	offset,
} from '@floating-ui/dom';

Just like D3, Floating UI is not Svelte-specific and works with DOM elements. Therefore, just like we did for the axes in Step 4.2, we will use bind:this to bind a variable to the tooltip element:

let commitTooltip;

<!-- Other attributes omitted for brevity -->
<dl class="info tooltip" bind:this={commitTooltip}>

Then, we will use computePosition() to compute the position of the tooltip based on the mouse position and the size of the tooltip. This function returns a Promise that resolves to an object with properties like x and y that we can use in our CSS instead of cursor. Therefore, let’s create a new variable to hold the position of the tooltip that we will update in our mouseenter event listener.:

let tooltipPosition = {x: 0, y: 0};

Since the code of this event listener is growing way beyond a single line expression, it’s time to move it to a function.

We’ll try something different this time: instead of creating separate functions for each event, we will invoke the same function for all events, and read evt.type to determine what to do. For this, create a new dotInteraction() function in your JS that takes the index of the dot and the event object as arguments:

function dotInteraction (index, evt) {
	if (evt.type === "mouseenter") {
		// dot hovered
	}
	else if (evt.type === "mouseleave") {
		// dot unhovered
	}
}

Move your existing event listener code, i.e. the code we have added to the circle element in the svg to handle on:mouseenter and on:mouseleave to the dotInteraction() function. We can now update the event listeners in circle to just call the dotInteraction function instead:

<circle
	on:mouseenter={evt => dotInteraction(index, evt)}
	on:mouseleave={evt => dotInteraction(index, evt)}
	<!-- Other attributes/directives that you already have in this element -->
/>

Back to the dotInteraction() function, we can use evt.target to get the dot that was hovered over:

let hoveredDot = evt.target;

Now, in the block that handles the mouseenter events, we will use computePosition() to compute the position of the tooltip based on the position of the dot. To do so, let’s first mark the function as async, which allows us to wait for async code to realize. This is helpful because computePosition() returns sich a Promise that resolves to the position of the tooltip.

This is how your function should look like now:

async function dotInteraction (index, evt) {
	let hoveredDot = evt.target;
	if (evt.type === "mouseenter") {
		hoveredIndex = index;
		cursor = {x: evt.x, y: evt.y};
		tooltipPosition = await computePosition(hoveredDot, commitTooltip, {
			strategy: "fixed", // because we use position: fixed
			middleware: [
				offset(5), // spacing from tooltip to dot
				autoPlacement() // see https://floating-ui.com/docs/autoplacement
			],
		});        }
	else if (evt.type === "mouseleave") {
		hoveredIndex = -1
	}
}

We won’t go into much detail on the API of Floating UI, so it’s ok to just copy the code above. However, if you want to learn more, their docs are excellent.

Lastly, replace cursor with tooltipPosition in the style attribute of the tooltip dl element (not the CSS in the <style> block) to actually use this new object.

If you preview now, you should see that the tooltip is always visible and positioned near the hovered dot, regardless of where it is relative to the viewport.

At this point you can also remove the cursor variable and the code setting it since we don’t need it anymore, unless there are other things you want to do where knowing the mouse position is useful.

Step 4: Communicating lines edited via the size of the dots (optional)

Note that the tiniest of edits are currently represented by the same size of dot as the largest of edits. It is common to use the size of the dots to communicate a third variable, in this case the number of lines each commit edited.

Step 4.1: Calculating our scale

We will need to define a new scale to map the number of lines edited to the radius of the dots. This means we need to: 1. Decide on the minimum and maximum radii we want to allow. Here, we can edit the circle r attribute and play around with different radii to decide. I settled on 2 and 30. 2. Calculate the range of values for number of lines edited by a single commit. For this one, we can refer to Step 4.1 and use d3.extent() to find the minimum and maximum value in one go. Then define a new linear scale (I called it rScale) using d3.scaleLinear() mapping the domain of the number of lines edited to the range of radii we decided on.

Now, in our HTML, instead of a hardcoded r=5, set the circle radius to r={ rScale(commit.totalLines) }.

If everything went well, you should now see that the dots are now different sizes depending on the number of lines of each!

As one last tweak, apply fill-opacity to the dots to make them more transparent, since the larger they are, the more likely they are to overlap. You can only apply it when the dots are not hovered, as an extra cue.

Screenshot

Step 4.2: Area, not radius

Hover over a few circles and pay attention to the number of lines they correspond to. What do you notice? The size of the dots is not very good at communicating the number of lines edited. This is because the area of a circle is proportional to the square of its radius (A = πr²), so a commit with double the edits appears four times as large!

To fix this, we will use a different type of scale: a square root scale. A square root scale is a type of power scale that uses the square root of the input domain value to calculate the output range value. Thankfully, the API is very similar to the linear scale we used before, so all we need to do to fix the issue is to just change the function name.

Step 4.3: Paint smaller dots over larger ones

You may notice that when dots are overlapping, it’s sometimes harder to hover over the smaller ones, if they happen to be painted underneath the larger one.

One way to fix this is to sort commits in descending order of totalLines, which will ensure the smaller dots are painted last. To do that, we can use the d3.sort() method. This would go in your onMount() callback:

commits = d3.sort(commits, d => -d.totalLines);

Why the minus? Because d3.sort() sorts in ascending order by default, and we want descending order, and that’s shorter than writing a custom comparator function.

Step 5: Clickable Commits

Up until now, we were able to give users an overview of the data of single commit events. However, we might be interested in giving the opportunity to analyze aggregated data that spans multiple commits. For this, we need to make our commits selectable in some fashion. For the scope of this lab, we will now work on making individual commits selectable by click interactions. As an optional step later on, we will also be looking at brushing - but let’s do one step at a time!

Step 5.1: Making commits clickable

Before we can add the right event handling to make click selections possible, we need to make a few changes to our code. For one, we need a variable that holds all the selected commits. For this, we can simply define an array of values in our <script> element:

let clickedCommits = [];

Now, we just need to add clicked commits and remove commits that were already selected and then have been clicked again. Such a functionality is often referred as a toggle switch.

If you remember, we have already done quite a bit of work with our <circle> svg elements and mouse interactions. Therefore, let’s leverage that! Let’s add to our async dotInteraction function the following check:

else if (evt.type === "click") {
	let commit = commits[index]
	if (!clickedCommits.includes(commit)) {
		// Add the commit to the clickedCommits array
		clickedCommits = [...clickedCommits, commit];
	}
	else {
			// Remove the commit from the array
			clickedCommits = clickedCommits.filter(c => c !== commit);
	}
}

Using clickedCommits.includes(commit), we check whether the specific commit is already selected. Now, what this code does for us is two fold. If the commit is not part of our array of selected commits, we create a new array that includes it. For this, we use JavaScript’s spread operator to unwrap (or spread) the existing array, add the new commit and then store it in place of the original array. If the commit is already there, we want to toggle it “off” and we thus filter the array to include all selected commits but the one clicked.

The software engineer inside of you might be compelled to put clickedCommits.includes(commit) into a helper function as we will also reuse it in the circle SVG component in the next step. For your mental sanity: don’t, unless you make certain key components reactive. If you choose to do the optional brushing part of this lab, you will see another flavor using the reactive paradigm.

The last thing that remains is to add the dotInteraction function to also trigger on:click:

<circle
	on:click={ evt => dotInteraction(index, evt) }
	<!-- Other attributes/directives that you already have in this element -->
/>

To check wheter what we have done works, we can add a console log into our dotInteraction() function where we enter upon a "click" event. Something like console.log(clickedCommits); should do the trick. If you try it out now, you should see something like this in your browser console:

Step 5.2: Adding the Visuals

So far, clicking only changes the state of an internal variable. However, since we are all about interactive data visualization, we now add some visual feedback for the user. To do so via a class:selected directive added to the <circle> element and leveraging the helper function we defined before:

<circle
	class:selected={ clickedCommits.includes(commit) }
	<!-- Other attributes/directives that you already have in this element -->
/>

You know what to do now, right? We just define a custom CSS for that class like so (you can also specify a custom color but this will give you nice consistency throughout your page):

.selected {
    fill: var(--color-accent);
}

It should now look something like so:

Step 6: Bar Chart

Now, let’s use the aggregated data to display insightful information about our commits. To achieve this, we’ll create a new bar chart component in src/lib/Bar.svelte.

While you’re welcome to build your own version, we provide a basic implementation here: Bar.svelte.

Step 6.1. Import the Bar Component

Once you have your bar chart component in src/lib/Bar.svelte, let’s import and use it in our routes/meta/+page.svelte file. Just like we did with the Pie component in the previous lab, add the following import statement:

import Bar from '$lib/Bar.svelte';

Step 6.2. Prepare the Data for the Bar Chart

Next, we need to structure the data so that Bar can properly render it. The data should be formatted as an array of arrays, where each sub-array consists of:

A programming language (as a string) in the first element.
The total number of lines of code (LOC) for that language in the second element.

Example:

[
    ["js", 154],
    ["css", 243],
    ["html", 14],
    ["svelte", 176]
]

To generate this data, we first extract all unique programming languages used in our project. We can achieve this using a Set, which inherently stores only unique values, and then cast it to an array:

$: allTypes = Array.from(new Set(data.map(d => d.type)));

Next, we determine which commits should be included - either the selected commits or, if none are selected, all commits:

$: selectedLines = (clickedCommits.length > 0 ? clickedCommits : commits).flatMap(d => d.lines);

Now, we use D3’s rollup function to aggregate the number of LOCs for each language:

$: selectedCounts = d3.rollup(
    selectedLines,
    v => v.length,
    d => d.type
);

Almost there! However, we need to ensure that all languages are included in the dataset, even if they have no corresponding LOCs in the current selection. This guarantees that colors and bar positions remain consistent as the user filters data:

$: languageBreakdown = allTypes.map(type => [type, selectedCounts.get(type) || 0]);

Step 6.3. Insert the Bar Chart

In the page body, right below our <svg> element where we draw the <circle> elements, add the Bar component:

<Bar data={languageBreakdown} width={width} />

Here, data={languageBreakdown} passes the processed data to the Bar component, and width={width} ensures that it spans the same width as our scatterplot. These are uni-directional bindings, meaning Bar will update when the data changes, but cannot modify the values itself.

Step 6.4. See It in Action

If you’ve followed along, your bar chart should now be integrated! It should dynamically update as different commits are selected, providing a visual breakdown of programming languages (don’t worry about your bar chart having a differnt width than in the video).

Pretty cool, right? But wait! There’s more for those who want to go even deeper into customization.

Step 7: Brushing (Optional)

In Step 5.1, we enabled clicking various commits and thus selecting them, allowing us to adapt other components (such as Bar) to the selection. As discussed in the A Tour through the Interaction Zoo lecture, brushing can be an effective interaction technique for selecting multiple data points in a visualization.

Once points are selected, we can further explore the dataset by displaying more data.

Step 7.1: Setting up the brush

Exactly because brushing is so fundamental to interactive charts, D3 provides a module called d3-brush to facilitate just that.

To use it, we need a reference to our <svg> element, so we use bind:this. More specifically, create a variable svg in your <script> and use bind:this={svg} in the top <svg> component (the one where we have also defined the viewBox).

We then create the brush through a reactive component like this:

$: d3.select(svg).call(d3.brush());

Try it! You should already be able to drag a rectangle around the chart, even though it doesn’t do anything yet.

Step 7.2: Getting our tooltips back

Did you notice that now that we can brush, our tooltips disappeared and that we can’t select any data points? 😱 What happened?!

If you inspect the chart, you will find the culprit:

So what is happening here? To make the brush work, D3 adds a rectangle overlay over the entire chart that catches all mouse events. Because of this, our circles never get hovered, and thus our tooltips never show and no selection can take place.

Since SVG elements are painted in source order, to fix this we need the overlay to come before the dots in the DOM tree. D3 provides a selection.raise() method that moves one or more elements to the end of their parent, maintaining their relative order.

Therefore, to move the overlay to be before the dots, we will “raise” the dots and everything that comes after the overlay.

First, let’s convert the single-line reactive statement to a reactive block:

$: {
	d3.select(svg).call(d3.brush());
}

Then, inside the reactive block, after the brush is created, we raise the dots and everything after the overlay:

d3.select(svg).selectAll(".dots, .overlay ~ *").raise();

That’s a funny looking selector, isn’t it? The ~ is the CSS subsequent sibling combinator and it selects elements that come after the selector that precedes it (and share the same parent).

Try it: you should now see that the tooltips are back, and the brush still works!

Step 7.3: Styling the selection rectangle (optional)

The overlay is not the only element added by d3.brush(). For example, there is a <rect class="selection> element that is used to depict the brush selection. This means you can use CSS to style it!

Just make sure to use the Svelte-specific :global() pseudo-class around .selection otherwise Svelte will drop the whole rule, as it thinks it’s unused CSS.

Here’s what I did, but feel free to experiment with your own styles:

@keyframes marching-ants {
	to {
		stroke-dashoffset: -8; /* 5 + 3 */
	}
}

svg :global(.selection) {
	fill-opacity: 10%;
	stroke: black;
	stroke-opacity: 70%;
	stroke-dasharray: 5 3;
	animation: marching-ants 2s linear infinite;
}

Step 7.4: Making the brush actually select dots

So far we can draw a nicely animated selection box, but it neither does anything, nor does it look like it does anything.

The first step is to actually figure out what the user has selected, both in terms of visual shapes (dots) so we can style them as selected, as well as in terms of data (commits) so we can allow the user to use brushing instead of clicking on every single commit to select them.

d3.brush() returns a brush object, which actually fires events when the brush is moved. We can use .on() to listen to these events and do something when they happen.

Let’s start by simply logging them to the console. Let’s define a function called brushed() that takes an event object as an argument and logs it to the console:

function brushed (evt) {
	console.log(evt);
}

Then, we use .on() to call this function when the brush is moved:

d3.select(svg).call(d3.brush().on("start brush end", brushed));

This line can replace your existing d3.select(svg).call(d3.brush()) code.

Open your browser console (if it’s not already open) and try brushing again. You should see a flurry of events logged to the console, a bit like this:

Try exploring these objects by clicking on the ▸ icon next to them.

You may notice that the selection property of the event object is an array of two points. These points represent the top-left and bottom-right corners of the brush rectangle. This array is the key to understanding what the user has selected.

Let’s create a new reactive variable that stores this selection array. I called it brushSelection. Then, inside the brushed() function, we can remove the console.log statement and set brushSelection to evt.selection like so:

$: brushSelection = null;

function brushed (evt) {
	brushSelection = evt.selection;
}

Now, thinking back to what we’ve done with the manual selection of the various commits, we can piggyback off the clickedCommits array that we have instantiated and make commit selection possible through both clicking and brushing! Thanks to the work we have done before, we get the selected view of the circles, i.e. the difference in color, for free!

function isCommitBrushed (commit) {
	if (!brushSelection) {
		return false;
	}
	// TODO return true if commit is within brushSelection
	// and false if not
}

The core idea for the logic is to use our existing xScale and yScale scales to map the commit data to X and Y coordinates, and then check if these coordinates are within the brush selection bounds.

Another way to do it is to use the D3 scale.invert() to map the selection bounds to data, and then compare data values, which can be faster if you have a lot of data, since you only need to convert the bounds once.

Can you figure out how to do it?

Show solution

There are many ways to implement this logic, but here’s one:

let min = {x: brushSelection[0][0], y: brushSelection[0][1]};
let max = {x: brushSelection[1][0], y: brushSelection[1][1]};
let x = xScale(commit.date);
let y = yScale(commit.hourFrac);
return x >= min.x && x <= max.x && y >= min.y && y <= max.y;

We can make use of this function and get an array of brushedCommits by adding:

$: brushedCommits = brushSelection ? commits.filter(isCommitBrushed) : [];

This will allow us to filter all the commits by the ones the brush encompasses and return an empty list, if none are in fact selected.

Now, we know if and what commits are click-selected (through clickedCommits) and which ones are brushed (through brushedCommits). Since we want to enable selection through both jointly, how can we combine these two? Why not simply merge the two arrays, making sure that every commit is present just once? Here we make use again of our spread operator and do the following:

$: selectedCommits = Array.from(new Set([...clickedCommits, ...brushedCommits]));

Last but not least, we replace the clickedCommits with selectedCommits where we want to consider both, the brushed and selected circles. That should be the following:

<script>
	// Omitting all the other code for clarity

	$: selectedLines = (clickedCommits.length > 0 ? clickedCommits : commits).flatMap(d => d.lines);
</script>

<circle
	class:selected={ clickedCommits.includes(commit) }
	<!-- Your other elements ... -->
/>

Changed to:

<script>
	// Omitting all the other code for clarity

	$: selectedLines = (selectedCommits.length > 0 ? selectedCommits : commits).flatMap(d => d.lines);
</script>

<circle
	class:selected={ selectedCommits.includes(commit) }
	<!-- Your other elements ... -->
/>

If everything comes together nicely, it should look somewhat similar to this:

Step 7.5: Showing count of selected commits

Lastly, you might find it a good idea to inform your user of the number of commits they have selected in total. That’s a quick one! Just add the following reactive component:

$: hasSelection = selectedCommits && selectedCommits.length > 0;

Now let’s display the number of selected commits in the HTML, under the chart:

<p>{hasSelection ? selectedCommits.length : "No"} commits selected</p>

If it works, it should look a bit like this:

Wow! Well done! You deserve a break. Thanks for tuning in also this time and we’re looking forward to having you back for the next lab!

Actually, it will only include commits that still have an effect on the codebase, since it’s based on lines of code that are currently present in the codebase. Therefore if all a commit did was change lines that have since been edited by other commits, that commit will not show up here. If we wanted to include all commits, we’d need to process the output of git log instead, but that is outside the scope of this lab. ↩

Lab 7: Visualizing quantitative data with D3

Check-off

Lab 7 Rubric

Slides (or lack thereof)

Step 0: Setting up

Step 0.1: Data Setup

Step 0.2: Getting Rid of Warnings

Step 1: Displaying summary stats

Step 1.1: Reading the CSV file in D3

Step 1.2: Computing commit data

Step 1.3: Displaying the stats

Aggregates over the whole dataset

Number of distinct values

Grouped aggregates

Min/max value

Step 2: Visualizing time and day of commits in a scatterplot

Step 2.1: Drawing the dots

Step 2.2: Adding axes

Step 2.3: Adding horizontal grid lines

Step 3: Adding a tooltip

Step 3.1: Static element

Step 3.2: Making it look like a tooltip

Step 3.3: Making only appear when we are hovering over a dot

Step 3.4: Positioning the tooltip near the mouse cursor

Step 3.5: Bulletproof positioning

Step 4: Communicating lines edited via the size of the dots (optional)

Step 4.1: Calculating our scale

Step 4.2: Area, not radius

Step 4.3: Paint smaller dots over larger ones

Step 5: Clickable Commits

Step 5.1: Making commits clickable

Step 5.2: Adding the Visuals

Step 6: Bar Chart

Step 6.1. Import the Bar Component

Step 6.2. Prepare the Data for the Bar Chart

Step 6.3. Insert the Bar Chart

Step 6.4. See It in Action

Step 7: Brushing (Optional)

Step 7.1: Setting up the brush

Step 7.2: Getting our tooltips back

Step 7.3: Styling the selection rectangle (optional)

Step 7.4: Making the brush actually select dots

Step 7.5: Showing count of selected commits