Wrapping and truncating chart labels in NVD3 horizontal bar charts

Richard D Jones
ITNEXT
Published in
14 min readAug 21, 2018

--

Some very very very very very very very long chart labels in my data

If your horizontal bar chart labels are too long for your left margin, by default in NVD3 the labels will simply overflow the space, and disappear off to the left, which is annoying and looks unprofessional. On these charts, space for the vertical axis labels is at a premium, so to do a good job it’s not enough to implement word-level wrapping, we need to incorporate hyphenated wrapping and label truncation.

One of the reasons that space can be limited for these labels is the temptation to make your horizontal bars around a single text line height. In that case label wrapping is not going to save you (though the truncation might). I’d recommend when dealing with data that has long labels that you consider also increasing your bar thickness to accommodate, with the bonus that I think this makes for better looking, more user-friendly charts.

This post describes an algorithm that can be used with NVD3, and indeed any SVG-based display, to take a long string, wrap it (using hyphens if needed) and truncate it when the space runs out.

(Note that Mike Bostock has a wrapping solution for this though it wraps at the word-level and does not truncate; the algorithm below uses similar concepts and extends. In NVD3 there is also a wrapLabels:true option on the Discrete Bar Chart (which I assume uses Mike’s solution), but that is not available on the horizontal multibar)

The Method

Before we get into the code, let’s consider the overall approach we’re going to take.

We can think about the problem in terms of fitting the maximum amount of text into a box with a certain width and height. Exactly how much text will fit in that box will depend on the text size, the line height, and — if using a variable-width font — exactly which letters are involved.

How much text can we fit in a box that’s the width of your margin and the height of your bar?

We can solve the problem by attacking the horizontal and vertical separately:

First, separate the text into lines which will fit horizontally into the space available. Second, distribute the text vertically (that is, place all the lines in sequence one above the other) and determine if they fit in the box. Finally, reduce the number of text lines until those that remain fit in the box.

Text that overflows vertically needs to be reduced until it fits in the space available

One of the key details we want to include in order to maximise the small amount of space available is to hyphenate words where possible. Unfortunately, hyphenation doesn’t always look right — you don’t want to do it after the first letter or two of the word, for example. Also, hyphenation introduces a new character into the text, which then needs to be accounted for in width calculations, so you can’t just throw a hyphen onto the end of a line which breaks a word.

Here’s a flowchart that defines a workable and horizontal-space-maximising rule for hyphenation:

An algorithm that hyphenates words in a reasonably user-friendly way

It works like this: at the end of our horizontal space, if we find ourselves in the middle of a word, find out how many letters it is back to the previous word break. If the distance to the word break is less than some minimum limit, then we don’t try to hyphenate, we just backtrack to the start of the word and break there. This prevents us from doing things like “m-e” or “yo-u”, which would look weird. We’ll set our minimum to something like 3.

If, on the other hand, the distance to the word break is greater than this minimum, then we should try to hyphenate. We do this by backtracking one letter and then adding a hyphen. Now we need to check if the line still fits horizontally, in case the hyphen is wider than the letter it replaced.

If the line fits, great we’re done! Otherwise we remove the hyphen and keep backtracking+hyphenating until the line fits in the space provided. If we get to the point where the number of letters left at the start of the word drops below our minimum, we just wrap the line at the word boundary.

This algorithm is the simplest hyphenation strategy that I’ve found that provides reasonable results for highly constrained spaces. In reality hyphenation is complex, and in type-setting there are even rules about where to divide different kinds of words (like, at syllable boundaries), which are way too complex for us . Also, our algorithm only considers whether the start of the word looks right, and doesn’t care about the end of the word. So, we can see wraps like “lon-g” or “joi-n” which start well, but end weirdly. We’re going to overlook that for the time being, as fixing it can make the horizontal space usage less efficient.

The Code

The full function for doing this can be found at the bottom of this post. We’ll step through the important bits so you can see how it works.

We’re going to do this as a single function, with other functions defined inside it. We’ll be passing in a bunch of parameters that will define the environment (the closure).

function wrapLabels(params) {
var axisSelector = params.axisSelector;
var maxWidth = params.maxWidth;
var maxHeight = params.maxHeight;
var lineHeight = params.lineHeight || 1.2;
var wordBreaks = params.wordBreaks || [" ", "\t"];
var minChunkSize = params.minChunkSize || 3;
// implementation goes here
}
  • axisSelector — a selector string that can be used by d3 to select the axis whose labels we’re going to wrap.
  • maxWidth — the maximum width of the box the text needs to fit into. This is probably equal to the left margin that you give the bar chart
  • maxHeight — the maximum height of the box the text needs to fit into. This is probably equal to the bar thickness
  • lineHeight — the line height you want to use, which defines the separation required between lines of text
  • wordBreaks — a list of characters you regard as the break points between words. We default here to space and tab.
  • minChunkSize — the minimum number of letters allowed before a hyphen when breaking a word.

Once we’ve drawn our chart using NVD3 in the usual way, we then apply the wrapping as a post-render modification. To do that, we need to grab each axis label and perform our transformation on it. We’ll break the algorithm down exactly as we described above: separate, distribute, reduce:

d3.selectAll(axisSelector + " .tick text").each(function(i, e) {
var text = d3.select(this);
var tspans = separate(text);
do {
distribute(text, tspans);
}
while (reduce(text, tspans))
});

We grab the text for each label, and we use our separate function to convert this into a list of tspan objects. Then we distribute those tspans vertically and then apply a reduce function which returns true if we reduced the text, or false if the text now fits.

Separate

Separating the text is the most complex function, because it involves the hyphenation.

We’re defining a function which takes the text node of the axis label:

function separate(text) { ... }

First we grab the text content as an array of characters, then replace the text content with a tspan that we can use to check the text width:

var chars = text.text().trim().split("");
text.text(null);
var x = text.attr("x");
var tspan = text.append("tspan").attr("x", x).attr("y", 0);

Now we’re going to chomp through that chars array, building up each line until it is full, and applying our hyphenation algorithm:

var lines = [];
var currentLine = [];
while (chars.length > 0) {
var char = chars.shift();
...
}

We have created two registries — one to record the list of lines and the other to record the progress on the currentLine. We then carry out a loop until the chars array is empty. We can’t for-each through the chars array because we’re going to be tracking forward and backward in that array, so we won’t necessarily consume only one character per iteration. Then our first act is to shift the first character off the front of the array.

The next step is simple, just add the first char to the currentLine, and put the currentLine into the tspan:

currentLine.push(char);
tspan.text(currentLine.join(""));

Now we check whether the line fits, and if not then make it fit and handle the hyphenation. We’ll look at the whole thing, and then discuss it below:

var maxed = false;
var hyphenated = false;
while(_isTooLong(tspan)) {
maxed = true;
if (hyphenated) {
currentLine.splice(currentLine.length - 1);
hyphenated = false;
}
_backtrack(1, currentLine, chars); if (_isMidWord(currentLine, chars)) {
var toPrevSpace = _toPrevSpace(currentLine);
if (toPrevSpace === -1 || toPrevSpace - 1 > minChunkSize) {
_backtrack(1, currentLine, chars);
currentLine.push("-");
hyphenated = true;
} else {
_backtrack(toPrevSpace, currentLine, chars);
}
}
currentLine = currentLine.join("").trim().split("");
tspan.text(currentLine.join(""));
}

We start with a question: is this line too long? If it is not, none of this code executes, and the chars array continues to be consumed until the line is too long.

Ignoring maxed and hyphenated for the moment, the next thing that happens is we backtrack 1 character. This removes the last character from currentLine and places it back in chars.

Now we ask another question: are we in the middle of a word? If not, then the currentLine and tspan are updated (in the final two lines of the loop), and then loop will terminate on the next iteration as the line will no longer be too long. If we didn’t want to hyphenate, this would be enough — we would have filled the line to the max with characters from our chars array.

If we are in the middle of a word, then apply our hyphenation algorithm: first work out how far it is to the previous space. This can be -1 if we get to the beginning of the text before we find a space, or some number if we find a space.

If that number is less than our minChunkSize we aren’t going to hyphenate (we’d end up with text like “m-e” otherwise), so we backtrack to the start of the word and finish there.

Otherwise, we backtrack 1 character, insert a “-” and continue, and note we set the tripwire hyphenated to true. This comes into play on the next iteration of the loop. If we start the loop with hyphenated set to true then the first thing we do is remove the final character (the hyphen) before carrying on. This deals with the possibility that the hyphen is wider than the character it replaced, and allows us to backtrack as many letters as we need to get the line to fit in the tspan.

This fully implements our hyphenation algorithm above. I won’t go into detail on the other functions used here except _isTooLong as that contains a key bit of detail about how width checking is done. Here’s the function:

function _isTooLong(tspan) {
return tspan.node().getComputedTextLength() >= maxWidth
}

It uses getComputedTextLength on the tspan node and determines if it is wider than some specified (in the closure in this case) maxWidth. This is the way we have to determine whether the text overflows the space, as we can only compute the width of a DOM element, we cannot compute the width of the text before it goes into the DOM.

We finish up the loop over the chars array with some termination conditions:

if (!maxed && chars.length > 0) {
continue;
}
if (maxed || chars.length === 0) {
lines.push(currentLine);
currentLine = [];
}

If we didn’t max out the line yet, and there are still characters to consume, then carry on. If we did max out the array, or there are no characters left to consume, then record the currentLine in the list of lines and reset the currentLine to empty, ready to be populated with the next line of characters.

We now pop out of the while-loop over the chars array, and construct our full list of tspans to give back to the caller.

while (chars.length > 0) {
var char = chars.shift();
// see above for detail...
}
tspan.remove();
var tspans = [];
for (var i = 0; i < lines.length; i++) {
tspan = text.append("tspan").attr("x", x).attr("y", 0);
tspan.text(lines[i].join(""));
tspans.push(tspan);
}
return tspans;

Note that we remove the original tspan we created here — we were only using that element to measure the width of the text, so we clean it up and start fresh once we have our list of lines.

Distribute

Now we have a set of tspan elements containing text which fits horizontally into the space provided, we can think about whether the text fits vertically.

Our separate function aligns the tspans it created with the original x position of the text it replaces, but it leaves the y set to 0, which means all the elements are sitting on top of each other. The distribute function fixes this by arranging the lines vertically, with the centre of the group centred in the space available (which will in turn centre it with respect to the bar in the chart).

function distribute(text, tspans) { 
var pmax = lineHeight * (tspans.length - 1);
var dy = parseFloat(text.attr("dy"));

for (var j = 0; j < tspans.length; j++) {
var pos = (lineHeight * j) - (pmax / 2.0) + dy;
var tspan = tspans[j];
tspan.attr("dy", pos + "em");
}
}

We start by working out the maximum height of the element set (pmax) when they are distributed, and grabbing the dy of the original text attribute — this tells us what the original offset NVD3 used to centre a line of text with the bar. It’ll be something like 0.32em.

Then we simply calculate the dy position of each tspan following a simple formula:

var pos = (lineHeight * j) - (pmax / 2.0) + dy;
  • The line height times the tspan number tells us the raw offset from the top of the set of tspans that we want to move to
  • The maximum height of the block over 2 tells us how far back up to shift the text so that it is in the right relative position to the bar
  • Adding dy reinstates the original offset of the text element to align the centre of the text with the centre of the bar

Once this pos is known, then we simply set the dy of the tspan appropriately.

Reduce

With a set of tspans that fit horizontally, and are distributed vertically, we can now finally determine if they fit into the box.

Our reduce function looks at the used space vs the allotted space, and removes the final element if there is an overflow. As a bonus it also replaces the final letters of the previous line with ellipses to indicate truncation.

function reduce(text, tspans) {
var reduced = false;
var box = text.node().getBBox();
if (box.height > maxHeight && tspans.length > 1) {
tspans[tspans.length - 1].remove();
tspans.pop();
var line = tspans[tspans.length - 1].text();
if (line.length > 3) {
line = line.substring(0, line.length - 3) + "...";
}
tspans[tspans.length - 1].text(line);
reduced = true;
}
return reduced;
}

This uses a feature getBBox or getBoundingBox which tells us the width, height and position of an element. If the box height is greater than some maximum height and there are still tspans left to remove then we simply remove the last one from the DOM and from the list in memory. We then remove the final 3 letters of the last line and replace them with ellipses, update the UI with the new text, then return true. If no elements were removed we return false and this is what makes this function suitable for the do ... while loop we introduced near the start.

Bringing it all together

We pull all the code into a single function, which itself contains the other functions we defined, giving us a nice closure:

function wrapLabels(params) {
var axisSelector = params.axisSelector;
var maxWidth = params.maxWidth;
var maxHeight = params.maxHeight;
var lineHeight = params.lineHeight || 1.2;
var wordBreaks = params.wordBreaks || [" ", "\t"];
var minChunkSize = params.minChunkSize || 3;
function _isMidWord(currentLine, remainder) {...} function _toPrevSpace(currentLine) {...} function _backtrack(count, currentLine, remainder) {...} function _isTooLong(tspan) {...} function separate(text) {...} function distribute(text, tspans) {...} function reduce(text, tspans) {...} d3.selectAll(axisSelector + " .tick text").each(function(i, e) {
var text = d3.select(this);
var tspans = separate(text);
do {
distribute(text, tspans);
}
while (reduce(text, tspans))
});
}

Then all we need to do is call this function each time the chart is updated (and remember to call it the first time the chart is rendered):

function updateChart() {
chart.update();
edges.nvd3.tools.wrapLabels({
axisSelector: "#mychart .nv-x.nv-axis",
maxWidth: 200, // the left margin
maxHeight: 40 // the bar height
});
}
updateChart();
nv.utils.windowResize(updateChart);

Here the only new important bit is to understand how to select the right axes. On a Horizontal Multibar this is .nv-x.nv-axis and we localise this to #mychart so that we don’t apply the wrapping to all the charts in the page.

In an ideal world chart labels should be short, as this is easiest for the user. Sometimes, though, it’s not possible, and when you are designing generic visualisations that present data that you don’t control you don’t get much choice. In those cases you need to do something to improve on the default “overflow the box” approach that you get from NVD3 and this label wrapping, hyphenating and truncation approach is highly suitable.

Richard is Founder and Senior Partner at Cottage Labs, a software development consultancy specialising in all aspects of the data lifecycle. He’s occasionally on Twitter at @richard_d_jones

PS — here’s the full code snippet

function wrapLabels(params) {
var axisSelector = params.axisSelector;
var maxWidth = params.maxWidth;
var maxHeight = params.maxHeight;
var lineHeight = params.lineHeight || 1.2;
var wordBreaks = params.wordBreaks || [" ", "\t"];
var minChunkSize = params.minChunkSize || 3;

function _isMidWord(currentLine, remainder) {
var leftChar = $.inArray(currentLine[currentLine.length - 1], wordBreaks) === -1;
var rightChar = $.inArray(remainder[0], wordBreaks) === -1;
return leftChar && rightChar;
}

function _toPrevSpace(currentLine) {
for (var i = currentLine.length - 1; i >= 0; i--) {
var char = currentLine[i];
if ($.inArray(char, wordBreaks) !== -1) {
return currentLine.length - i;
}
}
return -1;
}

function _backtrack(count, currentLine, remainder) {
for (var i = 0; i < count; i++) {
remainder.unshift(currentLine.pop());
}
}

function _isTooLong(tspan) {
return tspan.node().getComputedTextLength() >= maxWidth
}

function separate(text) {
// get the current content then clear the text element
var chars = text.text().trim().split("");
text.text(null);

// set up registries for the text lines that they will create
var lines = [];

// create a tspan for working in - we need it to calculate line widths dynamically
var x = text.attr("x");
var tspan = text.append("tspan").attr("x", x).attr("y", 0);

// record the current line
var currentLine = [];

// for each character in the text, push to the current line, assign to the tspan, and then
// check if we have exceeded the allowed max width
while (chars.length > 0) {
var char = chars.shift();
currentLine.push(char);
tspan.text(currentLine.join(""));

var maxed = false;
var hyphenated = false;
while(_isTooLong(tspan)) {
// record that we pushed the tspan to the limit
maxed = true;

// if we already added a hyphen, remove it
if (hyphenated) {
currentLine.splice(currentLine.length - 1);
hyphenated = false;
}

// if we have exceeded the max width back-track 1
_backtrack(1, currentLine, chars);

if (_isMidWord(currentLine, chars)) {
var toPrevSpace = _toPrevSpace(currentLine);

if (toPrevSpace === -1 || toPrevSpace - 1 > minChunkSize) {
_backtrack(1, currentLine, chars);
currentLine.push("-");
hyphenated = true;
} else {
_backtrack(toPrevSpace, currentLine, chars);
}
}

currentLine = currentLine.join("").trim().split("");
tspan.text(currentLine.join(""));
}

// if we didn't yet fill the tspan, continue adding characters
if (!maxed && chars.length > 0) {
continue;
}

// otherwise, move on to the next line
if (maxed || chars.length === 0) {
lines.push(currentLine);
currentLine = [];
}
}

// create all the tspans
tspan.remove();
var tspans = [];
for (var i = 0; i < lines.length; i++) {
tspan = text.append("tspan").attr("x", x).attr("y", 0);
tspan.text(lines[i].join(""));
tspans.push(tspan);
}

return tspans;
}

function distribute(text, tspans) {
var imax = tspans.length;
var pmax = lineHeight * (imax - 1);
var dy = parseFloat(text.attr("dy"));

for (var j = 0; j < tspans.length; j++) {
var pos = (lineHeight * j) - (pmax / 2.0) + dy;
var tspan = tspans[j];
tspan.attr("dy", pos + "em");
}
}

function reduce(text, tspans) {
var reduced = false;
var box = text.node().getBBox();
if (box.height > maxHeight && tspans.length > 1) {
tspans[tspans.length - 1].remove();
tspans.pop();
var line = tspans[tspans.length - 1].text();
if (line.length > 3) {
line = line.substring(0, line.length - 3) + "...";
}
tspans[tspans.length - 1].text(line);
reduced = true;
}
return reduced;
}

d3.selectAll(axisSelector + " .tick text").each(function(i, e) {
var text = d3.select(this);
var tspans = separate(text);
do {
distribute(text, tspans);
}
while (reduce(text, tspans))
});
}

--

--

All things data: capture, management, sharing, viz. All-round information systems person. Founder at Cottage Labs. https://cottagelabs.com