Loading...

3 Boost C++ libraries you might not know about

1 year ago

If you’re new to C++, you may not know how the Boost libraries have helped prototype many of the features you now take for granted in the standard.

Before C++ 11, using the Boost Libraries was a must if you wanted to use shared pointers, static assertions, or even create a thread in a portable way!

Today, you may be using Boost because of Asio, Spirit, or Containers. Boost has many, many libraries, and I thought it would be fun to zoom in on three libraries that may be less well-known.

Accumulators

Let’s say you have a bunch of integers (or floats), and you’d like to run stats on them. You know, nothing crazy, count them, average them, min, max. Easy. Then maybe, you’d want the median. Skewness. P99. Mmm. That’s a bit harder. Wait. P99 is not that trivial to get right. And it sounds like there are many optimization opportunities. 

Doing this right sounds like a mission. And we don’t have time for that. We got Rick & Morty episodes to watch.

Lucky for you, Boost has a fantastic library for that: Accumulators. You might have been scared by the documentation or the compilation error messages, but fear not fren, I will show you how easy it is.

Accumulators, is what its name says: an accumulator for statistics. You create an accumulator for a selection of statistics, inject your data as you receive it, and compute values at will.

Let’s start with a simple example: max, average, count:

#include <boost/accumulators/accumulators.hpp>
#include <boost/accumulators/statistics/count.hpp>
#include <boost/accumulators/statistics/max.hpp>
#include <boost/accumulators/statistics/mean.hpp>
#include <boost/accumulators/statistics/stats.hpp>

// order does not matter, just tell accumulator which
// stats you're interested in
using stats_accumulators =
    boost::accumulators::stats<boost::accumulators::tag::count,
    boost::accumulators::tag::mean, 
    boost::accumulators::tag::max>;

// here we'll use double floating points for computation
using stats_accumulators_set = 
    boost::accumulators::accumulator_set<double, stats_accumulators>;

// declare object
stats_accumulators_set acc;

// add bunch of values that would likely come from your program
acc(1.0);
acc(3.0);

// extract count, max, and mean, results will be double
auto mean  = boost::accumulators::mean(acc);
auto max   = boost::accumulators::max(acc);
auto count = boost::accumulators::count(acc);

Now, it gets a bit trickier when you begin to do medians and P99, as you need to configure the cache size properly (the library uses a, configurable, statistical algorithm, thus median is not 100% accurate).

Pro tip: make sure your cache size isn’t too large, as performance can suffer:

#include <boost/accumulators/accumulators.hpp>
#include <boost/accumulators/numeric/functional/complex.hpp>
#include <boost/accumulators/numeric/functional/valarray.hpp>
#include <boost/accumulators/numeric/functional/vector.hpp>
#include <boost/accumulators/statistics/count.hpp>
#include <boost/accumulators/statistics/max.hpp>
#include <boost/accumulators/statistics/mean.hpp>
#include <boost/accumulators/statistics/stats.hpp>
#include <boost/accumulators/statistics/tail_quantile.hpp>

// extract the greatest value
using right_quantile =
    boost::accumulators::tag::tail_quantile<
        boost::accumulators::right>;

// compared to before, 
// we just throw in the right_quantile and can keep the others
using stats_accumulators = boost::accumulators::stats<
    boost::accumulators::tag::count, 
    boost::accumulators::tag::mean,
    boost::accumulators::tag::max, 
    right_quantile>;

using stats_accumulators_set = 
    boost::accumulators::accumulator_set<double, stats_accumulators>;

// configure the cache size, 
// very important for statistical algorithms like median and quantiles
// here we will use a cache of 1,000, 
// I advise going above 10,000 for performance reasons
stats_accumulators_set acc{boost::accumulators::right_tail_cache_size = 1'000};

acc(1.0);
acc(3.0);

// hey Boost.Accumulators, give me the threshold below which 99% of my values are
auto p99 = boost::accumulators::quantile(acc,
    boost::accumulators::quantile_probability = 0.99);

Accumulator delivers good performance, will get the math right and has a comprehensive statistical toolbox. 

Admittedly, because it uses the Boost.Fusion, compilation error messages can be exceedingly arcane just because you forgot an include, and compilation times are not great.

Pros

  • Good performance
  • Comprehensive statistical toolbox
  • Great API

Cons

  • Compilation times 
  • Spartan documentation
  • Examples don’t always compile on modern compilers

Personal notes

I use Accumulator every time I need a bunch of stats and metrics, and I don’t need “insane” performance (in which case, you might need a completely different strategy). Make sure you remove 2 humanity points from your character sheet after successfully using the library for the first time.

Multi Index

We previously published an article about maps in C++, and I’d like to discuss a more advanced use case.

What if you need to access a collection of elements via different keys? Take this “player” class, for example:

struct player
{
    std::string name;
    std::uint32_t uid;
    std::uint8_t level;
};

You may be interested in looking up a player based on their name, UID, or level.

One way would be to have a class with multiple dictionaries and do the necessary mapping. The downside is that it will undoubtedly be inefficient and likely wrong.

Multi-Index allows you to create containers that can be acceded through multiple keys.

Let’s say we want to have our entries indexed by name, but as well ordered by name, as we may be interested in listing players in alphabetical order, and a hash table based on the uid for a quicklookup:

#include <boost/multi_index_container.hpp>
#include <boost/multi_index/member.hpp>
#include <boost/multi_index/hashed_index.hpp>
#include <boost/multi_index/ordered_index.hpp>

struct player_uid{};
struct player_name{};

// we potentially have different players with the same name and we want them ordered inside our set
using name_index = boost::multi_index::ordered_non_unique<
    boost::multi_index::tag<player_name>, 
        BOOST_MULTI_INDEX_MEMBER(player, std::string,name)>;

// make a hash index for uid
using uid_index = boost::multi_index::hashed_unique<
    boost::multi_index::tag<player_uid>, 
        BOOST_MULTI_INDEX_MEMBER(player,std::uint32_t, uid)>;

// now the set
using players_set =
  boost::multi_index_container<player,
      boost::multi_index::indexed_by<name_index, uid_index>>;


players_set s;

s.insert(player{"Bob", 2, 1});
s.insert(player{"Alice", 1, 1});
s.insert(player{"Oscar", 3, 1});

// all players names, sorted alphabetically
{
    const auto & r = boost::multi_index::get<player_name>(s);
    for(auto it = r.begin(); it != r.end(); ++it)
    {
        std::cout << it->name << std::endl;
    }
}

// look up for player uid 2
{
    // get the index
    const auto & r = boost::multi_index::get<player_uid>(s);
    // lookup behavior is like a map
    const auto it = r.find(2);
    if (it != r.end())
    {
        std::cout << it->name << std::endl;
    }
}

Pros

  • Can be (uniquely or not) ordered and unordered 
  • Good spatial and time efficiency
  • Supports node extraction

Cons

  • Indexing makes modifying entries more involved than with regular containers
  • No concurrent updates
  • Hash table isn’t state of the art performance

Personal notes

When I need to access a structure through different keys, I default to MultiIndex, and use it as a performance baseline.

Dynamic bitset

One day you wanted to store a set of bits. std::bitset is excellent, but the size is set at compile time. So you went for std::vector<bool>.

Little did you know that this container is only used to open a portal to the abyss and summon Prince Demogorgon (note: summoned tanar’ri is compiler-dependent). 

Worry not, boost has got you covered with an easy-to-use library: Dynamic Bitset. You can configure the underlying integer to use as well as the allocator, and it supports all fancy operations you might want to do on a bitset.

#include <boost/dynamic_bitset.hpp>

// 16 bits, all set to 0
boost::dynamic_bitset<std::uint64_t> x{16};

assert(x.none());
assert(!x.all());

// set first and third bit to 1
x[0] = x[2] = 1;

assert(!x.none());
assert(!x.all());
assert(x.any());

// add 1 bit to the end
x.push_back(1);

// flip the 2nd bit
x.flip(1);

// set it all back to 0
x.reset();

Dynamic bitset has native support for Serialization. It’s possible to open up access to the internal bitfield to write your zero-copy serialization code using the macro BOOST_DYNAMIC_BITSET_DONT_USE_FRIENDS.

Pros

  • Straightforward, clear, unsurprising API
  • Comprehensive
  • Good out-of-the-box performance
  • Base storage integer is customizable
  • Fast compilation

Cons

  • Custom serialization requires the use of an undocumented macro
  • No SIMD optimizations

Personal notes

A dynamic bitset might sounds like a mundane library, but it’s much more work than it looks to implement correctly. I like Dynamic Bitset because it does the job, it’s included in Boost, and it saved me a lot of time. 

To be continued

My goal in writing this article was to share some of Boost’s hidden gems.

Your organization may be conservative about adding new libraries, so if Boost is already accepted, you might want to be sure you make the most of it!

And you? What would be your three favorite boost libraries?

If you love C++ and are looking for cool problems to work on, this is your reminder that we are hiring!


Related Tags:
Top