Cumulative Weighted Choice

Specification

Function Template: cumulative_weighted_choice(weights, values) -> Value

Primary Feature. Takes two vectors by const reference, weights and values. Returns a random value when called, where the probability of any given value is based on its corresponding weight. Cumulative weights must be unique and sorted in ascending order. The size of the two vectors must be the same. Value type can be almost anything. Weight type must be double.

@param const vector<double>& weights.
@param const vector<Value>& values.
@return :: Weighted random Value from values.

Function Template: cumulative_from_relative(rel_weights) -> vector<Weight>

Utility Function. Takes a reference to a vector of relative weights and returns a new vector of cumulative weights.

@param const vector<Weight> & rel_weights :: relative weights, can be any numeric type.
@return :: vector<Weight>, cumulative weights. Each weight is guaranteed to be unique and in order from low to high.

Function Template: relative_from_cumulative(cum_weights) -> vector<Weight>

Utility Function. Takes a reference to a vector of cumulative weights and returns a new vector of relative weights.

@param const vector<Weight>& cum_weights :: cumulative weights, can be any numeric type.
@return :: vector<Weight>, relative weights.

CWC Implementation

// File: WeightedChoice.hpp
// Author: Robert Sharp
// Standard: C++17
#pragma once
#include <random>     // random_device, mt19937_64, uniform_real_distribution
#include <vector>     // vector
#include <algorithm>  // lower_bound, partial_sum, adjacent_difference
#include <iterator>   // distance

namespace WeightedChoice {

template<typename Value>
auto cumulative_weighted_choice(const std::vector<double> & weights, const std::vector<Value> & values) -> Value {
    static std::random_device hardware_seed;
    static std::mt19937_64 engine { hardware_seed() };
    assert(weights.size() == values.size());
    const auto max_weight { weights.back() };
    std::uniform_real_distribution<double> distribution { 0.0, max_weight };
    const auto raw_weight { distribution(engine) };
    const auto valid_weight { std::lower_bound(weights.cbegin(), weights.cend(), raw_weight) };
    const auto result_idx { std::distance(weights.cbegin(), valid_weight) };
    return values[result_idx];
}

template<typename Weight>
auto cumulative_from_relative(const std::vector<Weight> & rel_weights) -> std::vector<Weight> {
    std::vector<Weight> cum_weights(rel_weights.size());
    std::partial_sum(rel_weights.cbegin(), rel_weights.cend(), cum_weights.begin());
    return cum_weights;
}

template<typename Weight>
auto relative_from_cumulative(const std::vector<Weight> & cum_weights) -> std::vector<Weight> {
    std::vector<Weight> rel_weights(cum_weights.size());
    std::adjacent_difference(cum_weights.cbegin(), cum_weights.cend(), rel_weights.begin());
    return rel_weights;
}

} // end WeightedChoice namespace

CWC Example Usage

// File main.cpp
// Author: Robert Sharp
// Standard: C++17
#include <iostream>
#include <string>
#include "WeightedChoice.hpp"


int main() {
    const std::vector<double> rel_weights { 0.4, 0.3, 0.2, 0.05, 0.01, 0.01, 0.01, 0.01, 0.01 };
    const std::vector<double> cum_weights { WeightedChoice::cumulative_from_relative(rel_weights) };
    const std::vector<std::string> values {
        "A: Common 40%",
        "B: Uncommon 30%",
        "C: Rare 20%",
        "D: Epic 5%",
        "E: Legendary1 1%",
        "F: Legendary2 1%",
        "G: Legendary3 1%",
        "H: Legendary4 1%",
        "I: Legendary5 1%",
    };

    std::cout << "20 Random Weighted Values:" << '\n';
    for (auto i {0}; i < 20; ++i) {
         std::cout << WeightedChoice::cumulative_weighted_choice(cum_weights, values) << '\n';
    }
    std::cout << std::endl;
}

Typical Output

$ clang++ main.cpp -std=c++17 -stdlib=libc++ -Ofast -o test.out
$ ./test.out
20 Random Weighted Values:
C: Rare 20%
B: Uncommon 30%
A: Common 40%
D: Epic 5%
A: Common 40%
B: Uncommon 30%
A: Common 40%
A: Common 40%
A: Common 40%
B: Uncommon 30%
A: Common 40%
B: Uncommon 30%
A: Common 40%
A: Common 40%
A: Common 40%
A: Common 40%
D: Epic 5%
C: Rare 20%
B: Uncommon 30%
A: Common 40%

Typical Distribution of 1,000,000 Samples

A: Common 40%: 39.9805%
B: Uncommon 30%: 30.0106%
C: Rare 20%: 19.9437%
D: Epic 5%: 5.0529%
E: Legendary1 1%: 1.0041%
F: Legendary2 1%: 1.0095%
G: Legendary3 1%: 0.9878%
H: Legendary4 1%: 1.0092%
I: Legendary5 1%: 1.0017%

Typical Timing: Small Set

Testbed: 2016 Intel Core i7 Skylake 2.7GHz Quad, 16GB RAM, 1TB SSD
Microbench, average time per call:

cumulative_weighted_choice(cum_weights, values): 40 ± 8 ns
cumulative_from_relative(rel_weights): 90 ± 32 ns
relative_from_cumulative(cum_weights): 90 ± 32 ns

Theoretical Complexity

cumulative_weighted_choice(): Same as std::lower_bound plus some small constant. The number of comparisons performed is logarithmic in the number of weighted values. O(log2(n))

cumulative_from_relative(): Same as std::partial_sum: O(n)

relative_from_cumulative(): Same as std::adjacent_difference: O(n)