## Monday, June 30, 2014

### Chapter 6, part 7 of 8: Monte Carlo simulations

Hello everybody.

This is the seventh in a series of posts that started here and covers chapter 6 of my book; that is, the Monte Carlo framework.

A week or two ago, I've been issuing a call to arms on the QuantLib mailing list. We have quite a few old items in our bug tracker and patch tracker, and we should see if they're still relevant. A few have been closed already, but if you want to get in the action and help triaging the rest, please write to the mailing list.

Do I still need to tell you about my Introduction to QuantLib Development course? Apparently I do. It's three days of lectures and exercises based on Implementing QuantLib, it's in London from September 22nd to 24th, and more information, a brochure and registration form are available here.

Follow me on Twitter if you want to be notified of new posts, or add me to your circles, or subscribe via RSS: the widgets for that are in the sidebar, at the top right of the page. Also, make sure to check my Training page.

### Monte Carlo simulations

Up one step in complexity from the Monte Carlo model of last post and we get to the McSimulation class template, sketched in listing 6.15. Its job is to drive the simple-minded MonteCarloModel towards a goal, be it a required accuracy or number of samples. It can be used (and it was designed) as a starting point to build a pricing engine.

Listing 6.15: Sketch of the McSimulation class template.
    template <template <class> class MC, class RNG,
class S = Statistics>
class McSimulation {
public:
typedef typename MonteCarloModel<MC,RNG,S>::path_generator_type
            path_generator_type;
typedef typename MonteCarloModel<MC,RNG,S>::path_pricer_type
path_pricer_type;
typedef typename MonteCarloModel<MC,RNG,S>::stats_type
stats_type;
typedef typename MonteCarloModel<MC,RNG,S>::result_type
result_type;
virtual ~McSimulation() {}
result_type value(Real tolerance,
Size maxSamples = QL_MAX_INTEGER,
Size minSamples = 1023) const;
result_type valueWithSamples(Size samples) const;
void calculate(Real requiredTolerance,
Size requiredSamples,
Size maxSamples) const;
const stats_type& sampleAccumulator(void) const;
protected:
McSimulation(bool antitheticVariate,
bool controlVariate);
virtual shared_ptr<path_pricer_type> pathPricer() const = 0;
virtual shared_ptr<path_generator_type> pathGenerator() const = 0;
virtual TimeGrid timeGrid() const = 0;
virtual shared_ptr<path_pricer_type>
controlPathPricer() const;
virtual shared_ptr<path_generator_type>
controlPathGenerator() const;
virtual result_type controlVariateValue() const;
virtual shared_ptr<PricingEngine> controlPricingEngine() const;
mutable shared_ptr<MonteCarloModel<MC,RNG,S> > mcModel_;
bool antitheticVariate_, controlVariate_;
};


McSimulation follows the Template Method pattern. It asks the user to implement a few pieces of behavior, and in return it provides generic logic to instantiate a Monte Carlo model and run a simulation. Derived classes must define at least the pathPricer method, that returns the path pricer to be used; the pathGenerator method, that returns the path generator; and the timeGrid method, that returns the grid describing the nodes of the simulation. Other methods, returning the objects to be used for control variates, might or might not be McSimulation provides default implementations that return null values, so derived classes that don't want to use the control variate technique can just forget about it.

Based on such methods, McSimulation provides most of the behavior needed by an engine. Its constructor takes two boolean flags specifying whether it should use either antithetic or control variates; the second will only matter if the derived class implements the required methods.

The value method adds samples to the underlying model until the estimated error matches a required tolerance. (Of course, this needs an error estimate, so it won't work with low-discrepancy methods.)
    template <template <class> class MC, class RNG, class S>
typename McSimulation<MC,RNG,S>::result_type
McSimulation<MC,RNG,S>::value(Real tolerance,
Size maxSamples,
Size minSamples) const {
...
Real error = mcModel_->sampleAccumulator().errorEstimate();
while (error > tolerance) {
QL_REQUIRE(sampleNumber < maxSamples, ...);
Real order = (error*error)/(tolerance*tolerance);
Size nextBatch =
std::max<Size>(sampleNumber*order*0.8-sampleNumber,
minSamples));
nextBatch = std::min(nextBatch, maxSamples-sampleNumber);
sampleNumber += nextBatch;
error = mcModel_->sampleAccumulator().errorEstimate();
}
return mcModel_->sampleAccumulator().mean();
}

It looks at the current number of samples $$n$$ and the current error $$\epsilon$$, estimates the number of samples $$N$$ that will be needed to reach the given tolerance $$\tau$$ as $$N = n \times \epsilon^2/\tau^2$$ (since of course $$\epsilon \propto 1/\sqrt{n}$$), and adds a new batch of $$N-n$$ samples that gets the total closer to the estimated number; then it assesses the error again, and repeats the process as needed.

The valueWithSamples method just adds a batch of samples so that their total number matches the required one; its implementation is not shown here because of space constraints, but is simple enough.

Finally, the calculate method runs a complete simulation.
    template <template <class> class MC, class RNG, class S>
void McSimulation<MC,RNG,S>::calculate(Real requiredTolerance,
Size requiredSamples,
Size maxSamples) const {
if (!controlVariate_) {
mcModel_ = shared_ptr<MonteCarloModel<MC,RNG,S> >(
new MonteCarloModel<MC,RNG,S>(
pathGenerator(), pathPricer(),
S(), antitheticVariate_));
} else {
... // same as above, but passing the control-variate args, too.
}

if (requiredTolerance != Null<Real>())
this->value(requiredTolerance, maxSamples);
else
this->valueWithSamples(requiredSamples);
}

It takes as arguments either a required tolerance or a required number of samples (one of the arguments must be null, but not both) as well as a maximum number of samples; it instantiates a MonteCarloModel instance, with the controlVariate_ flag determining whether to pass the control-variate arguments to the constructor; and it calls either the value or the valueWithSamples method, depending on what goal was required.

In next post, I'll show an example of how to use the McSimulation class to build a pricing engine; but before that, let me point out a few ways in which it could be improved.

First of all, it currently implements logic to run a simulation based on two criteria (accuracy or total number of samples) but of course, more criteria could be added. For instance, one could run a simulation until a given clock time is elapsed; or again, one could add several batches of samples, look at the result after each one, and stop when convergence seems to be achieved. This suggests that the hard-coded value and valueWithSamples methods could be replaced by an instance of the Strategy pattern, and the switch between them in the calculate method by just a call to the stored strategy.

In turn, this would also remove the current ugliness in calculate: instead of passing the whole set of arguments for both calculations and giving most of them to a null value (like in the good old days, where languages could not overload methods) one would pass only the required arguments to the strategy object, and then the strategy object to calculate.

Finally, the presence of controlPricingEngine method is a bit of a smell. The implementation of McSimulation doesn't use it, so it's not strictly a part of the Template Method pattern and probably shouldn't be here. However, a couple of other classes (both inheriting from McSimulation, but otherwise unrelated) declare it and use it to implement the controlVariateValue method; therefore, leaving it here might not be the purest of designs but prevents some duplication.

#### Aside: synchronized walking

Both the MonteCarloModel and McSimulation class templates allow one to define an alternate path generator for control variates. Note, however, that this feature should be used with some caution. For this variance-reduction technique to work, the control-variate paths returned from the second generator must be correlated with the regular ones, which basically means that the two path generators must use two identical random-number generators: same kind, dimensionality, and seed (if any).

Unfortunately, this constraint rules out quite a few possibilities. For instance, if you're using a stochastic volatility model, such as the Heston model, you might be tempted to use the Black-Scholes model as control variate. No such luck: for $$n$$ time steps, the Heston process requires $$2n$$ random numbers ($$n$$ for the stock price and $$n$$ for its volatility) while the Black-Scholes process just needs $$n$$. This makes it impossible to keep the two corresponding path generators in sync.

In practice, you'll have a use case for the alternate path generator if you have a process with a number of parameters which is not analytically tractable in the generic case, but has a closed-form solution for your option value if some of the parameters are null. If you're so lucky, you can use a fully calibrated process to instantiate the main path generator, and another instance of the process with the null parameters to generate control-variate paths.

## Monday, June 23, 2014

### Chapter 6, part 6 of 8: Monte Carlo models

Welcome back.

This is a sixth in a series of posts covering chapter 6 of my book (here is the first one). This week: Monte Carlo models.

And of course, you can still register for my Introduction to QuantLib Development course: click on the link for more information.

Follow me on Twitter if you want to be notified of new posts, or add me to your circles, or subscribe via RSS: the widgets for that are in the sidebar, at the top right of the page. Also, make sure to check my Training page.

### Putting it all together

Having read all about the different pieces of a model in the previous posts, you'd expect me to start assembling them and finally get a working Monte Carlo model. However, before doing so, we still have the task of picking a specific set of pieces among those available in our toolbox. This will be the subject of the next subsection—after which we'll build the model, developer's honor.

#### Monte Carlo traits

As I mentioned before, there's a number of choices we have to make in order to implement a Monte Carlo model. Depending on the dimension of the problem and the stochastic process we choose, we can use one- or multi-dimensional paths and generators; or we can use pseudo-random numbers or low-discrepancy sequences (and even though I haven't listed them, there's quite a few algorithms available for either type).

For at least one of those choices, dynamic polymorphism is not an option: one-dimensional and multi-dimensional generators can't have a common interface, since their methods have different input and return types. Therefore, we'll go with static polymorphism and supply the required classes to the model as template arguments.

The problem is, this can result very quickly in unwieldy declarations; the sight of something like
    MonteCarloModel<
MultiPathGenerator<
InverseCumulativeRsg<SobolRsg,InverseCumulativeNormal> > >

can bring shudders to the most seasoned developer. Sure, typedefs can help; by using them, a user might assign mnemonics to default choices and shorten the declaration. Default template arguments might alleviate the pain, too. However, we went for a mechanism that also allowed us on the one hand, to define mnemonics for commonly used groups of related choices; and on the other hand, to add helper methods to the mix. To do this, we decided to use traits [1].

Listing 6.12: Example of random-number generation traits.
    template <class URNG, class IC>
struct GenericPseudoRandom {
typedef URNG urng_type;
typedef InverseCumulativeRng<urng_type,IC> rng_type;
typedef RandomSequenceGenerator<urng_type> ursg_type;
typedef InverseCumulativeRsg<ursg_type,IC> rsg_type;
enum { allowsErrorEstimate = 1 };
static rsg_type make_sequence_generator(Size dimension,
BigNatural seed) {
ursg_type g(dimension, seed);
return rsg_type(g);
}
};

typedef GenericPseudoRandom<MersenneTwisterUniformRng,
InverseCumulativeNormal>
PseudoRandom;

template <class URSG, class IC>
struct GenericLowDiscrepancy {
typedef URSG ursg_type;
typedef InverseCumulativeRsg<ursg_type,IC> rsg_type;
enum { allowsErrorEstimate = 0 };
static rsg_type make_sequence_generator(Size dimension,
BigNatural seed) {
ursg_type g(dimension, seed);
return rsg_type(g);
}
};

typedef GenericLowDiscrepancy<SobolRsg,
InverseCumulativeNormal>
LowDiscrepancy;

Listing 6.12 shows the default traits classes for pseudo-random and low-discrepancy number generation. First comes the GenericPseudoRandom class template. It takes as template arguments the type of a uniform pseudo-random number generator and that of an inverse-cumulative function object, and builds a number of other types upon it. The type of the passed generator itself is defined as urng_type—emphasis on "u" for uniform and "n" for number. Based on this type, it defines rng_type, no longer uniform since it uses the inverse-cumulative function to return numbers according to its distribution; ursg_type, where the "n" is replaced by "s" for sequence; and finally rsg_type, which generates sequences of numbers according to the passed distribution. The compile-time constant allowErrorEstimate, written as an enumeration to satisfy older compilers (it should really be a static const bool) tells us that this generator allows us to estimate the Monte Carlo error as function of the number of samples; and the helper function make_sequence_generator makes it easier to create a generator based on the passed inputs.

Then, we instantiate the class template with our weapons of choice. For the basic generator, that would be the MersenneTwisterUniformRng class; for the function object, the InverseCumulativeNormal class, since we'll most often want normally distributed numbers. The resulting traits class will be our default for pseudo-random generation; fantasy being not our strong suit, we defined it as the PseudoRandom class.

The GenericLowDiscrepancy class template is defined is a similar way, but with two differences. Since low-discrepancy numbers are generated in sequences, the types for single-number generation are missing; and the enumeration tells us that we can't forecast the statistical error we'll get. We define the LowDiscrepancy traits class as the one obtained by selecting the SobolRsg class as our generator and, again, the InverseCumulativeNormal class as our function object.

Finally, we defined a couple of traits classes to hold types related to specific Monte Carlo functionality, such as the types of used paths, path generators, and path pricers. They are shown in listing 6.13: the SingleVariate class holds the types we need for 1-D models, while the MultiVariate class holds the types for multi-dimensional ones. They are both template classes, and take as their template argument a traits class for random-number generation.

Listing 6.13: Example of Monte Carlo traits.
    template <class RNG = PseudoRandom>
struct SingleVariate {
typedef RNG rng_traits;
typedef Path path_type;
typedef PathPricer<path_type> path_pricer_type;
typedef typename RNG::rsg_type rsg_type;
typedef PathGenerator<rsg_type> path_generator_type;
enum { allowsErrorEstimate = RNG::allowsErrorEstimate };
};

template <class RNG = PseudoRandom>
struct MultiVariate {
typedef RNG rng_traits;
typedef MultiPath path_type;
typedef PathPricer<path_type> path_pricer_type;
typedef typename RNG::rsg_type rsg_type;
typedef MultiPathGenerator<rsg_type> path_generator_type;
enum { allowsErrorEstimate = RNG::allowsErrorEstimate };
};

By combining the provided RNG and Monte Carlo traits (or any traits classes that one might want to define, if one wants to use any particular type) not only we can provide a model with all the necessary information, but we can do it with a simpler and more mnemonic syntax, such as
    MonteCarloModel<SingleVariate, LowDiscrepancy>;

the idea being to move some complexity from users to developers. We have to use some template tricks to get the above to work, but when it does, it's a bit more readable (and writable) for users. But that's for next section, in which we finally assemble a Monte Carlo model.

#### The Monte Carlo model

Listing 6.14 shows the MonteCarloModel class, which is the low-level workhorse of Monte Carlo simulations.

Listing 6.14: Implementation of the MonteCarloModel class template.
    template <template <class> class MC, class RNG,
class S = Statistics>
class MonteCarloModel {
public:
typedef MC<RNG> mc_traits;
typedef RNG rng_traits;
typedef typename MC<RNG>::path_generator_type
path_generator_type;
typedef typename MC<RNG>::path_pricer_type path_pricer_type;
typedef typename path_generator_type::sample_type
sample_type;
typedef typename path_pricer_type::result_type result_type;
typedef S stats_type;

MonteCarloModel(
const boost::shared_ptr<path_generator_type>&
pathGenerator,
const boost::shared_ptr<path_pricer_type>& pathPricer,
const stats_type& sampleAccumulator,
bool antitheticVariate,
const boost::shared_ptr<path_pricer_type>& cvPathPricer
= boost::shared_ptr<path_pricer_type>(),
result_type cvOptionValue = result_type(),
const boost::shared_ptr<path_generator_type>& cvGenerator
= boost::shared_ptr<path_generator_type>());
const stats_type& sampleAccumulator(void) const;
private:
boost::shared_ptr<path_generator_type> pathGenerator_;
boost::shared_ptr<path_pricer_type> pathPricer_;
stats_type sampleAccumulator_;
bool isAntitheticVariate_;
boost::shared_ptr<path_pricer_type> cvPathPricer_;
result_type cvOptionValue_;
bool isControlVariate_;
boost::shared_ptr<path_generator_type> cvPathGenerator_;
};

template <template <class> class MC, class RNG, class S>
MonteCarloModel<MC,RNG,S>::MonteCarloModel(
const boost::shared_ptr<path_generator_type>& pathGenerator,
...other arguments...)
: pathGenerator_(pathGenerator), ...other data... {
if (!cvPathPricer_)
isControlVariate_ = false;
else
isControlVariate_ = true;
}

template <template <class> class MC, class RNG, class S>
for (Size j = 1; j <= samples; j++) {
sample_type path = pathGenerator_->next();
result_type price = (*pathPricer_)(path.value);

if (isControlVariate_) {
if (!cvPathGenerator_) {
price += cvOptionValue_-(*cvPathPricer_)(path.value);
} else {
sample_type cvPath = cvPathGenerator_->next();
price +=
cvOptionValue_-(*cvPathPricer_)(cvPath.value);
}
}

if (isAntitheticVariate_) {
path = pathGenerator_->antithetic();
result_type price2 = (*pathPricer_)(path.value);
if (isControlVariate_) {
\ldots{} adjust the second price as above
}
} else {
}
}
}

It brings together path generation, pricing and statistics, and as such takes template arguments defining the types involved: a MC traits class defining types related to the simulation, a RNG traits class describing random-number generation, and a statistics class S defaulting to the Statistics class. (Don't worry. I'm not going off another tangent, even though an early outline of this chapter had a section devoted to statistics. If you're interested, the Statistics class is in appendix A of the book.) The MC class is a template template argument, so that it can be fed the RNG traits (as shown in the previous section; see for instance the MultiVariate class).

The class defines aliases for a few frequently used types; most of them are extracted from the traits, by instantiating the MC class template with the RNG class. The resulting class provides the types of the path generator and the path pricer to be used; from those, in turn, the type of the sample paths and that of the returned prices can be obtained.

The constructor takes the pieces that will be made to work together; at least a path generator (well, a pointer to one, but you'll forgive me for not spelling out all of them), a path pricer, and an instance of the statistics class, as well as a boolean flag specifying whether to use antithetic variates. Then, there are a few optional arguments related to control variates: another path pricer, the analytic value of the control variate, and possibly another path generator. Optional arguments might not be the best choice, since they make it possible to pass a control variate path pricer and not the corresponding analytic value; it would have been safer to have a constructor with no control variate arguments, and another constructor with both path pricer and analytic value being mandatory and with an optional path generator. However, the current version saves a few lines of code. The constructor copies the passed arguments into the corresponding data members, and sets another boolean flag based on the presence or the lack of the control-variate arguments.

The main logic is implemented in the addSamples method. It's just a loop: draw a path, price, add the result to the statistics; but it includes a bit of complication in order to take care of variance reduction. It takes the number of samples to add; for each of them, it asks the path generator for a path, passes the path to the pricer, and gets back a price. In the simplest case, that's all there is to it; the price can just be added to the statistics (together with the corresponding weight, also returned from the generator) and the loop can start the next iteration. If the user passed control-variate data, instead, things get more interesting. If no path generator were specified, we pass to the alternate pricer the same path we used for the main one; otherwise, we ask the second generator for a path and we use that one. In both cases, we adjust the baseline price by subtracting the simulated price of the control and adding its analytic value.

It's not over yet. If the user also asked for antithetic variates, we repeat the same dance (this time asking the generator, or the generators, for the paths antithetic to the ones we just used) and we add to the statistics the average of the regular and antithetic prices; if not, we just add the price we obtained on the original paths. Lather, rinse, and repeat until the required number of samples is reached.

Finally, the full results (mean price and whatnot) can be obtained by calling the sampleAccumulator method, which returns a reference to the stored statistics. "Accumulator" is STL lingo; we should probably have used a method name taken from the financial domain instead. Such as, I don't know, "statistics." Oh well.

Next time: Monte Carlo simulations.

#### Bibliography

[1] N.C. Myers, Traits: a new and useful template technique. In The C++ Report, June 1995.

## Monday, June 16, 2014

### Chapter 6, part 5 of 8: path generators

Welcome back.

This week, back to some content from my book after the screencast of last week (and the announcement of the QuantLib Notebooks series; leave your feedback if you want the transcripts to be published). This post is the fifth in a series that started here and covers chapter 6, that is, the Monte Carlo framework in QuantLib.

Mandatory plug: you can still register for my Introduction to QuantLib Development course (London, September 24th to 26th): three days of lectures and exercises on the architecture of the library.

Follow me on Twitter if you want to be notified of new posts, or add me to your circles, or subscribe via RSS: the widgets for that are in the sidebar, at the top right of the page. Also, make sure to check my Training page.

### Path generation

Before tackling path generation proper, we need a last basic component: a structure to hold the path itself. Listing 6.9 shows the Path class, which models a random path for a single variable. It contains the TimeGrid instance and the Array holding the node times and values, respectively, and provides methods to access and iterate over them. The method names are usually chosen to follow those of the standard containers, allowing for a more familiar interface; methods with domain-based names such as time are also available.

Listing 6.9: Interface of the Path and MultiPath classes.
    class Path {
public:
Path(const TimeGrid& timeGrid, const Array& values);
bool empty() const;
Size length() const;
Real operator[](Size i) const;
Real at(Size i) const;
Real& operator[](Size i);
Real& at(Size i);
Real front() const;
Real& front();
Real back() const;
Real& back();
Real value(Size i) const;
Real& value(Size i);
Time time(Size i) const;
const TimeGrid& timeGrid() const;
typedef Array::const_iterator iterator;
typedef Array::const_reverse_iterator reverse_iterator;
iterator begin() const;
iterator end() const;
reverse_iterator rbegin() const;
reverse_iterator rend() const;
private:
TimeGrid timeGrid_;
Array values_;
};

class MultiPath {
public:
MultiPath(const std::vector<Path>& multiPath);
// ...more constructors...
Size assetNumber() const;
Size pathSize() const;
const Path& operator[](Size j) const;
const Path& at(Size j) const;
Path& operator[](Size j);
Path& at(Size j);
private:
std::vector<Path> multiPath_;
};

The MultiPath class, also shown in listing 6.9, holds paths for a number of underlying variables. It is basically a glorified container, holding a vector of Path instances and providing a few additional inspectors to uphold the law of Demeter. (For instance, if p is a MultiPath instance, the additional methods allow us to write p.pathSize() instead of p[0].length(). Besides being uglier, the latter might also suggest that p[1].length() could be different.) The simplicity of the implementation is at the expense of some space: by storing a vector of Path instances, we're storing N identical copies of the time grid.

#### Aside: access patterns

The basic inspectors in the Path class, such as operator[] and the iterator interface, return the values Si of the underlying variable rather than the pairs (ti,Si) including the node times. While it could well be expected that p[i] return the whole node, we thought that in the most common cases a user would want just the value, so we optimized such access (it's some kind of Huffman-coding principle, in which the most common operations should require the least typing; the Perl language shines at this, or so I'm told).

In a sad note about reuse (or lack thereof) if we were to add to the Path class a few methods to return whole nodes, we probably wouldn't use std::pair<Time,Real> to hold the data, but an inner Node struct. This is not because we instinctively leap at the most complex implementation—even though it may seem so at times—but because code such as p.node(i).second is not self-describing (wait, is second the time or the value?) The advantage of writing the above as p.node(i).value would offset the cost of defining a Node struct.

What remains to be done for path generation is now simply to connect the dots between the classes that I described in the last few posts. The logic for doing so is implemented in the MultiPathGenerator class template, sketched in listing 6.10; I'll leave it to you to decide whether this makes it an instance of the Factory pattern.

Listing 6.10: Sketch of the MultiPathGenerator class template.
    template <class GSG>
class MultiPathGenerator {
public:
typedef Sample<MultiPath> sample_type;
MultiPathGenerator(const shared_ptr<StochasticProcess>&,
const TimeGrid&,
GSG generator,
bool brownianBridge = false);
const sample_type& next() const { return next(false); }
const sample_type& antithetic() const { return next(true); }
private:
const sample_type& next(bool antithetic) const;
bool brownianBridge_;
shared_ptr<StochasticProcess> process_;
GSG generator_;
mutable sample_type next_;
};

template <class GSG>
const typename MultiPathGenerator<GSG>::sample_type&
MultiPathGenerator<GSG>::next(bool antithetic) const {
if (brownianBridge_) {
QL_FAIL("Brownian bridge not supported");
} else {
typedef typename GSG::sample_type sequence_type;
const sequence_type& sequence_ =
antithetic ? generator_.lastSequence()
: generator_.nextSequence();
Size m = process_->size(), n = process_->factors();
MultiPath& path = next_.value;
Array asset = process_->initialValues();
for (Size j=0; j<m; j++)
path[j].front() = asset[j];
Array temp(n);
next_.weight = sequence_.weight;

TimeGrid timeGrid = path[0].timeGrid();
Time t, dt;
for (Size i = 1; i < path.pathSize(); i++) {
Size offset = (i-1)*n;
t = timeGrid[i-1];
dt = timeGrid.dt(i-1);
if (antithetic)
std::transform(sequence_.value.begin()+offset,
sequence_.value.begin()+offset+n,
temp.begin(),
std::negate<Real>());
else
std::copy(sequence_.value.begin()+offset,
sequence_.value.begin()+offset+n,
temp.begin());

asset = process_->evolve(t, asset, dt, temp);
for (Size j=0; j<m; j++)
path[j][i] = asset[j];
}
return next_;
}
}

Its constructor takes and stores the stochastic process followed by the variable to be modeled; the time grid to be used for generating the path nodes; a generator of random Gaussian sequences, which must be of the right dimension for the job (that is, at each draw it must return N × M numbers for N factors and M steps); and a boolean flag to specify whether or not it should use Brownian bridging, which at this time cannot be set to true and with which we're unfortunately stuck for backward compatibility.

The interface of the class provides a next method, returning a new random path (well, multipath, but cut me some slack here); and an antithetic method, returning the antithetic path of the one last returned by next. (Unfortunately, the correctness of the antithetic method depends on its being called at the correct time, that is, after a call to next; but I can't think of any acceptable way out of this.) Both methods forward to a private method next(bool), which implements the actual logic. If the brownianBridge flag was set to true, the method bails out by raising an exception since the functionality is not yet implemented. Following the general principles of C++, which suggest to fail as early as possible, we should probably move the check to the constructor (or better yet, implement the thing; send me a patch if you do). As it is now, one can build a path generator and have the constructor succeed, only to find out later that the instance is unusable.

If Brownian bridging is not required, the method gets to work. First, it retrieves the random sequence it needs: a new one if we asked for a new path, or the latest sequence we used if we asked for an antithetic path. Second, it performs some setup. It gets a reference to the path to be built (which is a data member, not a temporary; more on this later); retrieves the array of the initial values of the variables and copies them at the beginning of the path; creates an array to hold the subsets of the random variables that will be used at each step; and retrieves the time grid to be used. Finally, it puts the pieces together. At each step, it takes as starting point the values at the previous one (it uses the same array where it stored the initial values, so the first step is covered, too); it retrieves from the time grid the current time t and the interval dt over which to evolve the variables; it copies the subset of the random variables that it needs (the first N for the first step, the second N for the second step, and so on, N being the number of factors; also, it negates them if it must generate an antithetic path); and to close the circle, it calls the evolve method of the process and copies the returned values at the correct places in the paths.

As I mentioned, the returned path is stored as a data member of the generator instance. This saves quite a few allocations and copies; but of course it also makes it impossible to share a path generator between threads, since they would race for the same storage. However, that's just the last nail in the coffin. Multithreading was pretty much out of the picture as soon as we stored the Gaussian number generator as a data member; the path generator can only be as thread-safe as its RNG—that is, not much, since most (if not all) of them store state. A much easier strategy to distribute work among threads would be to use different MultiPathGenerator instances, each with its own RNG. The problem remains of creating RNGs returning non-overlapping sequences; for algorithms that allow it (such as Sobol's) one can tell each instance to skip ahead a given number of draws.

#### Aside: stepping on one's own toes.

Unfortunately, there's quite a bit of memory allocation going on during multi-path generation. The obvious instances (the two Arrays being created to store the current variable values and the current subset of the random numbers) could be avoided by storing them as data members; but on the one hand, we would have yet more mutable members, which is something we should use sparingly; and on the other hand, the allocation of the two arrays only happens once in a whole path generation, so it might not make a lot of difference.

Instead, the worst offender might be the underlying process, which at each step creates an Array instance to be returned from its evolve method. How can we fix this? In this case, I wouldn't return a reference to a data member if I can help it: the StochasticProcess class might just work in a multithreaded environment (well, if market data don't change during calculation, or if at least the process is kept frozen) and I'd rather not ruin it. Another possibility might be that the evolve method take a reference to the output array as an argument; but then, we (and for we, I mean whoever will implement a stochastic process) would have to be very carefully about aliasing.

The sad fact is that the natural way to write the evolve method would usually be something like (in pseudocode, and in the case of, say, two variables)
    void evolve(const Array& x, Time t, Time dt,
const Array& w, Array& y) {
y[0] = f(x[0], x[1], t, dt, w[0], w[1]);
y[1] = g(x[0], x[1], t, dt, w[0], w[1]);
}

where x is the input array and y is the output. However, it would be just as natural for a user to write
    p->evolve(x,t,dt,w,x);

(that is, to pass the same x as both input and output) meaning to replace the old values of the variables with the new ones.

Well, you see the catch. When the first instruction in evolve writes into y[0], it actually writes into x[0]. The second instruction then calls g with a value of x[0] which is not the intended one. Hilarity ensues.

Fixing this requires either the developer to make each process safe from aliasing, by writing
    a = f(...); b = g(...); y[0] = a; y[1] = b;

or something to this effect; or the user to be mindful of the risk of aliasing, and never to pass the same array as both input and output. Either solution requires more constant care than I credit myself with.

So, where does this leave us? For the time being, the current interface might be an acceptable compromise. One future possibility is that we take away the allocation cost at the origin, by using some kind of allocation pool inside the Array class. As usual, we're open to suggestions.

To close the section, I'll mention that the library also provides a PathGenerator class for 1-D stochastic processes. It has some obvious differences (it calls the StochasticProcess1D interface and it creates Paths instead of MultiPaths; also, it implements Brownian bridging, which is a lot easier in the 1-D case) but other than that, it has the same logic as MultiPathGenerator and doesn't warrant being described in detail here.

#### Pricing on a path

There's not much to say on the PathPricer class template, shown in listing 6.11. It defines the interface that must be implemented by a path pricer in order to be used in a Monte Carlo model: an operator() taking a path and returning some kind of result. It is basically a function interface: in fact, if we were to write the framework now instead of ten years ago, I'd just use boost::function and dispense with PathPricer altogether.

Listing 6.11: Interface of the PathPricer class template.
    template<class PathType, class ValueType=Real>
class PathPricer : public unary_function<PathType,ValueType> {
public:
virtual ~PathPricer() {}
virtual ValueType operator()(const PathType& path) const=0;
};

The template arguments generalize on both the type of the argument, which can be a Path, a MultiPath, or a user-defined type if needed (for instance, when pricing instruments with an early-termination feature, one might want to save time by using a path class that generates nodes lazily and only if they are actually used); and the type of the result, which defaults of a simple Real (usually the value of the instrument) but might be an array of results, or some kind of structure. On the one hand, one might like to have the choice: an Array would be more appropriate for returning a set of homogeneous results, such as the amounts of a series of coupons, while a structure with named fields would be better suited for inhomogeneous results such as the value and the several Greeks of an instrument. On the other hand, the results will need to be combined, and therefore they'll need some basic algebra; the Array class already provides it, which makes it ready to use, whereas a results structure would need to define some operators in order to play well with the facilities described in the next section.

## Monday, June 9, 2014

### QuantLib notebook: random numbers

Hello everybody.

This week, a screencast of another IPython notebook (like the one I showed a few weeks ago). Since I'm about to tackle path generation in the next posts, I thought I'd make a couple of points first about random numbers and dimensionality. Set the thing to full screen and enjoy.

I have a few more screencasts planned, so I'm turning them into a series (with a channel page on Vimeo and everything). Also, I'm thinking about publishing the transcripts as an ebook: if you're interested, check this page on Leanpub and leave your feedback.

Finally, there are still places available for my Introduction to QuantLib development course—that is, the course that I'll teach in London based on the contents of my book. It's three days (September 22nd to 24th) half of which is me explaining the architecture of the library and the other half is you doing exercises on your laptop. The experience is rather intense, I'm told.

Follow me on Twitter if you want to be notified of new posts, or add me to your circles, or subscribe via RSS: the widgets for that are in the sidebar, at the top right of the page. Also, make sure to check my Training page.

## Monday, June 2, 2014

### Chapter 6, part 4 of 8: more processes

Hello everybody.

This is the fourth in a series of posts that cover chapter 6 of my book (that is, the Monte Carlo framework) and started here. This week, a couple more short examples.

In other news, you can still register for my Introduction to QuantLib Development course; I went on and on about it already, so I'll leave it at that.

Follow me on Twitter if you want to be notified of new posts, or add me to your circles, or subscribe via RSS: the widgets for that are in the sidebar, at the top right of the page. Also, make sure to check my Training page.

### Example: more processes

In the last post I exposed all the flaws in out implementation of the Black-Scholes process. For an example of a well-behaved process, you can look at listing 6.7 instead, which shows the OrnsteinUhlenbeckProcess class.

Listing 6.7: Implementation of the OrnsteinUhlenbeckProcess class.
    class OrnsteinUhlenbeckProcess : public StochasticProcess1D {
public:
OrnsteinUhlenbeckProcess(Real speed,
Volatility vol,
Real x0 = 0.0,
Real level = 0.0);
Real x0() const;
... // other inspectors
Real drift(Time, Real x) const  {
return speed_ * (level_ - x);
}
Real diffusion(Time, Real) const  {
return volatility_;
}
Real expectation(Time t0, Real x0, Time dt) const  {
return level_ + (x0 - level_) * std::exp(-speed_*dt);
}
Real stdDeviation(Time t0, Real x0, Time dt) const  {
return std::sqrt(variance(t,x0,dt));
}
Real variance(Time t0, Real x0, Time dt) const {
if (speed_ < std::sqrt(QL_EPSILON)) {
return volatility_*volatility_*dt;
} else {
return 0.5*volatility_*volatility_/speed_*
(1.0 - std::exp(-2.0*speed_*dt));
}
}
private:
Real x0_, speed_, level_;
Volatility volatility_;
};

The Ornstein-Uhlenbeck process is a simple one, whose feature of interest here is that its mean-reverting drift term $$\theta(\mu - x)$$ and its constant diffusion term $$\sigma$$ can be integrated exactly. Therefore, besides the mandatory drift and diffusion methods, the class also overrides the expectation and stdDeviation methods so that they implement the formulas for their exact results. The variance method (in terms of which stdDeviation is implemented) has two branches in order to prevent numerical instabilities; for small $$\theta$$, the formula for the variance is replaced by its limit for $$\theta \to 0$$.

Finally, for an example of a multi-dimensional process, we'll have a look at the StochasticProcessArray class, sketched in listing 6.8.

Listing 6.8: Partial implementation of the StochasticProcessArray class.
    class StochasticProcessArray : public StochasticProcess {
public:
StochasticProcessArray(
const std::vector<shared_ptr<StochasticProcess1D> >& ps,
const Matrix& correlation)
: processes_(ps), sqrtCorrelation_(pseudoSqrt(correlation)) {
for (Size i=0; i<processes_.size(); i++)
registerWith(processes_[i]);
}
// ...
Disposable<Array> drift(Time t, const Array& x) const {
Array tmp(size());
for (Size i=0; i<size(); ++i)
tmp[i] = processes_[i]->drift(t, x[i]);
return tmp;
}
Disposable<Matrix> diffusion(Time t, const Array& x) const {
Matrix tmp = sqrtCorrelation_;
for (Size i=0; i<size(); ++i) {
Real sigma = processes_[i]->diffusion(t, x[i]);
std::transform(tmp.row_begin(i), tmp.row_end(i),
tmp.row_begin(i),
bind2nd(multiplies<Real>(),sigma));
}
return tmp;
}
Disposable<Array> expectation(Time t0, const Array& x0,
Time dt) const;
Disposable<Matrix> stdDeviation(Time t0, const Array& x0,
Time dt) const;
Disposable<Array> evolve(Time t0, const Array& x0,
Time dt, const Array& dw) const {
const Array dz = sqrtCorrelation_ * dw;
Array tmp(size());
for (Size i=0; i<size(); ++i)
tmp[i] = processes_[i]->evolve(t0, x0[i], dt, dz[i]);
return tmp;
}
private:
std::vector<shared_ptr<StochasticProcess1D> > processes_;
Matrix sqrtCorrelation_;
};

This class doesn't model a specific process, but rather the composition in a single entity of N correlated one-dimensional processes. I'll use it here to show how correlation information can be included in a process.

Its constructor takes the vector of 1-D processes for the underlyings and their correlation matrix. The processes are stored in the corresponding data member, whereas the correlation is not: instead, the process precomputes and stores its square root. (That would be its matricial square root; that is, $$\sqrt{A} = B$$ if $$B B^T = A$$.) The constructor also registers with each process, in order to forward any notifications they might send.

Most other methods, such as initialValues, drift, or expectation simply loop over the stored processes, calling their corresponding methods and collecting the results in an array. The diffusion method also loops over the processes, but combines the results with the correlation: it multiplies each row of its square root by the diffusion term of the corresponding process, and returns the results (if you multiply it by its transposed, you'll find the familiar terms $$\sigma_i \rho_{ij} \sigma_j$$ of the covariance matrix). The stdDeviation method does the same, but using the standard deviation of the underlying processes which also include the passed $$\Delta t$$.

This leaves us with the evolve method. If we knew that all the processes behaved reasonably (i.e., by adding the calculated variations to the values of their variables) we might just inherit the default implementation which takes the results of the expectation and stdDeviation methods, multiplies the latter by the array of random variates, and adds the two terms. However, we don't have such guarantee, and the method needs to be implemented in a different way. First, it deals with the correlation by multiplying the Gaussian variates by its square root, thus obtaining an array of correlated random variates. Then, it calls the evolve method of each one-dimensional process with the respective arguments and collects the results.