Monday, June 30, 2014

Chapter 6, part 7 of 8: Monte Carlo simulations

Hello everybody.

This is the seventh in a series of posts that started here and covers chapter 6 of my book; that is, the Monte Carlo framework.

A week or two ago, I've been issuing a call to arms on the QuantLib mailing list. We have quite a few old items in our bug tracker and patch tracker, and we should see if they're still relevant. A few have been closed already, but if you want to get in the action and help triaging the rest, please write to the mailing list.

Do I still need to tell you about my Introduction to QuantLib Development course? Apparently I do. It's three days of lectures and exercises based on Implementing QuantLib, it's in London from September 22nd to 24th, and more information, a brochure and registration form are available here.

Follow me on Twitter if you want to be notified of new posts, or add me to your circles, or subscribe via RSS: the widgets for that are in the sidebar, at the top right of the page. Also, make sure to check my Training page.

Monte Carlo simulations

Up one step in complexity from the Monte Carlo model of last post and we get to the McSimulation class template, sketched in listing 6.15. Its job is to drive the simple-minded MonteCarloModel towards a goal, be it a required accuracy or number of samples. It can be used (and it was designed) as a starting point to build a pricing engine.

Listing 6.15: Sketch of the McSimulation class template.
    template <template <class> class MC, class RNG,
class S = Statistics>
class McSimulation {
public:
typedef typename MonteCarloModel<MC,RNG,S>::path_generator_type
            path_generator_type;
typedef typename MonteCarloModel<MC,RNG,S>::path_pricer_type
path_pricer_type;
typedef typename MonteCarloModel<MC,RNG,S>::stats_type
stats_type;
typedef typename MonteCarloModel<MC,RNG,S>::result_type
result_type;
virtual ~McSimulation() {}
result_type value(Real tolerance,
Size maxSamples = QL_MAX_INTEGER,
Size minSamples = 1023) const;
result_type valueWithSamples(Size samples) const;
void calculate(Real requiredTolerance,
Size requiredSamples,
Size maxSamples) const;
const stats_type& sampleAccumulator(void) const;
protected:
McSimulation(bool antitheticVariate,
bool controlVariate);
virtual shared_ptr<path_pricer_type> pathPricer() const = 0;
virtual shared_ptr<path_generator_type> pathGenerator() const = 0;
virtual TimeGrid timeGrid() const = 0;
virtual shared_ptr<path_pricer_type>
controlPathPricer() const;
virtual shared_ptr<path_generator_type>
controlPathGenerator() const;
virtual result_type controlVariateValue() const;
virtual shared_ptr<PricingEngine> controlPricingEngine() const;
mutable shared_ptr<MonteCarloModel<MC,RNG,S> > mcModel_;
bool antitheticVariate_, controlVariate_;
};


McSimulation follows the Template Method pattern. It asks the user to implement a few pieces of behavior, and in return it provides generic logic to instantiate a Monte Carlo model and run a simulation. Derived classes must define at least the pathPricer method, that returns the path pricer to be used; the pathGenerator method, that returns the path generator; and the timeGrid method, that returns the grid describing the nodes of the simulation. Other methods, returning the objects to be used for control variates, might or might not be McSimulation provides default implementations that return null values, so derived classes that don't want to use the control variate technique can just forget about it.

Based on such methods, McSimulation provides most of the behavior needed by an engine. Its constructor takes two boolean flags specifying whether it should use either antithetic or control variates; the second will only matter if the derived class implements the required methods.

The value method adds samples to the underlying model until the estimated error matches a required tolerance. (Of course, this needs an error estimate, so it won't work with low-discrepancy methods.)
    template <template <class> class MC, class RNG, class S>
typename McSimulation<MC,RNG,S>::result_type
McSimulation<MC,RNG,S>::value(Real tolerance,
Size maxSamples,
Size minSamples) const {
...
Real error = mcModel_->sampleAccumulator().errorEstimate();
while (error > tolerance) {
QL_REQUIRE(sampleNumber < maxSamples, ...);
Real order = (error*error)/(tolerance*tolerance);
Size nextBatch =
std::max<Size>(sampleNumber*order*0.8-sampleNumber,
minSamples));
nextBatch = std::min(nextBatch, maxSamples-sampleNumber);
sampleNumber += nextBatch;
error = mcModel_->sampleAccumulator().errorEstimate();
}
return mcModel_->sampleAccumulator().mean();
}

It looks at the current number of samples $$n$$ and the current error $$\epsilon$$, estimates the number of samples $$N$$ that will be needed to reach the given tolerance $$\tau$$ as $$N = n \times \epsilon^2/\tau^2$$ (since of course $$\epsilon \propto 1/\sqrt{n}$$), and adds a new batch of $$N-n$$ samples that gets the total closer to the estimated number; then it assesses the error again, and repeats the process as needed.

The valueWithSamples method just adds a batch of samples so that their total number matches the required one; its implementation is not shown here because of space constraints, but is simple enough.

Finally, the calculate method runs a complete simulation.
    template <template <class> class MC, class RNG, class S>
void McSimulation<MC,RNG,S>::calculate(Real requiredTolerance,
Size requiredSamples,
Size maxSamples) const {
if (!controlVariate_) {
mcModel_ = shared_ptr<MonteCarloModel<MC,RNG,S> >(
new MonteCarloModel<MC,RNG,S>(
pathGenerator(), pathPricer(),
S(), antitheticVariate_));
} else {
... // same as above, but passing the control-variate args, too.
}

if (requiredTolerance != Null<Real>())
this->value(requiredTolerance, maxSamples);
else
this->valueWithSamples(requiredSamples);
}

It takes as arguments either a required tolerance or a required number of samples (one of the arguments must be null, but not both) as well as a maximum number of samples; it instantiates a MonteCarloModel instance, with the controlVariate_ flag determining whether to pass the control-variate arguments to the constructor; and it calls either the value or the valueWithSamples method, depending on what goal was required.

In next post, I'll show an example of how to use the McSimulation class to build a pricing engine; but before that, let me point out a few ways in which it could be improved.

First of all, it currently implements logic to run a simulation based on two criteria (accuracy or total number of samples) but of course, more criteria could be added. For instance, one could run a simulation until a given clock time is elapsed; or again, one could add several batches of samples, look at the result after each one, and stop when convergence seems to be achieved. This suggests that the hard-coded value and valueWithSamples methods could be replaced by an instance of the Strategy pattern, and the switch between them in the calculate method by just a call to the stored strategy.

In turn, this would also remove the current ugliness in calculate: instead of passing the whole set of arguments for both calculations and giving most of them to a null value (like in the good old days, where languages could not overload methods) one would pass only the required arguments to the strategy object, and then the strategy object to calculate.

Finally, the presence of controlPricingEngine method is a bit of a smell. The implementation of McSimulation doesn't use it, so it's not strictly a part of the Template Method pattern and probably shouldn't be here. However, a couple of other classes (both inheriting from McSimulation, but otherwise unrelated) declare it and use it to implement the controlVariateValue method; therefore, leaving it here might not be the purest of designs but prevents some duplication.

Aside: synchronized walking

Both the MonteCarloModel and McSimulation class templates allow one to define an alternate path generator for control variates. Note, however, that this feature should be used with some caution. For this variance-reduction technique to work, the control-variate paths returned from the second generator must be correlated with the regular ones, which basically means that the two path generators must use two identical random-number generators: same kind, dimensionality, and seed (if any).

Unfortunately, this constraint rules out quite a few possibilities. For instance, if you're using a stochastic volatility model, such as the Heston model, you might be tempted to use the Black-Scholes model as control variate. No such luck: for $$n$$ time steps, the Heston process requires $$2n$$ random numbers ($$n$$ for the stock price and $$n$$ for its volatility) while the Black-Scholes process just needs $$n$$. This makes it impossible to keep the two corresponding path generators in sync.

In practice, you'll have a use case for the alternate path generator if you have a process with a number of parameters which is not analytically tractable in the generic case, but has a closed-form solution for your option value if some of the parameters are null. If you're so lucky, you can use a fully calibrated process to instantiate the main path generator, and another instance of the process with the null parameters to generate control-variate paths.