---
title: "An R Package for Fast Sampling from von Mises Fisher Distribution"
author: "Aristide Houndetoungan"
date: "`r Sys.Date()`"
output:
  pdf_document:
    citation_package: natbib
    number_sections: true
  bookdown::pdf_book:
    citation_package: biblatex
bibliography: ["References.bib", "Packages.bib"]
biblio-style: "apalike"
link-citations: true
urlcolor: blue
vignette: >
  %\VignetteIndexEntry{An R Package for Fast Sampling from von Mises Fisher Distribution}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
---

<style>
body {
text-align: justify}
</style>

```{r, include = FALSE}
knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
)
```

# Introduction{-#over}

The package **vMF** simulates von Mises-Fisher distribution ($\mathcal{M}$). Unlike the package [**movMF**](https://CRAN.R-project.org/package=movMF) [@hornik2014movmf], which simulates and estimates mixtures of $\mathcal{M}$, **vFM** performs  fast sampling as its source code is written in C++. **vFM** also computes the density and the normalization constant of $\mathcal{M}$.

The von Mises-Fisher distribution is used to model coordinates on a hypersphere of dimension $p \ge 2$. Roughly speaking, it is the equivalent of the normal distribution on a hypersphere. As the normal distribution, $\mathcal{M}$ is characterized by two parameters. The location (or mean directional) parameter $\boldsymbol{\mu}$ around which draws will be concentrated and the intensity parameter $\eta$ which measures the intensity of concentration of the draws around $\boldsymbol{\mu}$. The higher $\eta$, the more the draws are concentrated around $\boldsymbol{\mu}$. Compared to the normal distribution, $\boldsymbol{\mu}$ is similar to the mean parameter of the normal distribution and $1/\eta$ is similar to the standard deviation.

There are several definitions of the density function of $\mathcal{M}$. In this package, the density is normalized by the uniform distribution without loss of generality. This is also the case in @mardia2009directional and @hornik2013conjugate.

Let $\mathbf{z} \sim \mathcal{M}\left(\eta,\boldsymbol{\mu}\right)$. The density of $\mathbf{z}$ is given by

$$f_p(\mathbf{z}|\eta, \boldsymbol{\mu}) =C_p(\eta) e^{\eta\mathbf{z}'\boldsymbol{\mu}},$$
where $\displaystyle C_p(x) = \left(\frac{x}{2}\right)^{\frac{p}{2}-1}\frac{1}{\Gamma\left(\frac{p}{2}\right)I_{\frac{p}{2}-1}(x)}$ is the normalization constant and $I_.(.)$ the Bessel function of the first kind defined by: $$\displaystyle I_{\alpha}(x) = \sum_{m=0}^{\infty}\frac{\left(\frac{x}{2}\right)^{2m+\alpha}}{m!\Gamma(m+\alpha + 1)}.$$
\noindent The normalization with respect to the uniform distribution implies $C_p(0)=1$.

# Simulation from von Mises Fisher distribution{-#sim}
The following algorithm provides a rejection sampling scheme for drawing a sample from $\mathcal{M}$ with mean directional parameter $\boldsymbol{\mu} = (0, ... , 0, 1)$ and concentration (intensity) parameter $\eta \ge 0$ [see Section 2.1 in @hornik2014movmf]. 

* Step 1. Calculate $b$ using * Step 1. Calculate $b$ using

$$b = \dfrac{p - 1}{2\eta + \sqrt{2\eta^2 + (p - 1)^2}}.$$ Let $x_0 = (1 - b)/(1 + b)$ and $c = \eta x_0 + (p - 1)\log\left(1 - x_0^2\right)$.

* Step 2. Generate $Z \sim Beta((p-1)/2,(p-1)/2)$ and $U \sim Unif([0, 1])$ and calculate

$$W = \dfrac{1-(1+b)Z}{1-(1-b)Z}.$$

* Step 3. If

$$\eta W+(p-1)\log(1-x_0W)-c<\log(U),$$ go to step 2.

* Step 4. Generate a uniform $(d-1)$-dimensional unit vector $\mathbf{V}$ and return

$$\mathbf{X} =\left(\sqrt{1- W^2}\mathbf{V}^{\prime},W\right)^{\prime}$$

The uniform ($d-1$)-dimensional unit vector $\mathbf{V}$ can be generated by simulating $d-1$ independent
standard normal random variables and normalizing them so as $\lVert \mathbf{V} \rVert_2 = 1$. To get
sampling from $\mathcal{M}$ with arbitrary mean direction parameter $\boldsymbol{\mu}$, $\mathbf{X}$ is multiplied
from the left with a matrix where the first $d-1$ columns consist of unitary basis vectors of
the subspace orthogonal to $\boldsymbol{\mu}$ and the last column is equal to $\boldsymbol{\mu}$.

# Comparison of vMF and movMF{-#comp}
In this section, I compare **vMF** and **[movMF](https://CRAN.R-project.org/package=movMF)**. 

<!--First, I compare draws from both packages to make sure that they are the same.

```{r comp1, echo = TRUE, eval = TRUE}
library(vMF)
library(movMF)

n      <- 5 # Number of draws
set.seed(123)
xvMF   <- rvMF(n,c(1,0,0))
set.seed(123)
xmovMF <- rmovMF(n,c(1,0,0))
all.equal(c(xvMF), c(xmovMF))
xvMF
xmovMF
```
The outputs are surprisingly different although the random number generators are fixed. After studying the source code of **[movMF](https://CRAN.R-project.org/package=movMF)**, I notice that the two packages do not code their sampling function in the same way. The function `rmovMF` is more general and is designed to draw samples from a mixture of $\mathcal{M}$. The sampling function first generates $n$ (the sample size) random numbers from the uniform distribution before running the algorithm described above. This simulation modifies the random number generators. One can then obtain the same output by first simulating $n$ numbers from the uniform distribution before calling the function `rvMF`.

```{r comp2, echo = TRUE, eval = TRUE}
n      <- 30
set.seed(123)
ddpcr::quiet(runif(n))
xvMF   <- rvMF(n,c(1,0,0))
set.seed(123)
xmovMF <- rmovMF(n,c(1,0,0))
all.equal(c(xvMF), c(xmovMF))
xvMF[1:5,]
xmovMF[1:5,]
```

I now compare the performance of both packages.-->
```{r ex1, echo = TRUE, eval = TRUE}
library(rbenchmark) 

fcompare <- function(n) {
  benchmark("vMF" = rvMF(n,c(1,0,0)), "movMF" = rmovMF(1,c(1,0,0)))
}

fcompare(1)
fcompare(10)
fcompare(100)
```
**vMF** performs over **[movMF](https://CRAN.R-project.org/package=movMF)**. The performance of **vMF** is much better when only few simulations are performed. When the sample is too large, the two package require approximately the same running time.

```{r ex11a, echo = FALSE, eval = TRUE}
#load data to save time during the building
load("out.Rdata")
```
```{r ex11b, echo = TRUE, eval = FALSE}
out  <- unlist(lapply(1:200, function(x) fcompare(x)$elapsed[1]/fcompare(x)$elapsed[2]))
```
```{r ex11c, echo = TRUE, eval = TRUE, fig.height = 4, fig.align = "center"}
library(ggplot2)
ggplot(data = data.frame(n = 1:200, time = out), aes(x = n, y = time)) +
  geom_point(col = "blue") + geom_hline(yintercept = 1, col = 2)
```

Many papers use simulations from the von-Mises Fisher distribution in a Markov Chain Monte Carlo (MCMC) process. A single draw is performed at each iteration of the MCMC. This is for example the case in @boucher2019partial, @breza2020using, @mccormick2015latent. In such a simulation context, using **vMF** would take much less time than **[movMF](https://CRAN.R-project.org/package=movMF)**.
For example, I consider the process $\left(\mathbf{z}_t\right)_{t\in\mathbb{N}}$ which follows a random walk of the von-Mises Fisher distribution. The first variable, $\mathbf{z}_0$, is randowmly set on a 4-dimensional hypersphere and $\mathbf{z}_t \sim \mathcal{M}\left(1,\mathbf{z} _ {t - 1}\right)$ $\forall$ $t > 0$. Simulating this process has about the same complexity as using von-Mises Fisher drawings in an MCMC.

```{r ex2a, echo = TRUE, eval = FALSE}
set.seed(123)
P                  <- 4
initial            <- rmovMF(1, rep(0, P))
# Fonction based on vMF to simulate theta
SamplevMF          <- function(n) {
  output           <- matrix(0, n + 1, P)
  output[1, ]      <- initial
  for (i in 1:n) {
    output[i + 1,] <- rvMF(1, output[i,])
  }
  return(output)
}

# Fonction based on movMF to simulate theta
SamplemovMF        <-function(n){
  output           <- matrix(0, n + 1, P)
  output[1, ]      <- initial
  for (i in 1:n) {
    output[i + 1,] <- rmovMF(1, output[i,])
  }
  return(output)
}
benchmark("vMF" = SamplevMF(1000), "movMF" = SamplemovMF(1000))
```
```{r ex2b, echo = FALSE, eval = TRUE}
print(outbench)
```
The comparison of the running times  **vMF** is less time-consuming