{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Scaling - Generating power laws via SSRP\n",
"In this exercise we will see how Sample Space Reducing Processes (SSRP) can lead to power law distributions. \n",
"We will simulate the model seen in the lecture for a \"ball jumping down a stairway\". In this model, a ball jumps down from state $j$ to any state $i$ such that $i < j$, until the ball reaches the bottom. The process can be restarted at each step with probability $1 - \\lambda$ (where $\\lambda$ is called the driving rate), while a vector $q$ determines prior probabilities for visiting each state. \n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"import numpy as np\n",
"import matplotlib.pyplot as plt"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"First, we'll implement a sampler from $p(i|j)_{q, \\lambda}$, where $q$ is a vector of prior probabilities (so that visiting state $i$ occurs with probability proportional to $q_i$), and $\\lambda$ is the driving rate, so that with probability $1-\\lambda$ you restart the process (and jump to any other state).\n",
"\n",
"$$ \n",
"p(i|j)_{q, \\lambda} = \n",
" \\begin{cases}\n",
" \\lambda \\frac{q_i}{g_{j-1}} + (1-\\lambda)q_i & \\quad\\text{if } i < j\\\\\n",
" (1-\\lambda)q_i &\\quad\\text{otherwise.} \\\\ \n",
" \\end{cases}\n",
"$$\n",
"\n",
"Where $g_{j} = \\sum_{k=1}^{j} q_k$ is the equivalent of normalizing the subvector of the first $j$ elements of $q$ so that the probabilities in this reduced sample space sum to 1. If you want, use the function `normalize_q` to generate the vector of ratios $\\frac{q_i}{g_{j-1}}$.\n",
"\n",
"*Hint:* you can use `np.random.choice` to obtain a sample given a vector of probabilities"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"def sample_i(j, lmb, N, q):\n",
" \"\"\"\n",
" Obtain a new state i given that the current state is j\n",
" j: current state\n",
" lmb: driving rate (1-probability of restatring the process)\n",
" N: total number of states\n",
" q: prior state probabilities\n",
" \n",
" return i, the new state\n",
" \"\"\"\n",
" return 0"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We give a sample function for `normalize_q`. Depending on your implementation of `sample_i`, you might need to adjust it. "
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"def normalize_q(q, j):\n",
" \"\"\"\n",
" Given a an np.array of probabilities q and the current state j, \\\n",
" obtain a new vector of probabilities q^j with the first j \\\n",
" elements of q, and normalize it so it so sum(q) = 1. \n",
" \"\"\"\n",
" q_j = q[:j].copy()\n",
" g_j = q_j.sum()\n",
" q_j /= g_j\n",
" return q_j"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Next, the function `run_ssrp` obtains a sample from this model. We will empirically confirm that this process generates samples that behave like power laws. \n",
"As a reminder, this process starts from state $i=N$ (the \"top of the stairway\"), and visits lower states (analogous to the \"ball falling\") with probabilities proportional to the vector $q$. The process restarts either with probability $1-\\lambda$ or when it reaches the bottom of the state. "
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"def run_ssrp(N=30, lmb=1, n=100, q=None):\n",
" \"\"\"\n",
" N: number of states (size of staircase)\n",
" lmb: driving rate (1-probability of restarting the process at any step)\n",
" n: number of processes to run (equivalent to sample size) \n",
" q: vector of prior states probabilities\n",
" \"\"\"\n",
" if q is None:\n",
" # If q not provided, make it a unifrom probability\n",
" q = np.ones(N-1)\n",
" q /= q.sum() \n",
" l_samples = []\n",
" i = N\n",
" for k in range(n):\n",
" while i > 1:\n",
" i = sample_i(i, lmb, N, q)\n",
" l_samples.append(i)\n",
" i = N\n",
" return l_samples\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"def plot(samples, lmb=1, N=30, line=True):\n",
" \"\"\"\n",
" Sorry for the bad visualization\n",
" \"\"\"\n",
" plt.hist(samples, N-1, density=True)\n",
" if line:\n",
" x = range(1, N)\n",
" y = [(1./i)**(lmb) for i in x]\n",
" y = [i/sum(y) for i in y]\n",
" plt.plot(x, y)\n",
" plt.xlabel('State i')\n",
" plt.ylabel('p(i)')\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Now let's explore whether it's true that the visiting distribution for each state is a power law $p(i) \\propto i^{-\\alpha}$. Our plotting function provides a power law reference (note that it might not hold for the initial states, but the decay should). In the first case let's have $q$ as a constant vector and $\\lambda = 1$. Fast parameters for the number of states and the number of processes to run can be $N=30$ and $n=150$."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"lmb=1\n",
"N = 30\n",
"samples = run_ssrp(N=N, lmb=lmb, n=150)\n",
"plot(samples, lmb=lmb, N=N, line=True)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Next let's set $\\lambda = .2$. What happens?"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"scrolled": true
},
"outputs": [],
"source": [
"lmb=.2\n",
"N = 30\n",
"samples = run_ssrp(N=N, lmb=lmb, n=150)\n",
"plot(samples, lmb=lmb, N=N, line=True)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Now, what happens when $q(i) \\propto i^3$? Create such a vector $q$ and see if the power law behaviour remains. "
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"q = np.array([]) #create priors\n",
"q /= q.sum() # remember to normalize so they sum to 1\n",
"lmb = 1\n",
"N = 30\n",
"samples = run_ssrp(q=q, N=N, n=150, lmb=lmb)\n",
"plot(samples, lmb=lmb, line=True)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Last, let's have extremely large prior probabilties for the larger states, such as $q(i) \\propto e^i$. Are the visiting probabilities still a power law?"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"q = np.array([np.exp(i) for i in range(1, 30)])\n",
"q /= q.sum()\n",
"N = 30\n",
"samples = run_sampler(q=q, N=N, n=150)\n",
"plot(samples, line=False)"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.5.2"
}
},
"nbformat": 4,
"nbformat_minor": 2
}