{
"nbformat": 4,
"nbformat_minor": 0,
"metadata": {
"colab": {
"provenance": []
},
"kernelspec": {
"name": "python3",
"display_name": "Python 3"
},
"language_info": {
"name": "python"
}
},
"cells": [
{
"cell_type": "markdown",
"source": [
"#**Lab 2**\n",
"\n",
"**Name: (your name)**\n",
"\n"
],
"metadata": {
"id": "6VpZ01tyznHo"
}
},
{
"cell_type": "markdown",
"source": [
"Some remarks before you start:\n",
"* Import the Python packages below each time you start a session.\n",
"* Before submitting your report, remove all ouputs by clicking on **Edit🠦Clear all ouputs**. Then download your report by clicking on **File🠦Download🠦.ipynb**. This is the file you will submit on Canvas.\n",
"\n",
"In this lab, you will practice:\n",
"* perform matrix operations (addition, subtraction, multiplication)\n",
"* find stationary vector of a matrix\n",
"* rank web pages using PageRank algorithm\n",
"* solve a problem in ecology using PageRank algorithm\n",
"\n",
"\\begin{array}{|c| c|}\n",
" \\hline\n",
" \\text{Problems} & \\text{Points}\\\\\n",
" \\hline\\hline\n",
" 1, 3 & 2.5 \\\\\n",
" \\hline\n",
" 2, 4, 5, 6, 7 & 4 \\\\\n",
" \\hline\n",
" \\text{Readability of your report} & 3 \\\\\n",
" \\hline\n",
" \\text{Total: 7} & \\text{Total: 28} \\\\\n",
" \\hline\n",
"\\end{array}\n",
"\n"
],
"metadata": {
"id": "3cVoL2qvzufn"
}
},
{
"cell_type": "markdown",
"source": [
"##**I. Import necessary Python packages**\n",
"Execute the following code each time you start a session."
],
"metadata": {
"id": "bQagzUIHzzb_"
}
},
{
"cell_type": "code",
"source": [
"from sympy import* # import everything from the standard Python symbolic-computing package (SymPy)\n",
"from scipy import* # import everything from the standard Python scientific-computing package (SciPy)"
],
"metadata": {
"id": "sGBvQoGSzoih"
},
"execution_count": null,
"outputs": []
},
{
"cell_type": "markdown",
"source": [
"## **II. Matrix operations**\n",
"\n",
"It is more convenient to perform matrix operations with the SymPy package than with the Numpy package. As a tutorial example, let us define two matrices\n",
"\n",
"$\n",
"A=\\left[\n",
" \\begin{matrix}\n",
" 1&2\\\\\n",
" 3&2\n",
" \\end{matrix}\n",
"\\right]\n",
"$ and $\n",
"B=\\left[\n",
" \\begin{matrix}\n",
" -2&0\\\\\n",
" 2&1\n",
" \\end{matrix}\n",
"\\right]\n",
"$"
],
"metadata": {
"id": "yGFzT_PXz431"
}
},
{
"cell_type": "markdown",
"source": [
"using the command *Matrix* (rather than *array*):"
],
"metadata": {
"id": "QqnmIFJyDe1-"
}
},
{
"cell_type": "code",
"source": [
"A = Matrix([[1,2],[3,2]])\n",
"B = Matrix([[-2,0],[2,1]])"
],
"metadata": {
"id": "_fLg6iO5z5sG"
},
"execution_count": null,
"outputs": []
},
{
"cell_type": "markdown",
"source": [
"There are a few different ways to render the matrix $A+B$. Try each of the following commands."
],
"metadata": {
"id": "PZSxjPbZFM2L"
}
},
{
"cell_type": "code",
"source": [
"A + B"
],
"metadata": {
"id": "ndThAdz3nzYV"
},
"execution_count": null,
"outputs": []
},
{
"cell_type": "code",
"source": [
"print(A+B)"
],
"metadata": {
"id": "6zhz2n8kHHj_"
},
"execution_count": null,
"outputs": []
},
{
"cell_type": "code",
"source": [
"pprint(A+B)"
],
"metadata": {
"id": "XrzEPvV-HJ36"
},
"execution_count": null,
"outputs": []
},
{
"cell_type": "markdown",
"source": [
"Besides numeric matrices, you can also work with symbolic matrices. Try the following:"
],
"metadata": {
"id": "FQTMu6f5MMvZ"
}
},
{
"cell_type": "code",
"source": [
"C = Matrix([['a'],['b']])\n",
"C"
],
"metadata": {
"id": "20jODW9PM1u6"
},
"execution_count": null,
"outputs": []
},
{
"cell_type": "markdown",
"source": [
"###**Exercise 1**\n",
"Run each of the following commands and explain what each of them does."
],
"metadata": {
"id": "DIQOwXFQHl-b"
}
},
{
"cell_type": "code",
"source": [
"A - 2*B"
],
"metadata": {
"id": "0BPjroOYUfj8"
},
"execution_count": null,
"outputs": []
},
{
"cell_type": "code",
"source": [
"A*C"
],
"metadata": {
"id": "_gNZ8qABUhqj"
},
"execution_count": null,
"outputs": []
},
{
"cell_type": "code",
"source": [
"A**2"
],
"metadata": {
"id": "p_9hNl3kUi9L"
},
"execution_count": null,
"outputs": []
},
{
"cell_type": "code",
"source": [
"A**2*B"
],
"metadata": {
"id": "djWZwCJzUkHb"
},
"execution_count": null,
"outputs": []
},
{
"cell_type": "code",
"source": [
"A**2*B**2"
],
"metadata": {
"id": "PvgC4YjPUlTt"
},
"execution_count": null,
"outputs": []
},
{
"cell_type": "markdown",
"source": [
"###**Exercise 2**\n",
"Let $\n",
"A=\\left[\n",
" \\begin{matrix}\n",
" 1&2\\\\\n",
" 3&4\n",
" \\end{matrix}\n",
"\\right]\n",
"$. Find all $2\\times 2$ matrices $B$ such that $AB=BA$.\n",
"\n",
"Hint: if you write $\n",
"B=\\left[\n",
" \\begin{matrix}\n",
" a&b\\\\\n",
" c&d\n",
" \\end{matrix}\n",
"\\right]\n",
"$, then the equation $AB=BA$ is equivalent to a linear system of 4 equations with 4 unknowns $a,b,c,d$. You then write an associated matrix of this system and solve it using **rref** (review Lab 1 if you forget how to use it)."
],
"metadata": {
"id": "Go2IWVSOKeW_"
}
},
{
"cell_type": "markdown",
"source": [
"##**III. Stationary vectors of a matrix**\n",
"Consider matrix $\n",
"A=\\left[\n",
" \\begin{matrix}\n",
" -1&0\\\\\n",
" 2&3\n",
" \\end{matrix}\n",
"\\right]\n",
"$. A *stationary vector* of $A$ is a nonzero vector $v$ such that $Av=v$. To find $v$, you rewrite this equation as $$(A-I_2)v=0$$ where $I_2$ is the $2\\times 2$ identity matrix. Write $\n",
"v=\\left[\n",
" \\begin{matrix}\n",
" x\\\\\n",
" y\n",
" \\end{matrix}\n",
"\\right]$.\n",
"\n",
"The above equation becomes\n",
"$$\\left(\\left[\n",
" \\begin{matrix}\n",
" -1&0\\\\\n",
" 2&3\n",
" \\end{matrix}\n",
"\\right]-\\left[\n",
" \\begin{matrix}\n",
" 1&0\\\\\n",
" 0&1\n",
" \\end{matrix}\n",
"\\right]\\right)\\left[\n",
" \\begin{matrix}\n",
" x\\\\\n",
" y\n",
" \\end{matrix}\n",
"\\right]=\\left[\n",
" \\begin{matrix}\n",
" 0\\\\\n",
" 0\n",
" \\end{matrix}\n",
"\\right]$$\n",
"\n",
"which is equivalent to\n",
"\n",
"$$\\left[\n",
" \\begin{matrix}\n",
" -2&0\\\\\n",
" 2&2\n",
" \\end{matrix}\n",
"\\right]\\left[\n",
" \\begin{matrix}\n",
" x\\\\\n",
" y\n",
" \\end{matrix}\n",
"\\right]=\\left[\n",
" \\begin{matrix}\n",
" 0\\\\\n",
" 0\n",
" \\end{matrix}\n",
"\\right]$$\n",
"\n",
"This equation is equivalent to a linear system of 2 equations and 2 unknowns $x,y$. Its associated matrix can be obtained simply by attaching the zero column on the right hand side to the matrix on the left hand side, as follows:\n",
"$$\\left[\n",
" \\begin{matrix}\n",
" -2&0&0\\\\\n",
" 2&2&0\n",
" \\end{matrix}\n",
"\\right]$$"
],
"metadata": {
"id": "Z_m2zyE8PBDJ"
}
},
{
"cell_type": "code",
"source": [
"A = Matrix([[-1,0],[2,3]])\n",
"B = A - eye(2) # B equals A minus the identity matrix\n",
"C = zeros(2,1) # C is a 2x1 matrix of all zeros\n",
"B = B.row_join(C) # attach column 0 on the right of matrix B\n",
"B.rref(lambda x: abs(x) < 10**-10)[0] # any number within 10^-10 of 0 to be regarded as 0"
],
"metadata": {
"id": "0x6z6B0_XHkG"
},
"execution_count": null,
"outputs": []
},
{
"cell_type": "markdown",
"source": [
"From here, you can see that $x=y=0$. Therefore, matrix $A$ has no stationary vectors."
],
"metadata": {
"id": "F_ezhh0EZjX_"
}
},
{
"cell_type": "markdown",
"source": [
"###**Exercise 3**\n",
"Find all stationary vectors of the matrix\n",
"$$A=\\left[\n",
" \\begin{matrix}\n",
" 1/2&1/3&1/4\\\\\n",
" 1/2&1/3&1/2\\\\\n",
" 0&1/3&1/4\n",
" \\end{matrix}\n",
"\\right]$$\n",
"Which of them is a probability vector?"
],
"metadata": {
"id": "J9xAgbPfOvHc"
}
},
{
"cell_type": "markdown",
"source": [
"###**Exercise 4**\n",
"There is a simple numerical method to solve approximately for the probability stationary vector of an $n\\times n$ *stochastic matrix* $A$. The idea is as follows. Let $v_0$ be an arbitrary *probability* vector of length $n$. For each positive integer $k$, let $v_k=A^kv_0$. You can see that\n",
"\n",
"$$v_{k+1}=A^{k+1}v_0=AA^{k}v_0=Av_k$$\n",
"\n",
"Perron-Frobenius theorem guarantees that the sequence of vectors $v_0,v_1,v_2,v_3,...$ converges to some vector $v$. In the limit $k\\to\\infty$, the above equation yields $v=Av$. Vector $v$ is therefore the probability stationary vector of $A$.\n",
"\n",
"With the matrix $A$ given in [Exercise 3](#stocmat),\n",
"* Choose any probability vector $v_0$\n",
"* Compute $v_{k}=A^kv_0$ for $k=5,10,20,40$\n",
"* What is the approximate probability stationary vector of $A$?\n",
"* Now choose a different probability vector $v_0$ and repeat Step 2 and 3. Do you get a different probability stationary vector?"
],
"metadata": {
"id": "YCjhpEj1b0kf"
}
},
{
"cell_type": "markdown",
"source": [
"##**IV. PageRank algorithm**"
],
"metadata": {
"id": "yTs3lEbvc53r"
}
},
{
"cell_type": "markdown",
"source": [
"###**Exercise 5**\n",
"Consider the following internet\n",
"\n",
""
],
"metadata": {
"id": "PoZ9vnh11OPp"
}
},
{
"cell_type": "markdown",
"source": [
"* Intuitively (without any calculation), how would you rank these web pages?\n",
"* Write the transition matrix (without damping) of the internet.\n",
"* Find the PageRank vector of the internet. *Recall:* it is the probability stationary vector of the transition matrix.\n",
"* Does the order of the pages based on the PageRank agree with the order that you guessed earlier?"
],
"metadata": {
"id": "oFr-NkJH2I7i"
}
},
{
"cell_type": "markdown",
"source": [
"###**Exercise 6**\n",
"If the internet has disconnected clusters, the PageRank probability stationary vector may not be unique. In that case, it is ambiguous how to order the pages. Brin and Page's solution is to introduce a damping parameter $d\\in[0,1]$ and adjust the transition matrix to\n",
"$$A=dA'+(1-d)\\frac{1}{n}J$$\n",
"where $n$ is the number of pages, $A'$ is the regular transition matrix (without damping), and $J$ is an $n\\times n$ matrix in which all entries are equal to $1$. To see the rationale behind the damping parameter, imagine that you freely navigate from one web page to another in the following manner: with probability $d$, you click on any link on the page that you are on (all outbound links still have the same chance of being clicked on). With probability $1-d$, you randomly jump to any page on the internet. This is a clever idea to \"connect\" all disconnected clusters.\n",
"\n",
"With the damping parameter $d=0.85$, determine the PageRank vector of the following internet."
],
"metadata": {
"id": "tUzvkc6u3MWQ"
}
},
{
"cell_type": "markdown",
"source": [
""
],
"metadata": {
"id": "7zEnoRxr65BB"
}
},
{
"cell_type": "markdown",
"source": [
"###**Exercise 7**\n",
"Surprisingly, the PageRank algorithm used by Google to organize the web pages from a Google search result can be used to model food webs. Consider a mini food web in the below figure."
],
"metadata": {
"id": "RUrzT4qp7Ytx"
}
},
{
"cell_type": "markdown",
"source": [
""
],
"metadata": {
"id": "MaadbT4y79BJ"
}
},
{
"cell_type": "markdown",
"source": [
"Although *Grass* is not an animal, it is still a node of the food web. The arrow $A\\to B$ indicates that $A$ feeds $B$. For example, *Lion* feeds *Grass* by its manure. Unlike in the PageRank model, the “authority” of a node was not determined by incoming links to that node, but rather by outgoing links. The more other species that a given species supports with the nutrients passing though it, the more important it is to the ecosystem. Regarding the \"quality\" of links, unlike in the PageRank model where all outbound links at a given node are regarded as equal, animals typically prefer certain food/preys over others. Assume the following:\n",
"\n",
"* *Fox*'s preference is $1/3$ for *Cricket*, $2/3$ for *Vole*.\n",
"* *Grass*'s preference is $1/5$ for *Fox*, $1/3$ for *Lion*, $1/3$ for *Warthog*, and $1/30$ for each other species. Think of it this way: *Lion* and *Warthog* produce the most manure, so *Grass* prefers them more.\n",
"* *Lion*'s preference is $1/6$ for *Fox*, $5/6$ for *Warthog*.\n",
"* *Owl* and *Snake* have no preference among their preys.\n",
"* *Warthog*'s preference is $1/6$ for *Cricket*, $5/6$ for *Grass*.\n",
"\n",
"Find the PageRank vector to determine the importance level of each species in the ecosystem.\n"
],
"metadata": {
"id": "6S1vAIbp8pqC"
}
}
]
}