{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "### DS4420: A note on multivariate differentiation" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "In the in-class exercise from 1/9, one step involved finding $\\nabla_{\\bf x} z$ Where $z = {\\bf x}^t {\\bf x}$ and ${\\bf x} \\in \\mathcal{R}^d$.\n", "Many of you broke up the dot product, trying to do something like: $\\sum_i^d \\frac{d}{dx} {\\bf x}_i$. I think folks mostly had the right idea, but ended up collapsing into a single dimension by summing over $2 \\cdot x_i$ terms. \n", "\n", "This is understandable; you may not be used to multivariate calc (or rusty in general) -- so don't worry! Just remember that gradients of ${\\bf x}$ need to have the same dimension as ${\\bf x}$! Hence the warning: Mind your dimensions!\n", "\n", "Recall from lecture that when we are interested in $\\nabla_{\\bf x} f({\\bf x})$, we are looking for a *vector* (collection) of partial derivatives for each ${\\bf x}_i$.\n", "\n", "Consider ${\\bf x}_1$: $\\frac{\\partial z}{\\partial {\\bf x}_1}$. What is this? $\\frac{\\partial}{\\partial {\\bf x}_1} {\\bf x}^t {\\bf x} = \\sum_i^d \\frac{\\partial}{\\partial {\\bf x}_1} {\\bf x}^t_i {\\bf x}_i$; the only term that won't be zeroed out here is when $i=1$: $\\frac{\\partial}{\\partial {\\bf x}_1}{\\bf x}^t_1 {\\bf x}_1 = \\frac{\\partial}{\\partial {\\bf x}_1}{\\bf x}_1^2 = 2 {\\bf x}_1$. Similarly, $\\frac{\\partial}{\\partial {\\bf x}_2}{\\bf x}^t {\\bf x} = \\frac{\\partial}{\\partial {\\bf x}_2}{\\bf x}^t_2 {\\bf x}_2 = \\frac{\\partial}{\\partial {\\bf x}_2}{\\bf x}_2^2 = 2 {\\bf x}_2$, and so on. \n", "\n", "So we assemble each of these into a vector $\\nabla_{\\bf x}$, where entry $i$ ends up being $2{\\bf x}_i$: $\\nabla_{\\bf x} = [2{\\bf x}_1, ..., 2{\\bf x}_d]$. Hence we have $2{\\bf x}$.\n", "\n" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [], "source": [ "import numpy as np\n", "import torch\n", "import torch.optim as optim\n", "import torch.nn as nn" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "tensor([0.7032, 0.8064, 1.0841, 1.2363, 1.7031, 0.4266, 0.2907, 1.7737, 1.2545,\n", " 0.6804], dtype=torch.float64)\n" ] } ], "source": [ "D = 10 # arbitrary dimension.\n", "x = torch.tensor(np.random.random(D), requires_grad=True)\n", "z = torch.dot(x, x)\n", "z.backward()\n", "x_grad = x.grad\n", "print(x_grad)" ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "torch.Size([10])\n" ] } ], "source": [ "print(x_grad.shape)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.7.3" } }, "nbformat": 4, "nbformat_minor": 2 }