{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### DS4420: A note on multivariate differentiation"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "In the in-class exercise from 1/9, one step involved finding $\\nabla_{\\bf x} z$ Where $z = {\\bf x}^t {\\bf x}$ and ${\\bf x} \\in \\mathcal{R}^d$.\n",
    "Many of you broke up the dot product, trying to do something like: $\\sum_i^d \\frac{d}{dx} {\\bf x}_i$. I think folks mostly had the right idea, but ended up collapsing into a single dimension by summing over $2 \\cdot x_i$ terms. \n",
    "\n",
    "This is understandable; you may not be used to multivariate calc (or rusty in general) -- so don't worry! Just remember that gradients of ${\\bf x}$ need to have the same dimension as ${\\bf x}$! Hence the warning: Mind your dimensions!\n",
    "\n",
    "Recall from lecture that when we are interested in $\\nabla_{\\bf x} f({\\bf x})$, we are looking for a *vector* (collection) of partial derivatives for each ${\\bf x}_i$.\n",
    "\n",
    "Consider ${\\bf x}_1$: $\\frac{\\partial z}{\\partial {\\bf x}_1}$. What is this? $\\frac{\\partial}{\\partial {\\bf x}_1} {\\bf x}^t {\\bf x} = \\sum_i^d \\frac{\\partial}{\\partial {\\bf x}_1} {\\bf x}^t_i {\\bf x}_i$; the only term that won't be zeroed out here is when $i=1$: $\\frac{\\partial}{\\partial {\\bf x}_1}{\\bf x}^t_1 {\\bf x}_1 = \\frac{\\partial}{\\partial {\\bf x}_1}{\\bf x}_1^2 = 2 {\\bf x}_1$. Similarly, $\\frac{\\partial}{\\partial {\\bf x}_2}{\\bf x}^t {\\bf x} = \\frac{\\partial}{\\partial {\\bf x}_2}{\\bf x}^t_2 {\\bf x}_2 = \\frac{\\partial}{\\partial {\\bf x}_2}{\\bf x}_2^2 = 2 {\\bf x}_2$, and so on. \n",
    "\n",
    "So we assemble each of these into a vector $\\nabla_{\\bf x}$, where entry $i$ ends up being $2{\\bf x}_i$: $\\nabla_{\\bf x} = [2{\\bf x}_1, ..., 2{\\bf x}_d]$. Hence we have $2{\\bf x}$.\n",
    "\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [],
   "source": [
    "import numpy as np\n",
    "import torch\n",
    "import torch.optim as optim\n",
    "import torch.nn as nn"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "tensor([0.7032, 0.8064, 1.0841, 1.2363, 1.7031, 0.4266, 0.2907, 1.7737, 1.2545,\n",
      "        0.6804], dtype=torch.float64)\n"
     ]
    }
   ],
   "source": [
    "D = 10 # arbitrary dimension.\n",
    "x = torch.tensor(np.random.random(D), requires_grad=True)\n",
    "z = torch.dot(x, x)\n",
    "z.backward()\n",
    "x_grad = x.grad\n",
    "print(x_grad)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "torch.Size([10])\n"
     ]
    }
   ],
   "source": [
    "print(x_grad.shape)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.7.3"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}