masters-thesis/plots/plot_histograms.ipynb

{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {
    "collapsed": true
   },
   "source": [
    "# Description of Functionality\n",
    "This script loads csv files with the following information:\n",
    "* Time the client enqueues the packet (`enq`) or time the clien actually sends the packet (`send`)\n",
    "* Time the client gets a Work Completion (`send_wc`)\n",
    "* Time the server receives the packet (`recv`)\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Algorithm\n",
    "\n",
    "## Load settings from JSON file\n",
    "\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import os\n",
    "from sys import argv\n",
    "rootdir = argv[1]\n",
    "\n",
    "#############################\n",
    "#      FOR NOTEBOOK USE     #\n",
    "#     SET DIRECTORY HERE    #\n",
    "#                           #\n",
    "#rootdir = \"\"\n",
    "#                           #\n",
    "#############################\n",
    "\n",
    "print(\"Using root directory: {}\".format(rootdir))\n",
    "\n",
    "subdirs = sorted([ name for name in os.listdir('{}'.format(rootdir)) if os.path.isdir(os.path.join('{}'.format(rootdir), name)) ])\n",
    "\n",
    "print(\"Available subdirs: {}\".format(subdirs))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import json\n",
    "from sys import exit\n",
    "\n",
    "try:\n",
    "    with open(\"{}/settings.json\".format(rootdir)) as json_file:\n",
    "        settings = json.load(json_file)\n",
    "except:\n",
    "    print(\"Please define a correct JSON file!\")\n",
    "    exit()\n",
    "\n",
    "print(\"Succesfully loaded JSON file\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Import\n",
    "First, import the numpy library, initialize the arrays, and finally load the csv files. \n",
    "\n",
    "Because of the way the C script dumps the variables, the last character of the csv-file will be a comma and thus the last value of the `*_times` arrays will be `NaN`. Hence, the last value has to be eliminated."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import numpy as np\n",
    "\n",
    "# Initialize arrays\n",
    "enq_send = []\n",
    "recv = []\n",
    "\n",
    "if not settings['compare_tests']:\n",
    "    send_wc = []\n",
    "\n",
    "# Load all data and remove the last comma.\n",
    "# This for loop distinguish between tests which measure the enqueue time and tests which\n",
    "# measure the actual send time.\n",
    "for i, subdir in enumerate(subdirs):\n",
    "    enq_send.append(np.genfromtxt('{}/{}/enq_send_times.csv'.format(rootdir, subdir), delimiter=','))\n",
    "    recv.append(np.genfromtxt('{}/{}/recv_times.csv'.format(rootdir, subdir), delimiter=','))\n",
    "    \n",
    "    # Remove last comma\n",
    "    enq_send[i] = np.delete(enq_send[i], -1)\n",
    "    recv[i] = np.delete(recv[i], -1)\n",
    "    \n",
    "    if not settings['compare_tests']:\n",
    "        send_wc.append(np.genfromtxt('{}/{}/send_wc_times.csv'.format(rootdir, subdir), delimiter=','))\n",
    "        \n",
    "        # Remove last comma\n",
    "        send_wc[i] = np.delete(send_wc[i], -1)\n",
    "\n",
    "    #Print number of datapoints\n",
    "    print('Loaded {} + {} datapoints from {}.'.format(np.size(enq_send[i]), np.size(recv[i]), subdir))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "collapsed": true
   },
   "source": [
    "## Process data\n",
    "Now, the data is processed. First, a check for overflows have to be performed. The timestamps are determined with the function `clock_gettime(clockid_t clk_id, const struct timespec *tp)`. Both the `struct tp`, as well as the function are showed below\n",
    "\n",
    "```\n",
    "struct timespec {\n",
    "  time_t   tv_sec;        /* seconds */\n",
    "  long     tv_nsec;       /* nanoseconds */\n",
    "} tp;\n",
    "\n",
    "clock_gettime(CLOCK_MONOTONIC, &tp);\n",
    "```\n",
    "\n",
    "The application only sends the `long tv_nsec` value, which goes from 999999999ns to 0ns. Since transmissions cannot take longer than 1 second, this overflow is resolved by adding 1000000000ns to the receive timestamps and the send confirmation timestamps, if they are smaller than the enqueue- or send timestamps.\n",
    "\n",
    "Subsequentely, the deltas between the enqueue- or send time and the receive time, and the delta between the enqueue- or send time and the send confirmation time are calculated."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "#Initialize arrays\n",
    "enq_send_recv_d = []\n",
    "\n",
    "if not settings['compare_tests']:\n",
    "    enq_send_send_wc_d = []\n",
    "\n",
    "#Resolve overflow issues and then calculate deltas\n",
    "for i in range(0, len(subdirs)):\n",
    "    recv[i][recv[i] < enq_send[i]] += 1000000000\n",
    "    enq_send_recv_d.append(recv[i] - enq_send[i])\n",
    "    \n",
    "    if not settings['compare_tests']:\n",
    "        send_wc[i][send_wc[i] < enq_send[i]] += 1000000000\n",
    "        enq_send_send_wc_d.append(send_wc[i] - enq_send[i])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Plotting\n",
    "\n",
    "The data will now be plotted."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Define \"find nearest\" function\n",
    "def find_nearest(array, value):\n",
    "    array = np.asarray(array)\n",
    "    idx = (np.abs(array - value)).argmin()\n",
    "    return array[idx], idx"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "scrolled": false
   },
   "outputs": [],
   "source": [
    "import matplotlib.pyplot as plt\n",
    "from matplotlib.font_manager import FontProperties\n",
    "\n",
    "x_limit = 10000\n",
    "plots_saved = 0\n",
    "\n",
    "#Start creating plots\n",
    "for i in range(0, len(subdirs)):\n",
    "    datasets = []    \n",
    "\n",
    "    if settings['compare_tests']:\n",
    "        if i % 2 == 1:\n",
    "            continue\n",
    "        \n",
    "        datasets.append(enq_send_recv_d[i])\n",
    "        datasets.append(enq_send_recv_d[i+1])\n",
    "    else:\n",
    "        datasets.append(enq_send_recv_d[i])\n",
    "        datasets.append(enq_send_send_wc_d[i])\n",
    "\n",
    "    medians = []\n",
    "    medians.append(np.median(datasets[0]))\n",
    "    medians.append(np.median(datasets[1]))\n",
    "    \n",
    "    # Determine correction, in case figure needs to be bigger\n",
    "    correction = 0\n",
    "    if abs(medians[1] - medians[0]) < 200:\n",
    "        correction = 0.2\n",
    "\n",
    "    fig = plt.figure(num=None, figsize=(11, 2.7 + correction), dpi=500, facecolor='w', edgecolor='k')\n",
    "\n",
    "    # Add plot and set title\n",
    "    ax = fig.add_subplot(111)\n",
    "\n",
    "    # Set grid\n",
    "    ax.set_axisbelow(True)\n",
    "    ax.grid(True, linestyle='--')\n",
    "\n",
    "    bins = np.arange(0, x_limit+1, 100.0)\n",
    "\n",
    "    # Data in plot\n",
    "    # http://www.color-hex.com/color-palette/33602\n",
    "    ax.hist(datasets[1], label=settings['labels'][1], edgecolor='black', bins=bins, color='#00549f', zorder=1)\n",
    "    ax.axvline(medians[1], color='red', linestyle='-', linewidth=1, zorder=2, alpha=0.85)\n",
    "\n",
    "    ax.hist(datasets[0], label=settings['labels'][0], edgecolor='black', bins=bins, color='#8ebae5', zorder=3, alpha=0.75)\n",
    "    ax.axvline(medians[0], color='red', linestyle='-', linewidth=1, zorder=4)\n",
    "\n",
    "    # Set axis\n",
    "    plt.xlim([0,x_limit])\n",
    "     \n",
    "    # Calculate how many values are larger than the x_limit\n",
    "    errors = []\n",
    "    errors.append((np.size(datasets[0][datasets[0] > x_limit]) / np.size(datasets[0])) * 100)\n",
    "    errors.append((np.size(datasets[1][datasets[1] > x_limit]) / np.size(datasets[1])) * 100)\n",
    " \n",
    "    errors[0] = round(errors[0], 4)\n",
    "    errors[1] = round(errors[1], 4)\n",
    "    \n",
    "    # Set ticks\n",
    "    ticks_unmodified = ticks = np.arange(0, x_limit+1, 1000.0)\n",
    "\n",
    "    nearest = [None] * 2\n",
    "    nearest_idx = [None] * 2\n",
    "    \n",
    "    nearest[0], nearest_idx[0] = find_nearest(ticks, medians[0])\n",
    "    nearest[1], nearest_idx[1] = find_nearest(ticks, medians[1])\n",
    "    \n",
    "    if medians[0] < medians[1]:\n",
    "        ticks = np.append(ticks, medians[0])\n",
    "        ticks = np.append(ticks, medians[1])\n",
    "    else:\n",
    "        ticks = np.append(ticks, medians[1])\n",
    "        ticks = np.append(ticks, medians[0])\n",
    "\n",
    "    # Explicitly set labels\n",
    "    labels = []\n",
    "    \n",
    "    for value in ticks:\n",
    "        if value == nearest[0] and np.abs(nearest[0] - medians[0]) < 200:\n",
    "            labels.append(\"\")\n",
    "        elif value == nearest[1] and np.abs(nearest[1] - medians[1]) < 200:\n",
    "            labels.append(\"\")\n",
    "        else:\n",
    "            labels.append(str(int(value)))\n",
    "\n",
    "    # Set xticks\n",
    "    plt.xticks(ticks, labels, fontsize=10, family='monospace', rotation=30, horizontalalignment='right', rotation_mode=\"anchor\")\n",
    "    \n",
    "    # Color median values red\n",
    "    first_median_is_set = False\n",
    "    \n",
    "    for j, value in enumerate(ax.get_xticklabels()):\n",
    "        try:\n",
    "            if float(value.get_text()) == int(medians[0]) or float(value.get_text()) == int(medians[1]):\n",
    "                value.set_color('red')\n",
    "                \n",
    "                if abs(medians[0] - medians[1]) < 170 and first_median_is_set:\n",
    "                    value.set_y(-0.07)\n",
    "\n",
    "                    nearest, nearest_idx = find_nearest(ticks_unmodified, float(value.get_text()))\n",
    "                    \n",
    "                    if abs(nearest - float(value.get_text())) < 350:\n",
    "                        ax.get_xticklabels()[nearest_idx].set_y(-0.07)\n",
    "                \n",
    "                first_median_is_set = True\n",
    "    \n",
    "        except ValueError:\n",
    "            # We got some empty values. Ignore them\n",
    "            pass\n",
    "        \n",
    "    # Set yticks\n",
    "    plt.yticks(fontsize=10, family='monospace')\n",
    "\n",
    "    #Labels\n",
    "    font_text = FontProperties()\n",
    "    font_text.set_size(9.5)\n",
    "    font_text.set_family('monospace')\n",
    "    \n",
    "    ax.set_xlabel('latencies [ns]', fontsize=10, family='monospace', labelpad = 4 - 2 * correction)\n",
    "    ax.set_ylabel('frequency', fontsize=10, family='monospace', labelpad = 6)\n",
    "\n",
    "    test  = settings['labels'][1] + '$\\mathtt{{> {}\\/ns: }}${: >7.4f}%  (max: {:8} ns)\\n'.format(x_limit, errors[1], max(datasets[1]))\n",
    "    test += settings['labels'][0] + '$\\mathtt{{> {}\\/ns: }}${: >7.4f}%  (max: {:8} ns)'.format(x_limit, errors[0], max(datasets[0]))\n",
    "   \n",
    "    # bbox accepts FancyBboxPatch prop dict\n",
    "    x_position_box = 0.99 if medians[1] < 6000 else 0.38\n",
    "    \n",
    "    ax.text(x_position_box, 0.95, test,\n",
    "            verticalalignment='top', horizontalalignment='right',\n",
    "            transform=ax.transAxes, zorder=5,\n",
    "            color='black', fontproperties = font_text,\n",
    "            bbox={'facecolor':'white', 'alpha':0.85, 'pad':0.30, 'boxstyle':'round',\n",
    "                  'edgecolor':'#dbdbdb'})\n",
    "\n",
    "    # Show plot\n",
    "    plt.yscale('log')\n",
    "    plt.tight_layout()\n",
    "    \n",
    "    # Save plot\n",
    "    fig.savefig('{}/plot_{}.pdf'.format(rootdir, plots_saved), dpi=600, format='pdf')\n",
    "    plots_saved += 1\n",
    "    \n",
    "    if i == 0:\n",
    "        # Create and save legend\n",
    "        import pylab\n",
    "    \n",
    "        # create a second figure for the legend\n",
    "        figLegend = pylab.figure(figsize = settings['dimensions']['legend'])\n",
    "\n",
    "        # produce a legend for the objects in the other figure\n",
    "        pylab.figlegend(*ax.get_legend_handles_labels(), loc = 'upper left',\n",
    "                        prop={'family':'monospace', 'size':'8'}, ncol=2)\n",
    "        figLegend.savefig(\"{}/legend.pdf\".format(rootdir), format='pdf')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.7.0"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}