Fossil

Check-in [847423ed]
Login

Many hyperlinks are disabled.
Use anonymous login to enable hyperlinks.

Overview
Comment:Polishing pass on www/collisions.ipynb: improved docs, simplified the "sd" parameter as "spread", removed an empty cell, and renumbered the cells.
Downloads: Tarball | ZIP archive | SQL archive
Timelines: family | ancestors | descendants | both | trunk
Files: files | file ages | folders
SHA3-256: 847423edfaae1dd2d9baa767b95194f12cc700c94b595ba8f3372a73d7b158d7
User & Date: wyoung 2019-06-16 20:16:31
Context
2019-06-17
07:09
The touch command no longer treats missing files as a fatal error, instead emitting a warning message. check-in: 56530e9b user: stephan tags: trunk
2019-06-16
20:16
Polishing pass on www/collisions.ipynb: improved docs, simplified the "sd" parameter as "spread", removed an empty cell, and renumbered the cells. check-in: 847423ed user: wyoung tags: trunk
19:22
Removed the "or --allow-fork" advice output from "fossil checkin" when forcing the checkin would fork the branch. It's good for Fossil to have this option, especially for automated tooling that needs to just bull forward blindly, but it's bad advice to give to interactive users. Let them discover it via --help, if they learn of it at all. check-in: 66d55e9b user: wyoung tags: trunk
Changes
Hide Diffs Unified Diffs Ignore Whitespace Patch

Changes to www/collisions.ipynb.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
..
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40

41
42
43
44
45
46

47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69

70
71
72
73
74
75
76
..
87
88
89
90
91
92
93
94







95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Pull in the packages we need"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [],
................................................................................
    "# Parameters\n",
    "\n",
    "* **cpd** - Checkins per day. Defaults to 20.\n",
    "\n",
    "* **days** - Number of days to simulate.  Defaults to 10000, or roughly\n",
    "    40 working years.\n",
    "\n",
    "* **winSz** - Size of the commit window in percentage of the working day.\n",
    "    Defaults to 0.01% with the default for workSec, which is roughly 3\n",
    "    seconds with the default `workSec` value, a long-ish commit time for Fossil.\n",
    "\n",
    "* **workSec** - Seconds in working day, defaulting to 8 hours.  This value\n",
    "    only affects the reporting output, not the results of the underlying\n",
    "    simulation.  It's a scaling parameter to humanize the results.\n",
    "    \n",
    "* **sd** - The standard deviation to use for our normally-distributed random\n",
    "    numbers. The default gives \"nice\" distributions, spread over the working\n",
    "    day with a low chance of generating values outside the working day.\n",

    "    \n",
    "    The larger this value gets, the more spread out the checkins, and so the\n",
    "    lower the chance of collisions.  You might want to increase it a bit to\n",
    "    simulate early and late workers.  (e.g. `workSec / 3`)\n",
    "    \n",
    "    As you decrease this value — e.g. `workSec / 5` — you're simulating a work\n",

    "    environment where people tend to check their work in closer and closer\n",
    "    to mid-day, which increases the chance of collision.\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {},
   "outputs": [],
   "source": [
    "collisions <- function(\n",
    "        cpd = 20,\n",
    "        days = 10000,\n",
    "        winSz = 0.01 / 100,\n",
    "        workSec = 8 * 60 * 60,\n",
    "        sd = workSec / 4)\n",
    "{\n",
    "    cat(\"Running simulation...\\n\")\n",
    "\n",
    "    day = 0\n",
    "    collisions = 0\n",
    "    winSec = workSec * winSz\n",
    "    mean = workSec / 2\n",

    "\n",
    "    while (day < days) {\n",
    "        # Create the commit time vector as random values in a normal\n",
    "        # distribution.\n",
    "        times = sort(rnorm(cpd, mean, sd))\n",
    "\n",
    "        # Are there any pairs in the time vector that are within the\n",
................................................................................
    "    }\n",
    "    \n",
    "    cat(\"Found\", collisions, \"collisions in\", days, (workSec / 3600),\n",
    "        \"hour working days with\", winSec, \"second windows.\\n\")\n",
    "}"
   ]
  },
  {







   "cell_type": "code",
   "execution_count": 10,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Running simulation...\n",
      "Found 423 collisions in 10000 8 hour working days with 2.88 second windows.\n"
     ]
    }
   ],
   "source": [
    "collisions()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "R",
   "language": "R",
   "name": "ir"






|







 







|
|
|





|
|
|
>

|
|
|

<
>
|
|




|








|







>







 








>
>
>
>
>
>
>

|







|






<
<
<
<
<
<
<







1
2
3
4
5
6
7
8
9
10
11
12
13
14
..
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46

47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
..
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119







120
121
122
123
124
125
126
{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "*Pull in the packages we need:*"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [],
................................................................................
    "# Parameters\n",
    "\n",
    "* **cpd** - Checkins per day. Defaults to 20.\n",
    "\n",
    "* **days** - Number of days to simulate.  Defaults to 10000, or roughly\n",
    "    40 working years.\n",
    "\n",
    "* **winSz** - Size of the commit window as a fraction of `workSec`. Defaults\n",
    "    to 0.01%, which is roughly 3 seconds for the default 8-hour work day, a\n",
    "    a long-ish commit time for Fossil.\n",
    "\n",
    "* **workSec** - Seconds in working day, defaulting to 8 hours.  This value\n",
    "    only affects the reporting output, not the results of the underlying\n",
    "    simulation.  It's a scaling parameter to humanize the results.\n",
    "    \n",
    "* **spread** - Adjustment factor in computing the standard deviation for our \n",
    "    normally-distributed random numbers. The default gives \"nice\" distributions,\n",
    "    spread over the working day with a low chance of generating values outside\n",
    "    the working day.\n",
    "    \n",
    "    The smaller this value gets (&lt;4), the more spread out the checkins, and\n",
    "    so the lower the chance of collisions.  You might want to decrease it a bit\n",
    "    to simulate early and late workers.\n",
    "    \n",

    "    As you increase this value (&gt;4) you're simulating a work environment\n",
    "    where people tend to check their work in closer and closer to mid-day,\n",
    "    which increases the chance of collision."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [],
   "source": [
    "collisions <- function(\n",
    "        cpd = 20,\n",
    "        days = 10000,\n",
    "        winSz = 0.01 / 100,\n",
    "        workSec = 8 * 60 * 60,\n",
    "        spread = 4)\n",
    "{\n",
    "    cat(\"Running simulation...\\n\")\n",
    "\n",
    "    day = 0\n",
    "    collisions = 0\n",
    "    winSec = workSec * winSz\n",
    "    mean = workSec / 2\n",
    "    sd = workSec / spread\n",
    "\n",
    "    while (day < days) {\n",
    "        # Create the commit time vector as random values in a normal\n",
    "        # distribution.\n",
    "        times = sort(rnorm(cpd, mean, sd))\n",
    "\n",
    "        # Are there any pairs in the time vector that are within the\n",
................................................................................
    "    }\n",
    "    \n",
    "    cat(\"Found\", collisions, \"collisions in\", days, (workSec / 3600),\n",
    "        \"hour working days with\", winSec, \"second windows.\\n\")\n",
    "}"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "*Run the following cell, possibly changing parameters documented above:*"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Running simulation...\n",
      "Found 422 collisions in 10000 8 hour working days with 2.88 second windows.\n"
     ]
    }
   ],
   "source": [
    "collisions()"
   ]







  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "R",
   "language": "R",
   "name": "ir"