Skip to content

Commit

Permalink
differences for PR #20
Browse files Browse the repository at this point in the history
  • Loading branch information
actions-user committed Mar 5, 2024
1 parent 5e04984 commit 7abdb76
Show file tree
Hide file tree
Showing 11 changed files with 5,799 additions and 28 deletions.
5,745 changes: 5,745 additions & 0 deletions fig/stack.ai

Large diffs are not rendered by default.

Binary file added fig/stack.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added files/schelling_out.prof
Binary file not shown.
14 changes: 7 additions & 7 deletions md5sum.txt
Original file line number Diff line number Diff line change
Expand Up @@ -4,17 +4,17 @@
"config.yaml" "b413b2dfbce4f70e178cae4d6d2d6311" "site/built/config.yaml" "2024-02-08"
"index.md" "3a6d3683998a6b866c134a818f1bb46e" "site/built/index.md" "2024-02-15"
"links.md" "8184cf4149eafbf03ce8da8ff0778c14" "site/built/links.md" "2024-01-03"
"episodes/profiling-introduction.md" "a0163cbc57865b4fad063468ac4c0a41" "site/built/profiling-introduction.md" "2024-02-08"
"episodes/profiling-functions.md" "4ea67773010619ae5fbaa2dc69ecc4f6" "site/built/profiling-functions.md" "2024-02-08"
"episodes/profiling-lines.md" "8bd8cf015fcc38cdb004edf5fad75a65" "site/built/profiling-lines.md" "2024-02-08"
"episodes/profiling-introduction.md" "7dae558b7851344dcb1746141b6fdf0a" "site/built/profiling-introduction.md" "2024-03-05"
"episodes/profiling-functions.md" "494cc5948b1e6e5d0b8c3403cb505517" "site/built/profiling-functions.md" "2024-03-05"
"episodes/profiling-lines.md" "0717d028251a5ed792dde97e4f4abd43" "site/built/profiling-lines.md" "2024-03-05"
"episodes/profiling-conclusion.md" "340969a321636eb94fff540191a511e7" "site/built/profiling-conclusion.md" "2024-01-29"
"episodes/optimisation-introduction.md" "aff88de80645a433161ad48231f6fa7f" "site/built/optimisation-introduction.md" "2024-02-15"
"episodes/optimisation-data-structures-algorithms.md" "75dbff01d990fa1e99beec4b24b2b0ad" "site/built/optimisation-data-structures-algorithms.md" "2024-02-08"
"episodes/optimisation-data-structures-algorithms.md" "764571551ef896f596585ac0a2e0c62f" "site/built/optimisation-data-structures-algorithms.md" "2024-03-05"
"episodes/optimisation-minimise-python.md" "12d5c57fb3c31439d39c0d4997bdd323" "site/built/optimisation-minimise-python.md" "2024-02-15"
"episodes/optimisation-use-latest.md" "829f7a813b0a9a131fa22e6dbb534cf7" "site/built/optimisation-use-latest.md" "2024-02-08"
"episodes/optimisation-memory.md" "52c4b2884410050c9646cf987d2aa50e" "site/built/optimisation-memory.md" "2024-02-08"
"episodes/optimisation-conclusion.md" "1d608c565c199cea5e00dc5209f3da1b" "site/built/optimisation-conclusion.md" "2024-02-15"
"episodes/optimisation-memory.md" "69eb84dfc419083ff12856a80750a618" "site/built/optimisation-memory.md" "2024-03-05"
"episodes/optimisation-conclusion.md" "ccd780c447f0b0ce97b8da1b2572b9c1" "site/built/optimisation-conclusion.md" "2024-03-05"
"instructors/instructor-notes.md" "cae72b6712578d74a49fea7513099f8c" "site/built/instructor-notes.md" "2024-01-03"
"learners/setup.md" "50d49ff7eb0ea2d12d75773ce1decd45" "site/built/setup.md" "2024-01-29"
"learners/setup.md" "57c429eb0ded96a76813366158678bfb" "site/built/setup.md" "2024-03-05"
"learners/acknowledgements.md" "c4064263d442f147d3796cb3dfa7b351" "site/built/acknowledgements.md" "2024-02-08"
"profiles/learner-profiles.md" "60b93493cf1da06dfd63255d73854461" "site/built/learner-profiles.md" "2024-01-03"
2 changes: 1 addition & 1 deletion optimisation-conclusion.md
Original file line number Diff line number Diff line change
Expand Up @@ -29,7 +29,7 @@ This course's website can be used as a reference manual when profiling your own

::::::::::::::::::::::::::::::::::::: keypoints

Data Structures & Algorithms
- Data Structures & Algorithms
- List comprehension should be preferred when constructing lists.
- Where appropriate, Tuples and Generator functions should be preferred over Python lists.
- Dictionaries and sets are appropriate for storing a collection of unique data with no intrinsic order for random access.
Expand Down
2 changes: 1 addition & 1 deletion optimisation-data-structures-algorithms.md
Original file line number Diff line number Diff line change
Expand Up @@ -224,7 +224,7 @@ If that index doesn't already contain another key, the key (and any associated v
When the index isn't free, a collision strategy is applied. CPython's [dictionary](https://github.com/python/cpython/blob/main/Objects/dictobject.c) and [set](https://github.com/python/cpython/blob/main/Objects/setobject.c) both use a form of open addressing whereby a hash is mutated and corresponding indices probed until a free one is located.
When the hashing data structure exceeds a given load factor (e.g. 2/3 of indices have been assigned keys), the internal storage must grow. This process requires every item to be re-inserted which can be expensive, but reduces the average probes for a key to be found.

![An visual explanation of linear probing, CPython uses an advanced form of this.](episodes/fig/hash_linear_probing.png){alt='A diagram demonstrating how the keys (hashes) 37, 64, 14, 94, 67 are inserted into a hash table with 11 indices. This is followed by the insertion of 59, 80 and 39 which require linear probing to be inserted due to collisions.'}
![An visual explanation of linear probing, CPython uses an advanced form of this.](episodes/fig/hash_linear_probing.png){alt="A diagram demonstrating how the keys (hashes) 37, 64, 14, 94, 67 are inserted into a hash table with 11 indices. This is followed by the insertion of 59, 80 and 39 which require linear probing to be inserted due to collisions."}

To retrieve or check for the existence of a key within a hashing data structure, the key is hashed again and a process equivalent to insertion is repeated. However, now the key at each index is checked for equality with the one provided. If any empty index is found before an equivalent key, then the key must not be present in the ata structure.

Expand Down
4 changes: 2 additions & 2 deletions optimisation-memory.md
Original file line number Diff line number Diff line change
Expand Up @@ -29,7 +29,7 @@ Modern computer's typically have a single processor (CPU), within this processor
Data held in memory by running software is exists in RAM, this memory is faster to access than hard drives (and solid-state drives).
But the CPU has much smaller caches on-board, to make accessing the most recent variables even faster.

![An annotated photo of a computer's hardware.](episodes/fig/annotated-motherboard.jpg){alt='An annotated photo of inside a desktop computer's case. The CPU, RAM, power supply, graphics cards (GPUs) and harddrive are labelled.'}
![An annotated photo of a computer's hardware.](episodes/fig/annotated-motherboard.jpg){alt="An annotated photo of inside a desktop computer's case. The CPU, RAM, power supply, graphics cards (GPUs) and harddrive are labelled."}

<!-- Read/operate on variable ram->cpu cache->registers->cpu -->
When reading a variable, to perform an operation with it, the CPU will first look in it's registers. These exist per core, they are the location that computation is actually performed. Accessing them is incredibly fast, but there only exists enough storage for around 32 variables (typical number, e.g. 4 bytes).
Expand Down Expand Up @@ -160,7 +160,7 @@ An even greater overhead would apply.

Latency can have a big impact on the speed that a program executes, the below graph demonstrates this. Note the log scale!

![A graph demonstrating the wide variety of latencies a programmer may experience when accessing data.](episodes/fig/latency.png){alt='A horizontal bar chart displaying the relative latencies for L1/L2/L3 cache, RAM, SSD, HDD and a packet being sent from London to California and back. These latencies range from 1 nanosecond to 140 milliseconds and are displayed with a log scale.'}
![A graph demonstrating the wide variety of latencies a programmer may experience when accessing data.](episodes/fig/latency.png){alt="A horizontal bar chart displaying the relative latencies for L1/L2/L3 cache, RAM, SSD, HDD and a packet being sent from London to California and back. These latencies range from 1 nanosecond to 140 milliseconds and are displayed with a log scale."}

The lower the latency typically the higher the effective bandwidth. L1 and L2 cache have 1TB/s, RAM 100GB/s, SSDs upto 32 GB/s, HDDs upto 150MB/s. Making large memory transactions even slower.

Expand Down
52 changes: 39 additions & 13 deletions profiling-functions.md
Original file line number Diff line number Diff line change
Expand Up @@ -42,10 +42,12 @@ In this episode we will cover the usage of the function-level profiler `cProfile

The call stack keeps track of the active hierarchy of function calls and their associated variables.

As a stack it is last-in first-out (LIFO) data structure.
As a stack it is a last-in first-out (LIFO) data structure.

![A diagram of a call stack](fig/stack.png){alt="A greyscale diagram showing a (call)stack, containing 5 stack frame. Two additional stack frames are shown outside the stack, one is marked as entering the call stack with an arrow labelled push and the other is marked as exiting the call stack labelled pop."}

When a function is called, a frame to track it's variables and metadata is pushed to the call stack.
When that same function finishes and returns, it is popped from the stack and variables local the function are dropped.
When that same function finishes and returns, it is popped from the stack and variables local to the function are dropped.

If you've ever seen a stack overflow error, this refers to the call stack becoming too large.
These are typically caused by recursive algorithms, whereby a function calls itself, that don't exit early enough.
Expand All @@ -71,7 +73,9 @@ def c():
a()
```

Prints the following call stack:
Here we can see that the printing of the stack trace is called in `c()`, which is called by `b2()`, which is called by `a()`, which is called from global scope.

Hence, this prints the following call stack:

```output
File "C:\call_stack.py", line 13, in <module>
Expand All @@ -84,7 +88,11 @@ Prints the following call stack:
traceback.print_stack()
```

In this instance the base of the stack is printed first, other visualisations of call stacks may use the reverse ordering.
The first line states the file and line number where `a()` was called from (the last line of code in the file shown). The second line states that it was the function `a()` that was called, this could include it's arguments. The third line then repeats this pattern, stating the line number where `b2()` was called inside `a()`. This continues until the call to `traceback.print_stack()` is reached.

You may see stack traces like this when an unhandled exception is thrown by your code.

*In this instance the base of the stack has been printed first, other visualisations of call stacks may use the reverse ordering.*

:::::::::::::::::::::::::::::::::::::::::::::

Expand Down Expand Up @@ -152,6 +160,21 @@ This output can often exceed the terminal's buffer length for large programs and

## snakeviz

:::::::::::::::::::::::::::::::::: instructor

It can help to run these examples by running `snakeviz` live.
For the worked example you may wish to also show the code (e.g. in split screen).

Demonstrate features such as moving up/down the call-stack by clicking the boxes and changing the depth and cutoff via the dropdown.

Download pre-generated profile reports:

* snakeviz example screenshot: <a href="files/schelling_out.prof" download>files/schelling_out.prof</a>

* Worked example: <a href="files/snakeviz-worked-example/out.prof" download>files/snakeviz-worked-example/out.prof</a>

:::::::::::::::::::::::::::::::::::::::::::::

<!-- what is snakeviz/how is it installed-->
[`snakeviz`](https://jiffyclub.github.io/snakeviz/) is a web browser based graphical viewer for `cProfile` output files.
<!--TODO is covering pip here redundant as it's covered in the user setup file? -->
Expand All @@ -166,6 +189,7 @@ Once installed, you can visualise a `cProfile` output file such as `out.prof` vi
```sh
python -m snakeviz out.prof
```

This should open your web browser displaying a page similar to that below.

![An example of the default 'icicle' visualisation provided by `snakeviz`.](episodes/fig/snakeviz-home.png){alt='A web page, with a central diagram representing a call-stack, with the root at the top and the horizontal axis representing the duration of each call. Below this diagram is the top of a table detailing the statistics of individual methods.'}
Expand All @@ -185,6 +209,12 @@ As you hover each box, information to the left of the diagram updates specifying

## Worked Example

:::::::::::::::::::::::::::::::::: instructor

Demonstrate this!

:::::::::::::::::::::::::::::::::::::::::::::

::::::::::::::::::::::::::::::::::::: callout

## Follow Along
Expand All @@ -196,12 +226,6 @@ python -m cProfile -o out.prof example.py
python -m snakeviz out.prof
```

:::::::::::::::::::::::::::::::::: instructor

It can help to run the worked example by executing `snakeviz` live and explaining the visualisation with the code visible in split screen.

:::::::::::::::::::::::::::::::::::::::::::::

:::::::::::::::::::::::::::::::::::::::::::::

To more clearly demonstrate how an execution hierarchy maps to the icicle diagram, the below toy example Python script has been implemented.
Expand Down Expand Up @@ -282,7 +306,7 @@ This provides the same information as "Icicle", however the rows are instead cir
The sunburst visualisation displays less text on the boxes, so it can be harder to interpret. However, it increases the visibility of boxes further from the root call.

<!-- TODO: Alt text here is redundant? -->
![An sunburst visualisation provided by `snakeviz` for the worked example's Python code.](episodes/fig/snakeviz-worked-example-sunburst.png){alt='The snakeviz sunburst visualisation for the worked example Python code.'}
![An sunburst visualisation provided by `snakeviz` for the worked example's Python code.](episodes/fig/snakeviz-worked-example-sunburst.png){alt="The snakeviz sunburst visualisation for the worked example Python code."}

:::::::::::::::::::::::::::::::::::::::::::::

Expand Down Expand Up @@ -328,7 +352,9 @@ Other boxes within the diagram correspond to the initialisation of imports, or i

## Exercise 2: Predator Prey

Download and profile <a href="files/pred-prey/predprey.py" download>the Python predator prey model</a>, try to locate the function call(s) where the majority of execution time is being spent.
Download and profile <a href="files/pred-prey/predprey.py" download>the Python predator prey model</a>, try to locate the function call(s) where the majority of execution time is being spent

*This exercise uses the package `numpy`, it can be installed via `pip install numpy`.*

> The predator prey model is a simple agent-based model of population dynamics. Predators and prey co-exist in a common environment and compete over finite resources.
>
Expand All @@ -352,7 +378,7 @@ If the table is ordered by `ncalls`, it can be identified as the joint 4th most

If you checked `predprey_out.png` (shown below), you should notice that there are significantly more `Grass` agents than `Predators` or `Prey`.

![`predprey_out.png` as produced by the default configuration of `predprey.py`.](episodes/fig/predprey_out.png){alt='A line graph plotting population over time through 250 steps of the pred prey model. Grass/20, shown in green, has a brief dip in the first 30 steps, but recovers holding steady at approximately 240 (4800 agents). Prey, shown in blue, starts at 200, quickly drops to around 185, before levelling off for steps and then slowly declining to a final value of 50. The data for predators, shown in red, has significantly more noise. There are 50 predators to begin, this rises briefly before falling to around 10, from here it noisily grows to around 70 by step 250 with several larger declines during the growth.'}
![`predprey_out.png` as produced by the default configuration of `predprey.py`.](episodes/fig/predprey_out.png){alt="A line graph plotting population over time through 250 steps of the pred prey model. Grass/20, shown in green, has a brief dip in the first 30 steps, but recovers holding steady at approximately 240 (4800 agents). Prey, shown in blue, starts at 200, quickly drops to around 185, before levelling off for steps and then slowly declining to a final value of 50. The data for predators, shown in red, has significantly more noise. There are 50 predators to begin, this rises briefly before falling to around 10, from here it noisily grows to around 70 by step 250 with several larger declines during the growth."}

Similarly, the `Grass::eaten()` has a `percall` time is inline with other agent functions such as `Prey::flock()` (from `predprey.py:67`).

Expand Down
2 changes: 1 addition & 1 deletion profiling-introduction.md
Original file line number Diff line number Diff line change
Expand Up @@ -175,7 +175,7 @@ By highlighting individual functions calls, patterns relating to how performance
[`viztracer`](https://viztracer.readthedocs.io/en/latest/) is an example of a timeline profiler for Python, however we won't be demonstrating timeline profiling on this course.


![An example timeline visualisation provided by `viztracer`/`vizviewer`.](episodes/fig/viztracer-example.png){alt='A viztracer timeline of the execution of the Pred-Prey exercise from later in the course. There is a shallow repeating pattern on the left side which corresponds to model steps, the right side instead has a range of 'icicles' which correspond to the deep call hierarchies of matplotlib generating a graph.'}
![An example timeline visualisation provided by `viztracer`/`vizviewer`.](fig/viztracer-example.png){alt="A viztracer timeline of the execution of the Pred-Prey exercise from later in the course. There is a shallow repeating pattern on the left side which corresponds to model steps, the right side instead has a range of 'icicles' which correspond to the deep call hierarchies of matplotlib generating a graph."}

### Hardware Metric Profiling

Expand Down
2 changes: 1 addition & 1 deletion profiling-lines.md
Original file line number Diff line number Diff line change
Expand Up @@ -238,7 +238,7 @@ Therefore it can be seen in this example, how the time spent executing each line

The `-r` argument passed to `kernprof` (or `line_profiler`) enables rich output, if you run the profile locally it should look similar to this. *This requires the optional package `rich`, it will have been installed if `[all]` was specified when installing `line_profiler` with `pip`.*

![Rich (highlighted) console output provided by `line_profiler` for the above FizzBuzz profile code.](episodes/fig/line_profiler-worked-example.png){alt='A screenshot of the `line_profiler` output from the previous code block, where the code within the line contents column has basic highlighting.'}
![Rich (highlighted) console output provided by `line_profiler` for the above FizzBuzz profile code.](episodes/fig/line_profiler-worked-example.png){alt="A screenshot of the `line_profiler` output from the previous code block, where the code within the line contents column has basic highlighting."}

:::::::::::::::::::::::::::::::::::::::::::::

Expand Down
4 changes: 2 additions & 2 deletions setup.md
Original file line number Diff line number Diff line change
Expand Up @@ -22,10 +22,10 @@ This course uses Python and was developed using Python 3.11, therefore it is rec

<!-- Todo suggest using a venv?-->

The non-core Python packages required by the course are `pytest`, `snakeviz` and `line_profiler` which can be installed via `pip`.
The non-core Python packages required by the course are `pytest`, `snakeviz`, `line_profiler` and `numpy` which can be installed via `pip`.

```sh
pip install pytest snakeviz line_profiler[all]
pip install pytest snakeviz line_profiler[all] numpy
```

:::::::::::::::::::::::::::::::::::::::::::::::::::

0 comments on commit 7abdb76

Please sign in to comment.