# Lecture 6 – Packaging and Shipping Code

### Overview

Getting code to work on your machine is step one. Getting it to run reliably on any machine — your colleague's laptop, a CI server, a cloud VM, a customer's environment — is the real challenge. This gap between "works on my machine" and "works everywhere" is what packaging and shipping code is all about.

This lecture covers the full pipeline from source code to deployed software: how dependencies are declared and isolated, how artifacts are built and versioned, how containers eliminate environmental drift, and how packages are published to the world. Whether you're shipping a Python library, a CLI tool, or a multi-service web application, the same underlying patterns apply.

#### Key Takeaways

* A **dependency** is any library your code needs that isn't part of your language's standard library. Every dependency introduces a version constraint — and potentially a conflict.
* **Virtual environments** isolate each project's dependencies from the system and from each other. Use one per project, every time.
* **`uv`** is the modern, significantly faster replacement for `pip`. Prefer it whenever possible.
* An **artifact** is the packaged, distributable output of your source code. For Python: wheels (`.whl`). For Rust: compiled binaries. For containers: Docker images.
* **`pyproject.toml`** is the modern, canonical Python project manifest — prefer it over legacy `setup.py` or bare `requirements.txt`.
* **Semantic Versioning** (MAJOR.MINOR.PATCH) is a contract with your users: PATCH = safe upgrade, MINOR = safe upgrade with new features, MAJOR = may break.
* **Lock files** pin the exact resolved version of every transitive dependency for reproducible installs. Commit them for applications; don't commit them for libraries.
* **Containers** (Docker) package an application with its entire filesystem, eliminating environmental drift. They are lighter than full VMs because they share the host kernel.
* **Docker Compose** orchestrates multi-container applications — your app, database, cache, and message queue all declared in a single YAML file.
* **Configuration belongs outside code** — use environment variables or config files. Secrets must never be committed to version control.

***

### Core Concepts

#### Dependencies and the Dependency Graph

Modern software is built on layers. Your code uses `requests`. `requests` uses `urllib3`, `certifi`, and `charset-normalizer`. Those packages have their own dependencies. This is the **dependency graph** — a directed tree of "X needs Y at version Z" relationships.

When you run `pip install requests`, the package manager must:

1. Find `requests` on PyPI
2. Read its dependency declarations
3. Find compatible versions of all *transitive* dependencies
4. Download and install everything in the right order

```bash
pip install requests
# Resolving: requests → urllib3, certifi, charset-normalizer, idna
# Downloading and installing all five packages
```

**Dependency hell** happens when two packages in your project require mutually incompatible versions of the same dependency. For example:

```
tensorflow==2.3.0  requires  numpy>=1.16.0,<1.19.0
your-other-package requires  numpy>=1.19.0
→ No valid numpy version exists that satisfies both
```

The standard solution: **isolation**. Give each project its own environment with its own independent set of installed packages.

***

#### Virtual Environments

A virtual environment is a self-contained copy of the Python runtime with its own isolated package installation directory. It does not inherit packages from the global installation.

```bash
# Create a virtual environment
python -m venv venv

# Activate it (modifies PATH so 'python' and 'pip' use the venv)
source venv/bin/activate          # Linux/macOS
venv\Scripts\activate             # Windows

# Verify isolation
which python     # → /home/you/project/venv/bin/python
pip list         # → only pip itself (clean slate)

# Install project dependencies (isolated to this venv)
pip install requests flask

# Deactivate when done
deactivate
```

> **Never modify the system Python.** Many operating systems depend on their bundled Python for internal tools. Always use a virtual environment for project work.

**`uv`** — the modern, fast package manager:

`uv` is a Rust-based reimplementation of `pip` and `venv` that is dramatically faster (10–100×). It follows the same interface:

```bash
# Install uv
pip install uv

# Create a venv and activate
uv venv
source .venv/bin/activate

# Install packages (same syntax as pip, much faster)
uv pip install requests

# Create a venv with a specific Python version
uv venv --python 3.12 venv312
```

`uv` resolves, downloads, and installs 5 packages in \~20ms where pip takes seconds. Use it by default for new projects.

***

#### Artifacts and Packaging

**Source code** is what developers read and write. An **artifact** is the packaged, distributable output ready for installation or deployment without requiring a development setup.

| Artifact type       | Example                              | Install command           |
| ------------------- | ------------------------------------ | ------------------------- |
| Python wheel        | `requests-2.32.3-py3-none-any.whl`   | `pip install package.whl` |
| Source distribution | `requests-2.32.3.tar.gz`             | Build + install           |
| Compiled binary     | `ripgrep-14.1.0-x86_64-linux.tar.gz` | Extract and use           |
| Container image     | `python:3.12-slim`                   | `docker run`              |
| Debian package      | `nginx_1.24.0_amd64.deb`             | `apt install`             |

**Python wheels** (`.whl`) are zip archives with a specific structure. The filename encodes platform compatibility:

```
requests-2.32.3-py3-none-any.whl
│               │    │     └── architecture: any
│               │    └──────── OS: any
│               └───────────── Python version: any Python 3
└───────────────────────────── package name and version

numpy-2.2.1-cp312-cp312-macosx_14_0_arm64.whl
                  │     │       └── Apple Silicon
                  │     └──────── macOS 14+
                  └────────────── CPython 3.12 only
```

A pure-Python wheel (like `requests`) works everywhere. A compiled wheel (like `numpy`, which includes C extensions) is platform-specific.

***

#### `pyproject.toml` — The Project Manifest

`pyproject.toml` is the modern, canonical way to declare a Python project's metadata, dependencies, and build configuration. It replaced the older `setup.py` and bare `requirements.txt` approaches.

```toml
[project]
name = "greeting"
version = "0.1.0"
description = "A simple greeting library"
requires-python = ">=3.11"
dependencies = [
    "requests==2.32.3",       # exact pin
    "click>=8.0",             # minimum version
    "numpy>=1.24,<2.0",       # bounded range
    "pandas~=2.1.0",          # compatible release (>=2.1.0,<2.2.0)
]

# Entry point: creates a 'greet' CLI command after installation
[project.scripts]
greet = "greeting:main"

# Build backend (setuptools, hatch, flit — your choice)
[build-system]
requires = ["setuptools>=61.0"]
build-backend = "setuptools.build_meta"
```

To build and distribute:

```bash
uv build
# → dist/greeting-0.1.0.tar.gz          (source distribution)
# → dist/greeting-0.1.0-py3-none-any.whl  (wheel)

# Anyone can now install it
uv pip install ./dist/greeting-0.1.0-py3-none-any.whl
greet Alice
# → Hello, Alice!
```

***

#### Semantic Versioning

Semantic Versioning (SemVer) is the dominant convention for communicating compatibility across software releases: **MAJOR.MINOR.PATCH**.

| Part      | Example         | Contract                                             |
| --------- | --------------- | ---------------------------------------------------- |
| **PATCH** | `1.2.3 → 1.2.4` | Bug fixes only. Safe to upgrade.                     |
| **MINOR** | `1.2.3 → 1.3.0` | New features, backwards-compatible. Safe to upgrade. |
| **MAJOR** | `1.2.3 → 2.0.0` | Breaking changes. Review before upgrading.           |

This is a social contract between library maintainers and users, not a technical enforcement. Users who specify `requests>=2.0,<3.0` are trusting that any `2.x` release is backwards-compatible.

**Version specifiers in `pyproject.toml`:**

```toml
dependencies = [
    "requests==2.32.3",    # Exact: only this version
    "click>=8.0",           # Minimum: 8.0 or any newer
    "numpy>=1.24,<2.0",     # Range: between 1.24 and 2.0 (exclusive)
    "pandas~=2.1.0",        # Compatible: >=2.1.0 and <2.2.0
]
```

**Calendar Versioning (CalVer)** is an alternative used by Ubuntu (`24.04`, `24.10`), where versions encode release dates. Useful for communicating recency, but says nothing about compatibility.

***

#### Reproducibility and Lock Files

A `pyproject.toml` with version ranges like `requests>=2.0` is flexible — it works with many versions. But two developers installing at different times might get different versions, leading to "works on my machine" bugs.

The solution: **lock files**. A lock file records the exact resolved version (and content hash) of every package in the dependency tree — including all transitive dependencies.

```bash
# Generate the lock file
uv lock
# → uv.lock (commit this to git for applications)

# Install exactly what the lock file specifies (reproducible)
uv pip install --require-hashes -r uv.lock
```

A lock file excerpt:

```toml
[[package]]
name = "requests"
version = "2.32.3"
source = { registry = "https://pypi.org/simple" }
wheels = [
    { url = "https://files.../requests-2.32.3-py3-none-any.whl",
      hash = "sha256:70761cfe03c773ceb22aa2f671b4757976145175cdfca038c02654d061d6dcc6" },
]
```

**Library vs. application:** This distinction is critical:

|                     | Library                                       | Application / Service                 |
| ------------------- | --------------------------------------------- | ------------------------------------- |
| Dependency versions | **Ranges** — maximize ecosystem compatibility | **Pinned** — maximize reproducibility |
| Lock file           | Usually not committed                         | Always committed                      |
| Why                 | Users may combine your library with others    | You control the full environment      |

For **maximum reproducibility** beyond package versions (pinning compilers, system libraries, even the build environment itself), tools like [Nix](https://nixos.org/) and [Bazel](https://bazel.build/) provide hermetic builds where every input is content-addressed.

***

#### Containers and Docker

**The problem containers solve:**

Even with a lock file, your Python application might depend on system libraries (`libpq-dev` for PostgreSQL, CUDA drivers for GPU), a specific OS version, or other packages outside any Python package manager's scope. Containers package the *entire filesystem* of an application — system libraries, configs, binaries, and all — into a single portable artifact.

|              | Virtual Machine                | Container                           |
| ------------ | ------------------------------ | ----------------------------------- |
| Isolation    | Full OS (kernel + userspace)   | Userspace only (shares host kernel) |
| Size         | GBs                            | MBs to low GBs                      |
| Startup time | Minutes                        | Seconds or less                     |
| Overhead     | High (separate kernel)         | Low                                 |
| Security     | Strong (fully isolated kernel) | Weaker (shared kernel)              |

**Key Docker concepts:**

* **Image**: A packaged, read-only template. Like a class in OOP.
* **Container**: A running instance of an image. Like an object.
* **Dockerfile**: Instructions for building an image (declarative recipe).
* **Layer**: Each instruction in a Dockerfile creates a cached, immutable layer.
* **Registry**: A server that stores and distributes images (Docker Hub, GHCR).

```bash
# Run an existing image (downloads if needed)
docker run -it python:3.12 python

# Build an image from a Dockerfile in current directory
docker build -t myapp:1.0 .

# Run your image
docker run -p 8080:8080 myapp:1.0

# List running containers
docker ps

# Stop a container
docker stop <container-id>

# List local images
docker images
```

***

#### Writing a Good Dockerfile

A naive Dockerfile works, but creates large, slow, insecure images:

```dockerfile
# ❌ Naive — large, slow, unoptimized
FROM python:3.12
RUN apt-get update
RUN apt-get install -y gcc
RUN apt-get install -y libpq-dev
RUN pip install numpy
RUN pip install pandas
COPY . /app
WORKDIR /app
RUN pip install .
```

Problems: full Python image (\~1 GB), many layers create unnecessary cache misses, no cleanup of package caches, separate `apt-get` commands split into multiple layers.

```dockerfile
# ✅ Optimized — slim, layered correctly, cache-efficient
FROM python:3.12-slim

# Copy uv binary from its official image (builder pattern)
COPY --from=ghcr.io/astral-sh/uv:latest /uv /usr/local/bin/uv

# Install system deps in one layer with cleanup
RUN apt-get update && \
    apt-get install -y --no-install-recommends gcc libpq-dev && \
    rm -rf /var/lib/apt/lists/*

# Install Python deps BEFORE copying app code
# (these layers are cached unless pyproject.toml/uv.lock change)
COPY pyproject.toml uv.lock ./
RUN uv pip install --system -r uv.lock

# Copy application code (changes frequently — last layer)
COPY . /app
WORKDIR /app
```

**The key layering principle:** Put things that change rarely near the top of the Dockerfile (stable base image, system packages, dependency installation) and things that change often (your application code) near the bottom. Docker caches layers from the top; a change invalidates only that layer and everything below it.

***

#### Configuration Management

Configuration should live outside the codebase and be provided at runtime. The same Docker image should be deployable to dev, staging, and production with only configuration changes.

**Environment variables** — the most common pattern:

```python
import os

# With a default value (optional setting)
DATABASE_URL = os.environ.get("DATABASE_URL", "sqlite:///local.db")
DEBUG = os.environ.get("DEBUG", "false").lower() == "true"
PORT = int(os.environ.get("PORT", "8080"))

# Without a default (required setting — raises KeyError if missing)
API_KEY = os.environ["API_KEY"]
```

**Config files** — good for complex structured configuration:

```yaml
# config.yaml
database:
  url: "postgresql://localhost/myapp"
  pool_size: 5
server:
  host: "0.0.0.0"
  port: 8080
  debug: false
```

**Secrets handling — golden rules:**

1. **Never commit secrets to version control** — not even in `.env` files
2. Store them in a secrets manager (AWS Secrets Manager, HashiCorp Vault, GitHub Secrets)
3. Inject at runtime via environment variables
4. Add `.env` to `.gitignore` globally

```bash
# In .gitignore — always
.env
*.key
*.pem
secrets.json
credentials.yaml
```

***

#### Docker Compose: Multi-Service Orchestration

Real applications are rarely a single process. A typical web application needs: a web server, a database, a cache, possibly a background worker. Docker Compose lets you declare all these services in one file and start them together.

```yaml
# docker-compose.yml
services:
  web:
    build: .
    ports:
      - "8080:8080"          # host:container port mapping
    environment:
      - DATABASE_URL=postgresql://db:5432/myapp
      - REDIS_URL=redis://cache:6379
    depends_on:
      - db
      - cache

  db:
    image: postgres:16-alpine
    environment:
      - POSTGRES_DB=myapp
      - POSTGRES_PASSWORD=secret
    volumes:
      - postgres_data:/var/lib/postgresql/data   # persistent storage

  cache:
    image: redis:7-alpine
    volumes:
      - redis_data:/data

volumes:
  postgres_data:
  redis_data:
```

```bash
docker compose up          # start all services (logs to terminal)
docker compose up -d       # start in background (detached)
docker compose logs -f web # follow logs for the 'web' service
docker compose down        # stop and remove containers
docker compose down -v     # also remove volumes (wipes data)
```

Docker's internal DNS resolves service names automatically — the `web` container can reach PostgreSQL at `db:5432` and Redis at `cache:6379`, without any manual IP configuration.

***

#### Publishing Packages

Once your package works, you may want to share it. The Python ecosystem uses **PyPI** (Python Package Index) as its central registry.

```bash
# Build the distribution artifacts
uv build
# → dist/greeting-0.1.0.tar.gz
# → dist/greeting-0.1.0-py3-none-any.whl

# Test on TestPyPI first (safe staging environment)
uv publish --publish-url https://test.pypi.org/legacy/

# Install from TestPyPI to verify it works
uv pip install --index-url https://test.pypi.org/simple/ greeting

# When ready, publish to real PyPI
uv publish
```

**Package registries by ecosystem:**

| Language        | Registry                                                        | Install tool             |
| --------------- | --------------------------------------------------------------- | ------------------------ |
| Python          | [PyPI](https://pypi.org/)                                       | `pip` / `uv`             |
| JavaScript/Node | [npm](https://npmjs.com/)                                       | `npm` / `pnpm`           |
| Rust            | [crates.io](https://crates.io/)                                 | `cargo`                  |
| Ruby            | [RubyGems](https://rubygems.org/)                               | `gem`                    |
| Go              | GitHub / module proxy                                           | `go get` (decentralized) |
| Containers      | [Docker Hub](https://hub.docker.com/), [GHCR](https://ghcr.io/) | `docker pull`            |

**Installing from GitHub directly** (before formal release):

```bash
# Install Python package from git (builds from source)
pip install git+https://github.com/psf/requests.git

# Install from a specific tag
pip install git+https://github.com/psf/requests.git@v2.32.3
```

***

### Mental Model

#### The Artifact as a Contract

Think of an artifact not just as a file, but as a **contract** with an environment. The artifact says: "I need these exact dependencies at these exact versions, on this OS, with this architecture, with these system libraries available."

```
Source Code + pyproject.toml
         │
         ▼ (uv build)
  Wheel artifact
  [declares: needs Python >=3.11, numpy >=1.24]
         │
         ▼ (pip install)
  Installed in an environment
  [satisfies: Python 3.12, numpy 1.26.3 installed]
         │
         ▼ (python -c "import mylib")
  Works ✅
```

When the contract is violated (wrong Python version, missing system library, conflicting dependency), the installation fails loudly rather than failing silently at runtime.

***

#### Containers: Shipping the Environment, Not Just the Code

The traditional deployment problem:

```
Dev machine:     Ubuntu 22.04, Python 3.11, libpq 14.x, glibc 2.35
Production:      Ubuntu 20.04, Python 3.10, libpq 12.x, glibc 2.31

Result: "it worked in dev but fails in prod"
```

Containers invert this model. Instead of adapting your code to the environment, you **ship the environment with the code**:

```
Container image: Ubuntu 22.04 + Python 3.12 + libpq 16 + your app
  ↓
Runs on any Linux host (or macOS/Windows with Docker Desktop)
  ↓
Always the same environment — no environmental drift possible
```

The host OS kernel is shared (containers are not VMs), but everything else — filesystem, system libraries, runtimes — is bundled in the image.

***

#### The Dependency Resolution Problem

Package managers must solve a constraint satisfaction problem: given a set of packages, each with version constraints on their dependencies, find a single set of package versions that simultaneously satisfies every constraint.

```
Your project needs:
  package-A >= 1.0
  package-B >= 2.0

package-A 1.5 needs: shared-lib >= 3.0, < 4.0
package-B 2.3 needs: shared-lib >= 3.5

Resolution: shared-lib must be >= 3.5 and < 4.0 → use 3.9 ✅

But if package-B 2.3 needed: shared-lib >= 4.0
→ No valid version exists → dependency hell ❌
```

This is why version constraint syntax (`>=`, `<`, `~=`, `==`) matters: you're encoding which versions of the solution space are acceptable.

***

#### Layer Caching: Why Order Matters in Dockerfiles

Docker builds are incremental. Each layer is cached and reused unless it (or anything before it) changes.

```
Dockerfile order:         Change frequency:
  FROM python:3.12-slim   ← almost never
  RUN apt-get install ... ← rarely
  COPY requirements.txt   ← when deps change
  RUN pip install -r ...  ← when deps change
  COPY . /app             ← every code change ← put code last!
```

If you copy your entire application before installing dependencies:

```dockerfile
COPY . /app               # ← changes on every code edit
RUN pip install -r requirements.txt   # ← cache invalidated every time!
```

A single line-of-code change forces Docker to reinstall all dependencies. Reversing the order means dependency layers are only rebuilt when `requirements` files actually change — not on every code edit.

***

### Commands and Syntax

#### Virtual Environments with `uv`

```bash
# Create a venv (uses .venv by default)
uv venv
uv venv --python 3.12 .venv-py312   # specific Python version

# Activate
source .venv/bin/activate           # Linux/macOS
.venv\Scripts\activate              # Windows

# Deactivate
deactivate

# Install packages
uv pip install requests flask
uv pip install -r requirements.txt
uv pip install -e .                 # editable install (development mode)

# Show installed packages
uv pip list
uv pip show requests                # details about a specific package

# Uninstall
uv pip uninstall requests
```

***

#### Lock Files with `uv`

```bash
# Generate lock file from pyproject.toml
uv lock

# Install exactly what the lock file specifies
uv pip sync uv.lock                 # installs, removes extras

# Show dependency tree
uv tree

# Update a specific package and regenerate lock
uv lock --upgrade-package requests
```

***

#### Building Packages

```bash
# Build both wheel and source distribution
uv build

# Build only wheel
uv build --wheel

# Inspect wheel contents
unzip -l dist/greeting-0.1.0-py3-none-any.whl

# Install locally for testing
uv pip install dist/greeting-0.1.0-py3-none-any.whl
```

***

#### Docker Commands

```bash
# Run a container interactively
docker run -it python:3.12 python
docker run -it --rm ubuntu:22.04 bash   # --rm removes container on exit

# Run with environment variables
docker run -e API_KEY=abc123 myapp

# Run with port mapping (host:container)
docker run -p 8080:8080 myapp

# Run with volume mount (host:container)
docker run -v $(pwd)/data:/app/data myapp

# Build image from Dockerfile
docker build -t myapp:1.0 .
docker build -t myapp:1.0 -f Dockerfile.prod .

# Push to a registry
docker push ghcr.io/username/myapp:1.0

# View layers and sizes
docker history myapp:1.0

# Remove unused images/containers
docker system prune
docker image prune -a
```

***

#### Docker Compose Commands

```bash
docker compose up               # start all services (logs to terminal)
docker compose up -d            # detached (background)
docker compose up --build       # rebuild images before starting
docker compose down             # stop and remove containers
docker compose down -v          # also remove named volumes (destroys data)
docker compose ps               # list service status
docker compose logs web         # logs for 'web' service
docker compose logs -f web      # follow logs live
docker compose exec web bash    # open shell in running container
docker compose run web pytest   # run a one-off command in a service
docker compose pull             # pull latest images for all services
```

***

#### Publishing to PyPI

```bash
# Build first
uv build

# Publish to TestPyPI (safe to experiment)
uv publish --publish-url https://test.pypi.org/legacy/

# Install from TestPyPI to verify
uv pip install --index-url https://test.pypi.org/simple/ greeting

# Publish to real PyPI (requires account + API token)
uv publish
```

***

### Command Flow Diagrams

#### The Packaging Pipeline: From Source to Installation

```mermaid
graph LR
    SRC["Source Code\n+ pyproject.toml"]
    BUILD["uv build"]
    WHEEL["wheel .whl\n+ sdist .tar.gz"]
    PYPI["PyPI / TestPyPI\n(registry)"]
    INSTALL["pip / uv pip install"]
    ENV["Virtual Environment\n(installed packages)"]
    RUN["python myapp.py ✅"]

    SRC --> BUILD --> WHEEL
    WHEEL -->|"uv publish"| PYPI
    PYPI -->|"uv pip install"| ENV
    WHEEL -->|"uv pip install ./wheel"| ENV
    ENV --> RUN

    style WHEEL fill:#4a2d00,color:#fff
    style PYPI fill:#1a3a5c,color:#fff
    style ENV fill:#2d4a22,color:#fff
```

***

#### Dependency Resolution

```mermaid
graph TD
    PROJ["Your Project\npyproject.toml"]

    A["package-A 1.5\nneeds: shared-lib >=3.0,<4.0"]
    B["package-B 2.3\nneeds: shared-lib >=3.5"]
    SHARED["shared-lib\n(must satisfy BOTH)\n→ >=3.5 and <4.0\n→ pick 3.9 ✅"]

    PROJ -->|"requires A>=1.0"| A
    PROJ -->|"requires B>=2.0"| B
    A -->|"depends on"| SHARED
    B -->|"depends on"| SHARED

    LOCK["uv.lock\nshared-lib==3.9 (pinned)\npackage-A==1.5 (pinned)\npackage-B==2.3 (pinned)"]
    SHARED --> LOCK

    style LOCK fill:#2d4a22,color:#fff
    style SHARED fill:#4a2d00,color:#fff
```

***

#### VM vs. Container Architecture

```mermaid
graph TB
    subgraph "Virtual Machine"
        HW1["Physical Hardware"]
        HYP["Hypervisor"]
        OS1["Guest OS 1\n(full Linux kernel)"]
        OS2["Guest OS 2\n(full Linux kernel)"]
        APP1["App + Deps"]
        APP2["App + Deps"]
        HW1 --> HYP
        HYP --> OS1 --> APP1
        HYP --> OS2 --> APP2
    end

    subgraph "Containers"
        HW2["Physical Hardware"]
        HOST["Host OS\n(Linux kernel — shared)"]
        CR["Container Runtime\n(containerd/Docker)"]
        C1["Container 1\n(userspace + app + deps)"]
        C2["Container 2\n(userspace + app + deps)"]
        HW2 --> HOST --> CR
        CR --> C1
        CR --> C2
    end

    style HYP fill:#4a1a5c,color:#fff
    style CR fill:#1a3a5c,color:#fff
```

***

#### Docker Layer Caching: Good vs. Bad Order

```mermaid
graph TB
    subgraph "❌ Bad Order — deps reinstall on every code change"
        B1["FROM python:3.12-slim"]
        B2["COPY . /app  ← changes every edit"]
        B3["RUN pip install  ← cache always invalidated!"]
        B1 --> B2 --> B3
    end

    subgraph "✅ Good Order — deps cached, only app layer rebuilds"
        G1["FROM python:3.12-slim  ← stable"]
        G2["RUN apt-get install  ← stable"]
        G3["COPY pyproject.toml uv.lock  ← changes when deps change"]
        G4["RUN uv pip install  ← cached unless lock changes"]
        G5["COPY . /app  ← changes on every code edit"]
        G1 --> G2 --> G3 --> G4 --> G5
    end

    style B2 fill:#4a0000,color:#fff
    style B3 fill:#4a0000,color:#fff
    style G4 fill:#2d4a22,color:#fff
    style G5 fill:#1a3a5c,color:#fff
```

***

#### Docker Compose Multi-Service Architecture

```mermaid
graph LR
    CLIENT["Browser / API Client"]

    subgraph "docker-compose.yml"
        WEB["web container\n:8080\n(your app)"]
        DB["db container\n:5432\n(postgres:16-alpine)"]
        CACHE["cache container\n:6379\n(redis:7-alpine)"]
        WORKER["worker container\n(background tasks)"]
    end

    subgraph "Persistent Volumes"
        VOL1["postgres_data"]
        VOL2["redis_data"]
    end

    CLIENT -->|"HTTP :8080"| WEB
    WEB -->|"SQL queries\npostgresql://db:5432"| DB
    WEB -->|"cache reads/writes\nredis://cache:6379"| CACHE
    WEB -->|"task queue"| WORKER

    DB --- VOL1
    CACHE --- VOL2

    style WEB fill:#2d4a22,color:#fff
    style DB fill:#1a3a5c,color:#fff
    style CACHE fill:#4a2d00,color:#fff
```

***

#### SemVer Decision Tree

```mermaid
graph TD
    CHANGE["You made a change.\nWhat kind?"]
    BUG["Bug fix — no API change"]
    FEAT["New feature — backwards compatible"]
    BREAK["API change — users must update code"]

    PATCH["Bump PATCH\n1.2.3 → 1.2.4"]
    MINOR["Bump MINOR\n1.2.3 → 1.3.0"]
    MAJOR["Bump MAJOR\n1.2.3 → 2.0.0"]

    SAFE1["✅ Safe for users on >=1.2.x"]
    SAFE2["✅ Safe for users on >=1.x"]
    WARN["⚠️ Users must review changelog\nbefore upgrading"]

    CHANGE --> BUG --> PATCH --> SAFE1
    CHANGE --> FEAT --> MINOR --> SAFE2
    CHANGE --> BREAK --> MAJOR --> WARN
```

***

### Command Pipeline Examples

#### Set Up a Python Project from Scratch

```bash
# 1. Create project directory and venv
mkdir myproject && cd myproject
uv venv
source .venv/bin/activate

# 2. Create pyproject.toml
cat > pyproject.toml << 'EOF'
[project]
name = "myproject"
version = "0.1.0"
requires-python = ">=3.11"
dependencies = ["click>=8.0", "requests>=2.32"]

[project.scripts]
myapp = "myproject.cli:main"

[build-system]
requires = ["setuptools>=61.0"]
build-backend = "setuptools.build_meta"
EOF

# 3. Install deps and lock
uv pip install -e .
uv lock

# 4. Inspect what was resolved
uv tree
```

***

#### Build, Tag, and Push a Docker Image to GHCR

```bash
# 1. Login to GitHub Container Registry
echo $GITHUB_TOKEN | docker login ghcr.io -u USERNAME --password-stdin

# 2. Build with a descriptive tag
docker build -t ghcr.io/username/myapp:1.0.0 .
docker build -t ghcr.io/username/myapp:latest .  # also tag as latest

# 3. Push both tags
docker push ghcr.io/username/myapp:1.0.0
docker push ghcr.io/username/myapp:latest

# 4. Verify the image
docker run --rm ghcr.io/username/myapp:1.0.0 myapp --version
```

***

#### Inspect What Changed Between Environments

```bash
# Capture environment before activating venv
printenv | sort > before.txt

# Activate and capture after
source .venv/bin/activate
printenv | sort > after.txt

# See what changed
diff before.txt after.txt
# Most significant change: PATH now has .venv/bin prepended
# That's why 'python' and 'pip' resolve to the venv versions
```

***

#### Run a Multi-Service Stack and Debug It

```bash
# Start everything
docker compose up -d

# Check all services are healthy
docker compose ps

# Follow logs for one service
docker compose logs -f web

# Open a shell in the running web container
docker compose exec web bash

# Inside the container — inspect the environment
env | grep DATABASE
python -c "import mylib; print('imports work')"

# Stop everything cleanly
docker compose down
```

***

### Real World Workflows

#### Reproducing "Works on My Machine" with Docker

```bash
# 1. Colleague reports a bug that only happens on Linux
# You're on macOS. Run an identical Linux environment:
docker run -it --rm \
  -v $(pwd):/app \
  -w /app \
  python:3.12-slim \
  bash

# Inside the container (now on Linux):
pip install -e .
python reproduce_bug.py
# ← Bug reproduced on your machine in a Linux environment
```

***

#### Setting Up a Complete Python Package for PyPI

```bash
# Project structure
mypackage/
├── src/
│   └── mypackage/
│       ├── __init__.py
│       └── core.py
├── tests/
│   └── test_core.py
├── pyproject.toml
├── README.md
└── uv.lock

# pyproject.toml (complete)
[project]
name = "mypackage"
version = "1.0.0"
description = "My useful package"
readme = "README.md"
requires-python = ">=3.11"
license = { text = "MIT" }
authors = [{ name = "Alice", email = "alice@example.com" }]
dependencies = ["click>=8.0"]

[project.urls]
Repository = "https://github.com/alice/mypackage"

[project.scripts]
mypackage = "mypackage.cli:main"

[build-system]
requires = ["setuptools>=61.0"]
build-backend = "setuptools.build_meta"
```

```bash
# Test locally
uv pip install -e .
pytest

# Build
uv build

# Publish to TestPyPI, verify, then publish for real
uv publish --publish-url https://test.pypi.org/legacy/
uv pip install --index-url https://test.pypi.org/simple/ mypackage
# ← install and test from TestPyPI
uv publish  # publish to real PyPI
```

***

#### Inject Secrets at Runtime (Never in Code)

```bash
# .env file (gitignored — never committed)
DATABASE_URL=postgresql://user:pass@localhost/mydb
API_KEY=sk-live-abc123secret
DEBUG=false

# Load in docker compose via env_file
services:
  web:
    build: .
    env_file:
      - .env   # Docker reads this, injects each line as an env var

# Load in development
source .env  # manual
# or use python-dotenv:
# pip install python-dotenv
# from dotenv import load_dotenv; load_dotenv()

# In production — inject from a secrets manager, not a file
# AWS: aws secretsmanager get-secret-value | jq -r .SecretString | ...
# GitHub Actions: ${{ secrets.API_KEY }} injected as env var automatically
```

***

### Productivity Tricks

#### `uv` Makes Everything Faster

```bash
# Old workflow (pip — seconds per install)
python -m venv venv && source venv/bin/activate && pip install -r requirements.txt

# New workflow (uv — milliseconds)
uv venv && source .venv/bin/activate && uv pip install -r requirements.txt

# Even faster — skip the separate venv step
uv run python myscript.py  # auto-creates venv, installs deps, runs
```

***

#### Multi-Stage Docker Builds: Keep Images Small

Use **multi-stage builds** to use a full build environment during compilation, but ship only the minimal runtime in the final image:

```dockerfile
# Stage 1: Build (has compilers, dev tools)
FROM python:3.12 AS builder
COPY pyproject.toml uv.lock ./
RUN pip install uv && uv pip install --system -r uv.lock

# Stage 2: Runtime (minimal — only what's needed to run)
FROM python:3.12-slim
COPY --from=builder /usr/local/lib/python3.12/site-packages /usr/local/lib/python3.12/site-packages
COPY . /app
WORKDIR /app
CMD ["python", "-m", "myapp"]
# Result: no build tools, no caches, much smaller image
```

***

#### Check for Outdated Dependencies

```bash
# See which packages have newer versions available
uv pip list --outdated

# Dependabot (GitHub) — automated PRs when dependencies have new versions
# Add .github/dependabot.yml to your repo:
# version: 2
# updates:
#   - package-ecosystem: "pip"
#     directory: "/"
#     schedule:
#       interval: "weekly"
```

***

#### Use `.dockerignore` to Avoid Bloating Images

```dockerignore
# .dockerignore — same syntax as .gitignore
.git/
.venv/
__pycache__/
*.pyc
*.egg-info/
dist/
.env
.DS_Store
tests/
docs/
*.md
```

Without this, `COPY . /app` copies your entire git history, venv (GBs!), and test files into the image. With it, only production-relevant files are copied.

***

### Common Mistakes

#### ❌ Installing Packages Globally Instead of in a Venv

**Wrong:**

```bash
pip install requests flask pandas   # installs globally
```

**Correct:**

```bash
uv venv && source .venv/bin/activate
uv pip install requests flask pandas   # isolated to this project
```

**Why:** Global installs pollute every Python environment on the machine, cause dependency conflicts across projects, and modify OS-managed Python on many systems.

***

#### ❌ Committing `venv/` or `.venv/` to Git

**Wrong:** `git add venv/` — adds tens of thousands of binary files and virtual environment internals to your repository.

**Correct:**

```bash
# In .gitignore
.venv/
venv/
__pycache__/
*.egg-info/
dist/
```

Share `pyproject.toml` and `uv.lock` instead. Anyone can recreate the exact environment with `uv venv && uv pip install -r uv.lock`.

***

#### ❌ Pinning Everything in a Library's `pyproject.toml`

**Wrong (for a library):**

```toml
dependencies = [
    "numpy==1.26.3",   # forces users to use exactly this version
    "requests==2.32.3",
]
```

**Correct (for a library):**

```toml
dependencies = [
    "numpy>=1.24,<3.0",   # works with a range of versions
    "requests>=2.28",
]
```

**Why:** Your library will be installed alongside other packages. Pinning exact versions in a library causes conflicts with any user who has even a slightly different version. Reserve exact pinning for applications and lock files.

***

#### ❌ Putting `COPY . /app` Before Dependency Installation

**Wrong:**

```dockerfile
COPY . /app              # invalidated on every code change
RUN pip install -r requirements.txt  # reinstalls everything every build
```

**Correct:**

```dockerfile
COPY requirements.txt .  # only changes when deps change
RUN pip install -r requirements.txt  # cached layer
COPY . /app              # invalidated only on code changes
```

**Why:** Docker invalidates a layer and all subsequent layers when any file in a `COPY` changes. Copying application code before installing dependencies means every single code edit forces a full reinstall of all packages.

***

#### ❌ Hardcoding Secrets in Code or Dockerfiles

**Wrong:**

```python
API_KEY = "sk-live-abc123secret"   # committed to git
```

```dockerfile
ENV API_KEY=sk-live-abc123secret   # baked into the image layer
```

**Correct:**

```python
API_KEY = os.environ["API_KEY"]    # injected at runtime
```

```dockerfile
# Don't set secrets in Dockerfile at all
# Pass them at runtime:
# docker run -e API_KEY=... myapp
# or via docker compose env_file
```

**Why:** Secrets in code or Dockerfiles persist in git history and image layers forever, even after "deletion". They can be extracted from any copy of the repo or image.

***

#### ❌ Using `:latest` Tag in Production

**Wrong:**

```dockerfile
FROM python:latest       # today: 3.12; next month: 3.13 — different!
```

```yaml
# docker-compose.yml
cache:
  image: redis:latest    # unpredictable version
```

**Correct:**

```dockerfile
FROM python:3.12-slim    # specific, reproducible
```

```yaml
cache:
  image: redis:7-alpine  # specific version
```

**Why:** `:latest` means "whatever version is newest right now". Your image rebuild next month might pull a different base that breaks your application. Always pin to a specific version in production.

***

### Exercises

#### Beginner Exercises

1. **Observe virtualenv activation:**

   ```bash
   printenv | sort > before.txt
   python -m venv .venv && source .venv/bin/activate
   printenv | sort > after.txt
   diff before.txt after.txt
   ```

   What changed? Why does `which python` now point to `.venv/bin/python`? What is `deactivate` actually doing? (Hint: `which deactivate` then `type deactivate`.)
2. **Reproduce a `ModuleNotFoundError`:** Create `fetch.py` with `import requests; print(requests.get("https://example.com").status_code)`. Run it outside a venv (if requests isn't installed). Observe the error. Create a venv, install requests, run again. Explain why it failed before and works now.
3. **Explore SemVer:** Look at the releases page for a popular package (e.g., [requests on PyPI](https://pypi.org/project/requests/#history)). Find an example of a PATCH release, a MINOR release, and a MAJOR release. Read the changelog entries for each.
4. **Inspect a wheel:** Download a wheel file:

   ```bash
   uv pip download requests --dest /tmp/wheels
   ls /tmp/wheels
   unzip -l /tmp/wheels/requests-*.whl
   ```

   What files are inside? What is in the `METADATA` file?
5. **Run Docker interactively:**

   ```bash
   docker run -it --rm python:3.12-slim bash
   # Inside the container:
   python --version
   pip install requests
   python -c "import requests; print('installed!')"
   exit
   # Run again — is requests still installed?
   docker run -it --rm python:3.12-slim python -c "import requests"
   ```

   Why did requests disappear? What does this tell you about container state?

***

#### Intermediate Exercises

6. **Create a Python package:**
   * Write a small library with one useful function
   * Write a `pyproject.toml` with the correct metadata
   * Add a CLI entry point using `click`
   * Build it with `uv build`
   * Install the wheel in a fresh venv and verify the CLI works
   * Create a `uv.lock` file and inspect its contents
7. **Write and optimize a Dockerfile:**
   * Write a naive Dockerfile for your Python app (all RUN commands separate)
   * Build it and note the image size: `docker images`
   * Rewrite it following the best practices (slim base, combined RUN, correct COPY order, uv for installs)
   * Build again. How much smaller is it? How much faster does a rebuild after a code change take?
8. **Docker Compose stack:** Write a `docker-compose.yml` for a web application with:

   * A Python web service (`build: .`)
   * A Redis cache (`redis:7-alpine`)
   * Environment variable passing (`REDIS_URL=redis://cache:6379`)
   * Port mapping so you can reach the web service from your browser

   Start it with `docker compose up`, verify both containers are running with `docker compose ps`, then tear it down with `docker compose down`.
9. **Lock file comparison:** Create a project with version ranges (`requests>=2.0`). Generate a lock file. Inspect it — what exact version was resolved? Now change to `requests>=2.31,<2.32` and regenerate. What changed in the lock file?
10. **Configuration via environment variables:** Write a Python script that reads `DATABASE_URL` (with a default), `DEBUG` (as a boolean), and `API_KEY` (required). Run it:
    * Without `API_KEY` set (should fail with a clear error)
    * With `API_KEY=test DEBUG=true python script.py`
    * Injected via Docker: `docker run -e API_KEY=test myimage python script.py`

***

#### Advanced Challenge

11. **Publish to TestPyPI end-to-end:**
    * Create a package (use a unique name like `yourname-greeting`)
    * Write `pyproject.toml` with correct metadata
    * Build with `uv build`
    * Create an account on [test.pypi.org](https://test.pypi.org/)
    * Publish with `uv publish --publish-url https://test.pypi.org/legacy/`
    * Install from TestPyPI in a fresh venv to verify it works
    * Build a Docker image that installs your package from TestPyPI and runs it
12. **Multi-stage Docker build:** Take an application with a compiled dependency (e.g., a C extension, or anything that needs `gcc`). Write a multi-stage Dockerfile:

    * Stage 1 (`builder`): full image with build tools, compile the package
    * Stage 2 (final): `python:3.12-slim`, copy only the compiled artifacts

    Measure the size difference between the single-stage and multi-stage versions with `docker images`.
13. **Reproduce a dependency conflict:** Create two separate `pyproject.toml` files, each requiring incompatible versions of the same transitive dependency. Document the exact error that `uv pip install` produces. Then resolve it using a virtual environment to isolate each project.
14. **GitHub Pages:** Create a static website using any static site generator (e.g., Jekyll, MkDocs, or even plain HTML). Configure GitHub Pages in the repository settings to serve from the `gh-pages` branch or the `docs/` folder. Add a GitHub Actions workflow that automatically rebuilds and deploys the site on every push to `main`.

***

### Summary

| Topic                    | Core Idea                                                                           |
| ------------------------ | ----------------------------------------------------------------------------------- |
| **Dependencies**         | Your code needs other packages. Package managers resolve the full dependency graph. |
| **Dependency hell**      | Incompatible version requirements across packages. Solved by isolation.             |
| **Virtual environments** | Isolated package installations per project. Use one always.                         |
| **`uv`**                 | Modern, fast (10-100×) Python package manager. Replaces `pip` and `venv`.           |
| **Artifacts**            | Packaged, distributable outputs: wheels, binaries, container images                 |
| **`pyproject.toml`**     | Modern Python project manifest. Declares metadata, deps, scripts.                   |
| **SemVer**               | MAJOR.MINOR.PATCH — a compatibility contract between maintainers and users          |
| **Lock files**           | Pin every transitive dependency for reproducibility. Commit for apps, not libs.     |
| **Containers**           | Package app + entire filesystem. Eliminates "works on my machine".                  |
| **Dockerfile**           | Declarative recipe for building container images. Order layers by change frequency. |
| **Docker Compose**       | Orchestrate multi-service apps (web + db + cache) from one YAML file.               |
| **Configuration**        | Separate from code. Use env vars or config files. Never commit secrets.             |
| **Publishing**           | PyPI for Python, crates.io for Rust, npm for JS, Docker Hub for images.             |

#### Most Important Commands from This Lecture

```
uv venv / source .venv/bin/activate   – create and activate virtual environment
uv pip install package                – install packages (fast)
uv lock                               – generate lock file
uv build                              – build wheel + sdist
uv publish                            – publish to PyPI
uv tree                               – visualize dependency tree

docker build -t name .                – build a Docker image
docker run -it image                  – run a container interactively
docker run -p 8080:8080 -e KEY=val image  – run with port + env
docker images / docker ps             – list images / containers

docker compose up -d                  – start all services in background
docker compose down                   – stop and remove containers
docker compose exec service bash      – shell into running service
docker compose logs -f service        – follow service logs
```

#### What's Next

In **Lecture 7 – Agentic Coding**, you'll learn how AI coding agents work, how to use tools like Claude Code to delegate multi-step engineering tasks, and how to think about supervising autonomous systems that write and run code on your behalf.

***

Source: [MIT Missing Semester – Packaging and Shipping Code](https://missing.csail.mit.edu/2026/shipping-code/) Licensed under [CC BY-NC-SA 4.0](https://creativecommons.org/licenses/by-nc-sa/4.0)


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://shankar-lab.gitbook.io/mylearning/the-missing-semester-of-your-cs-education/lecture-6-packaging-and-shipping-code.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
