How to Automatically Generate requirements.txt from Python Imports: A Guide for GitHub Projects

If you’ve ever contributed to or maintained a Python project on GitHub, you’ve probably encountered the requirements.txt file. This humble text file lists the dependencies needed to run the project, ensuring that anyone cloning the repository can replicate the environment with a simple pip install -r requirements.txt. But manually curating requirements.txt is a recipe for frustration: missing dependencies, unused packages cluttering the file, or version mismatches that break reproducibility.

The good news? You don’t have to do it by hand. In this guide, we’ll explore automated tools and best practices to generate requirements.txt directly from your Python imports. Whether you’re a solo developer or part of a team, these techniques will save you time, reduce errors, and make your GitHub project more accessible to collaborators.

Table of Contents#

What is requirements.txt and Why It Matters#

requirements.txt is a plain text file that lists Python packages (and their versions) required to run a project. It’s used by pip, Python’s package installer, to automate dependency installation:

pip install -r requirements.txt  

For GitHub projects, requirements.txt is critical for:

  • Reproducibility: Ensures everyone working on the project uses the same dependency versions, preventing "it works on my machine" issues.
  • Onboarding: Simplifies setup for new contributors or users (no manual package installation).
  • CI/CD Pipelines: Enables automated testing and deployment by specifying the exact environment needed.

Challenges with Manual Requirements Management#

Manually updating requirements.txt is error-prone. Common issues include:

  • Missing Dependencies: Forgetting to add a new package after pip install.
  • Unused Packages: Leaving behind packages that were removed from the codebase, bloating the environment.
  • Version Confusion: Guessing version numbers or using latest, leading to unexpected breaking changes.
  • Environment Pollution: Accidentally including system-wide packages not relevant to the project.

Tools to Automatically Generate requirements.txt#

Let’s explore the most popular tools to automate requirements.txt generation, along with their pros, cons, and step-by-step usage.

1. pipreqs: Scan Imports, Generate Requirements#

How it works: pipreqs parses your Python files to detect imported packages, then generates requirements.txt containing only the dependencies your code actually uses. It ignores unused packages, making it ideal for lean projects.

Installation#

pip install pipreqs  

Usage#

Run pipreqs in your project root (replace ./my_project with your project path):

pipreqs ./my_project  

Example#

Suppose your project has a file app.py with:

import requests  
from pandas import DataFrame  

pipreqs will generate requirements.txt:

requests==2.31.0  
pandas==2.1.4  

(Versions depend on the latest available at runtime; see "Pinning Versions" below.)

Pros#

  • Focuses on used imports: Ignores unused packages, keeping requirements.txt clean.
  • Lightweight: No extra configuration or dependencies.
  • Works with any project: No need for a specific project structure.

Cons#

  • Misses dynamic imports: Fails to detect imports loaded via importlib or __import__().
  • No version pinning by default: Uses the latest available version unless specified (see workaround below).

Pro Tip: Pinning Versions with pipreqs#

To include specific versions, use the --force flag to overwrite an existing requirements.txt, or manually edit the file after generation. For stricter control, combine pipreqs with pip freeze (see Section 2).

2. pip freeze: Capture the Entire Environment#

How it works: pip freeze outputs all packages installed in your current Python environment (including their versions). Redirecting this output to requirements.txt captures your environment snapshot.

Usage#

First, activate a clean virtual environment (critical to avoid capturing unrelated packages):

python -m venv .venv  
source .venv/bin/activate  # Linux/macOS  
.venv\Scripts\activate     # Windows  

Install only your project’s dependencies, then run:

pip freeze > requirements.txt  

Example#

If your environment has requests==2.31.0 and pandas==2.1.4 installed, requirements.txt will contain:

pandas==2.1.4  
requests==2.31.0  

Pros#

  • Simple: No extra tools—built into pip.
  • Version precision: Includes exact versions, ensuring reproducibility.

Cons#

  • Includes all environment packages: Captures unused or transient dependencies (e.g., numpy if pandas requires it).
  • Environment-dependent: Relies on your virtual environment being "clean" (no leftover packages).

Pro Tip: Use pip freeze with pipreqs#

For the best of both worlds:

  1. Use pipreqs to get a list of used imports.
  2. Install those imports in a clean virtual environment.
  3. Run pip freeze to pin versions.

3. Poetry: A Modern Package Manager#

How it works: Poetry is a full-featured package manager that uses pyproject.toml (instead of setup.py) to manage dependencies. It can export requirements.txt from its dependency graph.

Installation#

curl -sSL https://install.python-poetry.org | python3 -  # Linux/macOS  
# Or via pip: pip install poetry  

Usage#

  1. Initialize a Poetry project (or convert an existing one):

    poetry new my_project && cd my_project  
  2. Add dependencies (Poetry installs them and updates pyproject.toml):

    poetry add requests pandas  
  3. Export to requirements.txt:

    poetry export --format requirements.txt --output requirements.txt  

Example#

pyproject.toml snippet:

[tool.poetry.dependencies]  
python = "^3.8"  
requests = "^2.31.0"  
pandas = "^2.1.4"  

Exported requirements.txt:

pandas==2.1.4; python_version >= "3.8" and python_version < "4.0"  
requests==2.31.0; python_version >= "3.8" and python_version < "4.0"  

Pros#

  • Dependency resolution: Automatically handles version conflicts.
  • Virtual environment management: Creates and manages environments for you.
  • Dev vs. prod dependencies: Separate dev-dependencies (e.g., pytest) from production dependencies.

Cons#

  • Steeper learning curve: More complex than pipreqs or pip freeze.
  • Overkill for small projects: Designed for packaging and distribution, not just requirements generation.

4. pip-tools: Compile Dependencies with Precision#

How it works: pip-tools uses pip-compile to generate requirements.txt from a requirements.in file (a human-readable list of top-level dependencies). It resolves versions and pins all transitive dependencies.

Installation#

pip install pip-tools  

Usage#

  1. Create requirements.in with your top-level dependencies:

    requests  
    pandas  
  2. Run pip-compile to generate requirements.txt:

    pip-compile requirements.in -o requirements.txt  

Example#

pip-compile resolves versions and outputs:

# This file is autogenerated by pip-compile with python 3.11  
# To update, run:  
# pip-compile requirements.in -o requirements.txt  
numpy==1.26.2  # via pandas  
pandas==2.1.4  
python-dateutil==2.8.2  # via pandas  
pytz==2023.3.post1  # via pandas  
requests==2.31.0  
six==1.16.0  # via python-dateutil  
urllib3==2.1.0  # via requests  

Pros#

  • Explicit control: Separates "source" dependencies (requirements.in) from generated pins (requirements.txt).
  • Dev/prod separation: Use requirements-dev.in for development dependencies (e.g., pytest).

Cons#

  • Extra setup: Requires maintaining .in files.
  • Less intuitive for beginners: Adds another layer of abstraction.

Best Practices for GitHub Projects#

To maximize the effectiveness of requirements.txt in GitHub projects:

1. Commit requirements.txt to Version Control#

Always include requirements.txt in your repo so collaborators can install dependencies immediately.

2. Use Virtual Environments#

Never generate requirements.txt from a system-wide Python installation—use venv, conda, or Poetry to isolate dependencies.

3. Separate Dev and Production Dependencies#

Create requirements-dev.txt for tools like pytest or black, and requirements.txt for production:

# requirements-dev.in  
-r requirements.txt  # Include production deps  
pytest==7.4.3  
black==23.12.1  

Then run pip-compile requirements-dev.in -o requirements-dev.txt.

4. Automate Updates with GitHub Actions#

Add a pre-commit hook or GitHub Action to regenerate requirements.txt when dependencies change. For example, use a workflow to run pipreqs or poetry export on every push.

5. Document the Process#

In your README.md, explain how to generate/update requirements.txt (e.g., "Run poetry export to update dependencies").

Common Pitfalls and How to Avoid Them#

1. Dynamic Imports#

Problem: Tools like pipreqs miss imports like importlib.import_module("requests").
Fix: Manually add dynamic dependencies to requirements.txt or use # pipreqs: force comments in your code.

2. Environment Pollution#

Problem: pip freeze includes unused packages from a messy virtual environment.
Fix: Always use a fresh virtual environment for generating requirements.txt.

3. Version Conflicts#

Problem: Generated requirements.txt has conflicting versions (e.g., requests==2.25.0 and urllib3==2.0.0 which is incompatible).
Fix: Use Poetry or pip-tools for automatic dependency resolution.

4. Forgetting to Update#

Problem: Adding a new dependency but not regenerating requirements.txt.
Fix: Use pre-commit hooks (e.g., pre-commit run pipreqs --all-files) to enforce updates.

Conclusion#

Automatically generating requirements.txt eliminates manual errors and ensures your GitHub projects are reproducible. The right tool depends on your workflow:

  • Small projects: Use pipreqs for simplicity.
  • Clean environments: Use pip freeze for version precision.
  • Modern workflows: Use Poetry or pip-tools for dependency management.

By combining these tools with best practices like virtual environments and CI/CD automation, you’ll make collaboration seamless and keep your project’s dependency hell at bay.

References#