GitSimulator: Production-Grade Git Simulation Engine
Update (Dec 2025): GitSimulator won 3rd Place at the GitKon 2025 Game Jam, organized by GitKraken.
View official announcement โ
Overview
GitSimulator is a production-ready Python CLI that acts as a "flight simulator" for Git commands. It simulates potentially destructive operations โ rebase, merge, reset, cherry-pick โ in a safe, read-only environment, providing visual before/after graphs, conflict prediction with three certainty levels (CERTAIN/LIKELY/POSSIBLE), and comprehensive safety analysis (LOW/MEDIUM/HIGH/CRITICAL). Now available on PyPI: pipx install gitsimulator (v1.0.1). Repository: GitHub - GitSimulator.
Production Highlights
- 135+ Tests: Comprehensive test suite with 95%+ coverage (unit, integration, property-based)
- Automated CI/CD: GitHub Actions pipeline for linting, type checking, testing, and PyPI releases
- Interactive TUI: Textual-powered terminal UI with real-time command preview
- Plugin Architecture: Extensible hook system for custom simulators, formatters, and policies
- Pure Python Git: No
gitbinary required โ uses Dulwich for all operations - Multi-Platform: Linux, macOS, Windows support (Python 3.11, 3.12, 3.13)
Why Build This?
Git's most powerful commands are also its most dangerous. A single mistyped git reset --hard or poorly planned git rebase can rewrite history, lose work, or create conflicts that take hours to resolve. GitSimulator addresses this by:
- Zero-Risk Preview: See exact outcomes without touching your repository
- Conflict Forecasting: Know which files will conflict before starting a merge/rebase
- Safety Guardrails: Automated risk scoring prevents catastrophic mistakes
- Educational Value: Learn Git internals through interactive explanations
- CI/CD Integration: Pre-validate dangerous operations in automation scripts
Architecture Deep Dive
System Design
GitSimulator follows a clean, modular architecture with strict separation of concerns:
โโโโโโโโโโโโโโโโโโโ CLI Layer (Typer) โโโโโโโโโโโโโโโโโโโ
โ Commands โ Options โ Parsing โ Rich Formatting โ
โโโโโโโโโโโโโโโโโโโโโโโโโฌโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โผ
โโโโโโโโโโโโโโ Simulation Dispatcher โโโโโโโโโโโโโโโโโโโ
โ โข Command routing โ
โ โข Plugin hook execution (pre/override/post) โ
โ โข Result validation โ
โโโโโโโโโโโโโโโโโโโโโโโโโฌโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โผ
โโโโโโโโโโโโโโโโโผโโโโโโโโโโโโโโโโ
โผ โผ โผ
โโโโโโโโโโโ โโโโโโโโโโโ โโโโโโโโโโโ
โ Rebase โ โ Merge โ โ Reset โ
โSimulatorโ โSimulatorโ โSimulatorโ
โโโโโโโโโโโ โโโโโโโโโโโ โโโโโโโโโโโ
โโโโโโโโโโโโโโโโโผโโโโโโโโโโโโโโโโ
โผ
โโโโโโโโโโโโโโโโโ Core Services โโโโโโโโโโโโโโโโโโโโโโโโ
โ Repository (Dulwich) โ Diff Analyzer โ Conflict โ
โ โ โ Detector โ
โโโโโโโโโโโโโโโโโโโโโโโโโฌโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โผ
โโโโโโโโโโโโโโโ Git Repository (.git/) โโโโโโโโโโโโโโโโโ
โ Read-only access via Dulwich โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโCore Components
1. Repository Layer (core/repository.py)
Pure Python Git access via Dulwich โ no git binary required:
- Short SHA Resolution: 7+ characters with ambiguity detection
- Relative Refs:
HEAD~2,main^, etc. - Topological Sorting: Proper commit graph display
- Lazy Loading: Only fetch commits needed for visualization
2. Simulation Engines (simulation/)
Each simulator implements a BaseSimulator[T] protocol:
class BaseSimulator(Protocol[T]):
def validate(self) -> tuple[list[str], list[str]]:
"""Pre-flight checks โ (errors, warnings)"""
def simulate(self) -> T:
"""Execute simulation โ typed result"""- Rebase: Find merge base โ collect commits โ replay onto target โ detect conflicts per commit
- Merge: Three-way analysis (base vs ours vs theirs) โ fast-forward detection โ conflict accumulation
- Reset: Mode handling (SOFT/MIXED/HARD) โ working directory simulation โ reflog recovery instructions
- Cherry-Pick: Sequential commit application โ cumulative conflict detection โ parent preservation
3. Conflict Detection (simulation/conflict_detector.py)
Sophisticated heuristics engine with three certainty levels:
- CERTAIN (90%+ accuracy): Same file, overlapping line ranges; both sides modify identical lines; binary file conflicts
- LIKELY (70%+ accuracy): Changes within 3 lines of each other; same function/class modifications
- POSSIBLE (50%+ accuracy): Same file modified in different sections; semantic conflicts (imports)
Algorithm: For each file changed in both branches, parse diff hunks and check for overlaps, proximity, or same-file edits. False positives reduced via whitespace normalization and contiguous hunk collapsing.
4. Safety Analysis System
Aggregates weighted risk factors into danger levels:
risk = ฮฃ (w_i * factor_i)
Levels:
LOW โ Easily reversible, minimal risk
MEDIUM โ Recoverable with reflog
HIGH โ History rewrite, force-push required
CRITICAL โ Data loss risk, shared branch rewriteFactors include: rewrite depth, force-push requirement, orphaned commits, conflict density, working tree discard magnitude.
Production Infrastructure
Testing Strategy (135+ Tests)
- Unit Tests: Each component in isolation (repository, conflict detector, simulators)
- Integration Tests: End-to-end command flows with real Git repositories
- Property Tests: Invariant checking (graphs are DAGs, simulations idempotent)
- Fixture-Based: Multiple repo states (linear, branched, merge commits, octopus merges)
pytest --cov=git_sim --cov-report=html
# Current: 95%+ coverageCI/CD Pipeline (GitHub Actions)
Automated workflow on every push:
- Linting: Ruff for code style and formatting
- Type Checking: MyPy with strict mode
- Testing: pytest across Python 3.11, 3.12, 3.13 on Linux/macOS/Windows
- Coverage: Enforce 90%+ threshold
- PyPI Release: Automated publishing on version tags
Plugin Architecture
Extensible system with three plugin types:
- HookPlugin: Intercept simulation lifecycle (pre/override/post)
- SimulatorPlugin: Add custom simulation commands
- FormatterPlugin: Custom output rendering
Discovery via entry points in pyproject.toml:
[project.entry-points."git_sim.plugins"]
my_plugin = "my_package.plugin:MyPlugin"Interactive TUI Mode
Textual-powered terminal UI with:
- Real-time command preview as you type
- Keyboard navigation of commit graphs
- Conflict hotspot highlighting
- Headless mode for scripting
gitsim tui # Launch interactive mode
gitsim tui --headless # Script-friendly outputEducational Features
gitsim explain <command> provides interactive learning:
- ๐ Plain-English explanations of Git internals
- ๐ Step-by-step algorithm breakdowns
- โ ๏ธ Risk assessment and common pitfalls
- ๐ก Best practices and safety tips
- ๐ Alternative approaches
Real-World Example Output
$ gitsim rebase main
Simulating: git rebase main
โญโโโโโโโโโโ Rebase Summary โโโโโโโโโโโโฎ
โ Source branch feature โ
โ Target branch main โ
โ Merge base abc1234 โ
โ Commits to replay 3 โ
โ Predicted conflicts 1 โ
โฐโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฏ
โญโโโโโโโโโโ Safety Analysis โโโโโโโโโโโฎ
โ Danger Level ๐ด HIGH โ
โ Reversible Yes โ
โ Force Push Required Yes โ
โฐโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฏ
Before Rebase:
* abc1234 (HEAD -> feature) Add authentication
* def5678 Update config
| * 123abcd (main) Fix security bug
|/
* 789xyz0 Initial commit
After Rebase (Simulated):
* new1234' (HEAD -> feature) Add authentication
* new5678' Update config
* 123abcd (main) Fix security bug
* 789xyz0 Initial commit
Potential Conflicts
โโโโโโโโโโโโโโณโโโโโโโโโโโโโณโโโโโโโโโโโโ
โ Severity โ File โ Details โ
โกโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฉ
โ CERTAIN โ config.py โ Lines โ
โ โ โ 45-52 โ
โโโโโโโโโโโโโโดโโโโโโโโโโโโโดโโโโโโโโโโโโ
Recovery: git reflog to restore, git reset --hard ORIG_HEADPerformance Optimizations
- Lazy Graph Building: Only load commits needed for visualization
- Diff Caching: Memoize expensive diff operations
- Short SHA Indexing: Stop at first unique match
- Parallel-Safe: Pure functional core, no shared state
- Memory Efficient: Stream commits vs loading entire history
Key Design Decisions
Read-Only Guarantee
GitSimulator never writes to your repository:
- Uses Dulwich's read-only API exclusively
- No
gitsubprocess calls that modify state - Simulations run entirely in memory
- Snapshot system uses separate
.git/git-sim-snapshots/directory
Deterministic Virtual Hashes
Virtual commit IDs are stable across runs (derived from parent hash + tree diff fingerprint), enabling:
- Consistent visualization for demos/screenshots
- Easier mental mapping of before/after states
- No collision with real Git object IDs
Unified Command Interface
Three equivalent commands reduce friction:
gitsim rebase main
git-sim rebase main # Dash-style (git alias friendly)
gitsimulator rebase main # Full name
# Or unified dispatcher:
gitsim sim "rebase main --autosquash"Development Struggles & Solutions
1. Indentation Noise in Conflict Detection
Problem: Early heuristic flagged pure whitespace changes as conflicts.
Solution: Added semantic diff filtering with whitespace normalization and comment-only diff detection.
2. Virtual Hash Collisions
Problem: First design used truncated SHA-1, causing occasional collisions.
Solution: Switched to deterministic hash from parent_sha + file_path + hunk_fingerprint with vreb prefix for clarity.
3. Safety Scoring Calibration
Problem: Force-push weight produced too many HIGH scores for benign operations.
Solution: Tuned weights using real-world repositories, exposed factors in explain mode for transparency.
4. Binary File Handling
Problem: Lacked diff granularity for binary files (images, PDFs).
Solution: Mark binary conflicts as CERTAIN with explicit advisory to inspect manually.
Concrete Use Cases
1. Pre-Flight Rebase Audit
Before running git rebase main on a 50-commit feature branch, simulate to see:
- Which commits will be rewritten (new SHAs)
- Predicted conflicts per commit with line ranges
- Safety level (HIGH if force-push required)
2. CI Merge Gate
In GitHub Actions:
- name: Simulate merge
run: |
gitsim merge ${{ github.head_ref }}
if [[ $(gitsim merge ${{ github.head_ref }} --json | jq -r '.safety.level') == "CRITICAL" ]]; then
echo "Merge too risky - manual review required"
exit 1
fi3. Educational Onboarding
New hires learn Git internals via gitsim explain:
gitsim explain rebase # How rebase works internally
gitsim explain merge # Three-way merge algorithm
gitsim explain reset # SOFT vs MIXED vs HARD4. Complex Cherry-Pick Planning
Simulate picking 10 commits from feature-a onto hotfix:
gitsim cherry-pick abc123 def456 789xyz ... --onto hotfix
# Shows step-by-step conflicts and new commit orderComparison with Alternatives
| Feature | GitSimulator | git log --graph | GitKraken | lazygit |
|---|---|---|---|---|
| Simulation | โ Full | โ No | โ No | โ No |
| Conflict Prediction | โ 3 levels | โ No | โ ๏ธ Basic | โ No |
| Safety Analysis | โ Yes | โ No | โ No | โ No |
| Educational Mode | โ Yes | โ No | โ No | โ No |
| No Git Binary | โ Pure Python | โ Requires Git | โ Requires Git | โ Requires Git |
| Plugin System | โ Yes | โ No | โ Yes | โ No |
| SSH-Friendly | โ Yes | โ Yes | โ GUI only | โ Yes |
| Automated CI Tests | โ 135+ | N/A | N/A | N/A |
Installation & Quick Start
# Install from PyPI (v1.0.1)
pipx install gitsimulator
# All three commands work identically:
gitsim rebase main
git-sim rebase main
gitsimulator rebase main
# Preview a merge with conflict detection
gitsim merge feature-branch
# Learn how Git commands work
gitsim explain rebase
# Save repository state for experiments
gitsim snapshot create before-rebaseFuture Roadmap
- โ Completed: Interactive TUI, plugin system, 135+ tests, PyPI distribution, CI/CD automation
- ๐ง In Progress: AST-based conflict refinement for language-specific detection
- ๐ Planned: Batch mode for CI matrix analysis, custom policy scoring, integration with Git hosting APIs (GitHub/GitLab)
Lessons Learned
Heuristics First, Precision Later
Fast approximations unlock value early. We shipped conflict detection with 70%+ accuracy instead of waiting for 95% precision via AST parsing. Users adopted it immediately, and we can layer in deeper analysis later.
Explainability Drives Trust
Users trust safety scores more when rationale is explicit. The explain mode showing why a rebase is HIGH risk (e.g., "3 commits rewritten + force-push required") builds confidence.
Testing Infrastructure Pays Off
135+ tests caught edge cases during plugin development (octopus merges, detached HEAD, ambiguous short SHAs). Automated CI prevented regressions across 6+ Python/OS combinations.
Pure Python = Portability Win
Using Dulwich instead of shelling out to git binary eliminated platform-specific bugs (Windows path handling, Git version mismatches) and enabled SSH-only environments.
Metrics & Impact
- PyPI Downloads: Tracked via
pepy.tech(published v1.0.1 November 2025) - Test Coverage: 95%+ across 135+ tests
- CI Success Rate: 99%+ (GitHub Actions badge)
- Supported Platforms: Linux, macOS, Windows ร Python 3.11/3.12/3.13
- Educational Reach: Used in university Git workshops (anecdotal)
Technical Debt & Trade-offs
Current Limitations
- No Stash Simulation: Planned for v1.1.0
- Basic Rename Detection: Relies on Dulwich heuristics (70% threshold)
- English-Only Explanations: I18n deferred for post-1.0
Intentional Constraints
- Read-Only: Never execute simulations automatically (safety by design)
- Terminal-First: No GUI planned (SSH-friendly focus)
- Heuristic Conflicts: Not 100% accurate (explicitly documented)
Conclusion
GitSimulator demonstrates how production-grade tooling (comprehensive testing, CI/CD, plugin architecture) can transform a developer experience problem into a polished open-source product. By combining Git internals expertise, sophisticated heuristics, and thoughtful UX (Rich formatting, interactive TUI), the project achieves its goal: making dangerous Git commands safe to explore.
GitHub: egekaya1/GitSimulator ยท PyPI: pipx install gitsimulator (v1.0.1) ยท CI: GitHub Actions