GitSimulator: Production-Grade Git Simulation Engine
Update (Dec 2025): GitSimulator won 3rd Place at the GitKon 2025 Game Jam, organized by GitKraken.
View official announcement โ
Overview
GitSimulator is a production-ready Python CLI that acts as a "flight simulator" for Git commands. It simulates potentially destructive operations โ rebase, merge, reset, cherry-pick โ in a safe, read-only environment, providing visual before/after graphs, conflict prediction with three certainty levels (CERTAIN/LIKELY/POSSIBLE), and comprehensive safety analysis (LOW/MEDIUM/HIGH/CRITICAL). Now available on PyPI: pipx install gitsimulator (v1.0.1). Repository: GitHub - GitSimulator.
Production Highlights
- 135+ Tests: Comprehensive test suite with 95%+ coverage (unit, integration, property-based)
- Automated CI/CD: GitHub Actions pipeline for linting, type checking, testing, and PyPI releases
- Interactive TUI: Textual-powered terminal UI with real-time command preview
- Plugin Architecture: Extensible hook system for custom simulators, formatters, and policies
- Pure Python Git: No
gitbinary required โ uses Dulwich for all operations - Multi-Platform: Linux, macOS, Windows support (Python 3.11, 3.12, 3.13)
Why Build This?
Git's most powerful commands are also its most dangerous. A single mistyped git reset --hard or poorly planned git rebase can rewrite history, lose work, or create conflicts that take hours to resolve. GitSimulator addresses this by:
- Read-only preview: See the exact outcome before touching your repository
- Conflict forecasting: Know which files will conflict before starting a merge or rebase
- Risk scoring: Automated safety analysis flags HIGH and CRITICAL operations with specific reasons
- Educational mode: Step-by-step explanations of what Git is actually doing internally
- CI integration: Pre-validate dangerous operations in automation pipelines
Architecture Deep Dive
System Design
GitSimulator follows a clean, modular architecture with strict separation of concerns:
โโโโโโโโโโโโโโโโโโโ CLI Layer (Typer) โโโโโโโโโโโโโโโโโโโ
โ Commands โ Options โ Parsing โ Rich Formatting โ
โโโโโโโโโโโโโโโโโโโโโโโโโฌโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โผ
โโโโโโโโโโโโโโ Simulation Dispatcher โโโโโโโโโโโโโโโโโโโ
โ โข Command routing โ
โ โข Plugin hook execution (pre/override/post) โ
โ โข Result validation โ
โโโโโโโโโโโโโโโโโโโโโโโโโฌโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โผ
โโโโโโโโโโโโโโโโโผโโโโโโโโโโโโโโโโ
โผ โผ โผ
โโโโโโโโโโโ โโโโโโโโโโโ โโโโโโโโโโโ
โ Rebase โ โ Merge โ โ Reset โ
โSimulatorโ โSimulatorโ โSimulatorโ
โโโโโโโโโโโ โโโโโโโโโโโ โโโโโโโโโโโ
โโโโโโโโโโโโโโโโโผโโโโโโโโโโโโโโโโ
โผ
โโโโโโโโโโโโโโโโโ Core Services โโโโโโโโโโโโโโโโโโโโโโโโ
โ Repository (Dulwich) โ Diff Analyzer โ Conflict โ
โ โ โ Detector โ
โโโโโโโโโโโโโโโโโโโโโโโโโฌโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โผ
โโโโโโโโโโโโโโโ Git Repository (.git/) โโโโโโโโโโโโโโโโโ
โ Read-only access via Dulwich โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโCore Components
1. Repository Layer (core/repository.py)
Pure Python Git access via Dulwich โ no git binary required:
- Short SHA Resolution: 7+ characters with ambiguity detection
- Relative Refs:
HEAD~2,main^, etc. - Topological Sorting: Proper commit graph display
- Lazy Loading: Only fetch commits needed for visualization
2. Simulation Engines (simulation/)
Each simulator implements a BaseSimulator[T] protocol:
class BaseSimulator(Protocol[T]):
def validate(self) -> tuple[list[str], list[str]]:
"""Pre-flight checks โ (errors, warnings)"""
def simulate(self) -> T:
"""Execute simulation โ typed result"""- Rebase: Find merge base โ collect commits โ replay onto target โ detect conflicts per commit
- Merge: Three-way analysis (base vs ours vs theirs) โ fast-forward detection โ conflict accumulation
- Reset: Mode handling (SOFT/MIXED/HARD) โ working directory simulation โ reflog recovery instructions
- Cherry-Pick: Sequential commit application โ cumulative conflict detection โ parent preservation
3. Conflict Detection (simulation/conflict_detector.py)
Sophisticated heuristics engine with three certainty levels:
- CERTAIN (90%+ accuracy): Same file, overlapping line ranges; both sides modify identical lines; binary file conflicts
- LIKELY (70%+ accuracy): Changes within 3 lines of each other; same function/class modifications
- POSSIBLE (50%+ accuracy): Same file modified in different sections; semantic conflicts (imports)
Algorithm: For each file changed in both branches, parse diff hunks and check for overlaps, proximity, or same-file edits. False positives reduced via whitespace normalization and contiguous hunk collapsing.
4. Safety Analysis System
Aggregates weighted risk factors into danger levels:
risk = ฮฃ (w_i * factor_i)
Levels:
LOW โ Easily reversible, minimal risk
MEDIUM โ Recoverable with reflog
HIGH โ History rewrite, force-push required
CRITICAL โ Data loss risk, shared branch rewriteFactors include: rewrite depth, force-push requirement, orphaned commits, conflict density, working tree discard magnitude.
Production Infrastructure
Testing Strategy (135+ Tests)
- Unit Tests: Each component in isolation (repository, conflict detector, simulators)
- Integration Tests: End-to-end command flows with real Git repositories
- Property Tests: Invariant checking (graphs are DAGs, simulations idempotent)
- Fixture-Based: Multiple repo states (linear, branched, merge commits, octopus merges)
pytest --cov=git_sim --cov-report=html
# Current: 95%+ coverageCI/CD Pipeline (GitHub Actions)
Automated workflow on every push:
- Linting: Ruff for code style and formatting
- Type Checking: MyPy with strict mode
- Testing: pytest across Python 3.11, 3.12, 3.13 on Linux/macOS/Windows
- Coverage: Enforce 90%+ threshold
- PyPI Release: Automated publishing on version tags
Plugin Architecture
Extensible system with three plugin types:
- HookPlugin: Intercept simulation lifecycle (pre/override/post)
- SimulatorPlugin: Add custom simulation commands
- FormatterPlugin: Custom output rendering
Discovery via entry points in pyproject.toml:
[project.entry-points."git_sim.plugins"]
my_plugin = "my_package.plugin:MyPlugin"Interactive TUI Mode
Textual-powered terminal UI with:
- Real-time command preview as you type
- Keyboard navigation of commit graphs
- Conflict hotspot highlighting
- Headless mode for scripting
gitsim tui # Launch interactive mode
gitsim tui --headless # Script-friendly outputEducational Features
gitsim explain <command> provides interactive learning:
- ๐ Plain-English explanations of Git internals
- ๐ Step-by-step algorithm breakdowns
- โ ๏ธ Risk assessment and common pitfalls
- ๐ก Best practices and safety tips
- ๐ Alternative approaches
Real-World Example Output
$ gitsim rebase main
Simulating: git rebase main
โญโโโโโโโโโโ Rebase Summary โโโโโโโโโโโโฎ
โ Source branch feature โ
โ Target branch main โ
โ Merge base abc1234 โ
โ Commits to replay 3 โ
โ Predicted conflicts 1 โ
โฐโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฏ
โญโโโโโโโโโโ Safety Analysis โโโโโโโโโโโฎ
โ Danger Level ๐ด HIGH โ
โ Reversible Yes โ
โ Force Push Required Yes โ
โฐโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฏ
Before Rebase:
* abc1234 (HEAD -> feature) Add authentication
* def5678 Update config
| * 123abcd (main) Fix security bug
|/
* 789xyz0 Initial commit
After Rebase (Simulated):
* new1234' (HEAD -> feature) Add authentication
* new5678' Update config
* 123abcd (main) Fix security bug
* 789xyz0 Initial commit
Potential Conflicts
โโโโโโโโโโโโโโณโโโโโโโโโโโโโณโโโโโโโโโโโโ
โ Severity โ File โ Details โ
โกโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฉ
โ CERTAIN โ config.py โ Lines โ
โ โ โ 45-52 โ
โโโโโโโโโโโโโโดโโโโโโโโโโโโโดโโโโโโโโโโโโ
Recovery: git reflog to restore, git reset --hard ORIG_HEADPerformance Optimizations
- Lazy Graph Building: Only load commits needed for visualization
- Diff Caching: Memoize expensive diff operations
- Short SHA Indexing: Stop at first unique match
- Parallel-Safe: Pure functional core, no shared state
- Memory Efficient: Stream commits vs loading entire history
Key Design Decisions
Read-Only Guarantee
GitSimulator never writes to your repository:
- Uses Dulwich's read-only API exclusively
- No
gitsubprocess calls that modify state - Simulations run entirely in memory
- Snapshot system uses separate
.git/git-sim-snapshots/directory
Deterministic Virtual Hashes
Virtual commit IDs are stable across runs (derived from parent hash + tree diff fingerprint), enabling:
- Consistent visualization for demos/screenshots
- Easier mental mapping of before/after states
- No collision with real Git object IDs
Unified Command Interface
Three equivalent commands reduce friction:
gitsim rebase main
git-sim rebase main # Dash-style (git alias friendly)
gitsimulator rebase main # Full name
# Or unified dispatcher:
gitsim sim "rebase main --autosquash"Development Struggles & Solutions
1. Indentation Noise in Conflict Detection
Problem: Early heuristic flagged pure whitespace changes as conflicts.
Solution: Added semantic diff filtering with whitespace normalization and comment-only diff detection.
2. Virtual Hash Collisions
Problem: First design used truncated SHA-1, causing occasional collisions.
Solution: Switched to deterministic hash from parent_sha + file_path + hunk_fingerprint with vreb prefix for clarity.
3. Safety Scoring Calibration
Problem: Force-push weight produced too many HIGH scores for benign operations.
Solution: Tuned weights using real-world repositories, exposed factors in explain mode for transparency.
4. Binary File Handling
Problem: Lacked diff granularity for binary files (images, PDFs).
Solution: Mark binary conflicts as CERTAIN with explicit advisory to inspect manually.
Concrete Use Cases
1. Pre-Flight Rebase Audit
Before running git rebase main on a 50-commit feature branch, simulate to see:
- Which commits will be rewritten (new SHAs)
- Predicted conflicts per commit with line ranges
- Safety level (HIGH if force-push required)
2. CI Merge Gate
In GitHub Actions:
- name: Simulate merge
run: |
gitsim merge ${{ github.head_ref }}
if [[ $(gitsim merge ${{ github.head_ref }} --json | jq -r '.safety.level') == "CRITICAL" ]]; then
echo "Merge too risky - manual review required"
exit 1
fi3. Educational Onboarding
New hires learn Git internals via gitsim explain:
gitsim explain rebase # How rebase works internally
gitsim explain merge # Three-way merge algorithm
gitsim explain reset # SOFT vs MIXED vs HARD4. Complex Cherry-Pick Planning
Simulate picking 10 commits from feature-a onto hotfix:
gitsim cherry-pick abc123 def456 789xyz ... --onto hotfix
# Shows step-by-step conflicts and new commit orderComparison with Alternatives
| Feature | GitSimulator | git log --graph | GitKraken | lazygit |
|---|---|---|---|---|
| Simulation | โ Full | โ No | โ No | โ No |
| Conflict Prediction | โ 3 levels | โ No | โ ๏ธ Basic | โ No |
| Safety Analysis | โ Yes | โ No | โ No | โ No |
| Educational Mode | โ Yes | โ No | โ No | โ No |
| No Git Binary | โ Pure Python | โ Requires Git | โ Requires Git | โ Requires Git |
| Plugin System | โ Yes | โ No | โ Yes | โ No |
| SSH-Friendly | โ Yes | โ Yes | โ GUI only | โ Yes |
| Automated CI Tests | โ 135+ | N/A | N/A | N/A |
Installation & Quick Start
# Install from PyPI (v1.0.1)
pipx install gitsimulator
# All three commands work identically:
gitsim rebase main
git-sim rebase main
gitsimulator rebase main
# Preview a merge with conflict detection
gitsim merge feature-branch
# Learn how Git commands work
gitsim explain rebase
# Save repository state for experiments
gitsim snapshot create before-rebaseFuture Roadmap
- โ Completed: Interactive TUI, plugin system, 135+ tests, PyPI distribution, CI/CD automation
- ๐ง In Progress: AST-based conflict refinement for language-specific detection
- ๐ Planned: Batch mode for CI matrix analysis, custom policy scoring, integration with Git hosting APIs (GitHub/GitLab)
Lessons Learned
Ship the approximation, refine later
Conflict detection shipped at 70%+ accuracy rather than waiting for AST-based analysis. Users adopted it immediately. The precision improvements can happen incrementally without breaking the workflow they already rely on.
Show the score's reasoning
A safety score of HIGH means nothing without context. The explain mode shows whyโ"3 commits rewritten + force-push required to origin/main"โand users adopted the tool much faster once that was in place.
Testing Infrastructure Pays Off
135+ tests caught edge cases during plugin development (octopus merges, detached HEAD, ambiguous short SHAs). Automated CI prevented regressions across 6+ Python/OS combinations.
Pure Python = Portability Win
Using Dulwich instead of shelling out to git binary eliminated platform-specific bugs (Windows path handling, Git version mismatches) and enabled SSH-only environments.
Metrics & Impact
- PyPI Downloads: Tracked via
pepy.tech(published v1.0.1 November 2025) - Test Coverage: 95%+ across 135+ tests
- CI Success Rate: 99%+ (GitHub Actions badge)
- Supported Platforms: Linux, macOS, Windows ร Python 3.11/3.12/3.13
- Educational Reach: Used in university Git workshops (anecdotal)
Technical Debt & Trade-offs
Current Limitations
- No Stash Simulation: Planned for v1.1.0
- Basic Rename Detection: Relies on Dulwich heuristics (70% threshold)
- English-Only Explanations: I18n deferred for post-1.0
Intentional Constraints
- Read-Only: Never execute simulations automatically (safety by design)
- Terminal-First: No GUI planned (SSH-friendly focus)
- Heuristic Conflicts: Not 100% accurate (explicitly documented)
Conclusion
The core problem GitSimulator solves is simple: Git gives you no preview of destructive operations before you run them. The solution required reading Git internals with Dulwich, implementing heuristic conflict detection, and wrapping it in a CLI with enough polish to actually be useful. 135+ tests and a working CI pipeline were not optional extras; without them the project would have broken on the first Windows or Python 3.13 user.
GitHub: egekaya1/GitSimulator ยท PyPI: pipx install gitsimulator (v1.0.1) ยท CI: GitHub Actions