M2 Mac Mini Agentic Stack

A low-level agentic tooling stack running on M2 Mac Mini infrastructure, focused on deterministic command execution, tool orchestration, and safety guardrails.

M2 Mac MiniTypeScriptNode.jsPTY ExecutionJSON-RPCStructured LogsState Machine DesignGuardrails

Overview

The M2 Mac Mini Agentic Stack is a low-level agent runtime focused on tool orchestration, not chat UX. It provides deterministic execution loops for command tools, file edits, and workflow steps where every action is stateful, inspectable, and replayable.

The design goal is simple: make agent behavior predictable under real engineering constraints.

Core Runtime Model

  • A single task is represented as an explicit state machine (queued -> running -> blocked -> complete/failed)
  • Tool calls are first-class events with inputs, outputs, timestamps, and exit status
  • Command execution is isolated with scoped working directories and configurable timeouts
  • Long-running sessions (PTY) are resumable so an agent can continue multi-step processes safely

Why Low-Level

Most "agent" demos hide reliability issues behind abstractions. This stack deliberately exposes the hard parts:

  • command boundary handling (&&, pipes, subshells)
  • partial failure recovery
  • non-interactive fallback behavior
  • deterministic re-entry after interruption

This makes it usable for real repo maintenance and ops tasks instead of only toy prompts.

Safety and Control

  • Allow/deny execution policy by command prefix
  • Explicit escalation path for privileged actions
  • Structured logs for auditability and postmortems
  • Guardrails around destructive actions to reduce blast radius

What It Demonstrates

  • Agent loop architecture under production-like failure modes
  • Tooling contracts that separate planning from execution
  • Reliable command/session management for iterative automation
  • Practical engineering tradeoffs between autonomy and safety

Current Direction

The roadmap is focused on:

  • improving retry semantics for flaky command chains
  • tighter typed schemas for tool I/O
  • better run summaries for human handoff

This stack is intentionally infrastructure-first: less "assistant personality," more reliable machine behavior.

Share this project

Share: