← Back to portfolio

Cloud workspace control plane

SpinUp

Cloud workspace control plane for browser-based developer environments.

SpinUp turns a browser project creation request into a real EC2-backed code-server workspace. The backend control plane creates project metadata, tracks lifecycle state in Postgres, uses Redis for locks and runtime mirrors, allocates or reuses EC2 capacity from an Auto Scaling Group, waits for VM agent and container readiness, restores project files from S3, and exposes the workspace through a browser IDE.

Product problem

Cloud IDE products look simple from the frontend, but the real challenge is runtime orchestration. The system must allocate compute, boot containers, restore files, expose a browser IDE, track progress, handle failure, and clean up infrastructure safely.

What I built

SpinUp models workspace creation as a control-plane workflow. A user creates a project, and the backend creates lifecycle state, acquires Redis locks, allocates or reuses an EC2 VM from an Auto Scaling Group, waits for VM agent health, starts a project-specific code-server container, restores files from S3, waits for readiness, and marks the workspace ready.

Architecture

System path

SpinUp has a Next.js control plane backed by Postgres and Redis. Postgres stores users, projects, project rooms, lifecycle status, runtime metadata, and project events. Redis handles distributed locks and runtime assignment mirrors. AWS EC2 ASG provides workspace machines. A VM agent on each instance controls Docker containers. vm-base-config turns code-server into a project-aware workspace image with S3 restore/sync behavior.

01

User creates project

02

Next.js control plane

03

Postgres lifecycle state

04

Redis locks and runtime mirror

05

EC2 ASG allocation

06

VM agent health

07

Docker container boot

08

S3 restore and sync

09

code-server workspace

Visual proof

Screenshots and architecture from the source repos

These assets are pulled from the project repositories so the case studies show the actual product workflow, architecture diagrams, dashboards, and runtime proof instead of placeholder cards.

Architecture

System architecture

Control plane, lifecycle state, Redis locks, EC2 ASG, VM agent, Docker runtime, code-server, and S3 persistence.

System architecture

Product workflow

How the product actually runs

  1. 01User signs in with Clerk
  2. 02User enters a project name and framework
  3. 03Backend creates or reuses the project row
  4. 04Control plane acquires create/runtime locks
  5. 05Project moves to ALLOCATING_VM
  6. 06EC2/ASG layer allocates or reuses an idle VM
  7. 07Backend waits for public IP and VM agent health
  8. 08Project moves to BOOTING_CONTAINER
  9. 09VM agent starts a deterministic project container
  10. 10vm-base-config restores or creates project files and starts code-server
  11. 11Backend waits for readiness and marks project READY
  12. 12UI shows Open IDE and workspace preview

Why it is not trivial

The hard parts are system boundaries

  • Separates control-plane state from runtime workspace execution.
  • Uses lifecycle states to make provisioning visible and debuggable.
  • Coordinates VM capacity, container boot, and S3-backed persistence.

SpinUp proves cloud platform thinking by turning a simple browser action into a lifecycle-driven infrastructure workflow with allocation, locking, runtime boot, persistence, cleanup, and UI-visible state.

Subsystem deep dive

Control plane vs runtime plane

The control plane decides what should happen. The runtime plane is where the workspace actually runs. This separation keeps product orchestration, database state, locks, cloud allocation, and Docker runtime control from collapsing into one blob.

  • Control plane: Next.js API routes, project orchestration, Postgres state, Redis locks, AWS control logic, VM agent client.
  • Runtime plane: EC2 ASG instances, VM agent, Docker, project container, code-server, S3-backed restore/sync.
  • Storage plane: Postgres for durable state, Redis for fast coordination, S3 for workspace files.

Lifecycle state machine

Workspace provisioning is represented as explicit lifecycle state so the frontend can show truthful progress and the backend can recover from long-running operations.

  • Primary lifecycle: CREATED ‒ ALLOCATING_VM ‒ BOOTING_CONTAINER → READY.
  • Failure and cleanup states: FAILED, STOPPED, DELETING, DELETED.
  • Each transition writes a ProjectEvent for audit history.

EC2/ASG allocation

The runtime allocation path first tries to reuse an idle EC2 instance from the Auto Scaling Group. If none is available, the ASG layer computes capacity and tries to ensure idle capacity.

  • Idle VM allocation avoids cold launches where possible.
  • ASG capacity planning makes the system feel like a small runtime platform.
  • Unhealthy or busy instances are tracked separately from available capacity.

VM agent and Docker boot

The backend does not directly run Docker commands on the VM. It waits for VM agent health and asks the agent to start, stop, or check containers.

  • VM agent health separates instance allocation from container runtime readiness.
  • Containers use deterministic names like spinup-<projectId>.
  • Backend waits for workspace HTTP readiness before marking the project ready.

S3 persistence and workspace image

vm-base-config turns a generic code-server image into a project-aware workspace image. It restores files from S3 or creates a project from a base template, installs dependencies, starts sync, and then starts code-server.

  • Project files survive container and VM restarts.
  • The workspace is not a blank remote IDE.
  • S3 restore/sync makes the runtime durable enough for demos and future platform extension.

Data model

Durable state behind the UI

  • User stores the Clerk-authenticated user mapped into the product database.
  • Project stores name, normalizedName, type, ownerId, lifecycle status, assignedInstanceId, containerName, publicIp, boot timestamps, heartbeat, cleanup timestamps, and last event data.
  • ProjectRoom connects users and projects and tracks VM state.
  • ProjectEvent stores lifecycle history such as PROJECT_CREATED, ALLLOCATION_STARTED, INSTANCE_ASSIGNED, CONTAINER_BOOT_STARTED, CONTAINER_BOOT_SUCCEEDED, HEARTBEAT_OK, DELETE_STARTED, and DELETE_COMPLETED.

Engineering decisions

Tradeoffs and reliability boundaries

  • Built SpinUp as a control-plane-first product.
  • Used lifecycle state in Postgres.
  • Used Redis for locks and runtime mirrors.
  • Used EC2 ASG instead of one-off instance launches.
  • Separated the control plane from the VM agent.
  • Used S3 for project persistence.
  • Kept V1 to one active runtime per user for cost and correctness.
  • Made runtime state visible in the UI.

What makes it more than a demo

  • Real cloud runtime allocation.
  • EC2 Auto Scaling Group integration.
  • Idle VM reuse.
  • VM agent integration.
  • Docker/code-server boot.
  • S3 project restore/sync.
  • Create/resume/delete/retry flows.
  • Failure states and frontend lifecycle polling.

Next improvements

  • Add HTTPS routing in front of workspaces.
  • Add per-user subdomain routing instead of raw public IPs.
  • Improve workspace app preview on port 3000.
  • Add stronger worker reconciliation and richer event timeline UI.
  • Add team collaboration, permissions, production-grade secrets, and observability.

Proof links

Proof links connect to the source repos, demos, architecture assets, screenshots, and engineering journals used to build this case study.

Final takeaway

SpinUp proves cloud platform thinking by turning a simple browser action into a lifecycle-driven infrastructure workflow with allocation, locking, runtime boot, persistence, cleanup, and UI-visible state.