Core Concepts

This guide explains the core concepts of Kumiho Cloud and how they relate to the Python SDK.

Graph-Native Architecture

Kumiho Cloud is built on a graph database (Neo4j), which means relationships between assets are first-class citizens. Unlike traditional file-based systems, Kumiho tracks:

  • Dependencies: What assets does this asset depend on?

  • Lineage: What was this asset created from?

  • Usage: Where is this asset used?

┌───────────────────────────────────────────────────────┐
│                         PROJECT                       │
│  ┌─────────────────────────────────────────────────┐  │
│  │                        SPACE                    │  │
│  │  ┌─────────┐      ┌─────────┐     ┌─────────┐   │  │
│  │  │  ITEM   │────▶│  ITEM   │────▶│  ITEM   │   │  │
│  │  └────┬────┘      └────┬────┘     └────┬────┘   │  │
│  │       │                │               │        │  │
│  │  ┌────▼────┐      ┌────▼────┐     ┌────▼────┐   │  │
│  │  │REVISION │      │REVISION │     │REVISION │   │  │
│  │  │   v1    │      │   v1    │     │   v1    │   │  │
│  │  └────┬────┘      └────┬────┘     └────┬────┘   │  │
│  │       │                │               │        │  │
│  │  ┌────▼────┐      ┌────▼────┐     ┌────▼────┐   │  │
│  │  │ARTIFACT │      │ARTIFACT │     │ARTIFACT │   │  │
│  │  └─────────┘      └─────────┘     └─────────┘   │  │
│  └─────────────────────────────────────────────────┘  │
└───────────────────────────────────────────────────────┘

Entity Hierarchy

Project

A Project is the top-level container representing a production, show, or workspace. Each project is isolated within a tenant’s graph database.

project = kumiho.create_project(
    name="sci-fi-short",
    description="Sci-fi short film VFX assets"
)

Key attributes:

  • name: URL-safe identifier (used in Kref URIs)

  • description: Human-readable description

  • kref: Reference URI (kref://sci-fi-short)

Space

A Space organizes assets within a project. Common groupings include:

  • By type: characters, environments, props

  • By episode: ep01, ep02

  • By department: modeling, animation, lighting

space = project.create_space("characters")

Key attributes:

  • name: URL-safe identifier

  • path: Full path in the hierarchy (e.g., /sci-fi-short/characters)

  • metadata: Custom key-value metadata

Item

An Item represents a single creative asset or AI artifact. Items have a kind that indicates what type of asset they are.

item = space.create_item(
    item_name="hero-robot",
    kind="model"
)

Common item kinds:

  • model: 3D models

  • texture: Textures and materials

  • animation: Animation data

  • rig: Character rigs

  • composite: Compositing setups

  • ai_model: Trained AI models

  • dataset: Training datasets

  • prompt: AI prompts

Key attributes:

  • name: URL-safe identifier

  • kind: Category of the asset

  • kref: Reference URI (kref://sci-fi-short/characters/hero-robot.model)

Revision

A Revision represents a specific iteration of an item. Revisions are immutable once published.

# Simple revision creation
revision = item.create_revision()

# Revision with metadata
revision = item.create_revision(
    metadata={
        "artist": "jane",
        "render_engine": "arnold",
        "notes": "Added facial rigging for dialogue"
    }
)

Key attributes:

  • number: Auto-incrementing revision number

  • metadata: Custom key-value metadata

  • kref: Reference URI (kref://sci-fi-short/characters/hero-robot.model?v=3)

  • tags: List of tags (e.g., “latest”, “published”, “approved”)

Artifact

An Artifact is a file reference attached to a revision. Artifacts store metadata about files without uploading the actual data—files stay on your local/NAS storage.

artifact = revision.create_artifact(
    name="hero_robot_v3.fbx",
    location="smb://studio-nas/projects/scifi/hero_robot_v3.fbx"
)

# Add metadata after creation
artifact.set_metadata({
    "size": "52428800",
    "checksum": "sha256:a1b2c3...",
    "format": "fbx"
})

Key attributes:

  • name: Artifact identifier within the revision

  • location: URI pointing to the actual file

  • metadata: Custom key-value metadata

Edge

An Edge represents a relationship between revisions. Edges enable lineage tracking.

import kumiho

# Get the target revision
texture = kumiho.get_revision("kref://sci-fi-short/textures/metal.texture?v=2")

# Create edge with optional metadata
edge = revision.create_edge(
    target_revision=texture,
    edge_type=kumiho.DEPENDS_ON,
    metadata={"usage": "body material"}
)

Edge types:

  • kumiho.DEPENDS_ON: This revision depends on the target

  • kumiho.DERIVED_FROM: This revision was derived from the target

  • kumiho.REFERENCED: This revision references the target

  • kumiho.CONTAINS: This revision contains the target

Querying edges by direction:

# Get outgoing edges (default) - edges FROM this revision
outgoing = revision.get_edges(direction=kumiho.OUTGOING)

# Get incoming edges - edges TO this revision
incoming = revision.get_edges(direction=kumiho.INCOMING)

# Get all edges in both directions
all_edges = revision.get_edges(direction=kumiho.BOTH)

Graph Traversal

Kumiho provides powerful graph traversal methods for dependency analysis:

# Find all dependencies (what this revision depends on)
deps = revision.get_all_dependencies(max_depth=5)
for kref in deps.revision_krefs:
    print(f"Depends on: {kref.uri}")

# Find all dependents (what depends on this revision)
dependents = revision.get_all_dependents(max_depth=5)
for kref in dependents.revision_krefs:
    print(f"Depended on by: {kref.uri}")

# Find shortest path between revisions
path = source_revision.find_path_to(target_revision)
if path:
    print(f"Path length: {path.total_depth}")
    for step in path.steps:
        print(f"  -> {step.revision_kref.uri} via {step.edge_type}")

# Analyze impact of changes (what would be affected)
impact = revision.analyze_impact()
for impacted in impact:
    print(f"Would affect: {impacted.revision_kref.uri} at depth {impacted.impact_depth}")

Bundle

A Bundle is a special item kind that aggregates other items. Bundles are useful for grouping related assets together (e.g., a character bundle with model, textures, and rig) with full revision-based audit trail of membership changes.

# Create a bundle via Project or Space
bundle = project.create_bundle("release-bundle")

# Or from a space
assets = project.get_space("assets")
char_bundle = assets.create_bundle("character-bundle")

# Add items to the bundle
hero_model = assets.get_item("hero", "model")
hero_rig = assets.get_item("hero", "rig")
bundle.add_member(hero_model)
bundle.add_member(hero_rig)

# Get current members
members = bundle.get_members()
for member in members:
    print(f"  {member.item_kref}")

# View change history (audit trail)
history = bundle.get_history()
for entry in history:
    print(f"v{entry.revision_number}: {entry.action} {entry.member_item_kref}")

Key characteristics:

  • Bundles use the reserved item kind "bundle"

  • Cannot be created via create_item() (use create_bundle())

  • Each membership change (add/remove) creates a new bundle revision

  • Full audit trail: who added/removed what item and when

  • Can query members at any specific revision in history

Metadata

All node types support custom metadata as key-value string pairs. Metadata can be set during creation (where supported) or updated afterward.

Setting Metadata

# During revision creation
revision = item.create_revision(metadata={
    "artist": "jane",
    "render_engine": "arnold",
    "frame_range": "1-100"
})

# During edge creation
edge = revision.create_edge(
    target_revision=texture,
    edge_type=kumiho.DEPENDS_ON,
    metadata={"channel": "diffuse"}
)

# Update metadata after creation (all node types)
space.set_metadata({"department": "modeling"})
item.set_metadata({"status": "approved"})
revision.set_metadata({"published_by": "supervisor"})
artifact.set_metadata({"checksum": "sha256:..."})

Granular Attribute Operations

For updating individual metadata fields without replacing the entire map:

# Set a single attribute
space.set_attribute("department", "modeling")
revision.set_attribute("status", "approved")

# Get a single attribute
dept = space.get_attribute("department")  # Returns "modeling" or None

# Delete a single attribute
space.delete_attribute("old_field")

This is more efficient than set_metadata() when you only need to change one field.

Common Metadata Patterns

Node Type

Common Keys

Space

department, supervisor, deadline

Item

status, priority, assigned_to

Revision

artist, render_engine, notes, software_version

Artifact

size, checksum, format, resolution

Edge

usage, channel, relationship_notes

Kref URI Scheme

Kumiho uses Kref URIs as universal identifiers for all entities:

kref://project/space/item.kind?v=revision&r=artifact

URI

Resolves To

kref://my-project

Project

kref://my-project/chars

Space

kref://my-project/chars/human

Sub-Space(s)

kref://my-project/chars/human/hero.model

Item (latest revision)

kref://my-project/chars/human/hero.model?r=2

Specific revision

kref://my-project/chars/human/hero.model?r=2&a=mesh

Specific artifact

kref://my-project/chars/human/hero.model?t=published

Latest published revision

kref://my-project/chars/human/hero.model?t=published&a=mesh

Latest published revision with specific artifact

kref://my-project/chars/human/hero.model?t=published&time=202406011330&a=mesh

Published revision at specific time with specific artifact

Time-Based Revision Queries

One of Kumiho’s most powerful features is time-based revision lookup. This enables reproducible builds, historical debugging, and auditing by answering questions like: “What was the published version of this asset on June 1st?”

Why Time-Based Queries Matter

In production pipelines, you often need to:

  1. Reproduce past renders: Re-render a shot exactly as it was delivered months ago

  2. Debug regressions: Compare current assets against a known-good state from a specific date

  3. Audit changes: Understand what version was used when a decision was made

  4. Compliance: Prove what asset versions were in use at a particular milestone

Without time-based queries, you’d need to manually track revision numbers for every asset at every milestone—an error-prone and tedious process.

Using get_revision_by_time

The SDK provides get_revision_by_time() to find the revision that was tagged with a specific tag at a given point in time:

from datetime import datetime, timezone

# Get the "published" revision as of June 1st, 2024
june_1 = datetime(2024, 6, 1, tzinfo=timezone.utc)
revision = item.get_revision_by_time(
    time=june_1,
    tag="published"
)

print(f"On {june_1}, published revision was r{revision.number}")

The time parameter accepts multiple formats:

  • datetime object: datetime(2024, 6, 1, 13, 30, 45, tzinfo=timezone.utc) - full precision

  • ISO 8601 string: "2024-06-01T13:30:45+00:00" or "2024-06-01T13:30:45Z" - full precision

  • YYYYMMDDHHMM string: "202406011330" - minute-level precision (rounded to end of minute)

For historical auditing where events may happen within the same minute, use datetime objects or ISO strings for sub-second precision.

This is especially useful for the published tag, which marks revisions as immutable and approved for downstream consumption.

Time-Based Kref URIs

You can also use time-based queries directly in Kref URIs with the t= (tag) and time= parameters:

# Get published revision at a specific time via Kref
# Format: YYYYMMDDHHMM (e.g., 202406011330 = June 1, 2024 at 13:30)
kref = "kref://my-project/chars/hero.model?t=published&time=202406011330"
revision = kumiho.get_revision(kref)

# Resolve to artifact location at that point in time
location = kumiho.resolve(kref)

Kref time query parameters:

Parameter

Description

t=<tag>

Find revision with this tag (e.g., t=published, t=approved)

time=<YYYYMMDDHHMM>

Point in time to query (e.g., time=202406011330)

When both t= and time= are provided, Kumiho finds the revision that:

  1. Had the specified tag at the given time

  2. Was the most recent such revision before or at that time

Practical Examples

Reproduce a past delivery:

# Find all assets as they were for the Q2 delivery
delivery_date = datetime(2024, 6, 30, 23, 59, 59, tzinfo=timezone.utc)

for item in space.get_items():
    rev = item.get_revision_by_time(time=delivery_date, tag="published")
    if rev:
        print(f"{item.name}: r{rev.number}")
        for artifact in rev.get_artifacts():
            print(f"  -> {artifact.location}")

Compare current vs historical:

# What changed between two milestones?
alpha_date = datetime(2024, 3, 1, tzinfo=timezone.utc)
beta_date = datetime(2024, 6, 1, tzinfo=timezone.utc)

alpha_rev = item.get_revision_by_time(time=alpha_date, tag="published")
beta_rev = item.get_revision_by_time(time=beta_date, tag="published")

if alpha_rev.number != beta_rev.number:
    print(f"Asset changed from r{alpha_rev.number} to r{beta_rev.number}")

Pipeline integration with timestamps:

# In a render farm job, record the exact time for reproducibility
import json
from datetime import datetime, timezone

render_manifest = {
    "render_time": datetime.now(timezone.utc).isoformat(),
    "assets": []
}

for item_kref in required_assets:
    item = kumiho.get_item(item_kref)
    rev = item.get_revision_by_tag("published")
    render_manifest["assets"].append({
        "kref": rev.kref,
        "revision": rev.number
    })

# Later, reproduce using the recorded timestamp
with open("render_manifest.json") as f:
    manifest = json.load(f)
    
render_time = datetime.fromisoformat(manifest["render_time"])
# Convert to YYYYMMDDHHMM format for kref
time_str = render_time.strftime("%Y%m%d%H%M")
for asset in manifest["assets"]:
    # Get exactly what was published at render time
    kref = f"{asset['kref'].split('?')[0]}?t=published&time={time_str}"
    revision = kumiho.get_revision(kref)

Tags and Time

The published tag is especially important for time-based queries because:

  1. Immutability: Published revisions cannot be modified or deleted

  2. Stability: Downstream consumers can rely on published revisions not changing

  3. Audit trail: Tag history is preserved, so you can query what was published when

Other common tags for time-based queries:

  • approved: Supervisor-approved versions

  • delivered: Versions sent to clients

  • milestone-alpha, milestone-beta: Project milestone snapshots

BYO Storage Philosophy

Kumiho follows a “Bring Your Own Storage” philosophy:

  1. Files stay local: Original files remain on your NAS, local disk, or on-prem storage

  2. Metadata in cloud: Only paths, hashes, and relationships are stored in the cloud

  3. No vendor lock-in: You can always access your files directly

# Artifact location is just a URI - files aren't uploaded
artifact = revision.create_artifact(
    name="hero.fbx",
    location="file:///mnt/studio/projects/hero.fbx"  # File stays here
)

Supported URI schemes:

  • file://: Local filesystem

  • smb://: Windows/Samba shares

  • nfs://: NFS mounts

  • s3://: Amazon S3 (for hybrid setups)

  • gs://: Google Cloud Storage (for hybrid setups)

Multi-Tenant Architecture

Kumiho Cloud is a multi-tenant SaaS:

  • Tenant: A studio or organization with their own isolated data

  • Region: Geographic location of the data (e.g., us-central, eu-west)

  • Control Plane: Global service for authentication and routing

  • Data Plane: Regional servers with Neo4j databases

The SDK handles tenant resolution automatically:

# SDK automatically routes to the correct region
kumiho.connect()  # Uses cached credentials and tenant info

Event Streaming

Kumiho supports real-time event streaming for reactive workflows. Events are emitted whenever assets change, enabling live dashboards, automated pipelines, and integrations.

Basic Usage

import kumiho

# Stream all events
for event in kumiho.event_stream():
    print(f"{event.routing_key}: {event.kref}")

# Filter by routing key (wildcards supported)
for event in kumiho.event_stream(routing_key_filter="revision.*"):
    if event.action == "created":
        print(f"New revision: {event.kref}")

# Filter by kref pattern (glob syntax)
for event in kumiho.event_stream(kref_filter="kref://my-project/**/*.model"):
    print(f"Model changed: {event.kref}")

Event Object

Each event contains:

Attribute

Type

Description

routing_key

str

Event type (e.g., revision.created, artifact.added)

kref

str

Kref URI of the affected resource

action

str

Action performed (created, updated, deleted, tagged)

timestamp

datetime

When the event occurred

metadata

dict

Additional event metadata

cursor

str

Cursor for resumable streaming (Creator+ tiers)

Event Types (Routing Keys)

Routing Key

Description

space.created

New space was created

space.updated

Space metadata was updated

space.deleted

Space was deleted

item.created

New item was created

item.updated

Item metadata was updated

item.deleted

Item was deleted

revision.created

New revision was published

revision.tagged

Revision was tagged (e.g., “published”)

revision.untagged

Tag was removed from revision

artifact.added

Artifact was added to a revision

artifact.deleted

Artifact was removed

edge.created

New relationship was created

edge.deleted

Relationship was removed

Tier-Based Capabilities

Event streaming capabilities vary by subscription tier:

Feature

Free

Creator

Studio

Enterprise

Real-time streaming

Routing key filters

Kref glob filters

Event persistence

1 hour

24 hours

30 days

Cursor-based resume

Replay from buffer

Consumer groups

BYO Kafka bridge

✅ Pro

Note: Creator tier and above features are Coming Soon. Currently only Free tier (real-time streaming) is available.

Query Tier Capabilities

Check your tenant’s streaming capabilities at runtime:

from kumiho import get_event_capabilities

caps = get_event_capabilities()
print(f"Tier: {caps.tier}")
print(f"Supports replay: {caps.supports_replay}")
print(f"Supports cursor: {caps.supports_cursor}")
print(f"Max retention: {caps.max_retention_hours} hours")
print(f"Buffer size: {caps.max_buffer_size} events")

Resumable Streaming (Coming Soon - Creator+)

For Creator tier and above, you can resume from where you left off:

import kumiho

# Save cursor for recovery
last_cursor = None

for event in kumiho.event_stream(routing_key_filter="revision.*"):
    process_event(event)
    last_cursor = event.cursor  # Persist this
    
# Later, resume from last position
for event in kumiho.event_stream(cursor=last_cursor):
    process_event(event)

Replay from Beginning (Coming Soon - Creator+)

Replay all events in the buffer:

# Replay entire buffer (useful for initial sync)
for event in kumiho.event_stream(from_beginning=True):
    sync_to_local_db(event)

Feature-Gated Streaming

Write code that adapts to your tier:

import kumiho

caps = kumiho.get_event_capabilities()

if caps.supports_cursor:
    # Creator+ tier: use cursor for reliability
    cursor = load_saved_cursor()
    stream = kumiho.event_stream(cursor=cursor)
else:
    # Free tier: real-time only
    stream = kumiho.event_stream()

for event in stream:
    process_event(event)
    if event.cursor:
        save_cursor(event.cursor)

Use Cases by Tier

Free Tier (Available Now):

  • Live dashboard updates

  • Real-time notifications during active sessions

  • Development and testing

Creator Tier (Coming Soon):

  • Overnight batch processing with morning resume

  • Intermittent connectivity scenarios

  • Small team collaboration with reliable delivery

Studio Tier (Coming Soon):

  • Integration with existing Kafka pipelines (BYO Kafka)

  • Render farm job triggering

  • Data warehouse ingestion

Enterprise Tier (Coming Soon):

  • Mission-critical production pipelines

  • Parallel processing with consumer groups

  • Full audit trail with 30-day retention

Next Steps