# Core Concepts This guide explains the core concepts of Kumiho Cloud and how they relate to the Python SDK. ## Graph-Native Architecture Kumiho Cloud is built on a **graph database (Neo4j)**, which means relationships between assets are first-class citizens. Unlike traditional file-based systems, Kumiho tracks: - **Dependencies**: What assets does this asset depend on? - **Lineage**: What was this asset created from? - **Usage**: Where is this asset used? ``` ┌───────────────────────────────────────────────────────┐ │ PROJECT │ │ ┌─────────────────────────────────────────────────┐ │ │ │ SPACE │ │ │ │ ┌─────────┐ ┌─────────┐ ┌─────────┐ │ │ │ │ │ ITEM │────▶│ ITEM │────▶│ ITEM │ │ │ │ │ └────┬────┘ └────┬────┘ └────┬────┘ │ │ │ │ │ │ │ │ │ │ │ ┌────▼────┐ ┌────▼────┐ ┌────▼────┐ │ │ │ │ │REVISION │ │REVISION │ │REVISION │ │ │ │ │ │ v1 │ │ v1 │ │ v1 │ │ │ │ │ └────┬────┘ └────┬────┘ └────┬────┘ │ │ │ │ │ │ │ │ │ │ │ ┌────▼────┐ ┌────▼────┐ ┌────▼────┐ │ │ │ │ │ARTIFACT │ │ARTIFACT │ │ARTIFACT │ │ │ │ │ └─────────┘ └─────────┘ └─────────┘ │ │ │ └─────────────────────────────────────────────────┘ │ └───────────────────────────────────────────────────────┘ ``` ## Entity Hierarchy ### Project A **Project** is the top-level container representing a production, show, or workspace. Each project is isolated within a tenant's graph database. ```python project = kumiho.create_project( name="sci-fi-short", description="Sci-fi short film VFX assets" ) ``` **Key attributes:** - `name`: URL-safe identifier (used in Kref URIs) - `description`: Human-readable description - `kref`: Reference URI (`kref://sci-fi-short`) ### Space A **Space** organizes assets within a project. Common groupings include: - By type: `characters`, `environments`, `props` - By episode: `ep01`, `ep02` - By department: `modeling`, `animation`, `lighting` ```python space = project.create_space("characters") ``` **Key attributes:** - `name`: URL-safe identifier - `path`: Full path in the hierarchy (e.g., `/sci-fi-short/characters`) - `metadata`: Custom key-value metadata ### Item An **Item** represents a single creative asset or AI artifact. Items have a kind that indicates what type of asset they are. ```python item = space.create_item( item_name="hero-robot", kind="model" ) ``` **Common item kinds:** - `model`: 3D models - `texture`: Textures and materials - `animation`: Animation data - `rig`: Character rigs - `composite`: Compositing setups - `ai_model`: Trained AI models - `dataset`: Training datasets - `prompt`: AI prompts **Key attributes:** - `name`: URL-safe identifier - `kind`: Category of the asset - `kref`: Reference URI (`kref://sci-fi-short/characters/hero-robot.model`) ### Revision A **Revision** represents a specific iteration of an item. Revisions are immutable once published. ```python # Simple revision creation revision = item.create_revision() # Revision with metadata revision = item.create_revision( metadata={ "artist": "jane", "render_engine": "arnold", "notes": "Added facial rigging for dialogue" } ) ``` **Key attributes:** - `number`: Auto-incrementing revision number - `metadata`: Custom key-value metadata - `kref`: Reference URI (`kref://sci-fi-short/characters/hero-robot.model?v=3`) - `tags`: List of tags (e.g., "latest", "published", "approved") ### Artifact An **Artifact** is a file reference attached to a revision. Artifacts store metadata about files without uploading the actual data—files stay on your local/NAS storage. ```python artifact = revision.create_artifact( name="hero_robot_v3.fbx", location="smb://studio-nas/projects/scifi/hero_robot_v3.fbx" ) # Add metadata after creation artifact.set_metadata({ "size": "52428800", "checksum": "sha256:a1b2c3...", "format": "fbx" }) ``` **Key attributes:** - `name`: Artifact identifier within the revision - `location`: URI pointing to the actual file - `metadata`: Custom key-value metadata ### Edge An **Edge** represents a relationship between revisions. Edges enable lineage tracking. ```python import kumiho # Get the target revision texture = kumiho.get_revision("kref://sci-fi-short/textures/metal.texture?v=2") # Create edge with optional metadata edge = revision.create_edge( target_revision=texture, edge_type=kumiho.DEPENDS_ON, metadata={"usage": "body material"} ) ``` **Edge types:** - `kumiho.DEPENDS_ON`: This revision depends on the target - `kumiho.DERIVED_FROM`: This revision was derived from the target - `kumiho.REFERENCED`: This revision references the target - `kumiho.CONTAINS`: This revision contains the target **Querying edges by direction:** ```python # Get outgoing edges (default) - edges FROM this revision outgoing = revision.get_edges(direction=kumiho.OUTGOING) # Get incoming edges - edges TO this revision incoming = revision.get_edges(direction=kumiho.INCOMING) # Get all edges in both directions all_edges = revision.get_edges(direction=kumiho.BOTH) ``` ### Graph Traversal Kumiho provides powerful graph traversal methods for dependency analysis: ```python # Find all dependencies (what this revision depends on) deps = revision.get_all_dependencies(max_depth=5) for kref in deps.revision_krefs: print(f"Depends on: {kref.uri}") # Find all dependents (what depends on this revision) dependents = revision.get_all_dependents(max_depth=5) for kref in dependents.revision_krefs: print(f"Depended on by: {kref.uri}") # Find shortest path between revisions path = source_revision.find_path_to(target_revision) if path: print(f"Path length: {path.total_depth}") for step in path.steps: print(f" -> {step.revision_kref.uri} via {step.edge_type}") # Analyze impact of changes (what would be affected) impact = revision.analyze_impact() for impacted in impact: print(f"Would affect: {impacted.revision_kref.uri} at depth {impacted.impact_depth}") ``` ### Bundle A **Bundle** is a special item kind that aggregates other items. Bundles are useful for grouping related assets together (e.g., a character bundle with model, textures, and rig) with full revision-based audit trail of membership changes. ```python # Create a bundle via Project or Space bundle = project.create_bundle("release-bundle") # Or from a space assets = project.get_space("assets") char_bundle = assets.create_bundle("character-bundle") # Add items to the bundle hero_model = assets.get_item("hero", "model") hero_rig = assets.get_item("hero", "rig") bundle.add_member(hero_model) bundle.add_member(hero_rig) # Get current members members = bundle.get_members() for member in members: print(f" {member.item_kref}") # View change history (audit trail) history = bundle.get_history() for entry in history: print(f"v{entry.revision_number}: {entry.action} {entry.member_item_kref}") ``` **Key characteristics:** - Bundles use the reserved item kind `"bundle"` - Cannot be created via `create_item()` (use `create_bundle()`) - Each membership change (add/remove) creates a new bundle revision - Full audit trail: who added/removed what item and when - Can query members at any specific revision in history ## Metadata All node types support custom metadata as key-value string pairs. Metadata can be set during creation (where supported) or updated afterward. ### Setting Metadata ```python # During revision creation revision = item.create_revision(metadata={ "artist": "jane", "render_engine": "arnold", "frame_range": "1-100" }) # During edge creation edge = revision.create_edge( target_revision=texture, edge_type=kumiho.DEPENDS_ON, metadata={"channel": "diffuse"} ) # Update metadata after creation (all node types) space.set_metadata({"department": "modeling"}) item.set_metadata({"status": "approved"}) revision.set_metadata({"published_by": "supervisor"}) artifact.set_metadata({"checksum": "sha256:..."}) ``` ### Granular Attribute Operations For updating individual metadata fields without replacing the entire map: ```python # Set a single attribute space.set_attribute("department", "modeling") revision.set_attribute("status", "approved") # Get a single attribute dept = space.get_attribute("department") # Returns "modeling" or None # Delete a single attribute space.delete_attribute("old_field") ``` This is more efficient than `set_metadata()` when you only need to change one field. ### Common Metadata Patterns | Node Type | Common Keys | |-----------|-------------| | Space | `department`, `supervisor`, `deadline` | | Item | `status`, `priority`, `assigned_to` | | Revision | `artist`, `render_engine`, `notes`, `software_version` | | Artifact | `size`, `checksum`, `format`, `resolution` | | Edge | `usage`, `channel`, `relationship_notes` | ## Kref URI Scheme Kumiho uses **Kref URIs** as universal identifiers for all entities: ``` kref://project/space/item.kind?v=revision&r=artifact ``` | URI | Resolves To | |-----|-------------| | `kref://my-project` | Project | | `kref://my-project/chars` | Space | | `kref://my-project/chars/human` | Sub-Space(s) | | `kref://my-project/chars/human/hero.model` | Item (latest revision) | | `kref://my-project/chars/human/hero.model?r=2` | Specific revision | | `kref://my-project/chars/human/hero.model?r=2&a=mesh` | Specific artifact | | `kref://my-project/chars/human/hero.model?t=published` | Latest published revision | | `kref://my-project/chars/human/hero.model?t=published&a=mesh` | Latest published revision with specific artifact| | `kref://my-project/chars/human/hero.model?t=published&time=202406011330&a=mesh` | Published revision at specific time with specific artifact| ## Time-Based Revision Queries One of Kumiho's most powerful features is **time-based revision lookup**. This enables reproducible builds, historical debugging, and auditing by answering questions like: "What was the published version of this asset on June 1st?" ### Why Time-Based Queries Matter In production pipelines, you often need to: 1. **Reproduce past renders**: Re-render a shot exactly as it was delivered months ago 2. **Debug regressions**: Compare current assets against a known-good state from a specific date 3. **Audit changes**: Understand what version was used when a decision was made 4. **Compliance**: Prove what asset versions were in use at a particular milestone Without time-based queries, you'd need to manually track revision numbers for every asset at every milestone—an error-prone and tedious process. ### Using `get_revision_by_time` The SDK provides `get_revision_by_time()` to find the revision that was tagged with a specific tag at a given point in time: ```python from datetime import datetime, timezone # Get the "published" revision as of June 1st, 2024 june_1 = datetime(2024, 6, 1, tzinfo=timezone.utc) revision = item.get_revision_by_time( time=june_1, tag="published" ) print(f"On {june_1}, published revision was r{revision.number}") ``` The `time` parameter accepts multiple formats: - **datetime object**: `datetime(2024, 6, 1, 13, 30, 45, tzinfo=timezone.utc)` - full precision - **ISO 8601 string**: `"2024-06-01T13:30:45+00:00"` or `"2024-06-01T13:30:45Z"` - full precision - **YYYYMMDDHHMM string**: `"202406011330"` - minute-level precision (rounded to end of minute) For historical auditing where events may happen within the same minute, use datetime objects or ISO strings for sub-second precision. This is especially useful for the `published` tag, which marks revisions as immutable and approved for downstream consumption. ### Time-Based Kref URIs You can also use time-based queries directly in Kref URIs with the `t=` (tag) and `time=` parameters: ```python # Get published revision at a specific time via Kref # Format: YYYYMMDDHHMM (e.g., 202406011330 = June 1, 2024 at 13:30) kref = "kref://my-project/chars/hero.model?t=published&time=202406011330" revision = kumiho.get_revision(kref) # Resolve to artifact location at that point in time location = kumiho.resolve(kref) ``` **Kref time query parameters:** | Parameter | Description | |-----------|-------------| | `t=` | Find revision with this tag (e.g., `t=published`, `t=approved`) | | `time=` | Point in time to query (e.g., `time=202406011330`) | When both `t=` and `time=` are provided, Kumiho finds the revision that: 1. Had the specified tag at the given time 2. Was the most recent such revision before or at that time ### Practical Examples **Reproduce a past delivery:** ```python # Find all assets as they were for the Q2 delivery delivery_date = datetime(2024, 6, 30, 23, 59, 59, tzinfo=timezone.utc) for item in space.get_items(): rev = item.get_revision_by_time(time=delivery_date, tag="published") if rev: print(f"{item.name}: r{rev.number}") for artifact in rev.get_artifacts(): print(f" -> {artifact.location}") ``` **Compare current vs historical:** ```python # What changed between two milestones? alpha_date = datetime(2024, 3, 1, tzinfo=timezone.utc) beta_date = datetime(2024, 6, 1, tzinfo=timezone.utc) alpha_rev = item.get_revision_by_time(time=alpha_date, tag="published") beta_rev = item.get_revision_by_time(time=beta_date, tag="published") if alpha_rev.number != beta_rev.number: print(f"Asset changed from r{alpha_rev.number} to r{beta_rev.number}") ``` **Pipeline integration with timestamps:** ```python # In a render farm job, record the exact time for reproducibility import json from datetime import datetime, timezone render_manifest = { "render_time": datetime.now(timezone.utc).isoformat(), "assets": [] } for item_kref in required_assets: item = kumiho.get_item(item_kref) rev = item.get_revision_by_tag("published") render_manifest["assets"].append({ "kref": rev.kref, "revision": rev.number }) # Later, reproduce using the recorded timestamp with open("render_manifest.json") as f: manifest = json.load(f) render_time = datetime.fromisoformat(manifest["render_time"]) # Convert to YYYYMMDDHHMM format for kref time_str = render_time.strftime("%Y%m%d%H%M") for asset in manifest["assets"]: # Get exactly what was published at render time kref = f"{asset['kref'].split('?')[0]}?t=published&time={time_str}" revision = kumiho.get_revision(kref) ``` ### Tags and Time The `published` tag is especially important for time-based queries because: 1. **Immutability**: Published revisions cannot be modified or deleted 2. **Stability**: Downstream consumers can rely on published revisions not changing 3. **Audit trail**: Tag history is preserved, so you can query what was published when Other common tags for time-based queries: - `approved`: Supervisor-approved versions - `delivered`: Versions sent to clients - `milestone-alpha`, `milestone-beta`: Project milestone snapshots ## BYO Storage Philosophy Kumiho follows a **"Bring Your Own Storage"** philosophy: 1. **Files stay local**: Original files remain on your NAS, local disk, or on-prem storage 2. **Metadata in cloud**: Only paths, hashes, and relationships are stored in the cloud 3. **No vendor lock-in**: You can always access your files directly ```python # Artifact location is just a URI - files aren't uploaded artifact = revision.create_artifact( name="hero.fbx", location="file:///mnt/studio/projects/hero.fbx" # File stays here ) ``` **Supported URI schemes:** - `file://`: Local filesystem - `smb://`: Windows/Samba shares - `nfs://`: NFS mounts - `s3://`: Amazon S3 (for hybrid setups) - `gs://`: Google Cloud Storage (for hybrid setups) ## Multi-Tenant Architecture Kumiho Cloud is a multi-tenant SaaS: - **Tenant**: A studio or organization with their own isolated data - **Region**: Geographic location of the data (e.g., `us-central`, `eu-west`) - **Control Plane**: Global service for authentication and routing - **Data Plane**: Regional servers with Neo4j databases The SDK handles tenant resolution automatically: ```python # SDK automatically routes to the correct region kumiho.connect() # Uses cached credentials and tenant info ``` ## Event Streaming Kumiho supports real-time event streaming for reactive workflows. Events are emitted whenever assets change, enabling live dashboards, automated pipelines, and integrations. ### Basic Usage ```python import kumiho # Stream all events for event in kumiho.event_stream(): print(f"{event.routing_key}: {event.kref}") # Filter by routing key (wildcards supported) for event in kumiho.event_stream(routing_key_filter="revision.*"): if event.action == "created": print(f"New revision: {event.kref}") # Filter by kref pattern (glob syntax) for event in kumiho.event_stream(kref_filter="kref://my-project/**/*.model"): print(f"Model changed: {event.kref}") ``` ### Event Object Each event contains: | Attribute | Type | Description | |-----------|------|-------------| | `routing_key` | `str` | Event type (e.g., `revision.created`, `artifact.added`) | | `kref` | `str` | Kref URI of the affected resource | | `action` | `str` | Action performed (`created`, `updated`, `deleted`, `tagged`) | | `timestamp` | `datetime` | When the event occurred | | `metadata` | `dict` | Additional event metadata | | `cursor` | `str` | Cursor for resumable streaming (Creator+ tiers) | ### Event Types (Routing Keys) | Routing Key | Description | |-------------|-------------| | `space.created` | New space was created | | `space.updated` | Space metadata was updated | | `space.deleted` | Space was deleted | | `item.created` | New item was created | | `item.updated` | Item metadata was updated | | `item.deleted` | Item was deleted | | `revision.created` | New revision was published | | `revision.tagged` | Revision was tagged (e.g., "published") | | `revision.untagged` | Tag was removed from revision | | `artifact.added` | Artifact was added to a revision | | `artifact.deleted` | Artifact was removed | | `edge.created` | New relationship was created | | `edge.deleted` | Relationship was removed | ### Tier-Based Capabilities Event streaming capabilities vary by subscription tier: | Feature | Free | Creator | Studio | Enterprise | |---------|------|---------|--------|------------| | Real-time streaming | ✅ | ✅ | ✅ | ✅ | | Routing key filters | ✅ | ✅ | ✅ | ✅ | | Kref glob filters | ✅ | ✅ | ✅ | ✅ | | Event persistence | ❌ | 1 hour | 24 hours | 30 days | | Cursor-based resume | ❌ | ✅ | ✅ | ✅ | | Replay from buffer | ❌ | ✅ | ✅ | ✅ | | Consumer groups | ❌ | ❌ | ❌ | ✅ | | BYO Kafka bridge | ❌ | ❌ | ✅ Pro | ✅ | > **Note**: Creator tier and above features are **Coming Soon**. Currently only Free tier (real-time streaming) is available. ### Query Tier Capabilities Check your tenant's streaming capabilities at runtime: ```python from kumiho import get_event_capabilities caps = get_event_capabilities() print(f"Tier: {caps.tier}") print(f"Supports replay: {caps.supports_replay}") print(f"Supports cursor: {caps.supports_cursor}") print(f"Max retention: {caps.max_retention_hours} hours") print(f"Buffer size: {caps.max_buffer_size} events") ``` ### Resumable Streaming (Coming Soon - Creator+) For Creator tier and above, you can resume from where you left off: ```python import kumiho # Save cursor for recovery last_cursor = None for event in kumiho.event_stream(routing_key_filter="revision.*"): process_event(event) last_cursor = event.cursor # Persist this # Later, resume from last position for event in kumiho.event_stream(cursor=last_cursor): process_event(event) ``` ### Replay from Beginning (Coming Soon - Creator+) Replay all events in the buffer: ```python # Replay entire buffer (useful for initial sync) for event in kumiho.event_stream(from_beginning=True): sync_to_local_db(event) ``` ### Feature-Gated Streaming Write code that adapts to your tier: ```python import kumiho caps = kumiho.get_event_capabilities() if caps.supports_cursor: # Creator+ tier: use cursor for reliability cursor = load_saved_cursor() stream = kumiho.event_stream(cursor=cursor) else: # Free tier: real-time only stream = kumiho.event_stream() for event in stream: process_event(event) if event.cursor: save_cursor(event.cursor) ``` ### Use Cases by Tier **Free Tier** (Available Now): - Live dashboard updates - Real-time notifications during active sessions - Development and testing **Creator Tier** (Coming Soon): - Overnight batch processing with morning resume - Intermittent connectivity scenarios - Small team collaboration with reliable delivery **Studio Tier** (Coming Soon): - Integration with existing Kafka pipelines (BYO Kafka) - Render farm job triggering - Data warehouse ingestion **Enterprise Tier** (Coming Soon): - Mission-critical production pipelines - Parallel processing with consumer groups - Full audit trail with 30-day retention ## Next Steps - Try the [Getting Started](getting-started.md) tutorial - Explore the [API Reference](api/kumiho.rst) - Learn about [Authentication](authentication.md)