Card Data and Collection Streaming Architecture

This document describes the foundational architecture for streaming card data and collection information between the Commonplace server and client, covering the core data model, reference management, and basic collection operations before extending to advanced spatial features.

Core Data Model

Card Identity and References

Every card in Commonplace has a globally unique identity that remains constant throughout its lifetime. Cards exist as single instance per user that can be referenced from multiple collections simultaneously.

Card Instance: The actual card data with a unique ID, containing fields, metadata, and version information. Card instances exist independently of any collection, although they are always referenced from at least one collection; when the card reference count goes to 0 the card instance is deleted/removed.

Card Reference: A pointer to a card instance that includes context-specific information such as schema_id (for efficient client-side caching), position within a collection, local metadata, and collection-specific overrides.

Reference Counting: The system tracks how many collections reference each card. For user convenience, when the last reference is removed, the card may move to a "Recently Deleted" collection rather than being immediately destroyed.

Collection Structure

Collections are containers that organize card references rather than containing the cards themselves. This reference-based approach enables several key capabilities:

  • The same card can appear in multiple collections without duplication
  • Changes to a card can propagate to all collections containing it
  • Collections can be reorganized without affecting the underlying card data
  • Cards maintain their identity and history across collection changes

Collection Metadata: Each collection has its own properties including name, description, display mode preferences, sorting rules, and access permissions.

Reference Metadata: Each card reference within a collection includes schema_id (for client-side schema caching), position or ordering information (for spatial mode and for manually ordered list mode, respectively), display overrides, and collection-specific annotations.

Card and Collection links: links between cards, or between cards and child collections, are specific to a collection.

Basic Collection Operations

Collection Listing and Navigation

The client needs efficient ways to browse collections and understand their structure:

Collection Hierarchy: Collections can contain other collections, creating navigable hierarchies. The client receives collection structure information separately from card data.

Collection Summaries: Basic collection information including name, card count, last modified date, and preview thumbnails loads quickly for navigation purposes.

Lazy Collection Loading: Full collection contents load only when the user navigates into a collection, not when browsing the collection hierarchy.

Card Listing Within Collections

When a user opens a collection, the system provides card information in stages:

Initial Card List: The server sends a list of card references including schema_id and basic metadata (title, schema type, modification date) without full field data. This enables efficient client-side schema caching.

Pagination and Ranges: Large collections are divided into pages or ranges. The client requests specific ranges based on the current view mode and user navigation.

Sort and Filter Operations: Most sorting and filtering operations execute on the server to avoid transferring unnecessary data. The server maintains sorted indexes and can quickly return filtered results.

Schema-Aware Loading Pattern

Efficient Schema Caching

The streaming architecture incorporates schema-aware loading for optimal performance:

Schema-Aware References: Card references include schema_id to enable efficient client-side schema caching and reduce redundant schema requests.

Client Schema Cache: Clients maintain a local cache of schema definitions, checking for missing schemas when loading collections and batch requesting unknown schemas.

Batch Schema Requests: Clients can request multiple schemas simultaneously, reducing network round trips when loading collections with diverse card types.

Schema Loading Flow:

  1. Client requests collection data
  2. Server returns card references with schema_id included
  3. Client identifies missing schemas from local cache
  4. Client batch requests unknown schemas before loading card content
  5. Client updates schema cache and loads cards with known schemas

Server-Side Query Processing

Collection Indexing

The server maintains multiple indexes for each collection to support efficient queries:

Ordered Indexes: Pre-computed sort orders for common fields like creation date, modification date, title, and custom schema fields.

Filter Indexes: Inverted indexes for tags, schema types, field values, and other filterable attributes.

Search Indexes: Full-text search indexes for card content, supporting complex queries across card fields.

Spatial Indexes: For spatial mode, geometric indexes track card positions and enable efficient area-based queries.

Query Execution

When the client requests collection data, the server processes queries efficiently:

Query Planning: The server analyzes filter, sort, and range parameters to determine the most efficient execution plan.

Index Utilization: Queries use appropriate indexes to minimize data scanning and processing time.

Result Streaming: Large result sets are streamed to the client in chunks rather than loaded entirely into memory.

Cache Integration: Frequently accessed query results are cached to improve response times for repeated operations.

Client-Server Communication Protocol

Request Types

The client sends various types of requests to retrieve and modify collection data:

Collection Structure Requests: Retrieve collection hierarchy, metadata, and basic organization information.

Card Range Requests: Request specific ranges of cards within a collection, with optional sorting and filtering.

Individual Card Requests: Fetch complete data for specific cards, typically triggered by user interaction. Schema information is already available from the initial reference load.

Schema Batch Requests: Fetch multiple schema definitions efficiently when the client identifies missing schemas from collection references.

Search Requests: Perform text searches or complex queries across collection contents.

Update Requests: Send card modifications, position changes, and collection structure updates to the server.

Response Streaming

The server responds with structured data optimized for client consumption:

Chunked Responses: Large collections are sent in manageable chunks to maintain responsiveness.

Progressive Detail: Card information is sent at different detail levels based on immediate display needs. Schema information is included in references to enable proper deserialization at all detail levels.

Delta Updates: When collections change, only the differences are transmitted rather than complete collection data.

Batch Operations: Multiple related operations are batched together to reduce network overhead.

Card Data Synchronization

Client-Side State Management

The client maintains local state for viewed collections and cards:

Collection State: Current sort order, filter settings, scroll position, and selection state.

Card Cache: Local copies of card data at various detail levels, managed with appropriate eviction policies.

Schema Cache: Local copies of schema definitions cached separately from card data, with longer retention since schemas change less frequently than card content.

Dirty State Tracking: Modified cards are marked as dirty and queued for synchronization with the server.

Conflict Resolution: The system handles cases where the same card is modified simultaneously by multiple clients. We will consider CRDT algorithms to ensure consistency and avoid conflicts.

Update Propagation

When cards are modified, changes flow through the system efficiently:

Local Updates: The client immediately updates the local display while sending changes to the server asynchronously.

Server Validation: The server validates updates, handles conflicts, and persists changes to storage.

Change Notification: Other clients viewing the same collection receive notifications about changes.

Version Management: Each update creates a new card version, maintaining the complete change history.

Collection Display Modes

List Mode Implementation

List mode provides traditional table-based viewing with rich interaction capabilities:

Virtual Scrolling: Only visible rows are rendered, with placeholder elements for off-screen content.

Dynamic Column Management: Columns can be shown, hidden, resized, and reordered based on user preferences.

In-Place Editing: Simple field modifications can be performed directly within the list view.

Bulk Operations: Multiple cards can be selected for batch operations like tagging, moving, or deletion.

Summary Mode Implementation

Summary mode provides compact collection overviews and dashboard-style displays:

Aggregated Data: The server pre-computes summary statistics and sends aggregated information rather than individual card details.

Custom Visualizations: Collections can define custom summary displays using schema-attached rendering code.

Drill-Down Capability: Users can navigate from summary views to detailed list or spatial views seamlessly.

Real-Time Updates: Summary displays update automatically as underlying collection data changes.

Search and Filtering Architecture

Server-Side Search Processing

Search operations are primarily executed on the server for efficiency:

Query Parsing: Complex search queries are parsed and validated on the server.

Index Utilization: Multiple indexes can be combined to execute complex searches efficiently.

Relevance Ranking: Search results are ranked by relevance and returned in order of importance.

Faceted Search: The system supports filtering by multiple attributes simultaneously.

Client-Side Search Experience

The client provides responsive search interfaces:

Incremental Search: Search results update as the user types, with appropriate debouncing.

Search History: Previous searches are remembered and can be quickly re-executed.

Saved Searches: Complex queries can be saved and reused for recurring analysis.

Search Suggestions: The system suggests completions and corrections for search queries.

Performance Considerations

Network Efficiency

The architecture minimizes network usage through several strategies:

Schema Caching: Client-side schema caching eliminates redundant schema requests. Schemas are cached separately from card data since they change less frequently, providing significant bandwidth savings for collections with many cards using the same schemas.

Batch Schema Requests: When multiple unknown schemas are discovered, they are requested in a single batch operation rather than individual requests, reducing network round trips.

Compression: All data transfers use appropriate compression algorithms.

Connection Reuse: Connections are maintained for ongoing communication.

Request Coalescing: Multiple small requests are combined into larger batch operations.

Conditional Requests: The client sends version information to avoid redundant data transfer.

Server Scalability

The server architecture supports scaling to large collections and many concurrent users:

Index Optimization: Database indexes are carefully designed and maintained for query performance.

Query Caching: Frequently executed queries are cached at multiple levels.

Horizontal Scaling: The architecture supports distributing load across multiple server instances.

Background Processing: Expensive operations like reindexing are performed asynchronously.

Data Consistency and Synchronization

Consistency Models

The system provides appropriate consistency guarantees for different operations:

Strong Consistency: Critical operations like card creation and deletion require immediate consistency.

Eventual Consistency: Display updates and non-critical operations can use eventual consistency for better performance.

Conflict Resolution: The system handles concurrent modifications gracefully with clear resolution policies.

Transaction Support: Related operations are grouped into transactions to maintain data integrity.

Real-Time Updates

Changes propagate to connected clients in real-time:

Event Streaming: The server sends change notifications to interested clients immediately.

Selective Updates: Clients only receive notifications for collections and cards they are currently viewing.

Update Batching: Rapid changes are batched together to avoid overwhelming client displays.

Offline Support: The system handles temporary network disconnections gracefully with local queuing.

This foundational streaming architecture provides the basis for all collection interactions in Commonplace, from simple list browsing to complex spatial arrangements, ensuring efficient data transfer and responsive user experiences across all collection modes and sizes.