Architecture Overview

This page provides a high-level overview of how EclipseStore and the Eclipse Serializer work together, the data flow from Java objects to persistent storage, and the main components of the system.

EclipseStore and Eclipse Serializer

EclipseStore consists of two main projects:

Eclipse Serializer (org.eclipse.serializer) — the serialization engine that converts Java objects to a compact binary format and back. It can be used standalone or as the foundation for storage.
EclipseStore (org.eclipse.store) — the object-graph storage layer built on top of the serializer. It provides persistence, transactions, lazy loading, and storage target management.

Relationship

Your Application

EclipseStore Storage
(StorageManager, Lazy Loading, Housekeeping)

Eclipse Serializer
(Binary Persistence, Type Handling, Type Dict)

Storage Targets / AFS
(File System, S3, Azure, SQL, Redis, Kafka…)

The Eclipse Serializer can be used independently of EclipseStore for pure serialization tasks (e.g., network communication, caching, data transfer). EclipseStore depends on the serializer for all persistence operations.

Data Flow

When you store an object, the following steps occur:

Object Graph Marking
        │
        ▼
    Serialization
        │
        ▼
  Storage Channel
        │
        ▼
   Storage Target
(File System, S3, SQL, etc.)

Object graph marking — the storage manager marks all objects reachable from the stored root object that need to be persisted
Serialization — the Eclipse Serializer converts each object to its binary representation using type handlers
Channel distribution — the binary data is distributed across storage channels for parallel I/O
Storage target write — each channel writes its data to the configured storage target through the Abstract File System (AFS) layer

When loading data, the process runs in reverse:

Storage target read — the requested data is read from the storage target through the AFS layer
Deserialization — the serializer uses the type dictionary to resolve type IDs back to Java classes and instantiates objects without calling constructors
Reference resolution — object references (stored as internal object IDs) are resolved to their corresponding Java objects, restoring the original object graph structure
Lazy reference handling — references wrapped in Lazy are not resolved immediately but loaded on first access

Key Components

Storage Manager

The EmbeddedStorageManager is the main entry point for application code. It provides methods for:

Starting and shutting down the storage
Storing objects and object graphs
Issuing backups
Managing the storage lifecycle

See Storage for details.

Serializer

The Serializer converts Java objects to and from binary format. It manages type registration, type dictionaries, and type evolution.

See Serializer for details.

Storage Channels

Storage channels are parallel I/O workers that distribute the load of reading and writing data. Each channel manages its own set of storage files and entity cache. Increasing the channel count can improve throughput on systems with fast storage.

See Using Channels for details.

Abstract File System (AFS)

The AFS is an abstraction layer that decouples the storage engine from the physical storage medium. All storage targets (file system, S3, SQL, etc.) implement the AFS interface, making the storage engine agnostic to where data is physically stored.

See Storage Targets for the available implementations.

Housekeeping

The housekeeping process runs in the background to:

Garbage-collect unreachable objects (data that was deleted from the object graph)
Compact storage files by removing dead data

See Housekeeping for details.

Storage Root

The storage root is the entry point of the persisted object graph. Every object reachable from the root — directly or transitively — is part of the stored data. On startup, the storage manager loads the root instance and makes the entire graph accessible to the application.

See Root Instances for details.

Lazy Loading

Lazy references allow parts of the object graph to remain unloaded until they are actually accessed. A Lazy reference holds an object ID internally and only fetches and deserializes the referenced object on the first call to get(). This keeps memory usage low for large data sets while still providing transparent access.

See Lazy Loading for details.

Type Dictionary

The type dictionary is a persistent mapping between type IDs and Java class definitions. It is stored alongside the data and used during deserialization to resolve binary records back to their corresponding Java types. When classes evolve (fields added, removed, or renamed), the type dictionary enables legacy type mapping to handle the transition.

See Type Handling for details.

Type Handling

The type handling system manages how Java types are mapped to binary representations. It supports:

Automatic type registration
Type evolution (legacy type mapping) when classes change
Custom type handlers for types requiring special treatment

Type handlers are generated automatically for almost any Java type. Classes do not need to implement java.io.Serializable, carry annotations, or provide a default constructor; only types tied to JVM internals (e.g. Thread, IO streams) and types whose class identity is unstable across runs (lambdas and anonymous inner classes, whose synthetic $$Lambda$… / $1 names are generated by the JVM or compiler) are excluded.

See Legacy Type Mapping and Type Handling for details.