Executing Queries

There are several ways to process the results of a query.

Whenever one of the following methods to process the query’s results is called, the query will be executed automatically.

All touched entities will be loaded on demand during the process.

The GigaMap employs internal read-write locking, which makes it effectively thread-safe.

However, to determine when the read-locks should be released, all query iteration resources must be closed after use.

The default iterator closes automatically once fully traversed. Everything else must be closed manually.

All resources implement AutoCloseable, making them best suited for use in a try-with-resources block.

Iterators

GigaQuery extends Iterable, so all the usual options are available.

GigaQuery<Person> query = gigaMap.query(...);

/*
 * Not recommended
 * because if the loop exits before the iterator is finished,
 * the iterator will stay open.
 */
for(Person person : query)
{
    // do something
}
/*
 * Always use try-with-resources,
 * just to make sure that the iterator is closed eventually.
 */
try(GigaIterator<Person> iterator = query.iterator())
{
    while(iterator.hasNext())
    {
        Person person = iterator.next();
    }
}
// Internal iteration works as well.
try(GigaIterator<Person> iterator = query.iterator())
{
	iterator.forEachRemaining(person -> ...);
}

Streams

Java’s powerful Streams API offers a lot of methods for further filtering, mapping, aggregating, and collecting after the query is executed.

Be aware that the stream does not influence the query but is executed afterward.

try(Stream<Person> stream = query.stream())
{
    // ...
}

Collectors

The most convenient way to get the results of a query is through collectors. There’s no need to worry about closing the resource; it is handled internally.

GigaQuery<Person> query = gigaMap.query(...);

// get the complete result as a list
List<Person> list = query.toList();

// limit the result to 100 entities
List<Person> list = query.toList(100);

// get the third page with 100 entities
List<Person> list = query.toList(
	200, // offset
    100  // limit
);

// Set collectors are also available
Set<Person> set = query.toSet(...);

Optional first entity

GigaQuery<Person> query = gigaMap.query(...);
Optional<Person> optFirst = query.findFirst();
optFirst.ifPresent(person -> ...);

Count results

GigaQuery<Person> query = gigaMap.query(...);
long count = query.count();

Multithreaded Query Execution

For large datasets, query execution can be parallelized across multiple threads. This is done by passing an IterationThreadProvider when creating the query.

Thread Count Providers

A ThreadCountProvider determines how many threads are used for parallel execution.

// Fixed number of threads
ThreadCountProvider fixed = ThreadCountProvider.Fixed(4);

// Adaptive: uses up to the number of available processors
ThreadCountProvider adaptive = ThreadCountProvider.Adaptive();

// Adaptive with a custom maximum
ThreadCountProvider adaptiveCapped = ThreadCountProvider.Adaptive(8);

The adaptive provider calculates the thread count based on the number of bitmap result segments and the specified maximum, ensuring threads are not wasted on small datasets.

Iteration Thread Providers

An IterationThreadProvider manages the lifecycle of the threads used during iteration. There are two implementations:

Creating — Creates fresh threads for each query execution. Simple and suitable for infrequent queries.

IterationThreadProvider provider = IterationThreadProvider.Creating(
    ThreadCountProvider.Adaptive()
);

Pooling — Maintains a pool of reusable threads. Avoids the overhead of thread creation for frequent queries. The pool grows as needed and threads are returned to the pool after each query completes.

IterationThreadProvider provider = IterationThreadProvider.Pooling(
    4, // initial reserved thread count
    ThreadCountProvider.Adaptive()
);

// Shut down the pool when no longer needed
provider.shutdown();

Parallel Execution with Multiple Consumers

The recommended way to use multithreaded execution is with the execute(Consumer…​) method. Each consumer processes a portion of the result set in its own thread.

IterationThreadProvider provider = IterationThreadProvider.Creating(
    ThreadCountProvider.Adaptive()
);

GigaQuery<Person> query = gigaMap.query(provider)
    .and(PersonIndices.city.is("Berlin"));

// Execute with multiple consumers - each runs in its own thread
query.execute(
    person -> processPartA(person),
    person -> processPartB(person)
);

This approach is internally simpler and faster than the iterator-based multithreading, because it avoids the complexity of coordinating an Iterator across threads. The bitmap result is partitioned and each thread processes its assigned segments independently.

Multithreaded Iterators

Alternatively, a multithreaded query can be used with the standard iterator API. When a query is created with an IterationThreadProvider, calling iterator() produces a ThreadedIterator that processes bitmap result segments in parallel across worker threads.

IterationThreadProvider provider = IterationThreadProvider.Pooling(
    4,
    ThreadCountProvider.Fixed(4)
);

GigaQuery<Person> query = gigaMap.query(provider)
    .and(PersonIndices.lastName.is("Smith"));

try(GigaIterator<Person> iterator = query.iterator())
{
    while(iterator.hasNext())
    {
        Person person = iterator.next();
        // process
    }
}
The multi-consumer execute() approach is preferred over multithreaded iterators for performance-critical workloads, as it avoids the overhead inherent in coordinating an Iterator across threads.