JVector Configuration
The VectorIndexConfiguration builder provides various options to tune the index for your specific use case.
JVM Configuration
To enable SIMD acceleration via the Panama Vector API, configure your JVM with the following parameters.
Full Example with Performance Tuning
java --add-modules jdk.incubator.vector \
-Djvector.physical_core_count=8 \
-jar your-app.jar
Maven Configuration
For Maven projects, configure the Surefire and Failsafe plugins:
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-surefire-plugin</artifactId>
<configuration>
<argLine>--add-modules jdk.incubator.vector</argLine>
</configuration>
</plugin>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-failsafe-plugin</artifactId>
<configuration>
<argLine>--add-modules jdk.incubator.vector</argLine>
</configuration>
</plugin>
Gradle Configuration
For Gradle projects:
tasks.withType<JavaExec> {
jvmArgs("--add-modules", "jdk.incubator.vector")
}
tasks.withType<Test> {
jvmArgs("--add-modules", "jdk.incubator.vector")
}
Physical Core Configuration
Intensive SIMD operations can saturate memory bandwidth during indexing and PQ computation. JVector mitigates this by using a PhysicalCoreExecutor that limits concurrency to physical cores rather than logical cores (hyperthreads).
| Property | Default | Description |
|---|---|---|
|
Half of available processors |
Override the detected physical core count for parallel operations |
The jdk.incubator.vector module is an incubator feature in Java 17-21. Starting with Java 22, the Vector API moved to preview status. JVector’s multi-release JAR handles the API differences automatically, but the module must still be explicitly enabled.
|
Basic HNSW Parameters
| Parameter | Default | Description |
|---|---|---|
|
(required) |
Vector dimensionality. Must match the size of your embedding vectors. |
|
|
Similarity metric ( |
|
16 |
Maximum connections per node in HNSW graph. Higher values improve recall but increase memory usage. |
|
100 |
Search beam width during index construction. Higher values improve recall during construction. |
|
1.2 |
Overflow factor for neighbor lists. |
|
1.2 |
Pruning parameter for the HNSW graph. |
On-Disk Storage
For datasets that exceed available memory, enable on-disk storage to use memory-mapped files.
| Parameter | Default | Description |
|---|---|---|
|
|
Enable on-disk graph storage. |
|
|
Directory for index files. Required if |
|
|
Enable Product Quantization compression. |
|
|
Number of PQ subspaces (0 = auto: dimension/4). |
Background Persistence
Enable automatic asynchronous persistence to avoid blocking operations during writes.
Setting persistenceIntervalMs to a value greater than 0 enables background persistence.
| Parameter | Default | Description |
|---|---|---|
|
|
Check interval in milliseconds. A value > 0 enables background persistence, 0 disables it. |
|
|
Minimum changes before persisting. |
|
|
Persist pending changes on |
Example
VectorIndexConfiguration config = VectorIndexConfiguration.builder()
.dimension(768)
.similarityFunction(VectorSimilarityFunction.COSINE)
.onDisk(true)
.indexDirectory(Path.of("/data/vectors"))
.persistenceIntervalMs(30_000) // Enable, check every 30 seconds
.minChangesBetweenPersists(100) // Only persist if >= 100 changes
.persistOnShutdown(true) // Persist on close()
.build();
Background Optimization
Enable periodic graph optimization to maintain query performance as the index grows.
Setting optimizationIntervalMs to a value greater than 0 enables background optimization.
| Parameter | Default | Description |
|---|---|---|
|
|
Check interval in milliseconds. A value > 0 enables background optimization, 0 disables it. |
|
|
Minimum changes before optimizing. |
|
|
Optimize pending changes on |
Example
VectorIndexConfiguration config = VectorIndexConfiguration.builder()
.dimension(768)
.similarityFunction(VectorSimilarityFunction.COSINE)
.onDisk(true)
.indexDirectory(Path.of("/data/vectors"))
.optimizationIntervalMs(60_000) // Enable, check every 60 seconds
.minChangesBetweenOptimizations(1000) // Only optimize if >= 1000 changes
.optimizeOnShutdown(false) // Skip for faster shutdown
.build();
Parameter Guidelines
The following table provides recommended parameter values based on dataset size.
| Use Case | maxDegree | beamWidth | Notes |
|---|---|---|---|
Small dataset (<10K) |
8-16 |
50-100 |
Lower values sufficient |
Medium dataset (10K-1M) |
16-32 |
100-200 |
Balanced trade-off |
Large dataset (>1M) |
32-64 |
200-400 |
Higher for better recall |
High precision required |
48-64 |
400-500 |
Maximum recall |