Improving Vald Search Performance through Parameter Tuning

vald.vdaas.org
ITNEXT
Published in
8 min readApr 25, 2024

--

In our initial validation for Hybrid Search, we collectively ran a standalone search quality test using Vald and discovered a surprisingly low. With low search quality, it is not easy to pinpoint how much Hybrid Search impacts the overall search results. As a team, we decided first to optimize Vald’s parameters and improve its search quality to gain a clearer understanding. We followed the Tuning Guideline focused on search performance to achieve this.

Our users, the driving force behind our improvements, shared feedback that Vald’s default parameters yield lower search quality than other libraries and engines. While data-dependent variations in search quality can sometimes negate the need for tuning, we saw this as a valuable challenge. We incorporated default parameter optimization into our validation process to enhance Vald’s user-friendliness and broaden its applicability across various use cases.

Search Parameter Tuning

Parameter tuning is performed as follows:

  1. Search quality validation with Linear Search (full scan/k-Nearest Neighbor)
  2. Parameter adjustment to improve search quality
  3. Parameter adjustment to meet the required search speed (Tuning was not performed this time because sufficient search quality was obtained under the condition of timeout=3s)

Our initial validation step involved search quality testing with Linear Search. This method, essentially k-Nearest Neighbor (k-NN) in Vald Agent, calculates the distance between all indexed data points and queries. It then predicts the k points, ranked by their distance from the query. Since Linear Search examines every data point, it provides the highest achievable search quality compared to Approximate Nearest Neighbor (ANN) methods.

Therefore, we recommend Linear Search as the first step in parameter tuning. Before moving on, this helps identify potential issues with vectorization or other quality-related factors. Once Linear Search confirms sufficient search quality, we can fine-tune the parameters affecting Vald’s ANN search for further optimization.

Search Quality Validation

Our validation utilized the MS MARCO passage dataset (train: 8.84M, val: 6980) and Sentence-BERT for vectorization, mirroring the setup used in Hybrid Search validation.

Vald Agent (NGT) relies on six critical parameters for search operations: radius, epsilon, k, timeout, create_edge_size, and search_edge_size. Let’s break down what each parameter controls:

  • Radius (radius): Defines the search range. Data points within this distance from the query and predicted as the top-k nearest neighbors in the index graph are retrieved. A value of -1 signifies the maximum float value.
  • Epsilon (epsilon): Fine-tune search quality. During the search, it expands the search circle radius by a factor of (1 + epsilon), impacting the explored area within the graph. Higher epsilon values can improve precision but increase search time. The maximum search circle radius aligns with the radius parameter when the maximum number of search results is retrieved.
  • Top-K (k): Specifies the number of results returned. Data exceeding the specified radius might be excluded, potentially resulting in less than k items.
  • Creation Edge Size (create_edge_size): Defines the number of outgoing edges assigned to each node when building the index graph. A larger value can enhance precision but lengthen index creation time.
  • Search Edge Size (search_edge_size): Determines the number of edges explored within the index graph during the search. Similar to create_edge_size, a larger value can improve precision but slows down search time. We are specifying 0 for all edges connected to each node during the search.
  • Timeout (timeout): Defines the maximum execution time for the search operation. If the search time is longer than the specified time, the search will be stopped midway and the results up to that point will be returned to the request destination. Tuning was not performed this time because sufficient search quality was obtained under the condition of timeout=3s.

For further details on these parameters, refer to the NGT wiki.

In this validation, we focused on tuning three parameters for Vald (v1.7.8): epsilon, create_edge_size, and search_edge_size. The tuning process was divided into two stages.

We began by assessing the performance of Vald across a range of create_edge_size and search_edge_size: 200/100/50/25 (Validation 1). This initial evaluation aimed to identify promising parameter combinations for further refinement. Building upon the insights gained from Validation 1, we delved deeper into fine-tuning the selected parameters (Validation 2).

For epsilon, we evaluated three values: 0.01, 0.05, and 0.1.

Validation 1

In Validation 1, we will roughly change the create_edge_size and search_edge_size and evaluate each search result. The search parameters and edge size pairs used for the validation are as follows:

pair of {radius, epsilon, timeout, k}

  • {-1, 0.01, 3s, 1000}
  • {-1, 0.05, 3s, 1000}
  • {-1, 0.1, 3s, 1000}

pair of {create_edge_size and search_edge_size}

  • {25, 25}
  • {25,200}
  • {50, 50}
  • {50, 200}
  • {100, 100}
  • {100, 200}
  • {200, 25}
  • {200, 50}
  • {200, 100}
  • {200, 200}

The results of the validation are shown in the following graphs.

Left: Time / Recall (validation 1), Right: Search Edge / Recall (validation 1)
Left: Time / NDCG (validation 1), Right: Search Edge / NDCG (validation 1)

The above results showed that edge sizes between 50 and 100 are sufficient to achieve high search quality.

Validation 2

In Validation 2, we will further refine the validation by narrowing the range of edge sizes based on the results of Validation 1. The parameter set is as follows:

pair of {radius, epsilon, timeout, k}

  • {-1, 0.01, 3s, 1000}
  • {-1, 0.05, 3s, 1000}
  • {-1, 0.1, 3s, 1000}

pair of {create_edge_size and search_edge_size}

  • {50, 50}
  • {50, 75}
  • {75, 50}
  • {75, 75}
  • {75, 100}
  • {100, 75}
  • {100, 100}

The results are shown in the following graphs.

Left: Time / Recall, Right: Time / NDCG (validation 2)
Left: Search Edge / Recall, Right: Creation Edge/ Recall (validation 2)
Left: Search Edge / NDCG, Right: Creation Edge/ NDCG (validation 2)
Left: Search Edge / Time, Right: Creation Edge/ Time (validation 2)

From the above results, it was confirmed that the pairs of create_edge_size and search_edge_size {50, 75}, {75, 100} are superior in terms of search quality and search speed.

In addition, as a result of this validation, we set epsilon=0.05, create_edge_size=50, and search_edge_size=50 as the new default parameters, considering search quality, search speed, and the required resources.

Discussion

Our analysis revealed that increasing the number of edges (create_edge_size + search_edge_size) improves search quality. However, unsurprisingly, this also leads to longer search times as the number of nodes explored during the search increases.

Interestingly, the results for c50s75 and c75s100 (create_edge_size: 50/75, search_edge_size: 75/100) displayed superior search quality and time performance compared to c100s100. This suggests that maximizing the total number of edges isn’t always optimal.

Creation Edge vs. Search Edge: A Balancing Act

Looking at the Search Edge/Time graph, we observed a significant overlap between lines for the same search_edge_size. This implies that create_edge_size has minimal impact on search time, while search_edge_size appears to be the dominant factor.

The Search Edge/Recall (NDCG) graphs further highlight this point. They show that larger search edges generally lead to higher search quality, with minimal influence from epsilon variations. This suggests that using a sufficiently large search edge with a small epsilon value can balance search quality and search time well.

In contrast, the Creation Edge/Time graph reveals less overlap and a more significant impact of create_edge_size on search time for the same search edge. The NDCG graphs for Search Edge/Recall demonstrate that increasing the create_edge_size can potentially raise the lower limit of search quality. While the size of search_edge_size has a significant impact on accuracy, generating too many edges may compromise the lower accuracy bound.

Finding the Sweet Spot: c50s75 and c75s100 Shine

Despite focusing on increasing both creation and search edges, the experiment identified c50s75 and c75s100 as clear winners in terms of both precision and time efficiency. Notably, c50s75 performed surprisingly well.

This can be attributed to both parameters reaching a “sweet spot” in these cases. Further increasing the number of edges beyond this point would likely lead to longer search times due to more nodes to explore and potentially introduce noise into the results, thereby impacting precision.

The “sweet spot” depends on factors like the dataset, model, and vectorization methods. This experiment’s optimal setting was a minimum creation edge of 50 and a creation-to-search edge ratio of 2:3.

c200s50: Outperforming Linear Search in Specific Cases

The Time/NDCG graph revealed an interesting outlier: c200s50 sometimes outperformed Linear Search in NDCG. While other parameter combinations trended upwards with Linear Search as the baseline, c200s50 stood out.

This might be due to potential discrepancies between “correct” results based on vectors and distance functions and the “true” relevance determined by the dataset. In this case, the search results obtained with c200s50 might have aligned better with the dataset’s relevance criteria (as measured by NDCG) despite ranking differently from a linear search. Additionally, the fact that precision continues to improve with c200s50 compared to s100c50 suggests further optimization potential within this setting.

Conclusion

This article provided a practical example of parameter tuning to enhance Vald’s (NGT) search performance. The validation process revealed a proportional increase in search quality with the number of edges. However, search time also increased with larger search_edge_size and epsilon values. The search_edge_size emerged as the most significant factor influencing both precision and search time.

Beyond the validation results, we tackled the challenge of defining new default parameters. Considering various factors like Recall or NDCG, search speed, user resource constraints, and initial user-friendliness, we selected epsilon=0.5, creation_edge_size=50, and search_edge_size=50 as the new defaults, offering a well-balanced cost-performance ratio.

Building on these insights, we explored the relationships between epsilon, create_edge_size, and search_edge_size within Vald. While this is just one example, we hope it is a valuable resource for your parameter-tuning endeavors.

Please refer to this article and the comprehensive Tuning Guideline for further guidance.

If you found this article helpful, feel free to leave a reaction (and we wouldn’t mind a GitHub Star or two!).

If you have any questions or requests, please get in touch with us!

We’re waiting for your contributions!

See you next post :)

Other Post

--

--

A highly scalable distributed fast approximate nearest neighbor dense vector search engine.