Best practices for picking number of shards for Aggregate Event Tags

Best practices for picking number of shards for Aggregate Event Tags

Lagom's  documentation states that you should not change the number of shards for a given AggregateEventTag that you have sharded. What is a good baseline number of shards to pick? What sort of impact does the number of shards have in terms of scaling a lagom application up/down? What impact does the number of shards picked have on the message broker?

Deciding the number of shards for your event stream really comes down to how much pressure your data store can handle. Each shard is going to have a process polling the database. So the more shards, the more database pollers, and this can exert a sizable load on the database. We’ve seen other users report very high latent CPU usage on their database servers (although the polling frequency can be tuned accordingly), so realistically you probably want to keep this value on the lower side. That said, the best way to size it is to do some load tests – find out what the maximum throughput that you’re able to handle in a read side processor or Kafka topic publisher, work out what the maximum throughput that you want to scale to is, and then divide the latter number by the former number to get the number of shards that you’ll need. On the alternative side, as far as the effect on scaling and the broker, it could be a limiting factor such that if set too low then you could have wasted resources.

Are there any best practices for dealing with modifying the number of shards for an AggregateEventTag?

The primary answer to this is that you can’t modify it, as you risk data order or loss. However, this is common enough of a request that an issue was created to discuss it, and James Roper even suggested a possible path to accomplish the change. Again, this is not a suggested path at this time, however it could change in the future.

    • Related Articles

    • Error Handling Best Practices

      The best overview of the error handling options in the standard library we are aware of is tersesystems.com/2012/12/27/error-handling-in-scala, by Will Sargent, a former Lightbender. Digression: One topic he doesn’t cover is why checked exceptions ...
    • How to implement versioning in Lagom microservices?

      Lagom doesn't provide any solution, plugin or feature that enables API versioning. However you can use the following to decide and pick what works best for you: When versioning you must keep in mind both the service endpoints and payloads and the ...
    • Should Kafka act as the source of truth in Lagom?

      > We are using Lagom with event sourcing and CQRS. Would you say that keeping the source of truth on the Kafka side is an option? Pragmatism always wins the day, and with that being said this idea has plenty of merit. The only concern we have with ...
    • Can I customize error handling in Lagom?

      Yes, the best way to customize error handling in Lagom is to use a custom ExceptionSerializer.  This Knoldus blog post explains it well: https://blog.knoldus.com/2018/02/05/exception-serializer-in-lagom/ That example, however, is more focused on how ...
    • Enum Best Practices

      There are different options to implement enums in Scala. The landscape is roughly: 1. scala.Enumeration Pros: Library code only. Does not create a class per enum value for simple enums without behaviour. Cons: need to use dependent types to refer to ...