Designing scalable system

 

Q: How do you design scalable applications?
A: “I apply hexagonal architecture and domain-driven design, as seen in NAV Station and PDS projects. I’ve used design patterns like Adapter and Chain of Responsibility to improve flexibility and maintainability.”

Microservices: Breakdown into multiple small domain module : in microservices style.

Load distribution: using circuit breaker, etc.

Event driven: Implement the event driven architechure; use kafka or other framework to messaging.

Caching: Implement caching mechanishm, just like redis or caffeine cache

Virtual thread: Implement threading s, muti threaded application using virtual threads.

Database optimization:

          Implememt the optimized queries, in such a way we can achive higher performance

          Implement hikaripool for database. Etc.

          Implement indexing , partitionaing and shrading.

Java code optimization: Implement multithreading practices, parallelstream, and other java practices like parallelmap instead of hash map etc.

Baching and pagination:

  • Batch inserts/updates to reduce transaction overhead.
  • Use pagination for large result sets to avoid memory issues.

Optional

Code quality:
  Hexagonal Architecture:Implement the module in hexagonal architecture style in the coding.

  Clean code: Implement the clean code, implement unit test and achive the higher coverage more than 90%

Implement CICD , and pipelines that runs the test in parallel.

Implement the git hooks so that arc unit is run and tested to make sure we are maining the arc , and also git hooks run the sonar coverage etc.

For documentation we have been using the devdocs.




Full detail:

Improving performance in a highly scalable Java application and its database involves a combination of architectural decisions, code-level optimizations, and infrastructure tuning. Here's a breakdown of strategies for both:


๐Ÿ”ง Java Application Performance

1. Concurrency and Thread Management

  • Use non-blocking I/O , we used project reactive or reactive programming language in NAV station and prisym Leverage parallel stream, completeable future, suppy async or run async.
  • Leverage ExecutorService or ForkJoinPool for parallel processing.
  • Leverage parallel stream, completeable future, suppy async or run async.

Avoid thread contention and excessive synchronization.

ExecutorService executor = Executors.newFixedThreadPool(4);

 

List<List<String>> batches = ...; // divide your data into batches

for (List<String> batch : batches) {

    executor.submit(() -> processBatch(batch));

}

 

executor.shutdown();

  •  

2. JVM Tuning

  • Profile with tools like JVisualVMYourKit, or Flight Recorder or you can use datadog to check the jvm usage and implement the performance accordly.
  • Tune Garbage Collection (GC) based on workload (e.g., G1GC for low pause times).
  • Set appropriate heap sizes (-Xms, -Xmx) and thread stack sizes.

3. Efficient Data Structures and Algorithms

  • Use appropriate collections (ConcurrentHashMap, ArrayList, etc.).
  • Synchronized list:
  • List<String> list = Collections.synchronizedList(new ArrayList<>());
  • Avoid unnecessary object creation and boxing/unboxing.
  • Minimize memory footprint by reusing objects (e.g., object pools).

4. Caching

  • Use in-memory caches like CaffeineEhcache, or Guava Cache.
  • For distributed caching, use Redis or Hazelcast.
  • Cache expensive computations and frequently accessed data.

5. Asynchronous Processing

  • Use message queues (Kafka, RabbitMQ) for decoupling and async workflows.
  • Offload heavy tasks to background workers.

6. Microservices and Load Distribution

  • Break monoliths into microservices for better scalability.
  • Use API gatewaysservice registries, and load balancers.
  • Virtual threads and parallel stream and completable futures.

๐Ÿ—„️ Database Performance

1. Indexing

  • Create indexes on frequently queried columns.
  • Use composite indexes for multi-column filters.
  • Monitor and remove unused indexes to reduce write overhead.
  • Creating an index in Oracle Database helps improve query performance by allowing the database engine to quickly locate rows. Here's how you can create different types of indexes:
    Basic index

CREATE INDEX index_name

ON table_name (column1, column2, ...);

 

2. Query Optimization

  • Use EXPLAIN PLAN to analyze query performance.
  • Avoid SELECT *; fetch only needed columns.
  • Use prepared statements to reduce parsing overhead.

3. Connection Pooling

  • Use tools like HikariCP or Apache DBCP for efficient connection management.
  • Tune pool size based on application load.

Why Use HikariCP?

High performance: Minimal overhead and fast connection acquisition.

Lightweight: Small footprint and low latency.

Reliable: Actively maintained and widely adopted (used by Spring Boot by default).

Advanced tuning: Offers fine-grained control over pool behavior.

<dependency>

    <groupId>com.zaxxer</groupId>

    <artifactId>HikariCP</artifactId>

    <version>5.0.1</version> <!-- Check for latest version -->

</dependency>

spring.datasource.url=jdbc:mysql://localhost:3306/mydb

spring.datasource.username=user

spring.datasource.password=password

spring.datasource.hikari.maximum-pool-size=10

spring.datasource.hikari.connection-timeout=30000

 

 

4. Partitioning and Sharding

  • Partition large tables to improve query performance.
  • Use sharding for horizontal scaling in distributed databases.
    Partitioning in databases is a technique used to divide large tables into smaller, more manageable pieces while maintaining the logical integrity of the data. It improves performancescalability, and maintenance—especially in high-volume transactional or analytical systems.

1. Horizontal Partitioning (Sharding)

Splits rows across multiple tables or databases.

Common in distributed systems.

Example: Users from different regions stored in separate databases.

2. Vertical Partitioning

Splits columns into separate tables.

Useful when some columns are accessed more frequently than others.

Example: Separating user profile info from login credentials.

 Benefits of Partitioning

  •  Improved Query Performance: Queries scan only relevant partitions.
  •  Faster Maintenance: Easier to archive, delete, or backup partitions.
  •  Parallelism: Queries and operations can run concurrently on different partitions.
  •  Scalability: Supports large datasets without degrading performance.

 

5. Caching Layer

  • Use Redis or Memcached to cache frequent queries.
  • Implement read-through or write-through caching strategies.

6. Replication and Read Scaling

  • Use read replicas to distribute read load.
  • Ensure replication lag is monitored and managed.

7. Batching and Pagination

  • Batch inserts/updates to reduce transaction overhead.
  • Use pagination for large result sets to avoid memory issues.

๐Ÿงช Monitoring & Profiling Tools

  • Java: JProfiler, JMC, Prometheus + Grafana, Micrometer
  • Database: pg_stat_statements (PostgreSQL), Performance Schema (MySQL), AWR (Oracle)

Comments

Popular posts from this blog

Archunit test

Hexagonal Architecture

visitor design pattern