Designing scalable system

Q: How do you design scalable applications?
A: “I apply hexagonal architecture and domain-driven design, as seen in NAV Station and PDS projects. I’ve used design patterns like Adapter and Chain of Responsibility to improve flexibility and maintainability.”

Microservices: Breakdown into multiple small domain module : in microservices style.

Load distribution: using circuit breaker, etc.

Event driven: Implement the event driven architechure; use kafka or other framework to messaging.

Caching: Implement caching mechanishm, just like redis or caffeine cache

Virtual thread: Implement threading s, muti threaded application using virtual threads.

Database optimization:

Implememt the optimized queries, in such a way we can achive higher performance

Implement hikaripool for database. Etc.

Implement indexing , partitionaing and shrading.

Java code optimization: Implement multithreading practices, parallelstream, and other java practices like parallelmap instead of hash map etc.

Baching and pagination:

Batch inserts/updates to reduce transaction overhead.
Use pagination for large result sets to avoid memory issues.

Optional

Code quality:
Hexagonal Architecture:Implement the module in hexagonal architecture style in the coding.

Clean code: Implement the clean code, implement unit test and achive the higher coverage more than 90%

Implement CICD , and pipelines that runs the test in parallel.

Implement the git hooks so that arc unit is run and tested to make sure we are maining the arc , and also git hooks run the sonar coverage etc.

For documentation we have been using the devdocs.

Full detail:

Improving performance in a highly scalable Java application and its database involves a combination of architectural decisions, code-level optimizations, and infrastructure tuning. Here's a breakdown of strategies for both:

🔧 Java Application Performance

1. Concurrency and Thread Management

Use non-blocking I/O , we used project reactive or reactive programming language in NAV station and prisym Leverage parallel stream, completeable future, suppy async or run async.
Leverage ExecutorService or ForkJoinPool for parallel processing.
Leverage parallel stream, completeable future, suppy async or run async.

Avoid thread contention and excessive synchronization.

ExecutorService executor = Executors.newFixedThreadPool(4);

List<List<String>> batches = ...; // divide your data into batches

for (List<String> batch : batches) {

executor.submit(() -> processBatch(batch));

}

executor.shutdown();

2. JVM Tuning

Profile with tools like JVisualVM, YourKit, or Flight Recorder or you can use datadog to check the jvm usage and implement the performance accordly.
Tune Garbage Collection (GC) based on workload (e.g., G1GC for low pause times).
Set appropriate heap sizes (-Xms, -Xmx) and thread stack sizes.

3. Efficient Data Structures and Algorithms

Use appropriate collections (ConcurrentHashMap, ArrayList, etc.).
Synchronized list:
List<String> list = Collections.synchronizedList(new ArrayList<>());
Avoid unnecessary object creation and boxing/unboxing.
Minimize memory footprint by reusing objects (e.g., object pools).

4. Caching

Use in-memory caches like Caffeine, Ehcache, or Guava Cache.
For distributed caching, use Redis or Hazelcast.
Cache expensive computations and frequently accessed data.

5. Asynchronous Processing

Use message queues (Kafka, RabbitMQ) for decoupling and async workflows.
Offload heavy tasks to background workers.

6. Microservices and Load Distribution

Break monoliths into microservices for better scalability.
Use API gateways, service registries, and load balancers.
Virtual threads and parallel stream and completable futures.

🗄️ Database Performance

1. Indexing

Create indexes on frequently queried columns.
Use composite indexes for multi-column filters.
Monitor and remove unused indexes to reduce write overhead.
Creating an index in Oracle Database helps improve query performance by allowing the database engine to quickly locate rows. Here's how you can create different types of indexes:
Basic index

CREATE INDEX index_name

ON table_name (column1, column2, ...);

2. Query Optimization

Use EXPLAIN PLAN to analyze query performance.
Avoid SELECT *; fetch only needed columns.
Use prepared statements to reduce parsing overhead.

3. Connection Pooling

Use tools like HikariCP or Apache DBCP for efficient connection management.
Tune pool size based on application load.

Why Use HikariCP?

High performance: Minimal overhead and fast connection acquisition.

Lightweight: Small footprint and low latency.

Reliable: Actively maintained and widely adopted (used by Spring Boot by default).

Advanced tuning: Offers fine-grained control over pool behavior.

<groupId>com.zaxxer</groupId>

<artifactId>HikariCP</artifactId>

</dependency>

spring.datasource.url=jdbc:mysql://localhost:3306/mydb

spring.datasource.username=user

spring.datasource.password=password

spring.datasource.hikari.maximum-pool-size=10

spring.datasource.hikari.connection-timeout=30000

4. Partitioning and Sharding

Partition large tables to improve query performance.
Use sharding for horizontal scaling in distributed databases.
Partitioning in databases is a technique used to divide large tables into smaller, more manageable pieces while maintaining the logical integrity of the data. It improves performance, scalability, and maintenance—especially in high-volume transactional or analytical systems.

1. Horizontal Partitioning (Sharding)

Splits rows across multiple tables or databases.

Common in distributed systems.

Example: Users from different regions stored in separate databases.

2. Vertical Partitioning

Splits columns into separate tables.

Useful when some columns are accessed more frequently than others.

Example: Separating user profile info from login credentials.

Benefits of Partitioning

✅ Improved Query Performance: Queries scan only relevant partitions.
✅ Faster Maintenance: Easier to archive, delete, or backup partitions.
✅ Parallelism: Queries and operations can run concurrently on different partitions.
✅ Scalability: Supports large datasets without degrading performance.

5. Caching Layer

Use Redis or Memcached to cache frequent queries.
Implement read-through or write-through caching strategies.

6. Replication and Read Scaling

Use read replicas to distribute read load.
Ensure replication lag is monitored and managed.

7. Batching and Pagination

Batch inserts/updates to reduce transaction overhead.
Use pagination for large result sets to avoid memory issues.

🧪 Monitoring & Profiling Tools

Java: JProfiler, JMC, Prometheus + Grafana, Micrometer
Database: pg_stat_statements (PostgreSQL), Performance Schema (MySQL), AWR (Oracle)

Search This Blog

SDE IN USA

Designing scalable system

Comments

Post a Comment

Popular posts from this blog

Archunit test

Hexagonal Architecture

visitor design pattern