Designing a Login System with OTP: A Distributed Systems Perspective
Designing authentication systems is a common topic in backend interviews. One popular scenario is designing a login system with OTP (One-Time Password) verification.
At first glance, the solution seems straightforward. Generate an OTP, store it temporarily, and verify it when the user submits it.
However, once we move from a simple design discussion to real-world distributed systems, things become more interesting.
Let’s walk through a typical interview conversation and uncover the deeper architectural concepts behind it.
The Interview Scenario
Imagine an interviewer asks the following question:
Design a login system with OTP verification. The OTP should be valid for 5 minutes.
A common candidate response might look like this:
Candidate:
“I’ll generate an OTP and store it in Redis with a TTL of 5 minutes. When the user submits the OTP, the system reads the stored value and compares it.”
This is a reasonable and practical approach.
Redis works well because it provides:
- Extremely fast reads and writes
- Built-in TTL support
- Distributed access across multiple services
However, interviewers often push deeper.
The Follow-Up Question
The interviewer might ask:
Why use a distributed cache like Redis? Why not store the OTP in memory?
The typical response is:
- Memory limitations
- Pod restarts might cause data loss
These are valid concerns, but the interviewer may continue.
The Trick Question
Now the interviewer says:
Assume unlimited memory and pods never crash.
At this point, many candidates get confused.
If memory is unlimited and servers never fail, why not just store OTPs in memory?
This is where the real distributed systems problem appears.
The issue is not memory and not crashes.
The real issue is state sharing.
Understanding the Real Problem
Modern backend systems rarely run as a single server.
Instead, applications typically run behind a load balancer with multiple instances (or pods).
For example:
User → Load Balancer → Pod A (Generate OTP)
User → Load Balancer → Pod B (Verify OTP)
Let’s walk through what happens.
- The user requests an OTP.
- The load balancer routes the request to Pod A.
- Pod A generates the OTP and stores it in memory.
So far, everything works.
Now the user enters the OTP and submits the verification request.
However, this request might be routed to Pod B instead of Pod A.
Pod B has no idea about the OTP stored in Pod A’s memory.
The result?
OTP verification fails even though the OTP is correct.
Why This Happens
This problem occurs because application instances do not share in-memory state.
Each pod or server instance has its own memory.
So when data is stored locally:
- Other instances cannot access it
- Requests handled by different servers lose context
This creates inconsistencies in distributed systems.
The Core Principle: Stateless Services
Modern microservice architectures follow an important design principle:
Services should be stateless.
Stateless means:
- Each request should be processed independently
- No request should depend on data stored in a specific server’s memory
- Any required state should be stored externally
When a request depends on data created by a previous request, the state must be externalized.
Common external storage options include:
- Databases
- Distributed caches (Redis)
- Message queues
- Object storage
In the OTP system example, Redis works well because:
- It supports TTL
- It’s extremely fast
- It can be accessed by all service instances
This ensures every pod can read the same OTP data.
The Distributed Cache Solution
Let’s see how Redis solves the problem.
Step 1: OTP Generation
User → Load Balancer → Pod A
Pod A generates the OTP and stores it in Redis with a TTL of 5 minutes.
Example:
key: otp:user123
value: 483921
ttl: 5 minutes
Step 2: OTP Verification
User → Load Balancer → Pod B
Pod B retrieves the OTP from Redis and verifies it.
Since Redis is shared across all instances, the OTP is available regardless of which pod handles the request.
This ensures consistent behavior across the system.
Can In-Memory Storage Still Work?
Interestingly, the answer is yes — technically it can work.
But it requires a specific load balancing strategy called sticky sessions.
Sticky Sessions Explained
Sticky sessions (also called session affinity) ensure that all requests from a user are routed to the same server instance.
Example flow:
User → Load Balancer → Pod A
The load balancer then “pins” the session to Pod A.
All future requests from that user go to Pod A.
So when the user submits the OTP:
User → Load Balancer → Pod A
Since both requests go to the same pod, the OTP stored in memory is accessible.
This makes in-memory storage possible.
Why Sticky Sessions Are Problematic
Although sticky sessions work, they introduce several architectural drawbacks.
1. Reduced Scalability
Sticky sessions reduce load balancing flexibility.
Instead of distributing traffic evenly across pods, the load balancer must maintain routing rules.
This can create uneven load distribution.
2. Reduced Elasticity
Modern systems often scale dynamically.
In container orchestration platforms like Kubernetes:
- Pods may be created automatically during high traffic
- Pods may be terminated during low traffic
Sticky sessions make this harder because sessions are tied to specific instances.
3. Failure Handling
If the pod handling a user session crashes, the session state disappears.
Users may be forced to restart the authentication process.
4. Operational Complexity
Sticky sessions introduce routing dependencies that complicate infrastructure configuration.
In highly scalable systems, this is generally avoided.
Why Distributed Caches Are Preferred
Because of these limitations, most production systems rely on distributed caching systems like Redis.
Redis provides several advantages:
Shared Access
All service instances can access the same data.
Built-in Expiration
Redis TTL automatically removes expired OTPs.
High Performance
Redis operations are extremely fast.
Scalability
Redis clusters can scale horizontally.
Fault Tolerance
Redis can be configured with replication and failover.
These properties make Redis ideal for short-lived authentication data like OTPs.
Real Production Considerations
In real-world systems, several additional factors must be considered when designing OTP systems.
Rate Limiting
Prevent users from requesting unlimited OTPs.
OTP Expiry
Ensure OTPs expire automatically to improve security.
Retry Limits
Limit the number of verification attempts.
Security
Never log or expose OTP values.
Monitoring
Track authentication failures and suspicious activity.
Key Takeaways
Designing a login system with OTP might appear simple, but it highlights several important distributed system concepts.
Here are the key lessons:
- Stateless services are essential in distributed systems
- Application instances should not rely on local memory for shared data
- Distributed caches help maintain consistency across multiple service instances
- Sticky sessions can work but reduce scalability and elasticity
- Redis is commonly used for temporary authentication data due to its speed and TTL support
Final Thoughts
Interview questions about OTP systems are rarely about OTPs themselves.
They are designed to test your understanding of distributed system design principles.
The real insight is recognizing that the challenge is not memory limitations or server crashes.
The true issue is state sharing across multiple service instances.
Once you understand this, the architectural choice becomes clear:
For scalable and reliable systems, state must be externalized.
That’s why distributed caches like Redis are widely used in production authentication systems.
Navya S
Java developer and blogger. Passionate about clean code, JVM internals, and sharing knowledge with the community.