Designing a Login System with OTP: A Distributed Systems Perspective

Designing authentication systems is a common topic in backend interviews. One popular scenario is designing a login system with OTP (One-Time Password) verification.

At first glance, the solution seems straightforward. Generate an OTP, store it temporarily, and verify it when the user submits it.

However, once we move from a simple design discussion to real-world distributed systems, things become more interesting.

Let’s walk through a typical interview conversation and uncover the deeper architectural concepts behind it.


The Interview Scenario

Imagine an interviewer asks the following question:

Design a login system with OTP verification. The OTP should be valid for 5 minutes.

A common candidate response might look like this:

Candidate:
“I’ll generate an OTP and store it in Redis with a TTL of 5 minutes. When the user submits the OTP, the system reads the stored value and compares it.”

This is a reasonable and practical approach.

Redis works well because it provides:

  • Extremely fast reads and writes
  • Built-in TTL support
  • Distributed access across multiple services

However, interviewers often push deeper.


The Follow-Up Question

The interviewer might ask:

Why use a distributed cache like Redis? Why not store the OTP in memory?

The typical response is:

  • Memory limitations
  • Pod restarts might cause data loss

These are valid concerns, but the interviewer may continue.


The Trick Question

Now the interviewer says:

Assume unlimited memory and pods never crash.

At this point, many candidates get confused.

If memory is unlimited and servers never fail, why not just store OTPs in memory?

This is where the real distributed systems problem appears.

The issue is not memory and not crashes.

The real issue is state sharing.


Understanding the Real Problem

Modern backend systems rarely run as a single server.

Instead, applications typically run behind a load balancer with multiple instances (or pods).

For example:

User → Load Balancer → Pod A (Generate OTP)
User → Load Balancer → Pod B (Verify OTP)

Let’s walk through what happens.

  1. The user requests an OTP.
  2. The load balancer routes the request to Pod A.
  3. Pod A generates the OTP and stores it in memory.

So far, everything works.

Now the user enters the OTP and submits the verification request.

However, this request might be routed to Pod B instead of Pod A.

Pod B has no idea about the OTP stored in Pod A’s memory.

The result?

OTP verification fails even though the OTP is correct.


Why This Happens

This problem occurs because application instances do not share in-memory state.

Each pod or server instance has its own memory.

So when data is stored locally:

  • Other instances cannot access it
  • Requests handled by different servers lose context

This creates inconsistencies in distributed systems.


The Core Principle: Stateless Services

Modern microservice architectures follow an important design principle:

Services should be stateless.

Stateless means:

  • Each request should be processed independently
  • No request should depend on data stored in a specific server’s memory
  • Any required state should be stored externally

When a request depends on data created by a previous request, the state must be externalized.

Common external storage options include:

  • Databases
  • Distributed caches (Redis)
  • Message queues
  • Object storage

In the OTP system example, Redis works well because:

  • It supports TTL
  • It’s extremely fast
  • It can be accessed by all service instances

This ensures every pod can read the same OTP data.


The Distributed Cache Solution

Let’s see how Redis solves the problem.

Step 1: OTP Generation

User → Load Balancer → Pod A

Pod A generates the OTP and stores it in Redis with a TTL of 5 minutes.

Example:

key: otp:user123
value: 483921
ttl: 5 minutes

Step 2: OTP Verification

User → Load Balancer → Pod B

Pod B retrieves the OTP from Redis and verifies it.

Since Redis is shared across all instances, the OTP is available regardless of which pod handles the request.

This ensures consistent behavior across the system.


Can In-Memory Storage Still Work?

Interestingly, the answer is yes — technically it can work.

But it requires a specific load balancing strategy called sticky sessions.


Sticky Sessions Explained

Sticky sessions (also called session affinity) ensure that all requests from a user are routed to the same server instance.

Example flow:

User → Load Balancer → Pod A

The load balancer then “pins” the session to Pod A.

All future requests from that user go to Pod A.

So when the user submits the OTP:

User → Load Balancer → Pod A

Since both requests go to the same pod, the OTP stored in memory is accessible.

This makes in-memory storage possible.


Why Sticky Sessions Are Problematic

Although sticky sessions work, they introduce several architectural drawbacks.

1. Reduced Scalability

Sticky sessions reduce load balancing flexibility.

Instead of distributing traffic evenly across pods, the load balancer must maintain routing rules.

This can create uneven load distribution.


2. Reduced Elasticity

Modern systems often scale dynamically.

In container orchestration platforms like Kubernetes:

  • Pods may be created automatically during high traffic
  • Pods may be terminated during low traffic

Sticky sessions make this harder because sessions are tied to specific instances.


3. Failure Handling

If the pod handling a user session crashes, the session state disappears.

Users may be forced to restart the authentication process.


4. Operational Complexity

Sticky sessions introduce routing dependencies that complicate infrastructure configuration.

In highly scalable systems, this is generally avoided.


Why Distributed Caches Are Preferred

Because of these limitations, most production systems rely on distributed caching systems like Redis.

Redis provides several advantages:

Shared Access

All service instances can access the same data.

Built-in Expiration

Redis TTL automatically removes expired OTPs.

High Performance

Redis operations are extremely fast.

Scalability

Redis clusters can scale horizontally.

Fault Tolerance

Redis can be configured with replication and failover.

These properties make Redis ideal for short-lived authentication data like OTPs.


Real Production Considerations

In real-world systems, several additional factors must be considered when designing OTP systems.

Rate Limiting

Prevent users from requesting unlimited OTPs.

OTP Expiry

Ensure OTPs expire automatically to improve security.

Retry Limits

Limit the number of verification attempts.

Security

Never log or expose OTP values.

Monitoring

Track authentication failures and suspicious activity.


Key Takeaways

Designing a login system with OTP might appear simple, but it highlights several important distributed system concepts.

Here are the key lessons:

  • Stateless services are essential in distributed systems
  • Application instances should not rely on local memory for shared data
  • Distributed caches help maintain consistency across multiple service instances
  • Sticky sessions can work but reduce scalability and elasticity
  • Redis is commonly used for temporary authentication data due to its speed and TTL support

Final Thoughts

Interview questions about OTP systems are rarely about OTPs themselves.

They are designed to test your understanding of distributed system design principles.

The real insight is recognizing that the challenge is not memory limitations or server crashes.

The true issue is state sharing across multiple service instances.

Once you understand this, the architectural choice becomes clear:

For scalable and reliable systems, state must be externalized.

That’s why distributed caches like Redis are widely used in production authentication systems.

Navya S

Java developer and blogger. Passionate about clean code, JVM internals, and sharing knowledge with the community.

Leave a Reply

Your email address will not be published. Required fields are marked *