gRPC

gRPC is used in Kumo primarily as a business-facing RPC layer. It is not designed to be the highest-throughput transport in the system, but to be the most interoperable and controllable one.

This page explains where gRPC fits, what problems it solves, and why it is intentionally placed at the pipeline control layer.

1. What gRPC is really for

In real production systems, the entry RPC layer is not a hot data path. It is a control plane for request flows.

This layer typically handles:

Authentication and authorization
Request routing
Quota and rate limiting
A/B testing and gray release
Fan-out to multiple backend services
Aggregation of multiple responses
Timeout and retry policies
Metadata propagation (trace id, user id, locale, etc)
Observability (metrics, tracing, logging)

None of these are about raw throughput. They are about controlling and shaping traffic.

gRPC is designed exactly for this layer.

2. Why gRPC is not about extreme QPS

It is common to see claims like "hundreds of thousands of QPS per node". In real business systems, this is mostly unrealistic.

A typical service node has:

Locks and mutexes
Memory allocation
Serialization (protobuf)
TLS
Tracing
Logging
Cache access
Thread scheduling
Business logic

Even very well-optimized production services usually run at:

50% to 70% CPU utilization
5k to 30k QPS per node

Above that, safety margins collapse. Any traffic spike, failover, or hot shard can bring the system down.

This is especially true for systems using:

Raft
Single-threaded state machines
Ordered logs
Transactional writes

In these systems, QPS is often fundamentally limited by single-threaded paths.

gRPC is built with these real constraints in mind. It prioritizes correctness, observability, and control, not paper benchmarks.

3. Where gRPC fits in Kumo

In Kumo, gRPC is used at the business and integration boundary:

Clients
|
|  (gRPC)
v
Business API Layer
|
|  (fan-out, routing, policies)
v
High-QPS internal services (KV, search, storage, compute)

This layer:

Accepts requests from many languages
Talks to many downstream systems
Applies routing and policy
Controls traffic

It is not the place to squeeze the last bit of QPS. It is the place to keep the system stable, observable, and evolvable.

4. Why gRPC is preferred here

gRPC brings things that are extremely hard to re-implement correctly:

Deadline propagation
Request cancellation
Structured metadata
Standard error model
Streaming
Backpressure
Load balancing hooks
Code generation for many languages
Deep ecosystem support

These are not optional in a real distributed system. They are required to build a controllable pipeline.

If you do not use gRPC, you will end up rebuilding most of these.

5. Multi-language ecosystem

The business layer is usually where:

Python
Java
Go
Node.js
C++
Rust

all need to connect.

gRPC is one of the few RPC systems with first-class, production-grade support across all major languages.

This makes it ideal for:

Public APIs
Microservices
Integration with external systems
Tooling and automation

6. C++ integration model

On C++, gRPC requires several components:

protobuf
grpc
transport (c-ares, TLS, etc)

This can be painful to integrate manually.

In Kumo, this is handled by kmpkg, which provides:

Prebuilt gRPC
Protobuf
TLS stack
Consistent versions

So the complexity is centralized and standardized.

7. Minimal C++ example

Service definition:

syntax = "proto3";

service Greeter {
  rpc SayHello (HelloRequest) returns (HelloReply);
}

message HelloRequest {
  string name = 1;
}

message HelloReply {
  string message = 1;
}

Server:

class GreeterServiceImpl final : public Greeter::Service {
  grpc::Status SayHello(grpc::ServerContext* context,
                        const HelloRequest* request,
                        HelloReply* reply) override {
    reply->set_message("Hello " + request->name());
    return grpc::Status::OK;
  }
};

Client:

auto channel = grpc::CreateChannel("localhost:50051",
                                   grpc::InsecureChannelCredentials());
auto stub = Greeter::NewStub(channel);

HelloRequest req;
req.set_name("kumo");

HelloReply resp;
grpc::ClientContext ctx;

auto status = stub->SayHello(&ctx, req, &resp);

This is enough to demonstrate the model. In real systems, most of the value comes from metadata, deadlines, and streaming.

8. Summary

gRPC in Kumo is not chosen because it is the fastest RPC.

It is chosen because:

It provides the best ecosystem
It provides the best control primitives
It works across languages
It integrates well with observability and policy layers

It is the right tool for the pipeline control layer.

High-QPS, data-path services should use more specialized transports and engines. Business-facing and integration layers should use gRPC.

1. What gRPC is really for​

2. Why gRPC is not about extreme QPS​

3. Where gRPC fits in Kumo​

4. Why gRPC is preferred here​

5. Multi-language ecosystem​

6. C++ integration model​

7. Minimal C++ example​

8. Summary​