Sometimes, we need to build "event-based" applications like a chat, an app showing the current price of a crypto currency or an event-driven microservices architecture, for example. In these cases, we have to decide what communication model fits better in terms of performance and scalability for an specific use case. Basically, we have three models: push, polling and long polling. Let's see the conceptual differences between them.
In this model, client opens a connection to the server and waits for messages. The connection is kept open and server pushes messages synchronously to the client when it receives them.
This model is the most "real-time" one of the three: messages goes to the client as soon as they are received by the server. And, precisely because of that, it is not the most scalable one. Server does not know if client is ready to consume new messages and may become busy, slowing down the server.
Pushing model is appropriate when you have limited clients/consumers and messages. For example, a chat application like Whatsapp where there are few clients per group or "channel" (in fact, WhatsApp uses this model). RabbitMQ and Redis Pub/Sub are some technologies using push model.
Unlike pushing, polling is a request/response model. Client makes a request to the server and gets a response from it immediately with a new message or without any. This model is a waste of bandwith when there are no new messages and responses contain no data. Also, open and close connections is expensive. Although this model does not scale well, it is the easiest one to implement and could be useful is some scenarios where scalability and performance is not a concern at all.
This one is a combination of the two previous models: client makes a request to the server, but this is kept open until a "useful" response is returned. What do I mean with "useful" response? Well, problem with polling are empty responses, wasting bandwidth and server resources.
With long polling, client tells the server "Hey, reply to me when there is some new message, I will wait for that". This way, server knows that specific client is ready to receive messages and can send them to it.
Two interesting things here. Once response is returned, client/server connection is closed and client is responsible of make another request to the server when it's ready to handle new messages. Besides, server can be configured to achieve a better performance: for instance, it could be configured to send data to clients when there are three new messages or when the pending messages are 25Kb or more". This way, message handling is optimised and client/server communication is more efficient. This model can be seen as a server-side local-polling model.
Kafka is an example of technology implementing long polling model, and that's the reason why it is more scalable compared with RabbitMQ when there is a massive amount of messages and clients. Also, it's worth to mention that Server Sent Events (SSE) is a protocol that implements long polling model, with the difference that it does not close the connection after a message is sent to the client, but it is kept open.
As you saw, I didn't mention any protocol or specific technology to implement these three patterns because, actually, they are protocol agnostic.