RTP Packets
As a real-time transmission protocol, RTP provides E2E transmission
services and is widely used in streaming-related communication and
entertainment services, such as online live telecasting, web TV, and
video conference. An RTP packet consists of a packet header and a
payload. Figure 1 shows the format
of an RTP packet header.
Figure 1 RTP packet header
The fields are defined as follows:
- V: Indicates the version number, occupying two bits. It indicates the version number of RTP. The current version number of RTP is 2.
- P: Indicates whether to pad packets, occupying one bit. If the value is 1, one or more extra octets are padded to the packet.
- X: Indicates an extension flag, occupying one bit. If the value is 1, one extension header is padded to the RTP packet header.
- CC: Indicates a special signal source counter, occupying four bits. It specifies the number of special signal source identifiers.
- M: Indicates a flag, occupying one bit. The meaning varies according to the valid payload. For videos, this field specifies the end of a frame. For audios, this field specifies the start of a session.
- PT: Indicates the payload type, occupying seven bits. The available
options are GSM audio and JPEG image.
- Sequence number: Indicates the sequence number of an RTP packet,
occupying 16 bits. The sequence number is increased by 1 each time
a packet is sent. The receiver calculates the packet loss rate and
disorder rate based on the sequence number, re-orders the packets,
and restores data.
- Timestamp: Indicates the sampling time of the first byte of an
RTP packet, occupying 32 bits. The receiver uses the timestamp to
calculate the jitter and implement synchronization control.
- Synchronization signal source identifier: Indicates a synchronization signal source, occupying 32 bits.
- Special signal source identifier: Indicates a special signal source, occupying 32 bits.