Transmission over finite buffer channels and deletion channels
The main manifestation of errors in networks are due to packet losses at a finite-buffer queue. We studied the information-carrying capacity of finite queues, allowing for inter-packet coding, but without coding in packet timing. In the presence of side-information about which packets were lost (such as through a packet sequence number mechanism), this problem is reduced to an erasure channel with significant memory and the capacity in this case is directly related to the long-term loss probability of the queue. We showed that even when there is significant memory in the erasures, feedback does not increase the capacity.
Transport protocols such as TCP use sequence numbers in the packet (strictly speaking, in the TCP header) to detect lost packets. Packets sent by the source are numbered sequentially. The destination can infer the loss of one or several packets from the sequence number of the packet following the loss. The sequence number uses up a certain number of bits (32 bits in the case of TCP/IP) to detect lost packets and to request retransmission of those packets. We can view it as a code that converts the deletion channel into an erasure channel.
A fundamental question therefore arises: if we do not assume a-priori the existence of sequence numbers, what is the capacity of the resulting channel? This question naturally leads to the deletion channel, which essentially differs from the erasure channel in that the destination receives no explicit symbol indicating loss of a packet. Instead, the received sequence of symbols is shorter than the original sequence, with deleted symbols simply removed, i.e. only a subsequence of the transmitted symbols is received. The characterization of capacity of the deletion channel is a long-standing open question since the 1960s. We provided the tightest lower bound to this problem since that time. Our work in revived interest in this problem, as evidenced by several follow-up pieces of work and recently Drinea and Mitzenmacher have improved our lower bounds. One of the surprising results we found was that the achievable rate in deletion channels differs from that of erasure channels by a small amount (at most 1 bit). This suggests that the use of sequence numbers to detect deletions is quite inefficient, though it makes the job of coding for such errors much simpler. In doing this analysis we studied questions on common subsequences between Markov processes, a topic perhaps of independent interest. More recently we have developed the tightest known upper bounds for the binary deletion channel.