software solutions
Computer Science » Digital Communications
Link characteristics bandwidth delay (light travels at 0.7c in fibre) attenuation Telephone network basic service two way voice low end-end delay guarantee of (bandwidth and buffer) resources to an accepted call connected by a virtual circuit overall design fully connected core simple routing hierarchically allocated telephone number space end system must be powered from the exchange generally analogue instead of two wires for transmit and receive circuits, use the same circuit has sidetone (local) we want this to stop people shouting and echo (far end) some received signal is transmitted back need expensive echo cancellers for long distance calls dialling pulse dialling: send pulse per unary coded digit tone dialling: dual tone multifrequency Pressing a single key (such as '1' ) will send a sinusoidal tone of the two frequencies (697 and 1209 hertz (Hz)) transmission multiplexing analogue band limit call to 3.4kHz and frequency shift was used to run one cable to two homes, was problem when wanting to upgrade to broadband digital convert voice to samples (8 bits/sample, 8000 samples/sec, call 64 kb/s) interleave samples from different calls use 256 logarithmically spaced quantization levels (as dB is logarithmic) TDM trunk works faster rate than inputs interleave samples (serve all inputs in time it takes one sample to arrive) multiplexed trunks can be multiplexed on faster trunls syandards Digital Signalling (in US) SONET defined to standardise but was implemented a little differently in US Link (physical media of transmission) cost is in installation, not the link builders install cat 5, fibre and coax to every room over-provision long distance links, unless undersea Fibre: < 10 Pbps capacity little attenuation generally only thermal noise types: multimode cheap (uses LEDs) suffers inter symbol interference short distance (a couple km) single mode (only one solution to wave equation) expensive (uses lasers) long distance (hundreds km) satellites geosynchronous 36000km away propagation delay of 250ms low earth orbit appear to move in the sky need more handover is tricky Switching two parts to the system switch (data plane) switch controller (control plane) Signalling controller does not touch voice samples manages network (routing, billing etc.) switch controllers are separate from the switching hardware have to keep track of the state at each endpoint (each phone, whether idle, dial tone, talking, etc.) Cellular communication Establishing cellular communication: there aren't enough frequencies to give each mobile a permanent frequency temporal reuse: if mobile is off, don't assign frequency to it spatial reuse: mobiles in non-adjacent cells can use the same frequency how complete a call to a mobile? need to know where it is how deal with moving mobile phone? need to send packet sooner if cell is further from base station hand off between base stations? Issues with telephone system critical systems (power grid control) built on assumptions what about changing to allow multimedia stuff? Internet administered by a collection of entities key technical innovations packets self descriptive data (data + metadata) better than samples we have to know where it is going can't store it store and forward metadata allows us to forward packets when we want (e.g. bundle of letters to one PO) allows efficient use of critical resources problems hard to control delay switches need buffer memory convergence of flow leads to congestion since have IP layer, can have different underlying layers and link lots of different networks addressing IP address made up of network number and host number network number early system: Class A, B, C networks class A: 8 bit network number class B: 16 bit network number class C: 24 bit network number CIDR use a netmask all routers in the internet know the mask for a subnet mask only know what it means inside the LAN routing need to know next hop for each network on the internet normally just keep local ones, otherwise send to default router reduces routing table at expense of possible non-optimal routing endpoint control use a dumb network do as much as possible at the edge higher layer (TCP) deals with network defects run over any lower layers, but no QoS need to to end to end checks challenges IP address space shortage decentralisation makes reliability next to impossible cannot guarantee that route even exists anyone can join no uniform solution for accounting/billing non-optimal routing multimedia needs QoS Internet has no way to identify streams routers not required to cooperate ATM networks design goals end to end QoS high bandwidth scalable manageable cost effective concepts virtual circuits actual circuits (POTS) have drawbacks idle users consume bandwidth quantization of link capacity carry only identifier in the header (saves space) connection needs to be pre-established also we need to translate IDs at each switch point all packets must follow same route easier to look up in hardware use signalling to separate control from data fixed sized packets pros simpler buffering hardware simpler scheduling easier to build parallel switches cons overhead for sending small amounts of data segmentation and reassembly cost last cell normally wastes bandwidth small packet size balance packetization delay (time to fill a packet) vs. overhead of packet data/header set at 48 bit for data, 5 bit for header each connection gives information about transmission (average rate, max peak rate, max peak length) can trade off worst case delay against speed of output trunk need more buffering than originally thought integrated service voice, video and data on the same network hardware support for switching different data types do by: signalling admission control resource reservation lots of bandwidth challenges QoS (defined but not used) scaling competition from other LAN techs (esp. Ethernet as it is cheap because of volume of people using it) interoperation with IP (vast, fast growing non-ATM structure) connection vs connectionless, etc. protocol is a set of rules and formats that govern communication between communicating peers (Customer A and Customer B) set of valid messages meaning of each message tells us syntax of a message (fields/format etc.) semantics of a message (e.g. a nack means received a corrupted file) actions to take on receipt of a message (e.g. on nack retransmit) can view protocol as providing a service some protocols rely on other ones (layered underneath it) e.g. reliable file transfer relies on packet transfer terminology Service Access Point (SAP): interface between upper and lower layer Service Data Unit (SDU): packets hander to a lower layer by an upper Protocol Data Unit (PDU): packets exchanged between peer entities PDU = SDU + optional header/trailer protocol stack allows implementation/information hiding allows upper layers to share lower layer functionality actually there is a tension between abstraction and good performance e.g. with TCP assuming packet loss always implies congestion have to leak some information to get good run time ISO Open System Interconnect protocol defined by: reference model: formally defines a layer/service etc. service architecture: describes services provided by the layer and the service access point protocol architecture: set of protocols which implement the service architecture Physical layer moving bits on physically connected end systems prescribes coding scheme to represent a bit shapes and sizes of connectors bit-level synchronisation Datalink layer have a frame (bits that belong together) beginning and end markers have idle markers for when a frame is not being transmitted done in software but on the custom hardware (network card) e.g. Ethernet since this is broadcast, need addresses and contention resolution done by MAC Network layer creates abstraction of one end to end link from several Datalink layer links allows any two endpoints to have an end to end link in datagram networks provides routing and data forwarding in connection oriented networks have a data and control plane for each of the layers (i.e. separate stack for data and control layers) e.g. IP Transport layer creates abstraction of error control, flow control and multiplexing multiplexing applications onto same end to end link (using port numbers) e.g. TCP or UDP Session layer creates abstraction of full duplex, expedited data delivery and session synchronisation expedited data delivery: put a later packet in front of outgoing data used in phone network, if a call is started and the person hangs up, to cancel database lookups etc. synchronisation allow users to place markers in the data stream and roll back to a pre-specified mark e.g. chief clerk of shipping, can do expedited delivery by courier, provide duplex etc. Presentation layer abstracts data representation differences, e.g. endian-ness deals with application data can do encryption too Application layer set of application useful protocols e.g. FTP, SMTP, HTTP System design every design decision is a trade-off goal maximise a set of performance metrics given a set of resource restrictions constrained vs. unconstrained resources e.g. high end PC with 28.8 modem constrained resource is link bandwidth CPU and memory are unconstrained common resources time (response/throughput etc.) space (memory/bandwidth etc.) computation money (cost of components/limits customers willing to pay, etc.) labor social constrains (standards/backwards compatibility etc.) balanced system one where all resources are simultaneously bottlenecked normally you have one bottleneck, remove it and the bottleneck is then something else... standard trade off techniques multiplexing: time and space for money pipelining batching: reduced overhead for longer worst case response time and increased throughput optimising the common case: 80% of time spent in 20% of code spend labour optimising that bit of code hierarchy: scalability for communication expense abstraction separating data and control: overhead of per packet metadata for per-packet QoS Naming and addressing both need to be globally unique names long, variable length human readable difficult to parse addresses shorter, fixed length easy for computer to process/use efficiently in routing name resolution done by servers (single PoF) naming we want Lookup, Create Rename Update and Delete of names to be scalable and efficient hierarchical naïve version: ask other authorities if name is unique recursive decomposition of the namespace decompose into mutually exclusive partitions scales arbitrarily Domain Name Service (DNS) name server which is distributed a name server is responsible for (an authoritative server for) a set of domains can delegate some responsibility to a child root servers are replicated if a local server cannot answer a query it asks its parent replies are cached and timed out addressing use hierarchical schemes globally unique names aggregation to reduce size of routing tables at expense of non-optimum routing in the telephone network variable length addresses assumes local scope unless use pointer to other scope (e.g. 01223) in the internet IPv4 network number and host number, defined by mask in different ways (oldest-youngest) A, B and C class networks classless routing by clustering addresses within a network contiguous C blocks referred to as a larger address block due using a CIDR mask must be known by all internet routers is similar to subnet mask DHCP computer broadcasts discover to subnet DHCP servers offer IP addresses host picks one and sends request to a particular server selected server sends ack when the host is finished with the address it sends a release the IP address has a lease which limits the time it is valid NAT multiplex network layer onto transport layer (i.e. use TCP and UDP ports) generally use non-routable addresses, e.g. 192.168.0.0/16 IPv6 extends address from 32 bit to 128 bit multiple levels of aggregation several types of multicast anycast (there is a set of endpoints to which a packet can go, but unlike multicast it only goes to one of them) ATM Network Service Access Point address variable length (7-20 bytes) Datalink layer addresses MAC addresses hierarchically allocated (when piece of hardware is made, indicates manufacturer etc.) not hierarchical deployed ARP to resolve in point to point LAN need ARP server susceptible to spoofing by posing as the network gateway Routing process of finding path from source to every destination in the network a routing protocol sets up a routing table in routers and switch controllers node makes local choice on global topology routing protocol needs to intelligently summarize relevant information about (complex) global state requirements minimise the routing table space minimise number and frequency of control messages want to avoid black holes: traffic vanishes loops: traffic stuck within a network oscillations: paths cycle between alternatives use optimal path choices centralised vs distributed routing source based vs. hop by hop stochastic vs. deterministic stochastic spreads load, avoiding oscillations, but misorders single vs. multiple path state dependent vs. state independent telephone network three level hierarchy, with fully connected core Central office Local Exchange Wide area core only major routing decision is at toll switch if have to use two hop path then which? features stable load can predict pairwise delay throughout the day hence can choose optimal routes in advance extremely reliable switches can assume a chosen route is available (can't do on Internet) single organisation controls the core highly connected connections require resources (but all need the same) costs requires reliability in every component and a fully connected core Dynamic Non-Hierarchical Routing (DNHR) divide the day into around ten periods in each period, each toll switch is assigned a primary one-hop path and a list of alternative two-hop paths drop only if the primary and all the alternatives are unavailable bursts of high activity can lead to metastability overflow into another node's direct path (can rule out by disallowing) measures traffic once a week Trunk status map routing (TSMR) similar to DNHR but updates measurements once an hour or so, if they change “significantly” Real Time Network Routing (RTNR) decentralised each toll switch maintains a list of lightly loaded links one switch asks another for its list intersection of source and destination lists gives set of lightly loaded paths links(A) = C, D, E links(B) = D, F, G hence ADB lightly loaded and good alternative path very effective in practice (used in the core) Distance vector routing node sends Distance Vector to all its neighbours that says what the node's best idea of distance to every node in the network, from it advantages distributed adapts to traffic changes and link failures suitable for networks with multiple administrative entities problem: count to infinity, can deal with using DV has path vector triggered updates (faster count to infinity) Link state routing router knows entire network topology, computes shortest path itself link state packet (LSP) cost of link between itself and its neighbour, e.g. (from A) {A; B; 4} control flood the network with LSPs store LSPs if one comes in that is new send it out on all routes besides incoming one network with E edges will copy at most 2E times use sequence numbers to tell if a LSP is new sequence numbers run out/don't know which to use on boot up solutions: ageing put a timeout value in the header and control flood notification of the event so that routers that don't know the time can do it too on booting a router has to wait until all the old LSPs are purged and refreshed need a unique start sequence number a is older than b if if a router gets an older LSP tell sender about newer LSP recovering from a partition the two nodes on either side of the re-established link co-ordinate to update databases Detecting router failure send periodic very small “hello” packets on timeout flood information about the dead router Links costs static metrics set all link costs to 1(min hop routing) give weights inversely proportional to capacity low weight value links will be preferred dynamic metrics NB if the dynamic cost depends on load, can have oscillations ARPAnet original (first cut) cost proportional to length of router queue problems + resolutions queue length averaged over small time (spikes cause re-routing) average queue length over longer time wide dynamic range of path costs (high cost paths ignored) restrict dynamic range queue length assumed to predict future loads (remember earlier NB) depend also on intrinsic link capacity no restriction on successively reported costs make restrictions all tables computer simultaneously (low cost link flooded) stagger table computations Hierarchical routing concept divide network into set of domains gateways connect domains computers within domain unaware of outside computers gateways only know about other gateways three level hierarchy in address network number subnet number host number core advertises routes to networks only gateways talk to backbone to find best next-hop to every other network in the Internet if a domain has multiple gateways external records tell hosts in a domain which to pick to reach a host in another domain contains distance from gateway to external node summary records tell backbone which gateway to use to reach an internal node contains distance from gateway to internal node Internet has three levels of routing backbone, connecting autonomous systems (AS): exterior gateway protocol within AS: interior gateway protocol within LAN exterior gateway protocols between mutually untrusting routers must tell trusted border gateway what paths are allowed different ASs will use different cost metrics: convert or just ignore Exterior Gateway Protocol nominally DV, costs are 128 reachable 255 unreachable allow admins to pick neighbours to peer with Border Gateway Protocol path vector DV annotated with entire path guaranteed to be loop free can use non-tree backbone topologies interior gateway protocols Routing Information Protocol distance vector cost metric is hop count infinity = 16 exchange DVs every 30s Open Shortest Path First link state uses areas to route packets hierarchically within ASs uses designated routers to reduce the number of endpoints Exterior Gateway Protocol (EGP) ATM routing Private Network-Network Interface link state many levels of hierarchy switch controllers at each level form a peer group group has a group leader leaders are members of the next higher level group leaders summarise information about group to tell higher level peers all records received by leader are flooded to lower level LSPs can be annotated with pre-link QoS metrics switch controller uses this to compute source routes for call-setup packets routing in a broadcast LAN: how decide if a packet is meant for inside or outside the LAN Internet solution: all hosts on LAN have same subnet address routers periodically send out advertisements (including TTL, etc.) redirection if a router's next hop is another router on the same LAN, host gets a redirect message multicast routing packet sent by any member of a group is received by all uses: resource location internet TV associates a group of senders with a group of receivers don't need to know each others' identities can create when sender starts sending or receiver starts receiving even if no-one else there multicast groups have their own class D addresses problems: which groups are currently active how to express interest in joining a group discovering the set of receivers in a group delivering data to members of the group expanding ring search used to find the closest multicast server which is not heavily loaded send out multicast packets increasing the TTL each time multicast packet will only be forwarded if it reaches server within the TTL, and that server is not heavily loaded hence discovering local resources first flavours point to multipoint multipoint to multipoint A wants to talk to B, G, H and I B wants to talk to A, G, H and I could approximate point to multipoint with set of point to points but AC, BC would carry packet three times, for G, H, I could do with two point to multipoint multicast groups but overhead of establishing two groups do with multipoint to multipoint send at most one multicast packet per link optimal tree provides shortest path from sender to every receiver difficulties sources dynamically join and leave need to be able to do so without explicitly notifying receiver leaves of tree are often members of broadcast LAN: want to exploit broadcast capability Class D to MAC address translation insert the bottom 23 bits of the multicast IP address into the 48 bit MAC address the top 25 MAC bits are a fixed pattern which indicate that this is a multicast packet network cards know to listen for layer 2 packets with these 25 top bits network card filter out those packets from the multicast groups of which the computer is a member Internet Group Management Protocol if a LAN has no members can not send to it (prune it) router periodically broadcasts query message hosts broadcast reply with list of groups they are interested in reply after random timeout if someone else has expressed interest you don't need to reply wide area multicast assume each endpoint is a router which can use IGMP to determine if it wants subscription goal: deliver packets to all routers on the path to a subscribing receiver solution 1 controlled flooding need state to see if a packet is new solution 2: reverse path forwarding (RPF) flood at node A forward packet labelled from S iff that packet arrived on the link which is the next hop on the shortest path to S solution 3 don't send packet downstream if you are not on the shortest path from the downstream router to the source potential confusion of downstream router has choice of shortest paths to source in RPF, B and C still get packets even though they don't need them router tells parent in tree to stop forwarding rejoining router C sends join(group, A) to B or periodically broadcast a packet and C refrains from pruning but some routers do not support multicast virtual links (IP in IP) between multicast enabled routers need shortest paths (for RPF) only using these virtual links Distance Vector Multicast Routing Protocol (DVMRP) similar to Routing Information Protocol flood and prune explicit join messages Multicast extension to OSPF (MOSPF) routers flood group membership information with LSPs each router computer shortest path tree that only includes multicast-capable routers complex have a statically configured core tree co-ordinate multicast with core router host sends join request to core router routers along path mark incoming interface for forwarding eliminates the pre-source and per-group prune records at each router (how?) Protocol Independent Multicast (PIM) dense mode PIM == DVMRP sparse mode PIM similar to CBT can switch from CBT to shortest path tree core is actually a “rendez-vous” point, as more than one leaf router switches to different “rendez-vous” on timeout routing vs. policy routing routing: forward packet on 'best' path policy routing: routes chosen on policy directives regarding things like source and destination address QoS time of day charging and accounting multiple metrics advertise multiple costs per link (e.g. delay and monetary cost) routers construct multiple shortest path trees problems all routers must use the same rules remote routers may misinterpret policy provider selection assume single service provider provides almost all the path choose policy by choosing provider crankback router returns packet if no next hop with sufficient QoS can be found in ATM networks do just for the call setup packet in internet have to do for every packet routing vs. forwarding routing: knowing how to get packets from source to destination have to maintain routes forwarding moving a packet from an input port to an output port make a forwarding decision based on route information getting the packet to the output port fast in IP packet arrives check destination address look up in routing table queue packet to output port mobile routing location: where is the host? routing: how get packets to it? telephone network remote MTSO tells home MTSO to phone: call forwarded to remote MTSO to closest base from phone: call forwarded to home MTSO from closest base new MTSOs can be added as load increases in the internet but outgoing traffic does not go through home and need to use tunnels to forward data Use registration packets instead of slotted ALOHA passed on to home address agent Old care-of-agent forwards packets to new care-of-agent until home address agent learns of change problems security mobile and home address agent share a common secret checked before forwarding packets to COA Error control CRCs detect all single bit errors almost all two bit errors any odd number of errors all bursts up to M where generator length is M longer bursts with probability 2-m implementation hardware: with shift register software: one's complement pre-compute remainders for 16 bit words add remainders to running sum want to touch every byte only once packet errors loss uncorrectable bit errors/overflow in buffer duplication insertion solutions per connection incarnation number assign port numbers serially assign sequence numbers serially wait one Mean Packet Loss after boot up (30s-2min) flushes old packets from the network 3 way handshake C -> S : SYN(c) S -> C : SYN-ACK ACK(c+1) SYN(s) C -> S :ACK ACK(s+1) reordering sequence numbers incremented for non-retransmitted packets ACKs carry cumulative sequence number size of sequence number > max time can have an ACK * bit rate > number of packets that can be reordered > number of packets that can be lost maximum packet lifetime time to live (TTL) NACKs aren't very useful take up bandwidth, have to timeout anyway timeout if no ACK then resend SYN-attack forge unused IP address send a long sequence of SYN packets to server up to four minutes' resource allocation per SYN while server retires its SYN-ACKs static scheme know RTT before hand dynamic scheme measure RTT timeout is a function of measured RTTs old TCP scheme a = 1 -> a = 0 -> no history new TCP scheme error control windows retransmission go back N selective wastes header space (bitmap of received packet sequence numbers) fast retransmit assume cumulative ACKs if see 1, 2, 3, 3, 3, retransmit (n+1) = 4 SMART cumulative sequence number and for packet sequence number Flow control considerations send at rate which receiver and network can process simplicity overhead scaling fairness stability usually transport layer classification open loop call setup source describes desired flow rate network admits call data transmission user sends within parameter range network polices users closed loop source monitors available service rate sends at that rate due to speed of light delay, some errors occur first generation ignore network state, only match receiver second generation responsive to different types of state state measurement explicit network tells its current rate more overhead implicit endpoint tries to figure out by looking at network control: two methods transmission window used for error control and flow control rate: directly control the rate no coupling of flow control and error control point of control: two methods hop by hop first generation flow control at each link end to end sender matches all the servers on its path hybrid source asks for minimum rate can send more if bandwidth available traffic descriptors traffic envelope use for traffic contract input to regulator input to policer requirements representativity adequately describes flow so that network does not reserve too little or too much resource verifiability verify that descriptor holds preservability doesn't change inside the network usability easy to describe and use for admission control peak rate fixed size packet network: minimum inter packet spacing variable size packet network: highest rate over all intervals of a particular duration sensitive to extremes average rate jumping window over consecutive intervals of length t, only a bits sent regulator reinitialises every interval moving window over all intervals of length t only a bits sent regulator forgets packets sent more than t seconds ago linear bounded arrival process number of bits transmitted < rt + s r = long term rate leaky bucket regulator for LBAP three way trade off between burst limit s loss rate long term rate r large bursts mess it up on-off send on, off signals stop and wait send and wait for ACK before sending next static window can send multiple packets per RTT (whilst waiting for ACK) bandwidth delay product/optimal window size w > b*R w is packet size of window b is bottleneck bit rate R is RTT b changes, static doesn't work DECbit flow control every packet has a specific bit in the header this bit is set by any of the intermediate servers if a queue has built up sink copies the bit to ACK if bit set then source reduces window size oscillates around optimal state additive increase, multiplicative decrease TCP flow control implicit dynamic window end to end similar to DECbit without the router support window starts at 1 exponential increase, “slow start” doubles every RTT each ACK results in window increase by one linear increase, “congestion avoidance” increase by one every RTT window increases by one when number of ACKS = window size loss detected with timeout or fast retransmit Tahoe: both cases, drop window to one Reno: timeout: drop window to one fast retransmit: drop window to half previous size weaknesses loss doesn't necessarily imply congestion congestion doesn't necessarily imply self blame NETwork Block Transfer (NETBLT) rate based flow control scheme separates error control from flow control, losses and retransmissions do not affect the flow rate application data sent as a series of buffers if received rate < sending rate multiplicatively decrease packet pair improves on NETBLT assume all bottlenecks serve packets in the same order ATM forum End to End Rate baled flow Control (EERC) similar to DECbit, but send whole cell instead of one bit Resource Management cell source sends periodically (once every 32 cells) has rate request each server fills in RM cell with current share, if less comparison to DECbit sources know exact rate non-zero initial cell rate RM cells sitting in the data path is a mess general points flow control is easier with RR (round robin?) scheduling explicit schemes are more robust hop by hop schemes are more responsive but more complex separation of flow control and error control is good rate based schemes are inherently unstable Common protocols plesiochronous hierarchy “nearly synchronous” tight control of deviation from synchrony output runs a bit faster always overhead identifies bits from a particular stream if a stream runs faster use overhead to identify it incompatible hierarchies around the world cannot switch bundles of calls synchronous digital hierarchy requires atomic clocks SDH 9 rows, 90 columns each payload container serviced in 125 microseconds one byte = one call (works out, given 1000/125 payloads/second at rate 64kbps) all overhead is in the headers IP properties unreliable best effort end to end fragmentation if a packet reaches a link with a smaller Maximum Transmission Unit than the size of the packet can break the packet down into fragments and transmit those reassembly happens at the receiving host can set a bits to indicate this packet is a fragment do not fragment this packet error multiplication: if error in any fragment have to retransmit whole packet traceroute router sends ICMP error packet if decrement TTL to 0 source sends packets with increasing TTL ICMP used to send error messages destination unreachable redirect TTL exceeded fragmentation needed, but “don't fragment” bit set TCP multiplexed duplex connection-oriented reliable flow-controlled byte-stream HTTP request/response (stateless connection) protocol is simple browser is complex (and usually buggy) cookies can violate statelessness e.g. one cookie for a website and two windows with two shopping trolleys contending ATM (datalink layer) connection oriented in-sequence unreliable QoS guaranteed routing information made up of virtual path identifier (VPI) virtual circuit identifier (VCI) virtual paths identified by the VPI all VCIs for the same VP share path and resource reservation set of VPs define a virtual private network (not the “tunnelled encrypted network over LAN”) ATM Adaptation Layer (AAL – network layer/rest of the stack) AAL 1 for synchronous apps timestamps, clocking, sequencing constant bit rate forward error correcting AAL 3/4 create a packet of whatever size, put on a frame header and trailer then break down and put in ATM packets with a per cell AAL header error detection, segmentation and reassembly AAL 5 violates layering to be efficient SSCOP similar to TCP no ACKs sender polls, receiver sends status, including ACK and window size poll interval is key variable IP over ATM treat ATM as a link-level technology problems ATM is connection oriented different addressing schemes ATM is point to point while IP assumes broadcast IP encapsulation in ATM put entire IP packet in AAL5 frame is multi-protocol encapsulation designate an ATM host as an ARP server need to use inverse ARP to get the layer 3 address of the client (which is needed to make the connection) creating an ATM based IP subnet if all hosts on an ATM are on the same IP subnet then broadcast reaches all (causes congestion) partition into logical IP subnets this can make longer paths between hosts which are actually attached at ATM level, as if it's outside your logical subnet you have to send your packets via the router next hop routing next hope server stores IP to ATM translations independent of logical subnet boundaries like DNS resolving multicast addresses maintain a set of endpoints that correspond to a particular class D address Multicast Address Resolution Server keeps this LAN emulation used to make IP think it's working over Ethernet or token ring translate from MAC address to ATM address need to emulate broadcast Cells in Frame (CIF) previous solutions required ATM host-adapter card can use Ethernet card: encapsulate AAL5 frame in Ethernet header on point to point Ethernet link CIF attachment device at the other end decapsulates and injects frame into ATM network holding time problem to send an IP packet have to open an ATM connection – when close it? locality principle, more packets are likely soon after some others have been sent optimal solution depends on pricing policy and packet arrival characteristics use measurement based heuristics close connection if expected cost exceeds expected benefit Protocol Implementation depends on structure environment structure partitioning of functionality between user and kernel trade-offs customisability security performance separation of layer processing (interface) strategies single context ... tasks upcalls environment data copy cost interrupt overhead context switch time memory access latency cache effects rules of thumb fine tune inner loops beware of data touching send largest packets possible use hardware Multiple access problem how should participants coordinate actions so that the number of messages exchanged per second is maximised the time spent waiting for a chance to speak is minimised contexts for the problem broadcast transmission medium colliding messages are garbled main contexts wired broadcast LAN wireless packet radio cellular telephony satellite communication solution choose base technology choose how to allocate limited transmission resource to contending users choices centralised vs. distributed design moderator? Centralised: one is master, other are slaves distributed: all are peers circuit mode vs. packet mode do stations send streams or bursts? circuit mode – allocate resources to stream packet mode – contend for resources with each packet constraints in real life propagation is not inverse square law spectrum scarcity radio link properties error proneness multipath interference hidden terminals parameter 'a' number of packets sent by a source before the farthest station receives the first bit performance metrics normalised throughput: fraction of link capacity used to carry non-retransmitted packets mean delay: time station has to wait before it successfully transmits a packet stability with a heavy load is all the time spent on resolving contentions? if infinite uncontrolled stations share a link then instability is guaranteed fairness (arbitrary definition) base technologies frequency division multiple access (FDMA) each station has its own frequency band separated by guard bands number of frequencies is limited reduce transmitter powers and reuse frequencies in non-adjacent cells transmission division multiple access (TDMA) send on same frequency at different times need time synchronisation pros can give different amounts of bandwidth to different users can switch off power when it's someone else's slot cons synchronisation overhead multipath interference on wireless links code division multiple access (CDMA) frequency hopping send at a different frequency in each time slot direct sequence convert a single bit into a code pros hard to spy immune from narrow band noise all cells can use all frequencies cons implementation complexity need large contiguous frequency band for direct sequence method to convert wireless medium to duplex channel frequency division duplex (FDD) uplink and downlink use different frequencies time division duplex (TDD) uplink and downlink use the same frequency at different times centralised schemes slave can only transmit when master permits natural fit in some solutions wireless LAN: base station is the only station that can see everyone cellular telephony: base station the only one capable of high transmission power pros simple cons single POF (and need to have re-election procedure) master involved in every transfer (causes overhead for shall transmissions) circuit mode when station wants to transmit sends message to master using packet mode master allocates resources to slave which uses until is done no contention during transfer also known as reservation based schemes when 'a' is large can't used distributed schemes (too many collisions) e.g. satellite links stations contend for packet slots for reservation messages master then allocates main channel based on the reservation messages polling master asks each station if it wants to send can be reasonably efficient – ask those who have been sending more frequently probing stations are numbered with consecutive logical addresses master does binary search to locate next active station distributed schemes decentralised polling each station assigned a slot it can use if nothing to send the slot is wasted need a protocol to get time synchronisation decentralised probing probe in a pre-established order works well if 'a' is small works poorly if many active stations or if all active stations are on the substation CSMA(carrier sense multiple access) an improvement on the more simple distributed multiple access schemes check whether the medium is active before sending a packet don't have to wait for a time slot or permission from a master if the medium is free then can transmitter if a collision happens then detect and resolve it works well when 'a' (the number of packets sent by a source before the farthest station receives the first bit) is small possible solutions to the collision problem p-persistent on idle, transmit with probability p (hard to choose) if too p is small then wasted time if p is large then more collisions exponential backoff on a collision choose a timeout randomly from a range which doubles each time don't need to choose a p CSMA/CD the transmitter has some scheme to detect if there has been a collision on detection method wait until medium is free start transmitting check for collision after each bit transmitted if detected transmit a jam signal waits for time interval set by exponential backoff CSMA/CA used in wireless LANS can't detect collision because the transmitter overwhelms co-located receiver (think radio: shouting very loud then listening for a whisper – cannot be listening for the whisper of someone else whilst shouting) need explicit acks have to reduce collisions as collisions are more expensive method check to see if medium is busy if not then wait interframe spacing set a contention timer to an interval randomly chosen in the range [1,CW] on timeout send packet and wait if no ack assume packet lost try again with a doubled CW if another station transmits while counting down, freeze CW and unfreeze after the other station finishes transmission this only works when every station can receive transmissions from all other stations hidden terminal: some stations in the area cannot hear transmissions from others, though the base can exposed terminal: some, but not all, stations can hear transmissions from stations outside the local area methods of dealing with hidden terminals busy tone multiple access (BTMA) use separate frequency to send out a “busy tone” when receiving multiple access collision avoidance BTMA requires split frequency band (more complicated receivers) instead use explicit messages to indicate when receiver is busy transmitter sends Request To Sent message if receiver is free it will send back Clear To Send if transmitter receives a CTS it sends the data Ethernet uses CSMA/CD with exponential backoff on collision place jam on wire small 'a' so time wasted in collision is minimised maximum packet size 1500 bytes given two clocks at the same frequency they will drift this is the max no of bits which can be safely sent without missampling do so don't have to implement phase locked loop two part bridge is technically a switch getting faster: not “how to do” but “how to do cheap enough for PCs” to avoid needing QoS guarantees can just increase bandwidth doesn't work past 100 Gbps pros easy to set up requires no configuration (has built in unique hardware address) can segment LANs to reduce load cons at heavy loads users see large delays due to backoff non-deterministic service big overhead on small packers token passing improve on polling token gives right to transmit if don't want to transmit pass on the token prevents starvation when receive a data packet always pass it on (unless you were the sender) if you are the receiver set the ACK bit pros medium access protocol is simple and explicit don't need to carrier sense/synchronise etc. no collisions cons token is single POF all stations must communicate stations must actively monitor the network normally need one station as a monitor Fiber Distributed Data Interface dual counter-rotating rings supports both realtime and non-realtime traffic ALOHA just send the packet if there's no ack try again after waiting a random time no backoff pros useful when 'a' is large simple cons at high loads collisions are very frequent sudden burst of traffic can lead to instability Slotted ALOHA use discrete timeslots (can only start transmitting at the start of a slot) decreases possibility of contention reservation ALOHA contend for reservation minislots using slotted aloha stations independently examine reservation requests and come to consistent conclusions pros supports circuit and packet mode transfer works with large 'a' simple cons cannot pre-empt hogs Switching packet vs. circuit switches packets have headers samples don't (just an identifier) connection oriented vs. connectionless connection oriented need call setup setup handled in the control place goals maximise capacity (subject to cost/reliability constraints) capacity of switch is maximum rate at which it can move information minimise call blocking only for circuit switch circuit switch must reject a call if can't find a path for samples minimise packet loss only for packet switching packet switch must reject a packet if can't find a buffer to store it don't reorder packets for N identical inputs, trunk runs at N times the speed of one input demultiplexer similar neither mux nor demux need addressing information always in same place on trunk so know which is which channel's by time division mux inverse multiplexing – take high bit rate stream and scatter it across multiple trunks circuit switching move 8-bit sample from input port to output port generally inputs trunks are multiplexed (rather than 200 000 individual inputs) multiplexed trunks carry frames (sets of samples) extract sample from frame, depending on position in frame, switch to output call blocking internal blocking: there is an output slot but no path output blocking: no output slot available transit switch can put sample in one of several slots going to desired next hope reduces output blocking Time Division Switching when demultiplexing position in frame determines position in output trunk write n bytes from incoming trunk in order into n memory slots read slots out in a permuted order limited by read/write speed of memory Space Division Switching each sample takes a different path through the switch depending on its destination for multiplexed input need switching schedule multistage crossbar can save crosspoints if a crosspoint can attach to more than one input have more than one crossbar in the system you can have more than one path to the output use TSI for first stage, with crossbar for the middle and final stages delay samples so that they arrive at the right time for the space division switch's schedule time-space-time switching use TSI for first and final stages, with crossbar in the middle replace n input * k output space switch by TSI switch that takes n-slot input frame and switches it to k-slot output frame packet switching datagram: lookup based on the entire destination address cell: lookup based on VCI repeaters: physical level bridges: datalink layer based on MAC addresses discover who is on which side by listening routers network level participate in routing protocols application level gateways: treat entire network as single hop deciding on output port for a datagram (in circuit switching can do using table lookup as have stored during call setup) standard solution: use a trie a sorted tree data structure can improve performance by caching more recently used addresses move common entries to a higher level (i.e. match on longer strings) blocking (internal or output) must buffer or drop deal with by over provisioning: internal links much faster than inputs buffers at input and output switches first generation basically like having a load of network cards in a PC second generation port mapping intelligence in line cards bottleneck: bus or ring the self routing fabric provides parallel paths contention is for the output buffers potential for unlimited scaling switch fabrics is for transferring from input to output ignoring scheduling and buffering ways of doing crossbar need to compute schedule on the fly (unless fixed size in known arrival pattern) buffered crossbar broadcast packets are tagged with an output port number and sent to all ports need to match N stages in parallel at each output port only useful as part of a switch or in a small one switched fabric element if 0, send packet to upper output else to lower output if both packets to same output buffer or drop properties NxN switch with bxb elements has logbN elements self routing recursive can be synchronous or asynchronous (i.e. can use for variable sized packets) Banyan avoiding blocking check path is available before sending or sort by output merge sort in hardware remove duplicates recirculate to the beginning or run output of trap to multiple Banyans remove gaps perform a perfect shuffle split set in half and interleave i.e. 1 2 3 4 5 6 7 8 -> 1 5 2 6 3 7 4 8 packet size small fixed size easy to build large parallel fabrics more overhead – overhead dominates at high speeds variable size can fragment or build asynchronous fabric hence variable size not hurt too much buffer placement need buffers to match input rate to service rate need arbiter to say which queue's packet for output K can go head of line blocking if a 1 had been chosen from a different input then the 3 on this output could go through the fabric but is trapped multiple queues per input arbiter must choose which queue an input port sends but multicast is usually pathological for this output queueing output queues have to run faster than the trunk speed knockout principle use to reduce speed required drop extra packets, fairly distributing loss between inputs buffered fabric buffers in each switch element pros only need to be as many times faster than trunk speed as the degree of the fan in if you use hardware back pressure you can reduce the buffering requirements if the arbiter knows the buffers are full it can prevent a certain output number from entering the fabric cons costly hard scheduling multicast switches do in hardware port mapper has a list of outputs incoming packet must be copied to these output ports sub-problems generating and distributing copies implicit multiple outputs enabled after placing packet on shared bus bus/ring based, crossbar or broadcast switches explicit copy a packet at the switch elements use copy network header (VCI) translation for copies normally can do VCI translation at input or output for multicast input maps VCI to output ports output swaps the VCI Integrated services problems with IP data transfer no recognition of flows connectionless – no signals forwarding per datagram no examination of type of traffic, no priorities e.g. file transfer vs. VoIP routing no fixed path, so no fixed QoS scheduling in the routers FCFS – no examination of type of traffic requirements for an Integrated Services Network QoS service-level packet handling traffic description policing application flow description Service interface common data structures and parameters signalling protocol admission control check request can be honoured scheduling packet classification prioritisation of traffic queue management Traffic and QoS parameters generally a network is three level structure access network local low multiplexing and volume of traffic distribution network medium volume low multiplexing core network high multiplexing and volume of traffic mixed traffic in the network routers use FCFS queues no knowledge of traffic patterns or applications traffic is often aggregated in complicated ways in the core packet size large good for general data “router friendly” slows real time traffic small good for real time data less end-to-end delay (defined using: propagation transmission (data rate) buffering application specific processing) better tolerance to loss less jitter (defined using: jitter is variation in end to end delay causes FIFO queuing dynamic path changes, etc. only critical in TCP if it is a first order effect critical to voice, Video over IP etc. less efficient due to overhead/data ration not “router friendly” end to end loss non real-time retransmit (as in TCP) real-time forward error correction media specific “fill in” at the receiver reordering can be seen as loss causes packet dropping at routers traffic violations excessive load link/router quality end to end data rate short term changes: during the course of a flow long term changes: during the course of a day goodput – residual throughput after allowing for overheads (inc. retransmission) affected by protocol behaviour (TCP congestion control) dynamic routing congestion aggregation link multiplexing elastic (non-real time) applications asynchronous: e.g. email interactive: http, ftp, etc. inelastic (real time) applications streaming voice non interactive end to end delay and jitter not important data rate and loss very important real time voice QoS parameters for the internet delay cannot set a maximum, but can measure it (Dmax) jitter cannot set end-to-end jitter value loss not really QoS parameter for IP networks linked to data rate packet size restrictions may be used by routers for buffer allocation etc. data rate – specify using leaky bucket parameters B: bucket size in Bytes L: leak rate in Bytes/sec data poured into bucket maximum latency is B/L (i.e. how long it would take from the top of the bucket) token bucket parameters b: bucket size in Bytes r: bucket rate in Bytes/sec p: peak rate in Bytes/sec bucket fills with tokens at rate r presence of a token allows a data transmission transmission is allowed at a rate up to p real time media flows audio low loss preferable data rate basically fixed for certain applications video loss must be low data rate dependent on frame size colour depth frame rate encoding QoS services and application level service interfaces INTSERV (an integrate services network architecture) this is an architecture used to guarantee QoS on networks is a fine-grained QoS system concept: every router in the system implements IntServ every application that requires some kind of guarantees has to make a reservation "Flow Specs" describe what the reservation is for (describe service semantics) "RSVP" is the underlying mechanism to signal it across the network service templates/“flow specs”, consists of TSpec: allowed traffic pattern, normally bucket parameters RSpec: service request specification policing: check that the TSpec is adhered to if violated can change the packet handling (degrade service etc.) RSVP used for signalling resource requests between user, server and network receiver oriented: receiver of a flow initiates and maintains the resource reservation for that flow every 30s or so each server which can send QoS data sends out a message to see if any clients are interested if a client is interested it sends a message back indicating what it wants and what guarantees it needs each router along the way decides if it can deal with the resource request if not send back a cancellation message if so keeps soft state of reservation options routers then police according to that policy, dropping the flow after a timeout is flow based (vs. class based) – treat each flow differently problems lots of state being held lots of refresh messages clogging up the system applications and routers need to be RSVP aware - what about legacy applications? DIFFSERV is a coarse-grained QoS system is class based (vs. flow based) data packet placed in one of a limited number of traffic classes doesn't differentiate network traffic based on the requirements of an individual flow, but on the class of a packet each traffic class can be managed differently the packets are assigned to classes by the network (i.e. the ISP) this ensures user cannot always ask for highest QoS without paying also means that the end system is unaware that or how this happens good for legacy applications per hop behaviours (traffic classes) expedited forwarding: low loss, low latency traffic assured forwarding: has varying levels of drop precedence problems going from one DIFFSERV cloud to another (one ISP to another), one cloud's gold packet may be another's bronze no standards for per hop behaviour types or how to treat them INTSERV and DIFFSERV? INTSERV is good for per-flow reservations DIFFSERV is good for aggregate, per customer/application/group etc. run INTSERV reservations within DIFFSERV flows? RTP is carried in UDP packets supports multicast natively uses RTCP to permit application level control/signalling Real Time Control Protocol (RTCP) uses signalling channel to send data about the flow this will affect how the sender and receiver act as entities is not directly applicable to specific packets it provides QoS info for deciding how to shape the flow loss, delay, jitter user information etc. application level signalling user to network Telco network separate data and signalling path IP signalling information carried on the data path RSVP carried in IP packets along data path need aggregated signalling towards the core, e.g. using INTSERV with DIFFSERV? i.e. can't keep individual user to user signalling call/session set-up use to exchange each's capabilities application level signalling supported by network summary need QoS mechanisms for UP per flow INTSERV RSVP does not scale well customer/provider services DIFFSERV Scheduling and queue management traditional routing behaviour per packet, no idea of flows generally FCFS, no priority traffic this basic behaviour won't support QoS, need to modify it the input lines have no input buffering uses policy to classify packets different classifications go in different output buffer for each output scheduler decides which buffer to schedule next FCFS in this scheme use a default classifier queue packets to outputs in order they arrive service a packet as soon as possible after it arrives work conservation a system is work conserving if it is not idle when there is work to do (i.e. packets waiting) we cannot give all flows a lower delay than they would get under FCFS and we have the equation: where ρn = λn . μn ρn is the mean link utilisation qn is the mean delay due to scheduler λn is the mean packet rate (packets per second) μn is the mean per-packet service rate (seconds) C is a constant non work conserving schedulers these are permitted to be idle even if packets are available wait until a packet is eligible for transmission advantages permits traffic smoothing for flows makes downstream traffic more predictable less jitter possibly can get away with less buffer space disadvantages higher end to end delay complex to implement requirements of a scheduler simple to implement it needs to be implemented on high speed networks want little complexity so can have a fast (hardware) implementation fairness protect any flow from the misbehaviour of any other local fairness generally implies global fairness performance bounds on jitter, delay, loss ... can we do it so it's deterministic, provable, per flow? Admission control if it is required it needs to be fast, simple and efficient fairness: max-min fair share criteria this is an algorithm to provide fairness to protect flows from each other algorithm find the flow with the smallest demand for the resource, from a pool of n flows give it as much as it wants (no more), up to R/n where R is the remaining resource available i.e. flows which can't have as much as they want get an equal share of the resource available remove the flow from the pool of flows, decrement the amount of resource available go to 1 mn = min(xn , Mn ) 1 <= n <= N where C is the total capacity of the resource mn is the actual resource allocated to flow n xn is the resource demanded by flow n Mn is the potential resource available to n you can do weighted max-min share parameters in a scheduler number of priority levels how to serve the different levels especially to prevent starvation of lower levels is jitter/delay control for some levels this could make it non work conserving aggregation: will priority be decided per flow? per application? per user? how to service within a given level's queue possible schemes simple priority queueing k queues queue (k+1) has higher priority than queue k generalised processor sharing weighted max-min share serves an infinitesimally small amount of data from flow i visits flows round-robin is unimplementable but used as a standard relative fairness: fairness of a scheduler with respect to other flows it is servicing absolute fairness: fairness of a scheduler compared to GPS for the same flow weighted round robin different flows will have different mean packet sizes so to help make more fair divide the weighting coefficient by the mean packet size this can introduce unfairness if the mean packet size is estimated incorrectly service is fair over long time-scales deficit round robin each queue has a (deficit) counter dc which is initially zero scheduler goes through the flows in order attempts to serve the queue if size <= quantum + dc if it does serve it then it sets dc = quantum + dc – size otherwise it sets dc = quantum + dc the quantum is a 'credit' which builds up each round it is normally the maximum expected packet size weighted fair queueing is a GPS emulation estimates the time the packet would have completed under bit by bit (i.e. discrete) GPS packets are tagged with this finish number the smallest finish number across the queues is served first buffer drop policy drop packets already queued in order of decreasing finishing number can provide “best effort” queueing guaranteed data rate and deterministic end to end delay is only a queueing mechanism (still needs a scheduler) link capacity is shared there is class based allocation a node can “borrow” spare capacity from a parent queue management packet dropping policies drop from tail drop from head better for TCP random drop fair if all sources are behaving flush queue (drop all packets) intelligent drop needs lots of higher level state information reactions of end systems to packet drops non-real time – TCP backs off real time – UDP application level control flow required real time data flows may not be able to adapt hence this hurts flows which do adapt (as they have to adapt for the non-adaption of other flows as well as for the general congestion) congestion control random early detection method monitor the average queue size and drop packets based on statistical probabilities if the buffer is almost empty, all incoming packets are accepted as the queue grows, the probability for dropping an incoming packet grows too when the buffer is full, the probability has reached 1 and all incoming packets are dropped any time a packet is dropped the sender is notified and hopefully slows down this process should reduce the cycle of the router becoming full (the network is congested) all incoming packets are dropped all incoming flows reduce flow rate drastically the network becomes underutilised all flows build up again at roughly the same rate repeat explicit congestion notification works using IP, only if both ends signal that they will use it on congestion a router sets a bit in the IP packet in the ACK the received indicates this congestion the received must act (in TCP terms) as if the packet was dropped (in terms of reducing flow, not in terms of retransmission) we really need adaptive sources issues fairness and protection scalability granularity speed of operation flow adaptation aggregation signalling router state ECN? Routing for integrated services routing requirements for integrated services multiparty communications multicast multi-user games etc. support for QoS in routing QoS based routing traditional (telephone) routing the destination determines the path there is only one optimal path to the destination properties of QoS routing multiple paths possible different paths have different QoS properties routing updates include QoS parameter information can do in IP header not widely used – no global agreements there are lots of metrics available to determine QoS, depending on what you want possible metrics minimum delay path maximum throughput path maximum reliability path minimum cost path if the router is having to deal with choosing a path for a packet based on its indicated preferences there will be a lot of overheads larger router updates (as QoS parameter information is passed in updates, as above) more state is stored in routers more processing per packet there can be advantages to staying on the same route, even if a “better” one becomes available, and there are a couple methodologies to do this route pinning ensures that, whilst there are still packets coming from A to B, in any flow, that they all follow the same route this can cause congestion (as it defies normal routing ideas of taking the best route for each packet) path pinning ensures that all packets from A to B in flow F go along the same route if a new flow starts it may take another route this can be problematic if there is a long lived flow and the route has since become a much worse choice intra domain QoS routing can use agreed single/multiple metrics need to indicate any disruptions to QoS along a path need to support best effort traffic is really a research issue inter domain QoS routing needs to be scalable QoS routing shouldn't be very dynamic don't want lots of updates shouldn't constrain intra-domain QoS routing QoS routing for multicast has to support widely dispersed groups dynamic membership changes must scale across domains is really hard... Traffic management this is a set of policies and mechanisms that permit a network to efficiently satisfy a diverse range of service requests needed for next generation data networks economic principals utility function it is assumed that there is a function from a given quality of service to a utility utility in this case is “level of satisfaction” this function is determinable by observing how rational users act e.g. a file transfer u is utility from file transfer S is satisfaction from a infinitely fast transfer a is rate at which satisfaction decreases with time t is transfer time u = S – a . t where NB, if t > S/A then the user is worse off (negative utility) social welfare social welfare is maximised when some combination (normally summation) of all the utility functions is maximised efficiency: a network is efficient when increasing the utility of one user must necessarily decrease the utility of another envy-free: a network is envy-free if no user would trade places with another (NB better performance costs more) attempt to maximise social welfare, envy freeness and still make a profit a single network which provides separate QoS is better than separate networks for each QoS can share unused capacity lowering delay of delay-sensitive traffic increases welfare will the person downloading program really notice an extra couple seconds, verses VoIP normally welfare increases more than linearly with an increase in capacity applying the principals it's better to give 5% of the traffic low delay than all the traffic slightly lower delay the big question if we took the money we spent on attempting to control the traffic and just blindly increased capacity, would we be better off? traffic models to do the above we need a traffic model of how the users behave can either measure or guess telephone traffic models time between calls is an exponential distribution (Poisson process) length of calls is an exponential with a heavy tail (significant numbers of calls are very long) internet traffic models LAN connections differ from WAN connections many parameters are heavy-tailed (a few calls are responsible for most of the traffic) traffic classes traffic classes encompass both user requirements what the network service offers e.g. telnet requires low bandwidth and low delay basic classes guaranteed service utility is zero unless an application gets a minimum level of QoS that QoS could be in one or more of bandwidth, delay, loss, jitter ... typically synchronous and real-time best effort typically asynchronous and indifferent to real time e.g. email ATM guaranteed service subclasses constant bit rate mean and peak rate are the same variable bit rate long term average with occasional bursts try to minimise delay e.g. constant quality variable bandwidth audio available bit rate user gets whatever is available unspecified bit rate like ABR but without feedback time scales actions are taken at different rates during a call some once per call (e.g. resource requests) some every few RTTs (e.g. feedback flow control) some very rapidly (e.g. policing) traffic management mechanisms need to deal with a range of traffic classes at a range of time scales mechanisms renegotiation static per-call descriptors doesn't make sense for many traffic sources e.g. if the network serves a real time video stream at slightly less than then agreed average rate it will need a large amount of buffering space to keep up renegotiating service rate every 10 sec gets the bandwidth requirement to nearly average rate note that renegotiation does cost in terms of overhead renegotiated constant bit rate (RCBR) service all traffic sent as constant bit rate CBR is renegotiated if necessary buffers are at the edge of the network is easy to price signalling this is a source signalling its utility function to the network, in two parts how to carry the message (transport) how to interpret it (semantics) how the signalling works setups are sent and acknowledged per hop resources are reserved tentatively is hard due to complex services, feature interaction, performance tradeoffs... resource translation the application wants an end-to-end quality this needs to be translated into per-hope requirements two pass approach forward: attempt to get as high a QoS at each hop as possible reverse: relax the actual resources reserved so that it fits the required end to end quality signalling on the Internet: RSVP the main motivation behind this is to efficiently support multipoint multicast with resource reservations multicast reservation styles naïve (source initiated) source contacts each receiver in turn intelligent multicast (merge replies) the source needs to know all receivers and the rate at which they can absorb there are two messages per link of the spanning tree doesn't scale naïve multipoint multicast two messages per source per link can't share resources among multicast groups RSVP receiver initiated state on reservation is kept per group, not pre connection admission control constant bit rate admission control simple on failure try again best effort admission control trivial variable bit rate admission control the peak rate differs from the average rate need to decide how much over the average rate to book trade off between wasting bandwidth or risking dropping packets in a burst options peak rate admission control reserve at the connection's peak rate pros simple zero fluid delay and zero loss cons wastes bandwidth worst case admission control source is described using average rate burst size use weighted fair queueing or rate controlled discipline to reserve bandwidth at an average rate pros may use less bandwidth than with peak rate can get an end-to-end delay guarantee cons complex admission with statistical guarantees as the number of calls increases, the probability that multiple sources send a burst decreases the sum of connection rates is increasingly smooth with sufficient sources all can be assumed to be arriving at their average rate traffic from a source is sent to a buffer which is drained at rate e if a burst is sent, its delay goes up for too large a burst packets are dropped equivalent bandwidth: the rate at which we need to drain the buffer so that the probability of loss is less than i and the delay in leaving the buffer is less than d many sources will generally share a buffer admission control method in this case a request from a source arrives use its performance requirements and network state to assign it an equivalent bandwidth the sum of equivalent bandwidths should be less than the link capacity pros can trade off a small loss probability for a large decrease in bandwidth reservation can obtain delay bounds cons assumes uncorrelated sources hairy mathematics measurement based admission for traffic which cannot easily describe itself a metric is obtained by measuring actual average load the user gives a peak that it expects if average + peak < capacity then admit problems: assumes past performance is indicative of future performance when to 'forget' about the past? peak load pricing service providers want to avoid overloading use all available capacity traffic has a periodic daily distribution showing peaks “price” can be used to move some traffic to off peak times “price” is a signal to the user from the provider about network preferences in deciding pricing have to create utility function to show the weight the network places on using capacity and on revenue the weight the user places on cost and on lack of overload rational users help the system by helping themselves capacity planning this is the design of modifications to network topology link capacity routing choices to most efficiently use existing resources or to alleviate long term congestion approach measure the network during its busy hour measure traffic from end points to end points we're just modifying the internal network, not end points find the busiest hour by measuring in general a week or so add a “fudge factor” to this rate for future growth create traffic matrix this is the number of bits sent from each source to each destination assume this pattern will predict future behaviour decide topology, which depends upon: connectivity between arbitrary points geographical considerations – some links may be easier to build than others existing capacity assign capacity assign sufficient capacity to carry the busy hour's traffic however routes are decided dynamically using link and congestion status it is easier to assign capacities if routing is static and links are always up (telephone network) some open problems resource translation we can do this for bandwidth doing it for delay is hard can either translate delay into an equivalent bandwidth obtain per hop delay bounds renegotiation static descriptors aren't that good for dynamic traffic renegotiation is not free as there s signalling overhead problems: when to renegotiate? admission control? what to do on renegotiation failure? measurement based admission over what time interval to measure the average? peak load pricing how to choose prices for on and off peak when peak times should end capacity planning simultaneously choosing topology, link capacity and routing metrics as routing and link capacity interact smart vs. dumb getting low delay for VoIP means giving low delay for all traffic but if we took the money spent on traffic management and increased capacity we'd get the lower delay? Miscellanea scaling/complexity all mechanisms added to IP have some cost we would like this cost to be O(c) normally it's O(n) in number of end systems number of routers per flow queues there will be a data structure for every active pair of senders/receivers per class queues data structure per class edge systems have to do policing and shaping may get O(log(n)) stability traffic, both data and control, should be stable we need to do a theoretic analysis to figure out if it is oscillatory should still converge or be bounded design to cause stability end to end congestion system: damped feedback routing systems: randomised timers we don't want to do alternate path routing and admission control on the same timescales mobile issues wireless links throughput, delay and loss can vary on wireless links offering hard QoS is hard it may be coordinatable if different layer two media can cooperate and do handoff properly patterns mobile access patterns may be different from fixed ones (we don't know) mobile multicast with a moving source or sink could be complex (need to rebuild the tree) resources some QoS approaches are based on the network being essentially underloaded expedited forwarding and assured forwarding only work for VoIP if it's a small percentage of the traffic this isn't the case on many wireless links pricing you have to charge for QoS how much are users prepared to pay? how to stop the rich from buying all the bandwidth and resell it at a different price? normally access fee can cover the cost of infrastructure the bill is an incentive scheme to stop users hogging capacity varying the price with network conditions – congestion pricing is theoretically optimal generally too complex for users predictable bills are often more important than cheapest fare provisioning users don't like being refused access we need to dimension the network to balance user satisfaction and revenue levels implementation novelties active networks allows the packet to provide code to execute on routers and nodes in the network was investigated as it was becoming hard to deploy new services in telephony and IP networks due to the scale of the infrastructure can reprogram the switch on the fly (possibly per packet) multi protocol label switching (MPLS) getting from the source to the destination(s) as fast as possible not quite the same as QoS provisioning is intended to replace ATM (similar technology) flows are identified and labelled label switching packet enters label switching router (LSR) destination and new label looked up in table new label applied, sent to output port labels packets that share the same next hop share the same local label that label is determined by a downstream label switching router routing information is used to distribute labels piggy back the label information onto existing protocols label stacking it's like source routing the LSRs along the way pop off the next label on the stack makes it even faster performance issues router architectures fast route table lookup fast packet classification (for QoS) better address aggregation traffic engineering (differentiated services) faster boxes or smarter software? (soft vs. dumb) if you push something out of one part of the architecture, it will show up somewhere else
statistical multiplexing
Protocol Layering
do routes depend on current network state (e.g. delay)
lollipop
![]()
![]()
![]()
efficiency issues
multicast tree
pruning
Core Based Tree (CBT)
flood and prune good for dense groups (only need a few prunes)
each cell phone has a global ID that it tells remote MTSO when turned on (using slotted ALOHA up-channel)
Very similar to mobile telephony
loops
![]()
![]()
![]()
![]()
![]()
![]()
s is burst limit
resolving IP addresses to ATM addresses
partitions:




in basic CSMA there is no collision detection in the transmitter, it is detected as a frame error (indistinguishable from any other) in the receiver and some error control protocol is used
multiplexor

crossbar

time space switching
need to find longest prefix match
back pressure: if switch fabric doesn't have buffering, prevent from entering until path available

third generation
buffer the crosspoints
switched fabric element fabrics
if two packets want to go to the same output there is output blocking
input buffering (queueing)
solution:
![]()
simple router schematic:
![]()

Class based queueing