Ana səhifə

Computer Science 461 Final Exam Friday March 23, 2008 7: 30-10: 30pm This test has eight (8) questions. Put your name on every page


Yüklə 124.5 Kb.
tarix24.06.2016
ölçüsü124.5 Kb.
NAME:
Login name:



Computer Science 461

Final Exam

Friday March 23, 2008

7:30-10:30pm

This test has eight (8) questions. Put your name on every page, and write out and sign the Honor Code pledge before turning in the test.


Please look through all of the questions at the beginning to help in pacing yourself for the exam. The exam has 100 points and lasts for 180 minutes.
``I pledge my honor that I have not violated the Honor Code during this examination.''


Question

Score

1 (10 points)




2 (10 points)




3 (15 points)




4 (10 points)




5 (15 points)




6 (15 points)




7 (20 points)




8 (5 points)




Total




QUESTION 1: Weighting it Out (10 points)
This question explores how to set the (configurable) link weights in link-state routing protocols like OSPF and IS-IS inside a single Autonomous System (AS) to achieve AS-wide goals.
1a) How should the network operators set the link weights if their goal is to minimize the number of hops each packet traverses to reach its destination? (2 points)
All the weights should be set to the same value (e.g., “1”).
1b) How should the operators set the link weights to minimize the end-to-end delay the traffic experiences? Assume the network is lightly loaded, so queuing delay is insignificant. (2 points)
Propagation delay is the only source of delay. As such, the weight of each link should be set proportional to its length (i.e., its propagation delay).
1c) In the picture below, the nodes are routers, the edges are links, and the integers correspond to the link weight on each direction of the link. (That is, the link a-b and the link b-a both have weight 10.) Put arrows on the edges to show the shortest path from every node to the destination node d. That is, show the “sink tree” leading to node d. (2 points)


Nodes a and g take a direct path to d. The other nodes go left until they reach a, d, or g.
1d) Suppose the link f-e is overloaded with traffic. Identify a single weight change (on just one link) that would divert traffic from source f to destination d away from the f-e edge without affecting the path between any other source-destination pairs. Avoid any reliance on how routers choose between multiple paths with the same (smallest) cost. (4 points)
The h-i link should be changed to 2. That makes the cost of path f-i-h-g-d decrease from 28 to 21, which is less than the cost 22 of the path f-e-d.

QUESTION 2: Route Views (10 POINTS)

This question explores the role of routing policy on BGP routing. In the picture below, the nodes are Autonomous Systems (ASes). Dotted lines correspond to peer-peer relationships, and solid arrows correspond to provider-customer relationships, with the arrow pointing from provider to customer. Every node prefers customer-learned routes over routes through peer ASes, and peer-learned routes over routes through provider ASes. Among routes through the same kind of neighbor, each AS prefers shorter AS paths over longer ones. An AS does not export routes learned from one peer or provider to other peers or providers.


2a) Draw all of the paths to destination AS d. That is, draw the “sink tree” leading to d. Circle any ASes that do not have any route to d. (2 points)

The paths are b-a-d, g-d, and h-e-d. Nodes c, f , and i cannot reach d.
2b) What path does e take to reach i? Suppose f decides to terminate the peering relationship with AS e. Can AS e still reach destination i? If so, what is the new path? (2 points)
Node e takes the path e-f-i. With the e-f link removed, the path is e-b-c-f-i. The path e-h-i is not used because e’s customer h would not advertise a peer-learned route to its provider.

2c) In the original picture from 2a, suppose ASes a and b provide a “dump” of all BGP routes they learn for every destination to a public measurement repository. Suppose a researcher extracts all links from all AS paths in those “dumps” and uses them to construct a view of the AS-level topology of the network. Draw a picture of the resulting inferred topology. (3 points)
The graph is missing the peering edges d-e, d-f, g-h, and h-i. Node a learns only customer-learned routes from its customer d and peer b, allowing it to learn a-d, a-d-g, a-b, a-b-e, and a-b-e-h. Node b learns only customer learned routes from its two peers a and b, and its customer c, allowing it to learn the routes that a learned, plus b-e and b-e-h from e, as well as b-c, b-c-f, and b-c-f-i from c.
2d) What is the minimum set of ASes that must provide “dumps” of every AS path they learn for every edge in the graph to be visible in at least one dump? List the ASes. (3 points)
Two nodes: b and h (or e and h). Node h has paths h-g (to g), h-i (to i), h-e (to e). It learns the paths h-e-d, h-e-f, h-e-b, h-e-b-a, and h-e-b-c from e. However, since e is using the path e-d to reach d and the path e-f to reach f, e does not announce the paths e-b-a-d or e-b-c-f, even though e does learn those paths. As such, the dump for node h reveals all edges except a-d and c-f, which are in a routing-table dump at either e or b.

QUESTION 3: Casting Party (15 points)
Anycasting is a network addressing and routing scheme whereby data are routed to the “nearest” or “best” destination as viewed by the routing topology. This allows multiple servers (replicating the same content) in different locations to have the same IP address. In the Internet today, this is typically implemented by announcing the same address block in BGP from many locations. The routers then forward packets toward whatever route is “best” (from their viewpoint) for reaching that destination address block.
3a) The operator of an authoritative DNS server (such as, say, root nameserver K) may provide access to a single named server in multiple locations spread throughout the Internet. In addition to having more machines to handle the high request volume, give two reasons why the operator may want these server machines located in different parts of the Internet. (2 points)
(i) reliability (in case there is an outage at one location), (ii) lower user-perceived latency (by directing requests to nearby servers), and (iii) lower network load (by taking shorter paths).

3b) Why might two packets sent by the same DNS client to the same anycast address reach different DNS server replicas in different location? Why is this acceptable? (3 points)


If a link fails in the network, the resulting routing change may make a path to a different server replica more attractive than the path to the original replica. This is acceptable because each DNS request packet is independent, and all DNS servers have the same information. There is no client-specific state maintained at the server.

3c) Why is having the packets go to different replicas unacceptable for some other applications? For example, why might this approach cause a Web download to fail? (3 points)


In contrast, HTTP is a stateful protocol and transfers span multiple packets. If (say) the Web client’s TCP ACK packets go to a different Web server than the original request, the original server will not receive the ACKs and will think the data packets were not received. The new server will receive unsolicited ACK packets and would likely respond with a TCP RST.

3d) Replicated Web sites typically use DNS to direct different clients to different server replicas, e.g., returning different IP addresses for www.cnn.com. Give two reasons why CNN’s DNS server might respond to DNS queries for www.cnn.com with different IP addresses. (4 points)


(i) proximity (directing each client to a nearby server), (ii) load balancing (directing a client to a lightly-loaded server), and (iii) customized content (directing different clients, e.g., from different geographic regions, to different versions of the Web site).

3e) Services using IP anycast typically have a large TTL value on their DNS entries. Explain why large TTLs are acceptable. Services (like Web sites) using DNS to select servers typically need to use small TTL on their DNS entries. Explain why small TTLs are necessary. (3 points)


Anycast automatically provides failover when a server fails, and recovery when a server comes back up, so there is little need to change the name-to-address mappings.
In contrast, for DNS-based server selection, the client will cache a name-to-address mapping even after the corresponding Web server has failed. Small TTLs are needed for reasonable failover times. Also, when DNS-based redirection is used for load balancing, small TTLs are useful for giving finer-grain control over server selection.

QUESTION 4: Think Locally (10 points)
This question explores local-area network technologies.
4a) Identify whether the following statements are true or false. Circle your answer. (2 points)

An Ethernet switch maintains a mapping of IP addresses to MAC addresses: true false
An end host maintains a mapping of IP addresses to MAC addresses: true false

4b) Running tcpdump (or wireshark) as root on your own computer will allow you to see some traffic that was neither sent nor received by you. Assume the switches already know how to forward traffic to each MAC address, so that all unicast traffic is delivered without trigging any flooding. List two things that your packet monitor could discover about other users on your LAN. (2 points)


DHCP queries reveal when the computer comes on the network, revealing when a person is in the building. ARP queries reveal the address of other hosts the computer communicates with (e.g., via Skype). Broadcast-based discovery protocols reveal when the computer is, say, using iTunes or accessing a printer.

4c) The guest lecture on SEATTLE discussed ways to make Ethernet scale better. Explain the techniques used to convert ARP and DHCP queries from flooding to unicast. (2 points)


A broadcast ARP query (to learn the MAC address associated with a given IP address) sent by a host is converted by the attached Ethernet switch into a unicast query to the switch associated with the hash of the IP address.
A broadcast DHCP query sent by a host is converted by the attached Ethernet switch into a unicast query, either based on the hash of “DHCP” or based on a static configuration of the DHCP server addresses.

4d) SEATTLE uses a one-hop DHT (running directly on the Ethernet switches) to store host-location information, and a link-state routing protocol amongst the switches. How do the DHT and the link-state protocol react when a link fails? When a router fails? In both cases, assume that the network topology remains “connected” – i.e., every router can still reach every other “up” router. (4 points)


DHT, link failure: no change

DHT, router failure: remapping of the portion of the hash ring handled by the failed router

Link-state protocol: recomputes shortest paths with the link or router (and incident links) removed

QUESTION 5: Internet Pathology Report (15 points)
5a) Suppose a router along the path through the Internet has an MTU “mismatch” and wrongly drops packets larger than (say) 1000 bytes. Why would simply running traceroute fail to reveal the offending router? (4 points)
Traceroute packets are typically small, so they would be unaffected by the error.

5b) Suppose a university with address block 12.1.0.0/16 has a link connected to AT&T, where the AT&T router forwards packets destined to 12.1.0.0/16 to the university router. Suppose the university router has three forwarding entries: 12.1.1.0/24 out the link to the math department, 12.1.2.0/24 out the link to the CS department, and a “default route” for 0.0.0.0/0 pointing to the AT&T router. Suppose a host in the rest of the Internet sends a packet destined to 12.1.57.109. What would happen to that packet? What could be done to prevent it? (4 points)


The packets would loop between the AT&T and university routers, because AT&T would forward the packet to the university (using the route for 12.1.0.0/16) and the university would forward the packet back to AT&T (using the default route 0.0.0.0/0). The university should configure a “null route” to drop all packets matching 12.1.0.0/16 to prevent this. See http://www.nanog.org/mtg-0602/gao.html for details about this issue, and the security vulnerabilities associated with it.

5c) Recently, Comcast was discovered to be blocking BitTorrent traffic by injecting (forged) TCP RST packets to BitTorrent clients to interrupt ongoing connections between (say) Alice and Bob, where Alice is a customer on the Comcast network. Explain how Alice and Bob could collect measurements to verify that the network is injecting forged packets. (3 points)


Alice and Bob could each run a packet monitor, like tcpdump or wireshark, and count the number of TCP RST packets. If Alice receives more than Bob sends, some packet were likely forged. See http://www.eff.org/wp/packet-forgery-isps-report-comcast-affair for details.

5d) Consider a network with four ASes speaking BGP. ASes 1, 2, and 3 are trying to select routes to a destination in AS d. Each AS prefers a two-hop path through their clockwise neighbor over a direct path (i.e., AS 2 prefers “2 3 d” over “2 d”), and the ASes do not export any other routes (e.g., AS 3 does not export “3 d” to AS 1, and AS 2 would only export “2 d” to AS 1). What decisions will the ASes make? How will the system behave? (4 points)





The system oscillates. Suppose initially that all three nodes pick a direct route (i.e., 1-d, 2-d, and 3-d). Upon learning a two-hop route from its clockwise neighbor, a node switches from the direct route to the two-hop route. For example, upon learning the route 1-2-d, node 1 switches from 1-d to 1-2-d, forcing 3 to use 3-d, but then that makes 2 want to switch to 2-3-d and withdraw 2-d, forcing 1 to go back to using 1-d, and the process repeats indefinitely. See http://portal.acm.org/citation.cfm?coll=GUIDE&dl=GUIDE&id=508332 for details.

QUESTION 6: Scalability of Peer-to-Peer Systems (15 POINTS)

This question explores the scalability of peer-to-peer systems, compared to client-server systems, for live streaming applications, such as Internet television. The server wants to deliver a video stream to six (6) receivers, where each receiver receives the stream live at the same streaming rate. A higher streaming rate corresponds to higher video quality, so the system should be designed to stream the video at the highest possible rate. The server has a link connected to the Internet with 30 Kb/second uploading bandwidth. All receivers have bidirectional links to the Internet–with 50 Kb/second uploading bandwidth and 50 Kb/second downloading bandwidth. Assume the core of the Internet has ample bandwidth.


6a) What is the maximum streaming rate that the system can support in a client-server configuration (i.e., where the receivers do not upload to each other)? That is, what is the rate of the stream received by each of the six receivers? (3 points)
The server can upload six copies of the stream at 30/6 = 5 Kb/second each.
6b) Now, suppose the receivers can upload to each other. A receiver can upload a stream of rate r only if he receives the same stream of rate r from another receiver or the server. For example, streaming trees shown as follows:



The trees achieve a total streaming rate of 8 Kb/second to four receivers, with two trees of streaming rates 5 and 3 respectively. The server transmits two copies of the stream at rate 5 and two at rate 3, consuming a total of 16 Kb/second of upload bandwidth. In our problem, with the upload and download capacities listed at the beginning of the question, what is the maximum streaming rate that can be achieved? Draw a set of trees (one or more trees) that achieve your streaming rate. You need not worry about minimizing the depths of the trees. (5 points)


A chain from nodes from the server through receivers 1, 2, 3, 4, 5, 6 can deliver the stream at 30 Kb/second.
6c) Now, suppose the receivers have a smaller uploading capacity of just 10 Kb/second. What is the maximum streaming rate that can be achieved in this case? Draw one or multiple streaming trees that support your streaming rate. (7 points)
Three trees at 5 Kbps also work with:
Tree 1: server transmits 5 to 1 (who transmits to 3 and 4) and 2 (who transmits to 5 and 6)

Tree 2: server transmits 5 to 3 (who transmits to 1 and 2) and 4 (who transmits to 5 and 6)

Tree 3: server transmits 5 to 5 (who transmits to 1 and 2) and 6 (who transmits to 3 and 4)
This ensures the server transmits 30 Kb/second and each receiver transmits 10 Kb/second. There are other valid trees besides the three listed above..

QUESTION 7: Torrential Downpour (20 points)
7a) Suppose there are many peers transferring chunks of the same large file. Let u denote a peer’s total uploading rate to other peers, and d denote its total downloading rate from other peers. What is a good approximation of the ratio u/d in BitTorrent? Why? (2 points)
Approximately 1, due to tit for tat.


7b) BitTorrent does not work well for unpopular files. Why? (3 points)
The aggregate upload bandwidth in the system is much lower, since fewer peers are in the swarm. There is also a higher risk of starvation as peers who have already downloaded the file leave the system.

7c) The author of BitTorrent claims that “incentives build robustness in BitTorrent.” However, a peer can still cheat in BitTorrent, i.e., downloading entire files without uploading any chunks to any other peers. How is this possible? (4 points)
A selfish peer could download only from the seed or through the “optimistic unchoking” process where other peers give him a chance.

7d) Suppose you are nice enough so you don’t want to totally free-ride in BitTorrent as suggested in (7c). But you can still game BitTorrent to make your download faster. Suppose you can see six peers in the system, each has a total uploading bandwidth (in Kbits/second) as indicated in the second column. Each of them exchanges chunks from other five other peers, who offer them the downloading rates listed as follows.





Uploading bandwidth

downloading rate from other peers (in Kbits/second)

Peer 1

10

15

12

9

7

3

Peer 2

30

23

15

6

4

2

Peer 3

5

9

8

7

2

1

Peer 4

45

20

12

6

3

1

Peer 5

15

45

23

11

8

7

Peer 6

20

23

22

18

9

8

For each of the six peers, at what rate would they upload to each of the five peers they are peering with? Assume these six peers run the conventional BitTorrent protocol. (2 points)


The six peers each divide their upload bandwidth evenly amongst their five peers. So, they upload at 2, 6, 1, 9, 3, and 4, respectively.
7e) For each of the six peers, what is the minimum rate would you have to upload data to ensure that the peer would want to upload data to you? Round to the nearest integer. (3 points)
The goal is to be in the top four by sending just slightly larger than the 4th largest peer. Rounding down to the download rate of the 4th largest peer, that corresponds to the second-to-last column in the table. That is, 7, 4, 2, 3, 8, 9.

7f) You are connected to the Internet with a 9 Kb/second uploading bandwidth and unlimited download bandwidth. Which peers should you upload to, and at what rate, to maximize the download rate you achieve? What download rate do you achieve? Show your work. (6 points)


The upload rate you must expend is a “cost”, and the download rate you achieve a “value”, where you have 9 units of currency to spend. (Theoretically speaking, this is a “knapsack problem.”)
Costs: 7, 4, 2, 3, 8, 9

Values: 2, 6, 1, 9, 3, 4
The last first, fifth, and sixth peers are too expensive. Using peers 2, 3, and 4, costs 9 units of upload bandwidth (4 + 2 + 3) and brings 16 units of download bandwidth (6 + 1 + 9). So, uploading to peers 2, 3, and 4, at rates 4, 2, and 3, respectively, is the best strategy, and achieves a download bandwidth of 16 Kb/second.

QUESTION 8: Google, Unplugged (5 points)
The Google search engine is optimized for interactive use. Users enter search terms and get reasonably good responses in less than a second, and then either revise their searches or download the recommended Web pages. Suppose Google wants to extend their service to support users with sporadic access to the Internet, via a motorcycle that makes daily visits to bring information to/from the village on CDs or USB drives. Briefly summarize the key ingredients of a design for “Google, Unplugged”. There is no one “right” answer.
There are many good answers here. Batch queries, perhaps with some identifier per user for privacy when returning query results. Return more query results, as well as the contents of the pages, perhaps following links several levels deep. Return results for common misspellings. Return results for popular requests and sites. Do a more computationally expensive (and hence time consuming) search for better results, since responding in “less than a second” is not important. Identify popular search topics for the region. Allow queries by cell phone or PDA. Return responses in local language. Provide all of Wikipedia (including differences since the last time Wikipedia was downloaded). Etc.





Verilənlər bazası müəlliflik hüququ ilə müdafiə olunur ©atelim.com 2016
rəhbərliyinə müraciət