forked from dileepramesh/Reliable-UDP
-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathREADME.txt
283 lines (215 loc) · 12 KB
/
README.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
README for CSE 533 Assignment 2 (Network Programming)
-----------------------------------------------------
Authors: Dileep Ramesh (108028179)
Manikantan Subramanian (108231320)
In this assignment, we have implemented a reliable file transfer service in a
client-service paradigm on top of an existing UDP connection. TCP like
features have been implemented on top of this existing UDP connection to make
the file transfer ordered and reliable.
Following are the some of the important changes that we have implemented as
part of this assignment:
1. Maintaining multiple listening sockets on the Server
-------------------------------------------------------
- The server creates multiple UDP sockets to listening for incoming
connections, one for each interface. The get_ifi_info() routine has been
used to obtain information related to each interface. This is stored in an
array of structures of the type conn_info_t which is defined as follows:
typedef struct conn_info_s {
char ifname[IFNAME_LEN];
int sock;
struct sockaddr_in ip;
struct sockaddr_in netmask;
struct sockaddr_in subnet;
} conn_info_t;
There is one instance of this structure for every interface on the system.
The server listens on all these interfaces for incoming connections.
Additional changes have been introduced to the get_ifi_info() routine to get
the netmask associated with the interface address as well.
- The get_ifi_info() routine issues an ioctl and gets all the addresses
associated with an interface in a structure of type ifi_info. This
includes the primary unicast address (ifi_addr), broadcast address
(ifi_brdaddr) and the destination address (ifi_dstaddr). We use only
ifi_addr and bind this address to the corresponding listening socket.
2. Modifications done to the RTT library functions in rtt.c and unprtt.h
------------------------------------------------------------------------
- Timers for detecting timeouts and retransmitting lost packets and
acknowledgements have been implemented using the rtt_* apis provided in
rtt.c. Since timeout values are in the sub-second range, we have used
itimers instead of alarm().
- The functions rtt_init(), rtt_ts(), rtt_start() and rtt_stop() have been
modified to measure time in microseconds rather than seconds. This
modification has been done as the measured RTT in the compserv and sbpub
machines were in the sub-second range.
- All the rtt_* functions have been modified to compute and store RTT and
Smoothed RTT values as integers rather than floats. As part of this, struct
rtt_info has been modified to store rtt_rtt, rtt_srtt, rtt_rto and rtt_base
as type int rather than float.
3. Flow Control and Congestion Control implementation
-----------------------------------------------------
Window Size
-----------
The sender and the receiver window size is specified in the server.in and
client.in files respectively. On the server, this parameter represents the
maximum size the congestion window can grow to. On the receiver, this
represents the maximum number of packets that can be buffered.
Slow Start
----------
The server will initially be in the slow start phase with a congestion
window size of 1. For every valid acknowledgement received, it increases it's
window by 1 MSS (1 packet of 512 bytes in our case). Thus, for the first
ack, window grows from 1 to 2 and 2 packets are sent in the next iteration.
Both the acks will be received in the same RTT initially, so the window
grows from 2 to 4, effectively doubling each time. This continues till the
server either times out on an acknowledgement or the SS threshold is hit
(Initial value is set to the receiver's window size advertised during
connection establishment).
In every iteration, the server tries to send as many packets as possible.
This packet count is the minimum of the following 3 parameters:
- Current free slots in the server's congestion window
- Advertised window size of the receiver
- Number of packet's still pending to be sent in the server's buffer
Congestion Avoidance
--------------------
Whenever the server times out or gets 3 consecutive duplicate
acknowledgements, it moves from slow start to congestion avoidance phase.
In this phase, the congestion window grows linearly rather than
exponentially. The current congestion window size is halved and stored as the
new SS threshold. Whenever the congestion window size grows beyond the SS
threshold, the window size is increased linearly instead of exponential
increase. This is implemented by counting the number of valid acks and
increasing the window by 1 MSS only when 'congestion-window size' number of
vaalid acknowledgements are received.
Handling Cumulative acks
------------------------
When packet drops are triggered, the server receives duplicate
acknowledgements for lost packets. When the lost packets are retransmitted,
the client may acknowledge multiple packets at once. This comes as a
cumulative ack to the server. Here, we have 2 cases:
- All the acked packets are in the congestion window. In this case, we
simply slide our window by updating the start aand end pointers. We also
update the window size accordingly based on the current congestion state.
- The cumulative ack is greater than the highest numbered packet in the
sender's congestion window. This happens if the server as shrunk his
congestion window and the receiver then sends a cumulative ack. In this
case, all the packets in the congestion window are acknowledged. So, we
drain the congestion window and start from the next unacknowledged packet.
Handling duplicate acknowledgements: Fast Retransmit and Fast Recovery
----------------------------------------------------------------------
When a packet is lost and the server times out, we half the current window
size and store it as the new threshold. The server moves back to slow start
phase with window size of 1. It moves to congestion avoidance once the
window size grows beyond the new SS threshold.
When the server receives 3 duplicaate acks, it stops its timer and
retransmits the lost packet immediately. This is the fast retransmit phase.
After the retransmit, it again shrinks its window by half and stores this
new size as the updated SS threshold. However, this time, instead of going
back to the slow start phase, it directly starts from the congestion
avoidance phase (starting window size = SS threshold value). This conforms
with the implementation of fast recovery in TCP.
Persist Mode implementation on the server
-----------------------------------------
Whenever the client's advertised window becomes full (this happens when the
consumer thread sleeps for a longer duration), it sends the next
acknowledgement with a window size of 0. When the server receives an
acknowledgement with window size 0, it goes to the persist state. It starts
the persist timer (5 second timer) and waits for the window to open up. When
the timer expires, it sends a probe message to the client. If the client's
window has openedup, it sends a duplicate acknowledgement with a non-zero
window size. Otherwise it still advertises a window size of 0. As long as
the server gets acknowledgements with window size 0, it stays in the persist
mode by restarting the persist timer. The moment it gets an acknowledgement
with a non-zero window value, it comes out of the persist mode and resumes
sending packets from the next sequence number.
Timer implementation
--------------------
In the actual TCP implementation, each segment is timed separately. However,
in our implementation, we maintain one timer for the whole congestion
window. This timer is always associated with the first unacknowledged packet
in the sender's congestion window.
Implementation of Congestion window and Advertised window
---------------------------------------------------------
Both the congestion window and advertised window are implemented as circular
buffers with the following 4 parameters: Start, End, Free and Size
cwnd_start: Points to the first unacknowledged packet. Updated whenever a
valid ack is received.
cwnd_end: Points to the first free slot. Updated whenever a packet is sent.
cwnd_free: Represents the number of free slots
cwnd_size: Current size of the congestion window.
advw_start: Points to the first unread packet. Updated whenever the consumer
packet drains the advertised window.
advw_end: Points to the first free slot in the advertised window. Updated by
the RUDP thread whenever it receives a packet and the window is
not full.
advw_free: Represents the number of free slots (Advertised window sent as
part of each ack)
advw_size: Size of the advertised window
Structure of the message header
-------------------------------
Every packet sent by the server has the following header:
+-----------------+
| MSG_TYPE |
|-----------------|
| SEQUENCE |
|-----------------|
| TIMESTAMP |
|-----------------|
| WINDOW SIZE |
+-----------------+
| |
| DATA |
. .
.-----------------.
MSG_TYPE: is one of the following:
#define MSG_TYPE_CONNECTION_PORT 2
#define MSG_TYPE_FILE_DATA 3
#define MSG_TYPE_FIN 5
#define MSG_TYPE_WINDOW_PROBE 6
#define MSG_TYPE_ERROR_INVALID_FILE 7
SEQUENCE: is incremented for every new packet sent
TIMESTAMP: represents the time when the packet was sent. This is used to compute RTT
which in turn is used to determine the waiting time for timeout.
Following are the MSG_TYPE used by client
#define MSG_TYPE_FILENAME 1
#define MSG_TYPE_ACKNOWLEDGEMENT 4
SEQUENCE: represents the next packet the client is expecting
4. Clean termination of the program after file transfer
-------------------------------------------------------
Server Implementation
When the server gets the acknowlegdement for the last data packet,
it sends out a control packet containing the message type MSG_TYPE_FIN.
This message type is interpreted by the client as the end of file transfer.
The server process ( child ) keeps re-transmitting the FIN Message until it receives an ACK from the client
or the number of retransmission exceeds the threshold
On reciieving the Ack for the FIN packet the server( child ) process exits. which triggers the SIGCHLD to be delivered to the parent.
The Message format looks as below -
+-----------------+
| MSG_TYPE_FIN |
|-----------------|
| SEQUENCE ( 0 ) |
|-----------------|
| TIMESTAMP |
|-----------------|
| WINDOW SIZE(0) |
+-----------------+
The parent handles the SIGCHLD signal and determines the process-id (waitpid) of the the child and
then purges the record associated with that connection from its table
Client Implementation :
On Recieving the MSG_TYPE_FIN message from the server, the client sends the ACK to the server.
In order to ensure reliable delivery of ACK message to the server, the Client starts a timer for 5 seconds and .
If the client receives a re-transmitted FIN message, it sends the Acknowledgement and re-starts the timer and moves to TIME_WAIT state
The client exits, when the timer expires.
5. Aditional notes
------------------
- On a local machine, we were able to transfer a 1.5GB file in 40 minutes with
0% drop probability and 1000 window size. Not able to test this on Solaris
due to memory constraints.
- On the solaris machine, we were able to transfer a 360MB file in 50 minutes
with 0% drop probability and 1000 window size.
- On the solaris machine, we were able to transfer a 360MB file in 50 minutes
with 0% drop probability and 1000 window size.
- On the solaris machine, we were able to transfer a 10MB file in 25 minutes
with 25% drop probability and 1000 window size.
- For aiding testing, we are additionally dumping the received file contents
in a separate file. By taking a diff of the input and output file, we are
ensuring that the file transfer was indeed successful.
END OF FILE