I uploaded a new version (V1.5) of the Dynamic Port Scanner (DPS), a reliable spoofed source IP port scanner. The major addition to this new version is the multi-threading capability which makes the scanning process faster.
Link: The home page of the DPS project
Link: Download dps-v1.5.tar.gz
"The sole idea of the Dynamic Port Scanner (DPS) is to provide a reliable spoofed source IP port scanner. The spoofed source IP is dynamically generated at run time and it varies for every scan packet; every scan packet carries a random spoofed source IP. Traditionally, a port scan with a spoofed source IP has been considered unreliable due to the fact that reply packets would not reach back the scanning system. However, the technique used by DPS ensures the reliability of such spoofed scan. This technique is based on the integration of ARP Poisoning into port scanning to achieve the desired result. The spoofed IP addresses used by DPS during a scanning process fall within the range of the local subnet. Thus, DPS is best suited for internal scanning."
The speed of Internet communications is of an equal importance to both Internet engineers and the end users. Internet users, in general, prefer to have a fast enough communication to perform their Internet transactions. A fast Internet communication provides comfort and easiness to the users. The speed of the Internet depends on many factors; however, they can be classified into two categories:
a. Hardware factors
b. Software factors
Hardware factors are like the physical link and its transfer rate, the CPU processing speed of Internet devices (e.g. Routers, Switches, Servers, etc) as well as the CPU processing speed of the end user’s PC. The software factors involve the algorithms which Internet applications utilize to perform Internet operations. In other words, there are different network programming techniques that could be used to optimize the speed of network communications. The software factors cannot influence the communication speed beyond the hardware factors. It is always the burden of the network application developer to use the best techniques to make the best out of the available hardware devices/links.
Conventionally, the time needed for a packet to travel from source A to destination B and come back from B to A again is called RTT, Round-Trip Time. And the TCP protocol contains algorithms (using the three-way handshake sent/received packets) to compute the approximate RTT between the client and the server, so that the client knows how long to wait for an acknowledgment. In a sequential communication, the client sends the next data after it receives the acknowledgment from the previous data. We will take DNS query/response as a case study to illustrate the points here. Let’s say a client wants to resolve three hostnames: server-A, server-B, and server-C. An application that performs sequential transactions will send a query for server-A, wait for the response, and once it receives the response, it sends another query for server-B; and once again when it receives the response for server-B, it sends the query for server-C and finally receives its response. If, for instance, the time from sending the query until receiving its response is 180 Milliseconds, the total time to resolve the three hostnames will be 540 Milliseconds.
If we divide the RTT into six time periods, we can draw the following diagram that corresponds to every query-response pair:
Time 0:
REQUEST1__________________->
<-__________________________
Time 1:
_________REQUEST1_________->
<-__________________________
Time 2:
__________________REQUEST1->
<-__________________________
Time 3:
___________________________->
<-_________________REPLY___1
Time 4:
___________________________->
<-________REPLY___1_________
Time 5:
___________________________->
<-REPLY__1__________________
However, by utilizing the link, we can minimize the total time to resolve three hostnames. This happens by sending other queries before receiving the responses that correspond to the previous queries. The client would first send query-1 followed directly by query-2 and query-3. The server will process the queries and send the replies in the same rate. This can be demonstrated by the following diagram:
Time 0: REQUEST1__________________-> <-__________________________ Time 1: REQUEST2 REQUEST1_________-> <-__________________________ Time 2: REQUEST3 REQUEST2 REQUEST1-> <-__________________________ Time 3: _________REQUEST3 REQUEST2-> <-_________________REPLY___1 Time 4: __________________REQUEST3-> <-________REPLY___1 REPLY___2 Time 5: ___________________________-> <-REPLY__1 REPLY___2 REPLY___3 Time 6: ___________________________-> <- REPLY___2 REPLY___3_______ Time 7: ___________________________-> <-REPLY__3__________________
In this case, the total amount of time to resolve three hostnames is reduced to 220 Milliseconds. But the question now is: how does a developer optimize the speed of network communication and make his/her application performs transactions faster? There are different network programming techniques to perform such act. The main four ones are: a. I/O Multiplexing, b. Non-Blocking I/O, c. Multi-Processing, and d. Multi-Threading. But before discussing those techniques, one has to understand the reason behind the normal sequential communication.
I/O operations are be default blocking, which means, when reading data from a socket or standard input or when writing data to a socket or standard output, the I/O operation blocks the continuity of the process until the I/O operation is done. As in the previous DNS example, the client sends the query using the send() function, then, it issues the recv() function to receive the reply. The recv() function will block the process until an actual data packet is received. The time during which the process is blocked becomes useless and the application cannot perform any other operation during this time. The following techniques can be used, individually or in combination, to solve this problem:
[1] I/O Multiplexing
I/O Multiplexing refers to the ability to multiplex multiple input/output streams and then select the stream that is ready to be read or write. When any of the streams is ready, it is returned to the process to perform the needed operation. Visiting our DNS example, the developer can multiplex two input streams: a socket stream to read the responses from the server, and a standard input stream to read the hostnames. Whenever the standard input is ready, the program reads the hostname and sends the query. At the same, whenever the socket input is ready, the program reads the response and processes it. In this case, the hostnames can still be read from the standard input and the queries are sent even though the replies from the previous queries have not been received yet. I/O Multiplexing can be implemented using the select() function call as follows:
While()
{
FD_SET( , &rset);
FD_SET( , &rset);
maxfd = max( , ) + 1;
select( maxfd, &rset, NULL, NULL, NULL );
if( FD_ISSET( , &rset ) )
{
/* read hostnames, and send the queries */
}
If( FD_ISSET( , &rset ) )
{
/* read the response, process it,
and print the result */
}
}
[2] Non-Blocking I/O
Non-Blocking I/O refers to changing the default behavior of input/output streams from blocking mode to non-blocking so that a read or write call will return immediately whether there is data to read/write or not. Not only that, Non-Blocking mode can also be set for initiating multiple connections [connect()]. For example, if a client application wants to initiate three connections (TCP three-way handshake) and these connections take 7, 9, and 14 milliseconds to complete, respectively, in a sequential operation, the total time needed to perform these connections is 35 milliseconds. However, by using Non-Blocking connect(), the application can initiate other connections before the first connection is established. If we set the program to initiate at least three connections simultaneously, the total time needed to establish the three previous connections is 14 seconds – almost 60% percent less than sequential connections. Setting the non-blocking mode on an I/O stream can be programmatically using fcntl() function as follows:
int flags = fcntl(, F_GETFL, 0); fcntl( , F_SETFL, flags | O_NONBLOCK );
[3] Multi-Processing (forking)
Multi-Processing refers to the technique by which the main application creates sub-processes (called children) that perform tasks independently from each other and from the main process (called the parent). Instead of using I/O Multiplexing or Non-Blocking I/O, a developer can make the main process forks [fork()] multiple children and let each child perform specific operation. Visiting our DNS example, the application can create a child for every hostname to be resolved. Similarly in our multiple-connection example, the application can create a child for every connection to be established. The O.S. handles the concurrent execution of those sub-processes. Creating a child process is done using the POSIX fork() call. This method is the most widely used by Unix clients and servers.
[4] Multi-Threading
Multi-Threading is very similar to Multi-Processing except that threads are lighter than process. Forking a process is considered expensive and requires additional overhead. However, with threads, the cost is minimal and operations are performed faster. Threads are run concurrently by the O.S. The main application can create multiple threads where every thread performs individual task separately. A thread can be created using the POSIC pthrea_create() function call.
Although each technique of these has its advantages and disadvantages and although each one has its own context that best fits in, the Multi-Threading technique is considered the fastest and least expensive among all techniques.
One of the traditional methods of any exploit is to pass a “shellcode” that spawns a shell. The attacker then would have a remote shell over TCP stream. The way to spawn a shell, not necessary through a shellcode but in any general way, is done by executing the following function:
char *str[] = {“/bin/sh”};
execve( “bin/sh”, str, NULL);
Executing this will simply spawns an interactive shell. If this was executed locally, the input and output streams are the standard input (stdin) and the standard output (stdout). However, when it is executed remotely, the input and output streams are piped to the TCP stream.
However, what if a user wants to have a local program running and executing commands received remotely, or from an input stream other than ‘stdin’ ? there are various proposed ways to do that. Looking at the exec() family (i.e. execl, execlp, execv, execvp, and execve), these functions do not give the user a control over the input stream or output stream, they only return a status code indicating success or failure. On the other hand, there is the popen() function which actually executes a command while giving the user a file descriptor to either write input or read output of the command. For example, the following code snippet lets the user execute the command “ls –l” and get the output within the program:
FILE *fd;
fd = popen( “ls –l”, “r” );
while( fgets(line, sizeof line, fd) )
printf(“%s”, line);
Such piece of code would be handy to use in a program that receives commands remotely from a user, or locally from a file, and pipes the output to whatever the user wants. However, there is one drawback to such implementation, that is, a command like “cd” will not work, and the user will not be able to navigate between directories. So, in order to have a full interactive shell, the command that needs to be executed is the shell itself, i.e. “sh”. But now we face another problem, running “sh” from popen() function will actually spawns a shell where its input and output are bound to stdin and stdout, and we will not be using the piping functionality provided by pipe.
The best way to run a fully interactive shell where its input and output are controlled by the user is to execute “sh” by one of the exec() functions and at the same take control of the stdin and stdout through the use of pipes; typically, two pipes are needed. The following diagram simplifies the description:
----------------- | | ----------------- --> pipe1 --> > /bin/sh > --> Pipe2 --> ----------------- | | -----------------
The steps to implement this are as follows:
1] Create two pipes (p1 and p2) and two FILE pointers (ptr1 and ptr2).
2] Fork a child process [pid=fork()].
3] The child process takes control of stdin and stdout by associating them with the reading-end of the first pipe and the writing-end of the second pipe, respectively.
4] The child process spawns a shell (“/bin/sh”) using an execlp() function.
5] The parent process associates the file descriptors with the writing-end of the first pipe and with the reading-end of the second pipe.
6] Anything written to the first file descriptor will be passed as input to the running interactive shell; and any results from the interactive shell can be read from the second file descriptor.
...
int p1[2], p2[2];
FILE *ptr1, ptr2;
pipe(p1); pipe(p2);
pid=fork;
if( pid == 0 ){
close(0); /* close the stdin */
dup(p1[0]); /* reading-end of pipe1 takes
control of stdin */
close(1); /* close the stdout */
dup(p2[1]); /* writing-end of pipe2 takes
control of stdout */
execlp(“/bin/sh”, “sh”, NULL, NULL)
}
ptr1 = fdopen( p1[1], “w” ); /* writing-end of pipe1 */
ptr2 = fdopen( p2[0], “r” ); /* reading-end of pipe2 */
fprintf( ptr1, “ls –l” );
while( fgets( line, 50, ptr2) ) printf(“%s”, line);
...
In this case, the user has an interactive shell with cd-ing and also can control the input and output stream.
References:
- Popen specification
- Fork, Exec and Process control
- Using the UNIX Pipe in C