Saturday, 30 July 2016

Troubleshoot network problem using tshark

How tshark works ?

When a packet arrives at the network card, the MAC destination address is checked to see if it matches yours, in which case an interrupt service routine will be generated and handled by the network driver. 

Subsequently, the received data is copied to a memory block defined in the kernel and from there it will be processed by the corresponding protocol stack to be delivered to the appropriate application in user space. Parallel to this process, when Tshark is capturing traffic, the network driver sends a copy of the packets to a kernel subsystem called Packet Filter, which will filter and store in a buffer the desired packets. These packets will be received by Dumpcap (in user space) whose main goal will be to write them into a libpcap file format to be subsequently read by Tshark. As new packets arrive, Dumpcap will add them to the same capture file and it will notify Tshark about their arrival so that they can be processed.




My objective would be to give you brief tutorial on how to find problems related to performance of network, could be due to bandwidth etc.. so we could use tshark to try and find out which hosts are generating more traffic and what type of data are they sending..

List all the network interfaces - tshark -D

Capture traffic from network interface and write to file -
#tshark -i <interface> -w traffic.pacap

How to capture and analyze traffic using tshark ? 

1. Determine which IPs in your VLAN(IPADDRES/NETMASK) could be misusing the network would be able to get IP list. list by dfault would be sorted according to total number of frames, so it could give an idea of heavy talkers.

#tshark -r traffic.pcap -q -z "conv,ip,ip.addr==74.125.130.0/24"

================================================================================
IPv4 Conversations
Filter:ip.addr==74.125.130.0/24
                                               |       <-      | |       ->      | |     Total     |   Rel. Start   |   Duration   |
                                               | Frames  Bytes | | Frames  Bytes | | Frames  Bytes |                |              |
74.125.130.102       <-> 10.0.2.15                105     27191     129     21393     234     48584   112.306444555       260.0255
74.125.130.95        <-> 10.0.2.15                 34      3395      36     11639      70     15034   263.618378290       108.6899
74.125.130.93        <-> 10.0.2.15                 32      3601      37     11601      69     15202   109.882120656       177.0934
================================================================================

2. With above inforamtion we know that IP 74.125.130.102 represents one of the host which is generating more traffic to communicate with other machines on the network 74.125.130.0/24

You could create another pcap file just with the traffic generated by that machine(74.125.130.102)

#tshark -r traffic.pcap -R "ip.addr==74.125.130.102" -w ip.pcap
# capinfos ip.pcap | grep "Number\|time:"
Number of packets:   234
Start time:          Fri Jul 29 20:37:12 2016
End time:            Fri Jul 29 20:41:32 2016

3. Check that your host is not breaking any of your policies of your network, only HTTP & HTTPS is allowed. Below commands will tells us outbound connections to ports other than any (HTTP or HTTPS)

#tshark -o column.format:'" Source ","%s","Destination","%d", "dstport", "%uD","Protocol", "%p"' -r ip.pcap -R "ip.src == 74.125.130.102 && ! dns && tcp.dstport != 80 && tcp.dstport != 443"  | sort -u

74.125.130.102 -> 10.0.2.15    43536 TCP
74.125.130.102 -> 10.0.2.15    43536 TLSv1.2
74.125.130.102 -> 10.0.2.15    43540 TCP
74.125.130.102 -> 10.0.2.15    43540 TLSv1.2

4. I don't have any traffic violating my policies, anyway lets suppose we say if that do exists, then we would have those machines IP address and the port on which they are connected. so to make sure that the traffic is not from other service using the FTP port, lauch tcp stream of that session.

#tshark -o column.format:'"Source","%s","srcport", "%uS","Destination","%d", "dstport", "%uD","Protocol", "%p"' -r ip.pcap -R "tcp.dstport == 43536" | head -1
74.125.130.102 443 10.0.2.15    43536 TCP

#tshark -r ip.pcap -q -z  "follow,tcp,ascii,74.125.130.102:443,10.0.2.15:43536,1"
===================================================================
Follow: tcp,ascii
Filter: ((ip.src eq 74.125.130.102 and tcp.srcport eq 443) and (ip.dst eq 10.0.2.15 and tcp.dstport eq 43536)) or ((ip.src eq 10.0.2.15 and tcp.srcport eq 43536) and (ip.dst eq 74.125.130.102 and tcp.dstport eq 443))
195
............sZ..@G"......!s.....?W...$..5......+./.....
.......3.9./.5.
s.youtube.com..........
...............................h2.spdy/3.1.http/1.1..........
===================================================================

5. Now you could observe that it was "youtube.com" was actullay consuming more bandwidth responsible for slowdown in network.

If you do come across any FTP sessions, troubleshoot the above way, also additionally you will check all the files downloaded by the client.

#tshark -r ip.pcap -q -z  "follow,tcp,ascii,74.125.130.102:443,<Destination machine>:21,1" | grep RETR

6. tshark also allows us to break down each of the protocols captured. Thus we can see hierarchically the number of frames and bytes associated with each protocol. Using capture file, let's see for example the distribution of HTTP and HTTPS traffic used by the IP 74.125.130.102:

#tshark -r traffic.pcap -q -z io,phs,"ip.addr==74.125.130.102 && ssl || http"
===================================================================
Protocol Hierarchy Statistics
Filter: ip.addr==74.125.130.102 && ssl || http

eth                                      frames:122 bytes:40644
  ip                                     frames:122 bytes:40644
    tcp                                  frames:122 bytes:40644
      ssl                                frames:122 bytes:40644
        tcp.segments                     frames:2 bytes:2589
          ssl                            frames:2 bytes:2589
===================================================================

7. It would practically tells us that SSL represents all traffic, let's see the IP's associated with that communication.

#tshark -o column.format:'"destination","%d"' -r  traffic.pcap -R "ip.src ==74.125.130.102 && ssl"| sort -u
10.0.2.15

#whois 10.0.2.15 | grep -i "netname\|netrange"
NetRange:       10.0.0.0 - 10.255.255.255
NetName:        PRIVATE-ADDRESS-ABLK-RFC1918-IANA-RESERVED

With whatever application or information your would get for the IP address/ports, you can create ACLs or IPtables rules to deny certain types of traffic, do a shutdown of a specific port, limit the bandwidth of some protocols so on ...

More references : 

Thanks for re-sharing !