Seeing as challenge #27 is now over 4 months old with no #28 in sight, and that I have (what I think is) an interesting idea for a new challenge, I have taken the liberty of creating it myself. As you will see, it will help you develop a somewhat different (but no less important) set of skills than most previous challenges. So without further ado...
Welcome to the 28th Beginners programming challenge.
Most previous challenges asked you to work with text or numeric data. By contrast, this one will ask you to work with binary data. I also wanted to make it relevant, so you will work with real-world data. Namely, you will implement an ARP packet analyser. Don't worry, it's less scary than it sounds. First, some background.
Background: The ARP Protocol
(To simplify the discussion, we only consider two machines that are in the same LAN segment, meaning they can communicate directly, without any router between them.)
Most people know that machines on a network are identified by an IP address. A bit less known is that machines are also identified by a hardware (or MAC) address. You can see it for example with the command ifconfig:
Here the IP address of my machine is 192.168.1.19, and its hardware address is 00:25:00:48:09:8c. Generally, when a user wants to use the network, they specify only the IP address of the machine they want to communicate with. I am not going to dwelve into the reason for having two addresses (IP and hardware) in the first place, but the fact is that in order to communicate with another machine on the network using the IP protocol, a machine needs to know both its IP and hardware addresses. The IP address is specified by the user, but the hardware address is not, so how does a machine obtain the hardware address of the machine it wants to communicate with? This is where the ARP (Address Resolution Protocol) protocol kicks in.
firas@aoba ~ % ifconfig eth1
eth1 Link encap:Ethernet HWaddr 00:25:00:48:09:8c
inet addr:192.168.1.19 Bcast:192.168.1.255 Mask:255.255.255.0
inet6 addr: fe80::225:ff:fe48:98c/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:93066 errors:0 dropped:0 overruns:0 frame:38438
TX packets:50786 errors:19 dropped:0 overruns:0 carrier:0
RX bytes:130634360 (130.6 MB) TX bytes:4352298 (4.3 MB)
Remember that my machine has IP address 192.168.1.19 and hardware address 00:25:00:48:09:8c, let's call it machine A. Suppose I want to send an IP packet to the machine with IP address 192.168.1.1, let's call it machine B. First, machine A needs to acquire the hardware address of machine B. In order to do that, it simply sends a packet to every other machine on the network, saying in effect: "Hi, I am 00:25:00:48:09:8c, I have IP address 192.168.1.19, and I would like to know who here has IP address 192.168.1.1." Assuming there actually is a machine on the network that has IP address 192.168.1.1, it will reply with a packet stating its hardware address, saying in effect: "Hi, I am 38:46:08:d1:83:97, and it is I who has IP address 192.168.1.1." Then machine A has all the information it needs in order to send an IP packet to machine B. Also, it will store the hardware address of machine B in its ARP table, so as to not perform an ARP lookup every time. You can see your machine's ARP table with the aptly named arp command:
firas@aoba ~ % arp -n
Address HWtype HWaddress Flags Mask Iface
192.168.1.1 ether 38:46:08:d1:83:97 C eth1
The format of an ARP packet is defined in several RFCs, but the Wikipedia article (especially section 2, Packet structure) will be sufficient for this task. ARP packets are encapsulated in Ethernet frames, so you will also need the Ethernet frame Wikipedia article.
Your task is simply to write a program that will read a copy of an ARP packet, and print the information it contains, such as whether the packet is a request or reply packet, and the addresses of the two machines involved. Sample request and reply packets are available in the attached archive. The files request.bin and reply.bin are the raw request and reply packets, that your program will take as input. The files suffixed .hexdump are hexadecimal dumps of the corresponding packets in text format, for easier visualisation. (If you would like to capture packets yourself, see post #4 below.)
Before you start coding, you should get familiar with the structure of an ARP packet (and of an Ethernet frame that contains one). To that end you can simply look at the hexdumps in your favourite text editor, or, even better, open them in Wireshark. Wireshark is available in the Ubuntu repositories (package wireshark), simply run it, cick File > Import, and open the hexdump file of your choice, keeping all the other options at their default values. Examining the packets in Wireshark will let you see exactly where in the packet each piece of information is stored, for example:
You can assume that all input packets are correctly formatted. Also, I have included the encapsulating Ethernet frame only to make opening the packets in Wireshark easier, you can just skip over the Ethernet data in your program.
Cookie points will be awarded for the following extras:
- Drop the assumption that all packets are correctly formatted, and handle incorrect packets gracefully.
- Also print the information contained in the Ethernet frames.
- Make your program support both input formats (raw and hexdump in the same format as in the provided archive).
Any overly obfuscated code will be immediately disqualified without account for programmers skill. Please remember that these challenges are for beginners and therefore the code should be easily readable and well commented.
Any non-beginner entries will not be judged. Please use common sense when posting code examples. Please do not give beginners a copy paste solution before they have had a chance to try this for themselves.
If you require any help with this challenge please do not hesitate to come and chat to the development focus group. We have a channel on irc.freenode.net #ubuntu-beginners-dev
Or you can pm me