IPv6 support in programming libraries

| | Comments (6) | TrackBacks (1)
I've been looking into what it would take to get several of our applications IPv6-enabled. In some cases, it's trivial. In others, it's going to be hell. This post is about how several programming languages have added IPv6 support to their standard libraries. Well, it's not really a post. It's more of a rant. Because I'm angry. I'm angry that two of the most popular languages appear to have put no forethought into future-proofing their networking libraries.

PHP and Perl are excellent examples of how not to add IPv6 support to a library. The status of IPv6 in both systems is an utter mess that will take years to fix.

We have many apps written in PHP and Perl, which we will eventually need to IPv6-enable. Given the abysmal state of IPv6 in these languages, I'm not optimistic about success. Critically, it not possible to write applications in PHP or Perl that use a single API and will work on IPv4-only, IPv6-only and IPv4/IPv6 dual-stacked hosts. The inability to do this is dooming.

Non-programmers can skip to the last paragraph. What follows is techy.

Let's look at PHP. It got IPv6 support in version 5, but its gethostbyname() function still doesn't support IPv6. Of course, the documentation doesn't mention this fact. Instead, users should use dns_get_record(). (As an aside, why does dns_get_record() still support A6 records, which were deprecated years ago?). Alternatively, developers may use PEAR's Net_DNS class to resolve names. It appears as if there are cases where Net_DNS_Resolver::query() can fail to use IPv6 to connect to a DNS server, however (if using UDP and the "enhanced sockets library").

Then there are the IP address utility libraries: Net_IPv4 and Net_IPv6. First question: Why are there two of them? Why is there not a single network address library that can handle both formats? But it gets worse: The two libraries have different APIs. There is considerable functional overlap between them -- there are functions to apply a netmask to an address, parse CIDR-formatted addresses, etc -- but the function prototypes look nothing alike. For example, some of Net_IPv4::parseAddress()'s abilities are duplicated in Net_IPv6::getNetmask() and Net_IPv6::removeNetmaskSpec(). Both Net_IPv4 and Net_IPv6 have functions to verify if a string is a valid address, but they have different names: Net_IPv4::validateIP() vs Net_IPv6::checkIPv6(). There are other examples of this sort of mismatch.

I'll only mention briefly that there are two other modules for validating IP(v4) addresses: Net_CheckIP and Net_CheckIP2. Both of them only handle IPv4 and have different function names than the equivalent functions in  Net_IPv4 and Net_IPv6. This only serves to increase programmer confusion and encourage the development of non-IPv6-capable code.

And that's just the low-level code. There are still the higher-level libraries for dealing with IMAP, SMTP, HTTP, etc. I haven't begun an audit of PEAR's Networking library to check for IPv4-only code. I'm too afraid.

Perl isn't any better. For reasons that escape me, Perl decided to fork its network libraries into IPv4-only and IPv6 halves. Specifically, there are Socket (and it's object-oriented cousin, IO::Socket::INET) and Socket6 (and IO::Socket::INET6). Of course this means that developers have to change their code to get IPv6 support. Almost no OSes bundle Socket6 or IO::Socket::INET6, so developers are loathe to make such changes. (Kudos to Mac OS X 10.5 and Ubuntu 7.10 for supporting them).

Because the modules are separate, you have to write a bunch of ugly code. When your code starts, you check if you have one of the IPv6 socket modules installed. If so, set a flag. Then every single time you need to talk to the network, you check this flag and branch, either to Socket6 or Socket. This is a) ugly b) inefficient c) unlikely to be adopted by virtually all programmers d) all of the above. Check the source to the Net::DNS module for an example. I am not looking forward to submitting patches that do this. But someone is going to have to.

I contrast this to Java, Python and C which have sane, clean, standardized APIs that support both IPv4 and IPv6. In Java, it's unlikely that developers will have to change code. In Python, the changes are usually two or three lines. In C, it's slightly more lines than that, but generally not much more. I'll elaborate more on these languages in a later post. Readers at Penn State may want to look at my IPv6 programming notes in the ITS Wiki (which is IPv6-enabled).

1 TrackBacks

Listed below are links to blogs that reference this entry: IPv6 support in programming libraries.

TrackBack URL for this entry: https://blogs.psu.edu/mt4/mt-tb.cgi/3220

» Improved IPv6 support in Perl from Living with IPv6

I've blogged before about the shoddy support for IPv6 in Perl. Last week, Perl 5.14 was released with improved IPv6 support in the core distribution: Improved IPv6 support The Socket module provides new affordances for IPv6, including implementations o... Read More

6 Comments

till said:

Hey Derek,

first of, thanks for all your feedback. :-) And I agree - consistency is not always a major feature of loose languages such as PHP.

But why this angry rant? Why not open a feature request and let people know?

In regard to Net::CheckIP2 - consistency is met because I follow Net::CheckIP. The "2" hints that the package suceeds another (primarily because of PHP5) so the objective is to offer a drop in replacement which provides the same API. Not to reinvent the wheel.

Regardless of this, I have been working on a support for Net::CheckIP2 to validate IPv6 since I discovered the news yesterday. I haven't checked it in because I did not get it to work (decompression is a bit tricky).

Only yesterday did they actually make progress on IPv6 (sorry, sounds a bit ignorant) because since then the root servers serve some IPv6 zones which enables clients to directly connect without a tunnel.

At least IMHO only yesterday it became somewhat obvious that IPv6 will be used some day. When this day is - no one knows. At least I couldn't figure it out.

In regard to cross-language consitency - we can revisit this when PEAR2 comes around. I'm looking forward to making it all better. Maybe you want to contribute as well. Let me know!

Derek Morr Author Profile Page said:

I will likely end up submitting patches to fix some of these issues. But many of them are architectural -- it's the APIs themselves that are broken. For example, why are there three incompatible ways to check if an IP address is valid? But you can't just get pick one to be "the standard" and remove the others; that would break backwards compatibility.

As for the root DNS servers' support for IPv6 - I agree this was a long-time coming. But it's been possible to query for AAAA records and to issue queries over IPv6 for years. This is possible without the root servers supporting IPv6. The C interfaces to do this have been standardized since 2000, see getaddrinfo() and getnameinfo() in POSIX. Many operating systems work fine with only IPv6 addresses in /etc/resolv.conf (I'm sitting on such a machine now, in fact).

I wrote a command line smtp client in Perl for testing mailservers [1] that I wanted to IPv6-enable since a long time ago. It supports STARTTLS through IO::Socket::SSL and that's indeed a problem - this module doesn't support v6 and I could find no easy (not even moderately difficult) way to add IPv6 support to IO::Socket::SSL.
[1] http://www.logix.cz/michal/devel/smtp/

OTOH Python is not innocent either. Its otherwise great SocketServer module doesn't seem to be IPv6 ready.

Derek Morr Author Profile Page said:

Right. The lack of extensibility in the basic Perl library API is frustrating.

Python certainly isn't 100% IPv6 ready. But a large chunk of its standard library supports IPv6. The socket, ftplib, httplib, poplib, smtplib, telnetlib, urlparse, and xmlrpclib already support. I've submitted patches for imaplib (Bug 1655) and nntplib (Bug 1664). That's much better out-of-the-box IPv6 support than Perl or PHP.

It is evident that both of them only handle IPv4 and have different function names than the equivalent functions in Net_IPv4 and Net_IPv6.

Derek Morr Author Profile Page said:

Yes, but my point is that there shouldn't be separate libraries for IPv4 and IPv6. Having separate libraries means that a programmer needs to write address-family-specific logic into their applications. With better API design, this doesn't need to happen.

Leave a comment