[erlang-bugs] possibly incorrect search order in inet:gethostbyname_tm/4

Raimo Niskanen raimo+erlang-bugs@REDACTED
Mon Jan 25 17:10:16 CET 2010


I have tossed the suggestion around, and it bounced a little...

The reason we introduced the native resolver was
to be bug-compatible with the underlying OS.

This means we can not let IP string detection
override the native resolver. There is even 
a fallback short circuit in the code today
that prevents inet:gethostbyname from trying
the IP string alternative after the native
resolver has failed.

This leaves us some other alternatives:
1) To extend the 'file' or the 'dns' lookup
   methods with IP string detection.
2) To do only do IP string detection when the 'native'
   lookup method is not used, but to do it first instead
   of last as today, within the inet:gethostbyname
   lookup method wrapper itself.
3) To introduce a new lookup method e.g 'ipstring' that the
   user can insert in the lookup chain wherever suitable.

I do not like 1) since it would make either of those
lookup methods impure, harder to test and harder to explain.

I prefer 2) because it would behave kind of natural.

However, 3) is the purest but would require manual
configuration in more cases.

But 2) wins...
At least until someone convinces me otherwise.



On Wed, Jan 20, 2010 at 11:17:37PM +0800, Chaos Wang wrote:
> Raimo Niskanen wrote:
> >On Wed, Jan 20, 2010 at 12:22:08AM +0800, Chaos Wang wrote:
> >  
> >>Cool~
> >>
> >>Sorry for responding so late. I was digging into some glibc source code...
> >>
> >>The followings are my findings (IPv4 only on Linux). And I totally agree 
> >>with you, that the safest form to be considered as a IPv4 address will 
> >>be the standard IPv4 dotted-decimal notation without trailing dot.
> >>    
> >
> >But I think the best way would be to adopt the Solaris/Linux behaviour.
> >Perhaps even also the hex/octal notation...
> >  
> Either way is OK to me, as long as it can recognize addresses in 
> standard IPv4 dotted-decimal notation correctly ;-)
> >  
> >>All libc APIs related to parsing IPv4 address string into in_addr form 
> >>in my mind are:
> >>   * inet_addr() (deprecated)
> >>   * inet_aton()
> >>   * inet_pton()
> >>   * gethostbyname() and gethostbyname_r() (obsolete, but used by 
> >>inet_gethost program)
> >>   * gethostbyname2() and gethostbyname2_r() (GNU extension)
> >>   * getaddrinfo()
> >>
> >>In all these functions, strings with trailing dot will not be considered 
> >>as IPv4 addresses.
> >>
> >>inet_aton() (and deprecated inet_addr()) recognize IPv4 numbers-and-dots 
> >>notation: every dotted number in the address can be in decimal, octal or 
> >>hexadecimal. And the address can also be written in shorthand:
> >>
> >>   a - means treat a as 32 bits
> >>   a.b - means treat b as 24 bits
> >>   a.b.c - means treat c as 16 bits
> >>
> >>inet_pton() is like inet_aton(), but without all the hexadecimal, octal 
> >>(with the exception of 0) and shorthand. So it only recognizes standard 
> >>IPv4 dotted-decimal notation.
> >>
> >>gethostbyname() (also gethostbyname2() and *_r alternations) use 
> >>__nss_hostname_digits_dots() to identify IP address. This function calls 
> >>inet_aton() to parse IPv4 address, except that it refuse to accept any 
> >>non-digit characters. So the hexadecimal from of IPv4 addresses can't be 
> >>recognized by it.
> >>
> >>getaddrinfo() use inet_aton() to recognize IPv4 address. So they are 
> >>equivalent in IPv4 address parsing.
> >>    
> >
> >I just found out you have almost created the Solaris man page for 
> >inet_pton:
> >http://www.s-gms.ms.edus.si/cgi-bin/man-cgi -> search command inet_pton
> >  
> Ah, truly a coincidence :-)
> Manpages in Solaris are more thorough than in Linux, indeed.
> >  
> >>The program I used to test these APIs is in the attachments.
> >>
> >>Reference locations (in glibc-2.9):
> >>   * resolv/inet_addr.c implements inet_aton(), inet_addr()
> >>   * resolv/inet_pton.c implements inet_pton()
> >>   * sysdeps/posix/getaddrinfo.c implements getaddrinfo()
> >>   * nss/getXXbyYY.c implements gethostbyname*()
> >>   * nss/getXXbyYY_r.c implements gethostbyname*_r()
> >>   * nss/digits_dots.c implements __nss_hostname_digits_dots()
> >>
> >>Raimo Niskanen wrote:
> >>    
> >>>I have done some research on my own...
> >>>
> >>>These are the ones that succeed (and other numbers
> >>>within the ranges, of course):
> >>>
> >>>Linux, FreeBSD, Solaris:
> >>>		AF_INET
> >>>"127.0.0.1"	->	127.0.0.1
> >>>"192.168.1	->	192.168.0.1
> >>>"10.1"		->	10.0.0.1
> >>>"17"		->	0.0.0.17
> >>>"192.168.65535"	->	192.168.255.255
> >>>"10.16777215"	->	10.255.255.255
> >>>"4294967295"	->	255.255.255.255
> >>>		AF_INET6
> >>>"127.0.0.1"	->	::ffff:127.0.0.1
> >>>"192.168.1	->	::ffff:192.168.0.1
> >>>"10.1"		->	::ffff:10.0.0.1
> >>>"17"		->	::ffff:0.0.0.17
> >>>"192.168.65535"	->	::ffff:192.168.255.255
> >>>"10.16777215"	->	::ffff:10.255.255.255
> >>>"4294967295"	->	::ffff:255.255.255.255
> >>>"::127.0.0.1"	->	::127.0.0.1
> >>>"::"		->	::
> >>>
> >>>FreeBSD (addendum):
> >>>		AF_INET
> >>>"127.0.0.1."	->	127.0.0.1
> >>>
> >>>OpenBSD:
> >>>		AF_INET
> >>>"127.0.0.1"	->	127.0.0.1
> >>>"127.0.0.1."	->	127.0.0.1
> >>>		AF_INET6
> >>>"::127.0.0.1"	->	::127.0.0.1
> >>>"::"		->	::
> >>>
> >>>For IPv6 addresses there seems to be consensus: if it
> >>>parses as an IPv6 address according to the specifications
> >>>I recall, that IPv6 address is returned, except that OpenBSD
> >>>does not accept an IPv4 string when requesting an IPv6
> >>>address while the others do.
> >>>
> >>>For IPv4 addresses Linux, FreeBSD and Solaris regards
> >>>many numeric strings as IPv4 addresses while OpenBSD
> >>>requires a 4-field dotted decimal. Both BSDs accept
> >>>a trailing dot for 4-field dotted decimal, while
> >>>Linux and Solaris regard a trailing dot as proof
> >>>that the string is an absolute hostname.
> >>>
> >>>Conclusions:
> >>>
> >>>The least common denominator (and the most common case)
> >>>would be to regard 4-field dotted decimal [0..255]
> >>>with no trailing dot as an IPv4 string.
> >>>
> >>>The most widespread behaviour would be the Linux, Solaris
> >>>and FreeBSD (except for trailing dot) behaviour. And since
> >>>OpenBSD is not the origin of Erlang/OTP and has little
> >>>importance in the community, it would probably be
> >>>the most sensible behaviour.
> >>>
> >>>The current inet_parse:ipv4_address/1 needs to be
> >>>augumented to handle the "192.168.65535",
> >>>"10.16777215" and "4294967295" IPv4 strings.
> >>>
> >>>I'll toss the suggestion around internally and see if and when
> >>>we can make such a change into the Linux/Solaris behaviour...
> >>>
> >>>/ Raimo
> >>>
> >>>
> >>>
> >>>On Mon, Jan 18, 2010 at 04:07:24PM +0100, Raimo Niskanen wrote:
> >>> 
> >>>      
> >>>>On Mon, Jan 18, 2010 at 10:57:53AM +0800, Chaos Wang wrote:
> >>>>   
> >>>>        
> >>>>>Hi all,
> >>>>>
> >>>>>inet:gethostbyname_tm/4 always try any specified DNS resolution 
> >>>>>methods first, and check whether the given domain name is a IPv4/v6 
> >>>>>address when all previous tries failed. So even a string containing 
> >>>>>valid IP address is specified as domain name to be resolved, it still 
> >>>>>needs to traverse all the resolution methods before finding out it's 
> >>>>>already an IP address at last.
> >>>>>
> >>>>>This would cause serious problems if 'dns' resolution method is 
> >>>>>specified in some corporation internal networks, in which all unknown 
> >>>>>domain names (including those treated-as-domain IPv4/v6 address 
> >>>>>string) will be resolved into the same portal server address. Only 
> >>>>>'native' resolution method can be used in such an environment, because 
> >>>>>libc DNS resolving API will check whether the domain name is an IP 
> >>>>>address at first.
> >>>>>
> >>>>>For example, in my working network, the resolving results when 
> >>>>>specified {lookup,[native]} in kernel inetrc are as following:
> >>>>>
> >>>>>  > inet:getaddr("www.google.com", inet).    % real domain name, 
> >>>>>resolvable at DNS server
> >>>>>  {ok,{64,233,189,99}}
> >>>>>  > inet:getaddr("10.0.0.2", inet).          % treated-as-domain IP 
> >>>>>address, not resolvable at DNS server
> >>>>>  {ok,{10,0,0,2}}
> >>>>>
> >>>>>But when specified {lookup,[dns]} in kernel inetrc, the results became:
> >>>>>
> >>>>>  > inet:getaddr("www.google.com", inet).    % real domain name, 
> >>>>>resolvable at DNS server
> >>>>>  {ok,{64,233,189,99}}
> >>>>>  > inet:getaddr("10.0.0.2", inet).          % treated-as-domain IP 
> >>>>>address, resolved to portal server address by DNS server
> >>>>>  {ok,{115,124,17,136}}   % Oops...
> >>>>>
> >>>>>IMHO the search order in inet:gethostbyname_tm/4 should be changed to: 
> >>>>>checking whether the domain name is already a IP address firstly, then 
> >>>>>try all specified domain resolution methods.
> >>>>>
> >>>>>Thanks!
> >>>>>     
> >>>>>          
> >>>>Hi!
> >>>>
> >>>>You make a good case for changing the resolving order. I am almost
> >>>>on your side, there is just one little detail...:
> >>>>
> >>>>Historically, portal server fake IP addresses has not been an issue
> >>>>for inet_res (the DNS resolver). Instead, it has had to balance between
> >>>>the RFCs and what actually is done in product networks.
> >>>>
> >>>>It is not impossible for inet_res to be in an environment where
> >>>>the default domain is foo.bar and a lookup for "17" is supposed
> >>>>to return the IP address for the host 17.foo.bar. Now "17" is
> >>>>not a DNS label according to RFC 1035 section 2.3.1 but that
> >>>>is only a "Preferred name syntax".
> >>>>
> >>>>Today it is more unlikely. But the question still is; 
> >>>>when can you safely assume the lookup string at hand is
> >>>>an IP address and not a host name.
> >>>>
> >>>>The existing function inet_parse:ipv4_address is probably
> >>>>too forgiving since it translates "17" -> {0,0,0,17},
> >>>>"17.18" -> {17,0,0,18}, "17.18.19" -> {17,18,0,19}
> >>>>and "17.18.19.20" -> {17,18,19,20}, all from ancient
> >>>>praxis or even standards.
> >>>>
> >>>>IPv6 addresses are more clear cut since any IPv6 address must contain
> >>>>at least two colons and that is very unlikely for a host name.
> >>>>
> >>>>Can you strengthen your case by finding out more what it takes for
> >>>>libc DNS to be convinced the lookup string is an IPv4 address? 
> >>>>
> >>>>   
> >>>>        
> >>>>>chaoslawful
> >>>>>
> >>>>>
> >>>>>________________________________________________________________
> >>>>>erlang-bugs mailing list. See http://www.erlang.org/faq.html
> >>>>>erlang-bugs (at) erlang.org
> >>>>>     
> >>>>>          
> >>>>-- 
> >>>>
> >>>>/ Raimo Niskanen, Erlang/OTP, Ericsson AB
> >>>>
> >>>>________________________________________________________________
> >>>>erlang-bugs mailing list. See http://www.erlang.org/faq.html
> >>>>erlang-bugs (at) erlang.org
> >>>>   
> >>>>        
> >>> 
> >>>      
> >
> >  
> >>#include <sys/socket.h>
> >>#include <netinet/in.h>
> >>#include <arpa/inet.h>
> >>#include <netdb.h>
> >>#include <stdio.h>
> >>#include <errno.h>
> >>#include <string.h>
> >>
> >>void use_inet_addr(const char *name);
> >>void use_inet_aton(const char *name);
> >>void use_inet_pton(const char *name, int af);
> >>void use_gethostbyname(const char *name);
> >>void use_gethostbyname_r(const char *name);
> >>void use_gethostbyname2(const char *name, int af);
> >>void use_gethostbyname2_r(const char *name, int af);
> >>void use_getaddrinfo(const char *name);
> >>
> >>int main(int argc, char *argv[])
> >>{
> >>	if(argc != 2) {
> >>		printf("Usage: %s <IPv4 addr>\n", argv[0]);
> >>		return 1;
> >>	}
> >>
> >>	use_inet_addr(argv[1]);
> >>	use_inet_aton(argv[1]);
> >>	use_inet_pton(argv[1], AF_INET);
> >>	use_gethostbyname(argv[1]);
> >>	use_gethostbyname_r(argv[1]);
> >>#if defined(_BSD_SOURCE) || defined(_SVID_SOURCE)
> >>	use_gethostbyname2(argv[1], AF_INET);
> >>	use_gethostbyname2_r(argv[1], AF_INET);
> >>#endif
> >>	use_getaddrinfo(argv[1]);
> >>
> >>	return 0;
> >>}
> >>
> >>void use_inet_addr(const char *name)
> >>{
> >>	in_addr_t addr = inet_addr(name);
> >>
> >>	printf("inet_addr: ");
> >>	if(addr == INADDR_NONE) {
> >>		printf("failed (possibly 255.255.255.255)\n");
> >>	} else {
> >>		struct in_addr in;
> >>		in.s_addr = addr;
> >>		printf("%s\n", inet_ntoa(in));
> >>	}
> >>}
> >>
> >>void use_inet_aton(const char *name)
> >>{
> >>	struct in_addr in;
> >>
> >>	printf("inet_aton: ");
> >>	if(inet_aton(name, &in)) {
> >>		printf("%s\n", inet_ntoa(in));
> >>	} else {
> >>		printf("failed\n");
> >>	}
> >>}
> >>
> >>void use_inet_pton(const char *name, int af)
> >>{
> >>	struct in_addr in;
> >>	
> >>	printf("inet_pton: ");
> >>	if(inet_pton(af, name, &in)) {
> >>		printf("%s\n", inet_ntoa(in));
> >>	} else {
> >>		printf("failed\n");
> >>	}
> >>}
> >>
> >>void use_gethostbyname(const char *name)
> >>{
> >>	struct hostent *h = gethostbyname(name);
> >>
> >>	printf("gethostbyname: ");
> >>	if(!h) {
> >>		printf("failed (%s)\n", hstrerror(h_errno));
> >>	} else {
> >>		if(h->h_addrtype != AF_INET) {
> >>			printf("failed (invalid address type)\n");
> >>		} else {
> >>			char **pp = h->h_addr_list;
> >>			while(*pp != NULL) {
> >>				struct in_addr *p = (struct in_addr*)(*pp);
> >>				printf("%s\n", inet_ntoa(*p));
> >>				++pp;
> >>			}
> >>		}
> >>	}
> >>}
> >>
> >>void use_gethostbyname_r(const char *name)
> >>{
> >>	int rc;
> >>	char buf[8192];
> >>	struct hostent h;
> >>	struct hostent *rp;
> >>	int myerrno;
> >>
> >>	printf("gethostbyname_r: ");
> >>	rc = gethostbyname_r(name, &h, buf, sizeof(buf), &rp, &myerrno);
> >>	if(rc == ERANGE) {
> >>		printf("failed (out of memory)\n");
> >>	} else if(!rc) {
> >>		if(!rp) {
> >>			printf("no address found\n");
> >>		} else {
> >>			char **pp = h.h_addr_list;
> >>			while(*pp != NULL) {
> >>				struct in_addr *p = (struct in_addr*)(*pp);
> >>				printf("%s\n", inet_ntoa(*p));
> >>				++pp;
> >>			}
> >>		}
> >>	} else {
> >>		printf("failed (%s)\n", hstrerror(myerrno));
> >>	}
> >>}
> >>
> >>#if defined(_BSD_SOURCE) || defined(_SVID_SOURCE)
> >>
> >>void use_gethostbyname2(const char *name, int af)
> >>{
> >>	struct hostent *h = gethostbyname2(name, af);
> >>
> >>	printf("gethostbyname2: ");
> >>	if(!h) {
> >>		printf("failed (%s)\n", hstrerror(h_errno));
> >>	} else {
> >>		if(h->h_addrtype != AF_INET) {
> >>			printf("failed (invalid address type)\n");
> >>		} else {
> >>			char **pp = h->h_addr_list;
> >>			while(*pp != NULL) {
> >>				struct in_addr *p = (struct in_addr*)(*pp);
> >>				printf("%s\n", inet_ntoa(*p));
> >>				++pp;
> >>			}
> >>		}
> >>	}
> >>}
> >>
> >>void use_gethostbyname2_r(const char *name, int af)
> >>{
> >>	int rc;
> >>	char buf[8192];
> >>	struct hostent h;
> >>	struct hostent *rp;
> >>	int myerrno;
> >>
> >>	printf("gethostbyname2_r: ");
> >>	rc = gethostbyname2_r(name, af, &h, buf, sizeof(buf), &rp, &myerrno);
> >>	if(rc == ERANGE) {
> >>		printf("failed (out of memory)\n");
> >>	} else if(!rc) {
> >>		if(!rp) {
> >>			printf("no address found\n");
> >>		} else {
> >>			char **pp = h.h_addr_list;
> >>			while(*pp != NULL) {
> >>				struct in_addr *p = (struct in_addr*)(*pp);
> >>				printf("%s\n", inet_ntoa(*p));
> >>				++pp;
> >>			}
> >>		}
> >>	} else {
> >>		printf("failed (%s)\n", hstrerror(myerrno));
> >>	}
> >>}
> >>
> >>#endif
> >>
> >>void use_getaddrinfo(const char *name)
> >>{
> >>	int rc;
> >>	struct addrinfo hints;
> >>	struct addrinfo *resp;
> >>
> >>	printf("getaddrinfo: ");
> >>
> >>	memset(&hints, 0, sizeof(hints));
> >>	hints.ai_family = AF_INET;
> >>	hints.ai_flags = AI_ADDRCONFIG | AI_PASSIVE;
> >>	hints.ai_socktype = SOCK_STREAM;
> >>	
> >>	rc = getaddrinfo(name, NULL, &hints, &resp);
> >>	if(rc) {
> >>		printf("failed (%s)\n", gai_strerror(rc));
> >>	} else {
> >>		struct addrinfo *rp;
> >>		
> >>		for(rp = resp; rp != NULL; rp = rp->ai_next) {
> >>			struct sockaddr_in *addr = (struct 
> >>			sockaddr_in*)(rp->ai_addr);
> >>			printf("%s\n", inet_ntoa(addr->sin_addr));
> >>		}
> >>
> >>		freeaddrinfo(resp);
> >>	}
> >>}
> >>
> >>
> >>    
> >
> >  
> >>________________________________________________________________
> >>erlang-bugs mailing list. See http://www.erlang.org/faq.html
> >>erlang-bugs (at) erlang.org
> >>    
> >
> >  
> 

-- 

/ Raimo Niskanen, Erlang/OTP, Ericsson AB


More information about the erlang-bugs mailing list