Squeezing Blood from a Stone or Getting Type Information from a C Language Compiler The C language and UNIX comprise a long tradition of using the worst programming tools ever created, then making worthless remarks on the difficulty of using them, as this intelligent idiot once wrote: Everyone knows that debugging is twice as hard as writing a program in the first place. So if you're as clever as you can be when you write it, how will you ever debug it? --Brian Kernighan ``Kernighan's Law'' Ada is a real programming language, and Ada 1995 includes real facilities for interfacing with other languages: COBOL, FORTRAN, and C. Unfortunately, the C language is still deeply unpleasant to bind, for many more semi-standard types than Ada's Interfaces.C package provides are used in practice. As the C language lacks anything resembling a package system, I must instead pore over many directories and files with asinine names, conventions, and nestings in an attempt to learn the base definitions. I've seen no tool that simplifies this stupid task. Calling upon GCC with the -E option is intended to stop compilation after the text replacement phase, which should enable one to see the definitions of certain values such as macros, but this fails as often as it succeeds for some reason or another. Seeing such values is only part of the problem, however, as the definitions of types aren't revealed by this method. GNU GLOBAL and cscope are both tools ostensibly meant for the task, but I can't get them to work. This insanity shouldn't be surprising to me, and isn't, but it's still bothersome so. Trying to use such shitty tools makes me appreciate even more the simple Common Lisp functions TYPE- OF, DESCRIBE, INSPECT, and also the distinction between the functions MACROEXPAND and MACROEXPAND-1. It occurred to me lately how my POSIX_UDP_Garbage Ada library, the interface entirely shrouded by my Usable_Datagram_Package library, could be improved still without involving C language code directly: An automated method of determining the few macro values and type definitions needed would suffice to make it easier to port to systems besides GNU/Linux, which likely only needs to be done a few times. I need but the following macro values, with only the last deviating any from what I've already seen: AF_INET, SOCK_DGRAM, MSG_TRUNC. That last value also deviates in acceptable use; both GNU/Linux and FreeBSD permit it to be used to get the true length of a packet, whereas OpenBSD seems not to do so; the easy way to handle this discrepancy makes the value zero, because the asinine way in which the C language fakes having optional parameters permits ignoring values in this way, and it's easy enough. After looking at the available tools, I got the idea in my mind to write a simple C language program which will reveal this information, but the problem of base type definitions is much harder. I need the three following types, at a minimum, and the third is a structure: ssize_t, socklen_t, sockaddr. The second type is an asinine type introduced by POSIX in a vain and late attempt at abstraction, by replacing an unsigned integer parameter type with a type that's almost certainly an unsigned integer underneath. The first type is want for a single additional value, pathetically. The structure type is constrained to be sixteen octets in every implementation I've checked, even to the point some use more exact types for the fields; a similarly asinine type for the first field is called sa_family_t, and actually varies between implementations in the most frustrating way. The BSD implementations of the sockaddr type introduce a useless additional field sa_len, which is of course always set to zero despite its name supposedly referring to a length, whereas the GNU/Linux implementation of the field uses a single type twice the size of sa_len. Amusingly, this field difference matters not under big endian systems. Unfortunately, little endian systems are the preference of UNIX hackers; when given a good choice and a bad choice, the true UNIX hacker always picks the latter without any hesitation. In attempting to write this incredibly simple program, I learned again of one addition to the 2011 C language standard, a keyword named _Generic that provides a little less than half of the mechanism a real language would have. This keyword allows one to write an expression that varies based on types given to it, roughly like TYPECASE in Common Lisp but with a laughable name; however, the types must be wholly enumerated, because the C language lacks anything like TYPE-OF. Regardless, this pathetic mechanism suffices for my purpose, and I was able to write a program that somewhat does what I want. This program suffices not for solving the entire problem, given the other type assumptions, but even C language programs break if their many assumptions about the sizes of types be violated in any way. Amusing articles have been written about the consequences of changing that size of the ``int'' type. This program could be improved in trivial ways, with which I likely won't bother. I'm interested in learning of any better ways to achieve this extremely simple result, but won't be holding my breath. This small, and almost entirely useless, program is placed in the public domain, as if that matters: #include #include #include int main () { ssize_t s; socklen_t t; if (2 == SOCK_DGRAM) printf("SOCK_DGRAM = 2\n"); else printf("SOCK_DGRAM /= 2\n"); if (2 == AF_INET) printf("AF_INET = 2\n"); else printf("AF_INET /= 2\n"); if (32 == MSG_TRUNC) printf("MSG_TRUNC = 32\n"); else printf("MSG_TRUNC /= 32\n"); if (_Generic(s, long: 1, default: 0)) printf("ssize_t is long\n"); else printf("ssize_t isn't long\n"); if (_Generic(t, unsigned int: 1, default: 0)) printf("socklen_t is unsigned integer\n"); else printf("socklen_t isn't unsigned integer\n"); return !((2 == SOCK_DGRAM) && (2 == AF_INET) && (32 == MSG_TRUNC) && (_Generic(s, long: 1, default: 0)) && (_Generic(t, unsigned int: 1, default: 0))); } .