Go home operator new[], you’re drunk

So, you can allocate zero-length arrays in C++, and, unsurprisingly, dereferencing the returned pointer is undefined behavior:

Unlike malloc(3), however, the C++ standard codifies the return value to be non-null (sec. 3.7.4.1 of the C++11 standard):

If the request succeeds, the value returned shall be a non-null
pointer value (4.10) p0 different from any previously returned value
p1, unless that value p1 was subsequently passed to an operator
delete. The effect of dereferencing a pointer returned as a request
for zero size is undefined.

Personally I think that “this can return whatever, including null” is a lot more of a warn-off than “this will always return non-null, but don’t touch it”.

Forcing TLS1+ in Python’s urllib2 on OSX

The recently-announced POODLE SSLv3 downgrade vulnerability is probably prompting you to update at the very least your client-side TLS applications to negotiate only TLS1 and above. If you’re using Python’s urllib2, however, you may discover that you have little immediate control over the supported TLS dialects. Worse, if you’re on OSX you will discover that urllib2 requests are rejected by servers that have been patched to support only TLS1+. Misery!

The problem appears to stem from the implementation of the SSLv23 compatibility protocol used by Python on OSX, which in wire traces leads with an SSL2 version in the ClientHello, and fails to retry with a higher version if rejected by the server. The upshot is that turning off SSLv3 support is going to break Python client applications that use urllib2 on OSX.

Despite the lack of an explicit interface to do so, forcing TLS1+ support in urllib2 is a reasonably straightforward matter of overriding the default HTTPSHandler and its underlying httplib.HTTPSConnection:

The implementation is somewhat unsatisfying: HTTPSConnection is not designed for extension and this approach requires duplicating the default connect method’s functionality, but it does work.

Log n isn’t constant

Link

Reading memory allocator blog posts (because this is how I prefer to spend Saturday nights), I came across this well-put observation by Jason Evans 1:

In essence, my initial failure was to disregard the difference between a O(1) algorithm and a O(lg n) algorithm. Intuitively, I think of logarithmic-time algorithms as fast, but constant factors and large n can conspire to make logarithmic time not nearly good enough.

It rewards contemplation.

C++ Thread Pools

The thread pool pattern is convenient for handling problems that exhibit a fair amount of unsynchronized concurrency, e.g. systems that dispatch unrelated requests to workers for handling or that periodically spawn background tasks. C++ thread pool library offerings are relatively sparse, with options including the lightweight but incomplete and unmaintained proto-boost thread pool (*not actually part of the Boost library) or taking a dependency on the much larger & fully-featured libdispatch (an open-source port of Grand Central Dispatch). Recently, my employer Maginatics released our own lightweight, header-only thread pool library, which you can find here: https://github.com/maginatics/threadpool

Example static locals workaround

Re: previously Happily, MSVC 2013 is going to support static initialization of locally scoped variables. In the meanwhile, you can work around this shocking deficiency most of the time, by doing the compiler’s work manually via double-checked locking & taking advantage of BSS zero initialization. Here’s a convenience macro that defines and initializes a locally scoped static safely in MSVC 2012 and prior: Note that for this to work, the type must have a trivial constructor, or it’s turtles all the way down. Thanks to Andrew Gaul for suggesting some some cleanups over a previous iteration of this code.

Static initialization and thread safety

In a recent code review, I found some globally-accessible state that was incorrectly being initialized more than once. I suggested that the state be made truly global, and because this is C++ and because I have been bitten rather hard before, I recommended using the construct-on-first-use idiom, e.g.:

// Never invoke this method in a destructor kthx.
Foo* foo() {
    static Foo singleton;
    return &singleton;
}

Another colleague, however, wondered whether and to what extent this code is thread safe. Local variables with static storage duration are initialized on first execution in C and C++, but I’d never given much thought to the multi-threaded context, because, embarrassingly, Marshall Cline has never led me astray. Cursory googling, however, turned up disturbing evidence to the contrary. Refer to the article for details, but basically the author claims that compilers will emit thread-unsafe initialization code, and furthermore that this is “required by the C++ standard” (presumably C++03 at the time). I couldn’t double check the requirement claim (C++ standards, it turns out, are free as in speech, not beer), and despite my previous life I couldn’t bring myself to dredge up a 2004-vintage compiler and experiment directly. Besides, it’s 2012 and we’re all working with contemporary compilers (right?), so I asked the C++11 draft standard. From Section 6.7.4, which describes initialization of block scope variables with static storage duration:

Otherwise such a variable is initialized the first time control passes
through its declaration; such a variable is considered initialized upon
the completion of its initialization. If the initialization exits by
throwing an exception, the initialization is not complete, so it will be
tried again the next time control enters the declaration. If control
enters the declaration concurrently while the variable is being
initialized, the concurrent execution shall wait for completion of the
initialization.

The standard is pretty emphatically on the thread-safe side. But what do actual compilers do? Consider the following toy program:

struct Foo {
    Foo() { foo_ = 11; } 
    int foo_; 
}; 

Foo* getFoo() { 
    static Foo singleton; 
    return &singleton; 
} 

Compiling this code under GCC 4.6.1 & disassembling getFoo yields the following:

400614: push %rbp 
400615: mov %rsp,%rbp 
400618: mov $0x601040,%eax 
40061d: movzbl (%rax),%eax 
400620: test %al,%al 
400622: jne 40064b 
400624: mov $0x601040,%edi 
400629: callq 400500 <\_\_cxa_guard_acquire@plt> 
40062e: test %eax,%eax 
400630: setne %al 
400633: test %al,%al 
400635: je 40064b 
400637: mov $0x601048,%edi 
40063c: callq 400680 <\_ZN3FooC1Ev> 
400641: mov $0x601040,%edi
400646: callq 400520 <\\_\_cxa_guard_release@plt> 
40064b: mov $0x601048,%eax
400650: pop %rbp 

The constructor for block scoped static variable singleton is protected by a guard variable (*$0x601040) and within a critical section (cxa_guard_release sets the guard variable after initialization), fulfilling the requirements of the spec. Whatever the case was in 2004, block scoped statics in contemporary C++ do not pose a risk for concurrent programs. This doesn’t mean they shouldn’t be used cautiously, however: the spec leaves reentrant initialization undefined, except to require that no deadlock occur. If your initialization routines are reentrant, though, I submit that you have a more serious problem. Edit: The above discussion in no way applies to VC 2012 (and previous versions), for which assigning to a local variable with static storage duration remains manifestly unsafe, as I have recently discovered to my vast dismay.