The Linux Page

OpenSSL error: <no libname>, <no funcname>, <no reason>, [no details] in ssl_lib.c

Foggy train tracks that divides as threads in a process do.

Problem

Working on a project that creates many threads handling many messages using secure sockets for everything, I ran in an interesting error (I broke up the error on multiple lines so it's easier to read than a one liner with a scrollbar!):

# OpenSSL error
06/11/18 14:31:39:error: tcp_client_server.cpp/bio_log_errors(367):
       OpenSSL: [336236705/20|169|161]:[<no libname>]:[<no
       funcname>]:[<no reason>]:[ssl_lib.c]:[1963]:[(no details)]

# My library error
06/11/18 14:31:39:error: safe_thread.h/run(468): a standard
       exception occurred in thread "ticker-service":
       failed creating an SSL_CTX server object

How can an error like this happen once in a while, but not all the time? You would think that if SSL_CTX can't be created once, my code must be all broken and it should always fail.

The fact is that OpenSSL is multi-thread aware, but it's not safe by default. This is because OpenSSL has no internal means to lock a mutex or semaphore. Instead it relies on you to provide a callback that does that dirty work for it.

Notice how useful the OpenSSL error is: it has no clue what happened. It reports absolutely no library name, no function name, no reason, although it knows of the source file where the problem happened (ssl_lib.c on line 1953).

The codes 336236705/20, 169, and 161 don't help much either! This is why I'm writing this post, in an attempt to help others who get a similar no info error in their logs.

Note 1: once I had the lock in place, starting my application made use of over 117,000 lock/unlock just to get initialized properly (no clients connected yet). Pretty impressive!

Note 2: the OpenSSL documentation actually says that the library is likely to crash if you don't initialize these two callbacks properly. Really it's important and I think a little heuristic in the code to detect that it is being used in a multithread environment could help. At the same time, we all like super fast libraries, don't we?

Solution

The solution is to setup two callbacks. In most cases, only one is necessary, but the docs says you should set these two functions:

locking_function(int mode, int n, const char *file, int line);
threadid_func(CRYPTO_THREADID *id);

The locking_function() accepts four parameters as follow:

  • mode: whether to LOCK or UNLOCK and whether the LOCK is exclusive (WRITE) or sharable (READ)
  • n: the lock number as OpenSSL uses many locks to optimize the potential of blocking for too long
  • file: a pointer to the source file calling the callback
  • line: the line number at which the call is happening

The threadid_func() accepts one parameter as follow:

  • id: a pointer where the thread identifier is to be saved

You have to make sure to at least initialize the locking_function() before creating your first thread.

There are three important points on implementing these two functions in a C++ application.

Setting Up the Callbacks

Somehow, you must make sure to call the setup before you create any thread.

If you overloaded the thread implement in some way, you should be able to do that in your implementation. Otherwise, somewhere in your main() function or some of the initialization functions it calls.

The two calls are simple here:

CRYPTO_set_locking_callback(&openssl_lock);
CRYPTO_THREADID_set_callback(&openssl_threadid);

However, the openssl_lock() requires you to have an array of mutexes. One way is not initialize the array in this way:

// a global pointer, null by default
//
std::mutex *       g_mutex = nullptr;

// the initialization code
//
g_mutex = new std::mutex[CRYPTO_num_locks()];

If you are sure that the array can be created in a global variable, then go for it. It just needs to exist by the time the callback gets called. So be careful!

The Locking Callback Function

The locking callback function needs to use the 'n' parameter it receives to know which lock is being LOCKed or UNLOCKed.

What is happening is defined in the 'mode' parameter. Here I have a standard mutex which can either be locked exclusively or unlocked. If you want to support a semaphore or some other type of mutex that can be locked non-exclusively, then you will want to take the CRYPTO_READ and CRYPTO_WRITE flags in account.

void openssl_lock(int mode, int type, char const * file, int line)
{
    if((mode & CRYPTO_LOCK) != 0)
    {
        // At this point we don't have a READ vs READ/WRITE so
        // we just lock and that's it
        //
        g_mutex[type].lock();
    }
    else
    {
        g_mutex[type].unlock();
    }
}

In my example I ignore the file and line parameters. These can be useful if you get a deadlock or such error to see what happens around the time the deadlock happens. Otherwise, as mentioned above, it's likely to flood your log file. That being said, if you are having problems, you may need to have some logs just to see that the function gets called properly.

The Thread ID Callback Function

As mentioned in the documentation, the thread ID callback function has a default so in effect it's not required to call it. The default implementation is used when you don't specify a specific thread ID.

The way the default implementation works is by using the pointer to errno as the unique identifier (i.e. the identifier does not need to be any specific identifier, it just needs to be unique for each thread.)

So the default function looks something like this:

void openssl_threadid(CRYPTO_THREADID * id)
{
    CRYPTO_THREADID_set_pointer(id, &errno);
}

On my end, I used the thread idenfier as returned by gettid(). Note, however, that this gettid() function is not often defined in the C library. If you run in that problem, I'm including a declaration right here too (I still have to use it in 16.04);

inline pid_t gettid()
{
    return syscall(SYS_gettid);
}

void openssl_threadid(CRYPTO_THREADID * id)
{
    CRYPTO_THREADID_set_numeric(id, gettid());
}

I think that the default implementation is not unlikely better because you avoid a syscall(). At least, getting the pointer to the errno should not generate a syscall() under INTEL processors (ia64, iCore, Pentium, Xeon, etc.) The errno is saved at an address which uses a specific segment register (On Pentium, it was FS. I have not looked closely at the 64 bit implementation.)

If you need really high speed, I suggest you test with a profiler and see how both implements behave.