Loading...

Finding and fixing leaks in our nodejs addon

9 years ago

nodejs is a great platform reaching a certain degree of maturity. We generally have a lazy approach toward API support with quasardb: if no one is asking for it, don’t implement it.

And one day it happened: a customer asked for a nodejs driver!

Discovering node and writing my first addon

The documentation gets you started and answers your most basic question such as “how do I build a C++ object?” or “How do I call a method?”.

For the rest you will have to read the v8 documentation and read examples in order to understand how you wrap objects, manage lifetime and all those v8 oddities.

I think the biggest odditity with v8, which is due to the garbage collector, is the decorellation between a C++ scope and the actual scope of an object. This is well explained here.

There is a library called nan that helps with wrapping object but it is a little bit too messy to my taste. 500 lines of C++ 11 later we had our own little wrapper and all was good.

The twist about this project is that I didn’t know anything about Javascript when I got started. I didn’t even know how to write a unit test and what the idioms were. All I knew was the alert function but I suspected it might not be enough.

Fortunately examples on the web are extremely abundant and the node build system was very nice to work with. Of all APIs so far it was the fastest to reach a “builds on all platforms” status.

Captain, we have a leak in a plasma injector!

Because Javascript is garbage collected in short tests leaks will be invisible. It’s perfectly fine for memory to grow and then retract that’s not the sign of a memory issue.

Fortunately our customers are awesome and we got a very precise bug report about a memory leak caused by our addon. Since other addons used by this customer had years of production behind them it was reasonable to assume we were responsible.

Reproducing the bug

The leak could have been caused either by:

  • Not releasing memory allocated by the low-level quasardb API
  • Invalid nodejs/v8 usage

The best was to use an API that didn’t keep alive buffers, for that we used the tagging API that transforms the result into an array of strings instead of a blob get that uses the nodejs Buffer object whose lifetime may exceed the callback.

By looping long enough we should observe whether or not memory grows.

var qdb = require('../build/Debug/qdb.node');
var c = new qdb.Cluster('qdb://127.0.0.1:2836');

function getter(t, i)
{
if (i < 50000)
{
t.getEntries(function(err, entries) {
if (err != 0)
{
console.log("could not get entries");
return;
}

entries = null;

getter(t, i + 1);
});
}
}

c.connect(function()
{
console.log("connected successfully");

// get tags en masse, should not leak
var t = c.tag('dasTag');

getter(t, 0);
},
function(err)
{ // error
console.log("cannot connect: %s", err);
});

Visual Studio gave us the following graph:

For five minutes I thought it was my stocks portfolio and was super happy, then I realized I was looking at the wrong window and became super sad.

In one click I got a list of culprits:

That was odd. When I looked at bindCallback, I had the following code:

bool bindCallback(qdb_request & req)
{
auto callback = _eater.eatCallback();

if (callback.second)
{
req.callback.Reset(v8::Isolate::GetCurrent(), callback.first);
}

return callback.second;
}

Callback being an v8::Persistent<v8::Function> object.

The enclosing qdb_request object was properly destroyed and callback was a member of that object. What was going on?

v8 oddity

That leak was an honest mistake. Remember what I said about object scope? Well that’s it with a twist.

Persistent handles are meant to be kept alive for a time exceeding the scope of a function, which is the case for a callback. To signal that the callback is no longer needed you use the Reset method.

In C++ smart pointer automatically call reset on destruction, but a Persistent handle is not a smart pointer.

I was properly destroying the qdb_request object when the callback was finished, but the destructor of v8::Persistent does not call the Reset method by default, resulting in a leak.

The fix was therefore to add an explicit call to Reset:

~qdb_request(void)
{
callback.Reset();
}

Get me my broker, my stocks underperform!

Wrong windows again, this time it’s good news. Wait. That’s no good news at all!

We can see we leak less but leaking comes in only acceptable quantity and that quantity is zero.

We could see that we were no longer leaking a persistent handle:

RTFM

That time this was just a stupid mistake caused by my misunderstanding of libuv. I didn’t understand that you have to manage the lifetime of a work object and I never deleted the uv_work_t object.

That’s 100% my fault but still I think there is a design issue with the API.

Typical code looks like this:

uv_work_t * work = new uv_work_t();

// init work

uv_queue_work(uv_default_loop(), work, cb1, cb2);

Ownership of the work object is unclear. Who owns the work object? Does the API deletes it after it’s done? Do I have to delete it in my callback?

I understood my mistake when I saw examples using statically allocated objects (which is a very bad idea in asynchronous code by the way).

Nevertheless I think the API could benefit from this change:

uv_work_t * work = uv_allocate_work();

uv_queue_work(uv_default_loop(), work, cb1, cb2);

If libuv creates the work object for you it then has the ability to know if it has been allocated by the API or not (with a canary in the structure for example).
It would also permit optimizations such as memory pools of work items for maximum performance and debug checks to see, for example, if an incorrectly initialized work item is being passed to uv_queue_work.

Users would still be able to pass static allocated or user managed work items and the API would not attempt to release them as they would not have the proper canary.

Fix validation

So how does it look now?

Absolutely no increase after two minutes of runtime, seems that we fixed our memory leaks!

Feel free to share your thoughts and don’t hesitate to give quasardb a try!

Top