Written by Steffen Müller

Understanding the Problem

The Perl diagnostics list has the following useful bit to say about the warning:

Perl went to decrement the reference count of a scalar to see if it would go to 0, and discovered that it had already gone to 0 earlier, and should have been freed, and in fact, probably was freed. This could indicate that SvREFCNT_dec() was called too many times, or that SvREFCNT_inc() was called too few times, or that the SV was mortalized when it shouldn't have been, or that memory has been corrupted.

Let’s pull that apart as it references a number of implementation details of Perl: The current implementation of Perl5 uses a reference counting based memory management scheme. Every basic value (a scalar, technically a pointer to an SV struct) has a slot that tracks the number of times it is referenced by anything else. If you create a new reference to an SV, you increment this so-called refcount. After giving up your reference to the SV, you need decrement the counter. As soon as this refcount reaches zero, Perl frees the memory associated with that SV and in turn decrements the refcounts on all other SVs that the soon-to-be-ex-SV holds a reference to. SvREFCNT_inc() and SvREFCNT_dec() are the Perl (C) API macros that do just that.

If you call SvREFCNT_inc() one time too many or call SvREFCNT_dec() one time too few, then the SV and everything it references will leak because they never get destroyed until the global destruction phase of the perl VM. If you do the opposite (too many SvREFCNT_dec() or too few SvREFCNT_inc() calls) then the refcount on an SV drops to zero prematurely and it is freed even though it is still referenced by data structures. Alas, those are left in blissful unawareness of the pending doom by invalid memory access.

The aforementioned warning is generated by Perl when it detects that the refcount on an SV is being decremented and that refcount was already zero. The refcount being zero to begin with, of course, means that it’s actually no longer a valid SV in use by Perl! Since the memory segment where the SV used to live is no longer used to store the original SV (see below for a twist), the memory might have been reused for storing a different SV. If so, then you may not see the warning about the scalar that had the originally bad memory management. Perl knows nothing about your intentions and happily decrements the refcount on the new resident of your favorite slot of memory. Eventually that just means the refcount of the new SV will drop to 0 prematurely as well. Rinse-repeat until you manage to corrupt memory and warn about it before Perl gets a chance to reuse the memory. Fun times.

Astute C programmers will now bring up the memory debugging tool of the day (my favorite is and remains valgrind/memcheck) that ought to deal with this problem quite effectively by pinpointing invalid access to freed memory. I wish it was so easy! The scheme is broken twofold: For once, the above action at a distance with memory reuse already means that perfectly valid code can contain the invalid access. But more importantly, perl’s internal memory management makes this all the more likely to happen. Perl uses slab allocation to avoid going to the OS for each and every SV it creates since many OS malloc implementations are deficient. The parts of the SV structs that hold the refcounts are allocated in such a slab (called arena in the Perl sources) that is typically as large as one page of memory and holds however many items of the same size that fit. Perl uses a list of unused elements in the slab to efficiently "allocate" and "deallocate" SVs. This is great for performance and to avoid fragmentation of memory. But for debugging the above memory problems, it exacerbates the action at a distance by frequent reuse of SV slabs.

If you fall victim to the eponymous warning, a search of the internet will find quite a number of cases of exasperated fellow sufferers asking for help from the experts. Alas, there is no one true debugging recipe that will lead to resolution in all cases and the few specific hints that do exist generally require building a special copy of perl for debugging.

Rigging Your Perl

There are a number of configure-time options with varying degrees of coverage in the Perl documentation that will help building a copy of perl that avoids some of the action-at-a-distance problems outlined above. The basic (*nix) recipe for building your own Perl is as follows (assuming you’re in a checkout of the perl git repository or have an unpacked release tarball):

$ sh Configure -des -Dusedevel

$ make

$ make test

The -d -e -s options basically mean "don't ask me any questions and use sane defaults for everything!". The -Dusedevel option just means that if you're building from a git clone, Configure shouldn't whine about building a development version of perl. It's basically the "yes, I really mean it" option to prevent people from deploying unreleased versions of Perl in production. To build and test your perl more quickly on a multi-core machine, you can sprinkle some -j magic:

$ sh Configure -des -Dusedevel

$ TEST_JOBS=5 make -j5 test

That will compile and test with five parallel jobs. If you want to install the new perl into a specific location, then add -Dprefix=/home/you/mydebugperl and run make install . Moving towards a more debugging-enabled perl, for starters, you want to include debugging symbols in your output and possibly disable the C compiler's optimizations, so add -Doptimize="-g3 -O0" to the Configure invocation. This will come in handy for locating problems in the actual perl sources when staring at valgrind output. The 3 after -g allows gcc to expand macros. Next up is building a perl with its own debugging facilities enabled: Add -DDEBUGGING . Putting it all together so far, we get:

$ sh Configure -des -Dprefix=/home/you/mydebugperl -Dusedevel \

-Doptimize="-g3 -O0" -DDEBUGGING

All you’ve achieved so far, of course, is obtaining a copy of perl that is massively slower than your production perl (probably at least an order of magnitude) and won’t really help you debug your refcount problem just yet! Having a perl like that handy is your typical stepping stone for perl-core debugging and so far, all of these steps are well-documented elsewhere.

The perlhacktips document explains a number of more intricate options for memory debugging. The PERL_DESTRUCT_LEVEL section is of particular interest. It turns out that perl by default doesn’t bother cleaning up its memory slabs when it’s done. It generally lets the OS do it (I believe because that has less overhead). Setting the environment variable PERL_DESTRUCT_LEVEL during program execution makes perl be more pedantic. That's important to make such issues visible to tools like valgrind in the first place:

$ PERL_DESTRUCT_LEVEL=2 perl your_buggy_program.pl

If you had the opposite problem of Attempt to free... , that is, SVs with too high refcounts, then you would start getting notifications about leaking scalars with this setup. The next steps of getting to the bottom of that involve the -DDEBUG_LEAKING_SCALARS , -DDEBUG_LEAKING_SCALARS_FORK_DUMP , and -DDEBUG_LEAKING_SCALARS_ABORT Configure options. These are mostly documented in perlhacktips.

To make it easier pick up on the suspected refcount issues, we can piggy-back on an option intended for the Purify tool: -Accflags=-DPURIFY (read: add "-DPURIFY" to the C compiler options). With this C define, perl will avoid using slabs for allocating SVs, which ought to improve your chances of picking up on weird behavior. On top of that, we can ask Perl to overwrite derelict memory areas with a known pattern (0xEF) to avoid obscuring errors by reusing memory previously used for SVs. Akin to the purify option, this is a C compiler define, so we now get: -Accflags="-DPURIFY -DPERL_POISON" .

Just to put it all the tools together, this is how I built my perl in the end: (the -Dcc and -Dld settings allow me to use ccache with gcc for faster repeated compilation)

$ sh Configure -Doptimize="-g3 -Wall -Wextra -O2" -DDEBUGGING -Dusedevel \

-Dprefix=/home/you/mydebugperl -Dcc=ccache\\ gcc\\ -g3 -Dld=gcc \

-Uusethreads -de -DPERL_TRACK_MEMPOOL -DDEBUG_LEAKING_SCALARS_FORK_DUMP \

-DDEBUG_LEAKING_SCALARS -Accflags="-DPURIFY -DPERL_POISON" \

-DDEBUG_LEAKING_SCALARS_ABORT

My Unreferenced Scalar

A perl equipped with all of the above debugging tools has made my debugging work much easier on many occasions. Usually, a subset of the facilities outlined above is enough. Alas, in this case, only stepping through my code carefully, dumping the addresses of many SV’s to locate the one that was being prematurely freed, allowed me track down the one stray statement that spoiled my day so thoroughly:

SvREFCNT_dec(*fetched_sv);

The code was using a hash access that creates the element it tries to fetch if it doesn’t exist ( hv_common 's HV_FETCH_LVALUE|HV_FETCH_JUST_SV mode) but falsely assumed an implicit refcount increase as part of that operation. Needless to say, I killed it with fire and then proceeded to perform an intricate victory dance.

The actual flaw had only been uncovered by running many, many millions of fuzz tests against our Perl/XS implementation of the Sereal deserialization library to harden it against attacks. But that is a topic for another day.