Movatterモバイル変換


[0]ホーム

URL:


Skip to content
DEV Community
Log in Create account

DEV Community

Cover image for A Deep Dive into Memory Leaks in Ruby
AppSignal profile imageTony Rowan
Tony Rowan forAppSignal

Posted on • Originally published atblog.appsignal.com

     

A Deep Dive into Memory Leaks in Ruby

In thefirst part of this two-part series on memory leaks, we looked at how Ruby manages memory and how Garbage Collection (GC) works.

You might be able to afford powerful machines with more memory, and your app might restart often enough that your users don't notice, but memory usage matters.

Allocation and Garbage Collection aren't free. If you have a leak, you spend more and more time on Garbage Collection instead of doing what you built your app to do.

In this post, we'll look deeper into the tools you can use to discover and diagnose a memory leak.

Let's continue!

Finding Leaks in Ruby

Detecting a leak is simple enough. You can useGC,ObjectSpace, and the RSS graphs in your APM tool to watch your memory usage increase. But just knowing you have a leak is not enough to fix it. You need to know where it is coming from. Raw numbers can't tell you that.

Fortunately, the Ruby ecosystem has some great tools to attach context to those numbers. Two arememory-profiler andderailed_benchmarks.

memory_profiler in Ruby

Thememory_profiler gem offers a very simple API and a detailed (albeit a little overwhelming) allocated and retained memory report — that includes the classes of objects that are allocated, their size, and where they were allocated. It's straightforward to add to our leaky program.

# leaky.rbrequire"memory_profiler"an_array=[]report=MemoryProfiler.reportdo11.timesdo1000.times{an_array<<"A"+"B"+"C"}puts"Array is#{an_array.size} items long"endGC.startendreport.pretty_print
Enter fullscreen modeExit fullscreen mode

Outputting a report that looks similar to this.

Total allocated: 440072 bytes(11001 objects)Total retained:  440072 bytes(11001 objects)allocated memory by gem-----------------------------------    440072  otherallocated memory by file-----------------------------------    440072  ./leaky.rballocated memory by location-----------------------------------    440000  ./leaky.rb:9        72  ./leaky.rb:10allocated memory by class-----------------------------------    440000  String        72  Thread::Mutexallocated objects by gem-----------------------------------     11001  otherallocated objects by file-----------------------------------     11001  ./leaky.rballocated objects by location-----------------------------------     11000  ./leaky.rb:9         1  ./leaky.rb:10allocated objects by class-----------------------------------     11000  String         1  Thread::Mutexretained memory by gem-----------------------------------    440072  otherretained memory by file-----------------------------------    440072  ./leaky.rbretained memory by location-----------------------------------    440000  ./leaky.rb:9        72  ./leaky.rb:10retained memory by class-----------------------------------    440000  String        72  Thread::Mutexretained objects by gem-----------------------------------     11001  otherretained objects by file-----------------------------------     11001  ./leaky.rbretained objects by location-----------------------------------     11000  ./leaky.rb:9         1  ./leaky.rb:10retained objects by class-----------------------------------     11000  String         1  Thread::MutexAllocated String Report-----------------------------------     11000"ABC"     11000  ./leaky.rb:9Retained String Report-----------------------------------     11000"ABC"     11000  ./leaky.rb:9
Enter fullscreen modeExit fullscreen mode

There is a lot of information here, but generally, the
allocated objects by location andretained objects by location sections can be the most useful when looking for leaks. These are the file locations that allocate objects, ordered by the number of allocated objects.

  • allocated objects are all objects allocated (created) within thereport block.
  • retained objects are objects that have not been garbage collected by the end of thereport block. We forced a GC run before the end of the block so we could see the leaked objects more clearly.

Be careful about trusting theretained object counts. They depend heavily on what portion of the leaking code is within thereport block.

For example, if we move the declaration ofan_array into thereport block, we might be fooled into thinking the code isn't leaky.

# leaky.rbrequire"memory_profiler"report=MemoryProfiler.reportdoan_array=[]11.timesdo1000.times{an_array<<"A"+"B"+"C"}puts"Array is#{an_array.size} items long"endGC.startendreport.pretty_print
Enter fullscreen modeExit fullscreen mode

The top of the resulting report won't report many retained objects (just the report itself).

Total allocated: 529784 bytes(11002 objects)Total retained:  72 bytes(1 objects)
Enter fullscreen modeExit fullscreen mode

derailed_benchmarks in Ruby

Thederailed_benchmarks gem is a suite of very useful tools for all kinds of performance work, primarily aimed at Rails apps. For finding leaks, we want to look atperf:mem_over_time,perf:objects, andperf:heap_diff.

These tasks work by sendingcurl requests to a running app, so we can't add them to our little leaky program. Instead, we'll need to set up a small Rails app with an endpoint that leaks memory, then install thederailed_benchmarks on that app.

# Create a rails app with no databaserails new leaky--skip-active-record--minimal## Add derailed benchmarkscdleakybundle add derailed_benchmarks
Enter fullscreen modeExit fullscreen mode
# config/routes.rbRails.application.routes.drawdoroot"leaks#index"end# app/controllers/leaks_controller.rbclassLeaksController<ApplicationControllerdefindex1000.times{$an_array<<"A"+"B"+"C"}renderplain:"Array is#{$an_array.size} items long"endend# config/initializers/array.rb$an_array=[]
Enter fullscreen modeExit fullscreen mode

You should now be able to boot the app withbin/rails s. You'll be able tocurl an endpoint that leaks on each request.

$curl http://localhost:3000Array is 1000 items long$curl http://localhost:3000Array is 2000 items long
Enter fullscreen modeExit fullscreen mode

We can now usederailed_benchmarks to see our leak in action.

perf:mem_over_time

This will show us memory use over time (similarly to how we watched the memory growth of our leaky script withwatch andps).

Derailed will boot the app in production mode, repeatedly hit an endpoint (/ by default), and report the memory usage. If it never stops growing, we have a leak!

$ TEST_COUNT=10000DERAILED_SKIP_ACTIVE_RECORD=true\  bundleexecderailedexecperf:mem_over_timeBooting: productionEndpoint:"/"PID: 4417104.33984375300.609375455.578125642.69140625751.6953125
Enter fullscreen modeExit fullscreen mode

Note: Derailed will boot the Rails app in production mode to perform the tests. By default, it will alsorequire rails/all first. Since we don't have a database in this app, we need to override this behavior withDERAILED_SKIP_ACTIVE_RECORD=true.

We can run this benchmark against different endpoints to see which one/s (if any) leak.

perf:objects

Theperf:objects task usesmemory_profiler under the hood so the produced report will look familiar.

$ TEST_COUNT=10DERAILED_SKIP_ACTIVE_RECORD=true\  bundleexecderailedexecperf:objectsBooting: productionEndpoint:"/"Running 10timesTotal allocated: 2413560 bytes(55476 objects)Total retained:  400000 bytes(10000 objects)# The rest of the report...
Enter fullscreen modeExit fullscreen mode

This report can help narrow down where your leaked memory is being allocated. In our example, the report's last section — theRetained String Report — tells us exactly what our problem is.

Retained String Report-----------------------------------     10000"ABC"     10000  /Users/tonyrowan/playground/leaky/app/controllers/leaks_controller.rb:3
Enter fullscreen modeExit fullscreen mode

We've leaked 10,000 strings containing "ABC" from theLeaksController on line 3. In a non-trivial app, this report would be significantly larger and contain retained strings that you want to retain — query caches, etc. — but this and the other 'by location' sections should help you narrow down your leak.

perf:heap_diff

Theperf:heap_diff benchmark can help if the report fromperf:objects is too complex to see where your leak is coming from.

As the name suggests,perf:heap_diff produces three heap dumps and calculates the difference between them. It creates a report that includes the types of objects retained between dumps and the location that allocated them.

$ DERAILED_SKIP_ACTIVE_RECORD=truebundleexecderailedexecperf:heap_diffBooting: productionEndpoint:"/"Running 1000timesHeap file generated:"tmp/2022-06-15T11:08:28+01:00-heap-0.ndjson"Running 1000timesHeap file generated:"tmp/2022-06-15T11:08:28+01:00-heap-1.ndjson"Running 1000timesHeap file generated:"tmp/2022-06-15T11:08:28+01:00-heap-2.ndjson"Diff====Retained STRING 999991 objects of size 39999640/40008500(inbytes) at: /Users/tonyrowan/playground/leaky/app/controllers/leaks_controller.rb:3Retained STRING 2 objects of size 148/40008500(inbytes) at: /Users/tonyrowan/.asdf/installs/ruby/3.1.2/lib/ruby/gems/3.1.0/gems/derailed_benchmarks-2.1.1/lib/derailed_benchmarks/tasks.rb:265Retained STRING 1 objects of size 88/40008500(inbytes) at: /Users/tonyrowan/.asdf/installs/ruby/3.1.2/lib/ruby/gems/3.1.0/gems/derailed_benchmarks-2.1.1/lib/derailed_benchmarks/tasks.rb:266Retained DATA 1 objects of size 72/40008500(inbytes) at: /Users/tonyrowan/.asdf/installs/ruby/3.1.2/lib/ruby/3.1.0/objspace.rb:87Retained IMEMO 1 objects of size 40/40008500(inbytes) at: /Users/tonyrowan/.asdf/installs/ruby/3.1.2/lib/ruby/3.1.0/objspace.rb:88Retained IMEMO 1 objects of size 40/40008500(inbytes) at: /Users/tonyrowan/.asdf/installs/ruby/3.1.2/lib/ruby/gems/3.1.0/gems/derailed_benchmarks-2.1.1/lib/derailed_benchmarks/tasks.rb:259Retained IMEMO 1 objects of size 40/40008500(inbytes) at: /Users/tonyrowan/.asdf/installs/ruby/3.1.2/lib/ruby/gems/3.1.0/gems/derailed_benchmarks-2.1.1/lib/derailed_benchmarks/tasks.rb:260Retained FILE 1 objects of size 8432/40008500(inbytes) at: /Users/tonyrowan/.asdf/installs/ruby/3.1.2/lib/ruby/gems/3.1.0/gems/derailed_benchmarks-2.1.1/lib/derailed_benchmarks/tasks.rb:266Run`$heapy--help`formore options
Enter fullscreen modeExit fullscreen mode

You can also readTracking a Ruby memory leak in 2021 to understand better what's going on.

The report points us exactly where we need to go for our leaky baby app. At the top of the diff, we see 999991 retained string objects allocated from theLeaksController on line 3.

Leaks in Real Ruby and Rails Apps

Hopefully, the examples we've used so far have never been put into real-life apps — I hope no one intends to leak memory!

In non-trivial apps, memory leaks can be much harder to track down. Retained objects are not always bad — a cache with garbage collected items would not be of much use.

There is something common between all leaks, though. Somewhere, a root-level object (a class/global, etc.) holds a reference to an object.

One common example is a cache without a limit or an eviction policy. By definition, this will leak memory since every object put into the cache will remain forever. Over time, this cache will occupy more and more of the memory of an app, with a smaller and smaller percentage of it actually in use.

Consider the following code that fetches a high score for a game. It's similar to something I've seen in the past. This is an expensive request, and we can easily bust the cache when it changes, so we want to cache it.

classScore<ApplicationModeldefself.user_high_score(game,user)@scores={}unless@scoresif(score=@scores["#{game.id}:#{user.id}"])scoreelseScore.where(game:game,user:user).order(:score).first.tapdo|score|@scores["#{game.id}:#{user.id}"]=scoreendendenddefself.save_score(game,user,raw_score)score=create!(game:game,user:user,score:raw_score)ifraw_score>user_high_score(game,user).score@scores["#{game.id}:#{user.id}"]=scoreendendend
Enter fullscreen modeExit fullscreen mode

The@scores hash is completely unchecked. It will grow to hold every single high score for every user — not ideal if you have a lot of either.

In a Rails app, we would probably want to useRails.cache with a sensible expiry (a memory leak in Redis is still a memory leak!) instead.

In a non-Rails app, we want to limit the hash size, evicting the oldest or least recently used items.LruRedux is a nice implementation.

A more subtle version of this leak is a cache with a limit, but whose keys are of arbitrary size. If the keys themselves grow, so too will the cache. Usually, you won't hit this. But, if you're serializing objects as JSON and using that as a key, double-check that you're not serializing things that grow with usage as well — such as a list of a user's read messages.

Circular References

Circular referencescan be garbage collected. Garbage Collection in Ruby uses the "Mark and Sweep" algorithm.During their presentation introducing variable width allocation, Peter Zhu and Matt Valentine-House gave an excellent explanation of how this algorithm works.

Essentially, there are two phases: marking and sweeping.

  • In themarking phase, the garbage collector starts at root objects (classes, globals, etc.), marks them, and then looks at their referenced objects.

It then marks all of the referenced objects. Referenced objects that are already marked are not looked at again. This continues until there are no more objects to look at — i.e., all referenced objects have been marked.

  • The garbage collector then moves on to thesweeping phase. Any object not marked is cleaned up.

Therefore, objects with live references can still be cleaned up. As long as a root object does not eventually reference an object, it will be collected. In this way, clusters of objects with circular references can still be garbage collected.

Application Performance Monitoring: The Event Timeline and Allocated Objects Graph

As mentioned in thefirst part of this series, any production-level app should use some form of Application Performance Monitoring (APM).

Many options are available, including rolling your own (only recommended for larger teams). One key feature you should get from an APM is the ability to see the number of allocations an action (or background job) makes. Good APM tools will break this down, giving insight into where allocations come from — the controller, the view, etc.

This is often called something like an 'event timeline.' Bonus points if your APM allows you towrite custom code that further breaks down the timeline.

Consider the following code for a Rails controller.

classLeaksController<ApplicationControllerbefore_action:leakdefindex@leaks=$leak.sample(100)endprivatedefleak1000.times{$leak<<Leak.new}endend
Enter fullscreen modeExit fullscreen mode

When reported by an APM, the 'event timeline' might look something like the following screenshot from AppSignal.

Bare Event Timeline

This can be instrumented so we can see which part of the code makes the allocations in the timeline. In real apps, it is probably going to be less obvious from the code 😅

classLeaksController<ApplicationControllerbefore_action:leakdefindexAppsignal.instrument('leak.fetch_leaks')do@leaks=$leak.sample(100)endendprivatedefleakreturnunlessparams[:leak]Appsignal.instrument('leak.create_leaks')do1000.times{$leak<<Leak.new}endendend
Enter fullscreen modeExit fullscreen mode

Here's an example of an instrumented event timeline, again from AppSignal:

Instrumented Event Timeline

Knowing where to instrument can often be difficult to grasp. There's no substitute for really understanding your application's code, but there are some signals that can serve as 'smells'.

If your APM surfaces GC runs or allocations over time, you can look for spikes to see if they match up with certain endpoints being hit or certain running background jobs. Here's another example fromAppSignal's Ruby VM magic dashboard:

Allocations

By looking at allocations in this way, we can narrow down our search when
looking into memory problems. This makes it much easier to use tools like
memory_profiler andderailed_benchmarks efficiently.

Read about the latest additions to AppSignal's Ruby gem, like allocation and GC stats tracking.

Wrapping Up

In this post, we dived into some tools that can help find and fix memory leaks, includingmemory_profiler,derailed_benchmarks,perf:mem_over_time,perf:objects,perf:heap_diff, the event timeline and allocated objects graph in AppSignal.

I hope you've found this post, alongsidepart one, useful in diagnosing and sorting out memory leaks in your Ruby app.

Read more about the tools we used:

Additional detailed reading:

Happy coding!

P.S. If you'd like to read Ruby Magic posts as soon as they get off the press,subscribe to our Ruby Magic newsletter and never miss a single post!

Top comments(0)

Subscribe
pic
Create template

Templates let you quickly answer FAQs or store snippets for re-use.

Dismiss

Are you sure you want to hide this comment? It will become hidden in your post, but will still be visible via the comment'spermalink.

For further actions, you may consider blocking this person and/orreporting abuse

To get a steady dose of magic, subscribe to 🎩Ruby Magic.

Magicians never share their tricks. But we do. Subscribe and we’ll deliver our monthly edition straight to your inbox.

More fromAppSignal

DEV Community

We're a place where coders share, stay up-to-date and grow their careers.

Log in Create account

[8]ページ先頭

©2009-2025 Movatter.jp