Fragmentation Analysis

Use this forum for questions on how to use .NET Memory Profiler and how to analyse memory usage.
Post Reply
ngm
Posts: 10
Joined: Thu Sep 26, 2013 4:09 pm

Fragmentation Analysis

Post by ngm » Mon Oct 07, 2013 3:17 pm

Hello Andreas,

I was getting familiar with profiler and it seems pretty capable to me.

I've got question regarding memory fragmentation, what are best techniques to analyze it?

What I figured so far is to look at real-time monitor while taking snapshots of Gen #0 and comparing Total bytes and Live bytes.

Also, I took a look at native memory's overhead node.

As far as our scenario in particular goes, we're having large byte arrays that go to LOH. Our WinForms 32-bit CLR 2.0 process gets out of memory even though we haven't spotted any memory leaks so far. Also, it can go OOM sometimes around 600 MB but sometimes also around as low as 300 MB which points me to the fragmentation.

As a side note, let's say the process has 600 MB commit size, it is usually sitting like that for a while. If I take a full snapshot, it goes down below 200 MB instantly.

It seems like GC is not aggressive enough. Would it be better to switch to server GC. The process is running on the Win Server 2008 R2 with several cores.

Thank you a million!

- ngm

ngm
Posts: 10
Joined: Thu Sep 26, 2013 4:09 pm

Re: Fragmentation Analysis

Post by ngm » Tue Oct 08, 2013 12:21 am

Just to give you some numbers from profiler into perspective:

Total bytes: 22 MB
Live bytes: 16 MB

Gen #0 GCs: 628
Gen #1 GCs: 531
Gen #2 GCs: 179

Physical Memory
Total: 453 MB (562 MB with profiler data)
Normal heap: 11 MB

LOHLarge heap: 753 KB
Overhead/unused: 23 MB
Unreachabe instances: 278 MB

Committed Memory
Total: 721 MB (833 MB with profiler data)
Normal heap: 11 MB

LOH
Large heap: 753 KB
Overhead/unused: 25 MB
Unreachale instances: 350 MB

Total managed heaps usage 76% of private comitted memory
Managed heaps utilization: 1%
Available virtual memory: 1,026 MB
Largest allocatable lock: 240 MB

Large heap usage: 73% of private committed memory
Large heap utilization: 0%
Large heap fragments: 35
Wasted large heap memory (small gaps): 4 KB (0% of large heap memory)

Thanks,

- ngm

Andreas Suurkuusk
Posts: 1028
Joined: Wed Mar 02, 2005 7:53 pm

Re: Fragmentation Analysis

Post by Andreas Suurkuusk » Tue Oct 08, 2013 4:11 pm

The numbers you presented in the second post do not indicate any significant memory fragmentation. They do however show that you have a lot of unreachable instances, especially in the large object heap. The .NET runtime analyzes the allocation pattern of the process and adjusts the garbage collection to minimize the performance impact and memory overhead.

In your case, I assume that you have allocated a set large instances (a few 100 MBs), and the allocated a lot of small instances. This could for instance happen when you open a large document (which will create a set of large instances) and then work with the document (which may create many UI related short lived instances). When you work with the document, all newly created instances are GCed during a gen #0 but none of the larger instances are collected during a full GC, and thus the garbage collector learns that it's very efficient to perform frequent gen #0 collections and very few full collections. When the document is closed all large instances becomes unreachable, but no full GC will be triggered to collect them. If you expect to recreate a new set of large instances, this should not be a problem. The runtime will perform a full GC when running low on memory, and the memory will be reused. In some cases it might also be justified to call GC.Collect to reduce the memory usage (after releasing a large amount of long-lived instances).

If you allocate very large objects in a 32-bit process (e.g. > 100MB), you don't need very much memory fragmentation to get an OOM exception, even though 300MB committed memory seems very low for an OOM. In your example, you have 35 fragments in the large object heap, so this will limit the size of your large objects, but, still, the largest allocatable object is 240MB.

Switching the the server GC will most likely not help, as the memory overhead is often higher when the server GC is used. Instead I recommend that you try to optimize your memory allocation pattern. If possible:
  • Try to avoid allocating differently sized large objects
  • Split the large objects into smaller "chunks", e.g. using some paging algorithm.
  • Clean up your "old" large objects (i.e. remove the references) before creating a new set of large objects
Best regards,

Andreas Suurkuusk
SciTech Software AB

ngm
Posts: 10
Joined: Thu Sep 26, 2013 4:09 pm

Re: Fragmentation Analysis

Post by ngm » Tue Oct 08, 2013 5:50 pm

Andreas,

Thank you so much for detailed explaination.
When the document is closed all large instances becomes unreachable, but no full GC will be triggered to collect them. If you expect to recreate a new set of large instances, this should not be a problem. The runtime will perform a full GC when running low on memory, and the memory will be reused. In some cases it might also be justified to call GC.Collect to reduce the memory usage (after releasing a large amount of long-lived instances).
Reusing memory when it's low is exactly what I expected but it is not hapenning - even though all large instances are unreachable i.e. eligible for full collection, GC doesn't kick in on the next large allocation but rather OOM is thrown bringing process down.

That brought me to fragmentation conclusion at the beginning.

My allocation pattern is that process is usually very idle i.e. every 15 mins there's request which allocates from 1 - 100 byte arrays totalling to 5 MB - 250 MB in a very tight loop, assigning arrays to the fields. Profiler didn't spot any reachable instance of those arrays when doing snapshots. The only reachable ones were when I took the snapshot while OOM was beeing thrown or when I took the snapshot during the request being dispatched which I think shouldn't be counted at all.

How about memory pressure?
Is there a possibility that GC doesn't do full collection because the pattern doesn't require so, but once the pressure is high (allocations/sec vs. available mem) it kicks in. However, it cannot keep up with the pressure. Especially considering it's workstation GC in concurrent mode where Gen #2 (LOH) collection can be performed without pausing. In other words, is it possible that there's enough unreachable memory (which becomes free after the collection) and the largest allocatable block fits allocation but still OOM can be thrown?
Does this reasoning make sense?

Thanks once again.

- ngm

Andreas Suurkuusk
Posts: 1028
Joined: Wed Mar 02, 2005 7:53 pm

Re: Fragmentation Analysis

Post by Andreas Suurkuusk » Tue Oct 08, 2013 6:42 pm

The garbage collector should not be affected by the number of allocations per second. If there's no available memory, the runtime should, as far as I know, perform a full GC, even when a lot of allocations take place in several threads.

As I understand it, you are reproducing the memory problem while running the process under the profiler. Is that correct? When running under the profiler, the concurrent GC is not enabled, so that should not affect this problem.

How did you collect a snapshot at the time of the OOM exception? Did you run under the debugger or did you attach to the process? Were you able to see how big the instance being allocated was when the OOM error occurred?

Maybe you can try to enable peak snapshot collection and see if you get some better information.
Best regards,

Andreas Suurkuusk
SciTech Software AB

ngm
Posts: 10
Joined: Thu Sep 26, 2013 4:09 pm

Re: Fragmentation Analysis

Post by ngm » Tue Oct 08, 2013 8:56 pm

I'm using nmpcore.exe since the application with memory issues is running on client's site.

Can I enable peak snapshot collection with it?

I tried with the following arguments:

nmpcore.exe /r /peaksnapshot+ /sf "C:\MemSessions\DumpSession.prfsession" /p "exepath"

But I don't see Peak snapshot in a saved session.

- ngm

ngm
Posts: 10
Joined: Thu Sep 26, 2013 4:09 pm

Re: Fragmentation Analysis

Post by ngm » Wed Oct 09, 2013 4:54 am

Peak snapshot is really nice feature indeed. If I can make nmpcore.exe take a peak snapshot that would be even better.

As a side note, I have noticed that my application when opened for the first time (without serving any request where LOH byte arrays are involved) has breakdown like:

Total bytes: 3 MB
Live bytes: 3 MB

Private
Manage heaps: 6 MB
Normal heap: 4 MB

LOH
Large heap: 1 MB

Available virtual memory 1.5 GB
Largest allocatable block: 612 MB

Large heap usage: 2% of private committed memory
Large heap utilization: 49%
Large heap fragments: 2
Wasted large heap memory (small gaps): 0 KB (0% of large heap memory)
Start.png
Start Snapshot
All my snapshots are Gen #0.

Once it starts processing requests, Total bytes and Live bytes start to differ usually in range of as high as 25 MB total - as low as 5 MB live for smaller byte arrays and as high as 250 MB total - as low as 30 MB live for larger ones. My largest allocatable block is then usually around 250 MB.
RequestsGraph.png
Requests Graph
AfterSingleRequest.png
After single large request

ngm
Posts: 10
Joined: Thu Sep 26, 2013 4:09 pm

Re: Fragmentation Analysis

Post by ngm » Wed Oct 09, 2013 5:07 am

GC #2 kicks in aggressively on the next request for 2 - 10 collections ultimately lowering memory sometimes to as low as 5 MB of both total and live but only 438 MB of largest allocatable block is left. Even if I take Gen #1 snapshot which will bring me back to 3 MB live and 3 MB total the largest allocatable block is still 438 MB and not initial 612 MB.
AfterRequests.png
After several small and large requests
StartVsRequests.png
Start application vs. After several requests
Intrestingly enough I found in different session from another box where the application is running that Snapshot 7 and 8 although have very similar memory usage their Largest allocatable block differs for over 430 MB:
Snapshot7vsSnapshot8.png
Snapshot 7 vs. Snapshot 8
- ngm

Andreas Suurkuusk
Posts: 1028
Joined: Wed Mar 02, 2005 7:53 pm

Re: Fragmentation Analysis

Post by Andreas Suurkuusk » Wed Oct 09, 2013 8:28 pm

The peak snapshot feature does not work correctly in the current version of NmpCore, but we have now corrected this. Thanks for pointing out this error. We have planned to release a maintenance release of .NET Memory Profiler 4.6 tomorrow, and this release will include the fix for peak snapshots in NmpCore.

The address space of a 32-bit process is very limited when you try to allocate large objects. A 600 MB block can easily be fragmented by a lot of things; such as loading libraries, making minor allocations. How the address space is used can differ between machines, depending on factors such as version of the .NET runtime and other libraries and possibly drivers.

As I mentioned previously, I would recommend that you try to allocate smaller memory blocks is possible. In earlier versions of the profiler, we allocated large blocks of memory in the profiled process. This could cause OOM-exceptions even if only 1GB of memory was committed. We redesigned the parts of the profiler that created large blocks, and tried to keep the largest block smaller than 1 MB. After this redesign, we usually don't see an out of memory error until close to 2GB is committed (in a process with a 2GB address space).

Since you provided the "/r" argument to NmpCore, I assume that you are profiling a pre-.NET 4.0 process. If possible I would recommend that you run under .NET 4.0 or later, as I believe that Microsoft has improved the large object heap in later versions. .NET Framework 4.5.1 also includes an option to compact the large object heap, which might help with your problem.
Best regards,

Andreas Suurkuusk
SciTech Software AB

ngm
Posts: 10
Joined: Thu Sep 26, 2013 4:09 pm

Re: Fragmentation Analysis

Post by ngm » Wed Oct 16, 2013 6:14 pm

Andreas,

Thank you for looking into it.

I used profiler for last 7 days of my trial and it was beyond helpful. I tried Red Gates' profiler as well, while it's got nice UI, it's far behind yours in terms of in-depth memory analysis.

I knew it's capable tool the moment Wintellect guys recommended it :)

Anyway, I just bought license for myself. Will be advocating it to my employer as well.

Thank you once again for bringing this jewel.

- ngm

Andreas Suurkuusk
Posts: 1028
Joined: Wed Mar 02, 2005 7:53 pm

Re: Fragmentation Analysis

Post by Andreas Suurkuusk » Thu Oct 17, 2013 9:36 pm

I'm glad that you like .NET Memory Profiler and find it helpful. If you need any additional support, don't hesitate to contact us using this forum or by sending an e-mail to support@scitech.se.
Best regards,

Andreas Suurkuusk
SciTech Software AB

Post Reply

Who is online

Users browsing this forum: Bing [Bot] and 30 guests