I have an application which loops through several thousand datarows using loops within loops in order to upgrade the data contained in the database.
Basically it loops through a datatable of documents, returning a datatable of worksheets for each one, which it then loops through returning a datatable of worklayers for each one which it then loops through to get all the objects which require updating.
This one database we're trying to update has 450 documents, which works out to be about 1000 or so worklayers each containing hundreds of shapes to upgraded. So it's a lot of looping.
Each loop though calls the next one in it's own function and each datatable is only ever a local variable which should go out of scope for the next loop.
After getting to about 120 or so documents it was using over 1Gb memory, and it pretty much locked up at 2Gb use after about 200 documents. I seem to have cut the memory usage down to almost a quarter of that by using;
after each document loop.
However, it's still not satisfactory. Running the app and using Vistas Performance monitor it seems that after 100 or so documents that over 10,000 classes are loaded and about the same number of assemblies are loaded. I'm thinking this could be a problem, however I don't really have any experience dealing with this kind of thing.
The 10,000 classes I assume would be each object that is getting updated, but is there anyway I can find out what the assemblies being loaded are? I'm not sure if those classes are being disposed and collected, though I assume they would be, however it is my understanding that the assemblies aren't unloaded until the app is closed, which could use a lot of memory I'm assuming.
Also, using the memory profiler, the total usage only ever seems to get to about 20mb and always drops right back down to the live usage of about 8mb after each GC.
The unmanaged resources live usage however, continues to climb over the entire duration the app is running, but the memory usage the profiler shows is no where near as high as the task manager shows.
Any help or suggestions would be greatly appreciated.
Do you see a lot of classes when you collect a snapshot with .NET Memory Profiler? If you can locate a class that seems to be dynamically created, investigate the type details of that class. Even though it might not have any live instances, you should be able to get some information by looking at the allocation call stacks. Hopefully you will be able to see why the class (and assembly) is created.
There is a memory overhead in by the profiler, since it needs to keep track of all instances, assemblies, classes etc. If the number of classes keeps increasing, the memory usage of the profiler will also increase. The profiler also avoids presenting its own memory usage, and therefore the memory usage presented by the profiler will be lower than the memory usage presented by Vista.
SciTech Software AB
I have updated the app somewhat with the help of our SQL guy. Basically I got him to create a stored proc to do what my loops were doing. So now I just get one dataset with a table with 70,000 rows and cycle through that.
Running the mem profiler gives a rather different result now. For one it is showing the non-managed resources, which is odd because it wasn't before (running 64bit vista), I had to use a different XP machine to get that data.
Anyway, it no longer seems to be the unmanaged resources that are going out of control, and the value it is showing for the managed total bytes is more or less what the task manager is showing, so this makes more sense now than it did before.
The number 1 culprit by a long way is system.string with over a million instances and a live instances size of 300mb. And this is only at shape 1500 of 73,000.
Datarow has 365,000 instances but only 23mb.
There is quite a lot of xml serialization and deserialization going on which I have found can cause problems due to a bug or something in the .NET framework, so I'm working on the workarounds for that.
The app got to about 5000 or so shapes using over 1gb ram and then stopped responding. So there is definitely still some work to be done.
From http://msdn.microsoft.com/en-us/library ... lizer.aspx
Now we just create an XML serialiser as a public variable when the application loads and use that one each time.Dynamically Generated Assemblies
To increase performance, the XML serialization infrastructure dynamically generates assemblies to serialize and deserialize specified types. The infrastructure finds and reuses those assemblies. This behavior occurs only when using the following constructors:
If you use any of the other constructors, multiple versions of the same assembly are generated and never unloaded, which results in a memory leak and poor performance. The easiest solution is to use one of the previously mentioned two constructors. Otherwise, you must cache the assemblies in a Hashtable, as shown in the following example.
Users browsing this forum: No registered users and 34 guests