How to diagnose and fix memory problemIf you have memory corruption issue and you got a report for it - this report will be a simple indication that you have a problem. You won't be able to fix the problem by using this report. Why? Because any such report is a note that the problem had occurred somewhere and some time ago. It's somehow similar to memory leaks - and we've already discussed it earlier. The problem is that nobody can scan each CPU instruction and ask: "is this command going to corrupt my memory?" That's why all checks are performed from time to time at certain checkpoints. Besides, only special data can be validated automatically. For example, if we take mem-leaks case - the checkpoints are calls of memory manager's routines and verified data are internal structures and freed memory. But even in that simplest case memory manager does not scan the entire memory pool on each request, limiting check to one memory block in question only. This is a usual trade-off between speed and functionality.
Okay, so, having a report, you will know that there is a problem. But you don't know where is it. You have a chance to locate it in the case of memory leaks, but not in the case of memory corruption. That's because you have some references to code for leaks, but references to code for memory corruptions are total off-topic. The real culprit-code can sit a million instructions away in space and time from the code, which crashed because of it, and there is no references to it. That's why the very first thing, that you should try to do (wherever you have a report or just crash/hang) is to try to reproduce the problem. Sometimes you can do it easily; sometimes it is possible, but hard to do; and often it is just not possible at all.
If you've managed to reproduce the problem - then it is a very simple case. Just debug your application as much as you want. The most useful tools here will be memory breakpoints. General strategy is simple: you need to find a moment, when memory is committed, but is not corrupted yet. You place a break-point on the memory (yes, Delphi's debugger can do it; we'll not discuss it here - please, refer to other resources or Delphi's help) and you just run your application. As soon as this break-point fire - you'll find the culprit for memory corruption. Make yourself at home and take your time: analyze the call stack, variables, etc, etc - the situation is under your control.
So, to put it short: the main question here is to locate the problem (assuming you can reproduce it at all). We'll discuss the different methods below, which you can use to locate the problem. Some of them you can use always - both in debug and release version. Some of them are only applicable to debug version.
If you aren't able to solve the problem (either you can't reproduce it or you can reproduce, but can't locate it) – then the only options is to use passive methods. I.e. things, which aren't directed to your particular issue, but rather helps you to improve your code - that way after improvements you'll be able to diagnose the problem or it may be that the problem will go away without doing anything specific. For example, if your code is chaotic mix of totally unrelated routines calls without slightest sign of logic - you can spend half a year looking for the reason (and still not solve it). Or you can spend few months to refactor/improve your code - and then hunt down and fix not only this problem, but other issues too, which you've spotted because your code becomes much clearer.
Problem's locating (active methods)First at all, you should analyze, what can be your problem. There are two main cases here: dynamic memory (heap) or the stack. Depending on the answer you may use methods for the heap or for the stack. For example, using debugging memory manager can help you with the memory corruptions in the heap, but it can do nothing about stack corruptions. If you aren't sure about it - just use all methods.
1. Using debugging memory manager (heap). Debugging memory manager is any memory manager, which provides additional features for debugging memory problems. Searching for memory leaks and searching for memory corruption bugs use the very similar approach. EurekaLog's case: these checks are enabled on memory problems page. Other options may affect the results too, but these ones are primary options for memory corruption checks. Just enable additional options and run your application, until debug memory manager will catch a problem.
Please note that EurekaLog uses light-weight methods, which are fast and can be used on end-user machines. However, these methods may be not enough for you to local debugging. In this case you should use specialized heavy-debugging code - like FastMM, SafeMM, AQTime, etc. This will show down your application a lot and you can not use it on end-users machines, but you can stress-test your application locally, at developer's machine.
2. Enabling debugging options (stack and heap). We mentioned this before too. The main option here is "Range check errors", which allows you to catch out of range errors in array-based data structures (note, that this option have a bug in old Delphi's versions). Besides this option, you may want to disable inlining and optimization (to simplify debugging and to avoid bugs like this). Unfortunately, Delphi's compiler do not have a more generic option for checking stack's state like others compilers have.
3. Forced checkpoints (stack and heap). As already mentioned, any report about memory problem reports only about moment of detection, not about the problem itself. You must locate the problem. But how can you do it? Obviously, you need to find a point, when problem is not occurred yet (memory is not corrupted); find the point, when memory is corrupted. Therefore, the problem will sit somewhere between those two points. Each of these moments will be a checkpoint. By moving (or creating) checkpoints - you can reduce code's area with problem until you locate it. Sometimes, those checkpoints are created automatically. For example, debugging memory manager validates memory block each time its routine is called for this block. For the stack: it can be routine leave. Since you successfully leave the routine - this means that return address wasn't damaged, so there was no stack corruption (at least some type of it). If those automatically created checkpoints aren't suffice to locate the problem (or they aren't created at all) - then you need to create them manually.
We have an option to force check manually for the heap. You can call CheckHeap routine for EurekaLog. You force memory manager to scan the entire memory pool for corruptions by calling this routine (obviously, only consistence of internal info/headers can be validated, not the data inside memory blocks). By putting calls to this function around the code - you put explicit checkpoints. Start with calling them periodically. Once you found a problem between two calls - move them closer to each other, until you locate the problem.
You can also switch to SafeMM for even more debug control.
4. Debug checks (stack and heap). It's not always possible to use or set checkpoints as discussed in previous item. For example, no one can check consistency of your information, all automated tools can check only their info, not yours. That's why you may need to validate your info manually. Well, it's simple: just place as many checks as you want around your code. Put Assert's call everywhere. Check every thing, that you're able to check. Once you found a problem between two Assert's call - move them closer to each other, just like in checkpoint's case. As soon as you reduce a gap enough to acquire the address of corrupted memory - you're done. Just run your application until the moment before problem and put a memory-breakpoint on this address (see also below).
5. Avoiding local variables (stack). Since we don't have much tools for the stack - you can move the problem elsewhere by avoiding local variables: try to use global variables (just for test, of course) or (better yet) put all local variables into record, which you allocate dynamically in the heap. This will move the problem to another area, where we have some handy tools (your favorite debugging memory manager).
6. Problem with threads (heap). Multi-threading usually does not affect stacks, but it can be a reason for many hard-to-detect problems with global or heap's data (well, not multi-threading by itself, but rather synchronization errors). Debugging of multi-threaded application is large and complex thing, so it won't be discussed there - please, see other resources.
7. Memory breakpoints (stack and heap). If you'll found an specific address for memory, which was corrupted, things will become much easier. All you need to do now is to use memory breakpoints. Memory breakpoints is handy ability of Delphi's debugger, which allows you to put breakpoint on memory, just like you do this for code. A memory breakpoint triggers, when some code accesses memory. Use Delphi's help to learn details on how to use them.
So, you have a memory's address. Run your application until the moment, when this memory will be available (allocated). It should be in the valid state at this moment. Place a memory breakpoint on it. And run your application. When breakpoint fires - check the code, which caused it. You'll find the culprit eventually.
So, if you wasn't able to solve your problem with the above methods - then the only thing left is:
Prevention of problems with memory (passive methods)1. Avoid low-level code. It's simple: scan all your code, looking for calls of low-level routines (which aren't type-safe, therefore have a high chance of corrupting memory). Double-check all usage cases. Replace low-level code with high-level counterpart, if you can do it. It's better to do it slow and safe/correct than do it fast, but incorrect.
2. Check code for being unicode-ready. Most common error is confusing length and size - i.e. size of buffer in characters and size of buffer in bytes.
3. Use wrappers. Separate all API code into separate unit/class, which you can validate as single entity. You'll reduce searching area and simplify code by placing suspicious/potential troublesome code in the same place.
4. Code review by other developer. It's well-know fact, that your eyes see only things, which you brains want to see. That's why it's good thing to give your code to colleague - sometimes he/she can spot obvious problem, which you can't solve for few hours/days.
Actually, this section is endless. There are many books, which tells you how to write a good quality code. And they do this in more details, than we can do it here. That's why we won't list anything further - just give you some advice: read "smart" books. Consider the text above only as short example. You can improve yourself and your code by reading books and blogs. Many problems will be easier to spot or they can disappear eventually.
See also:
|