Boring Intro
Our retail software is built on top of Visual Basic (yuck!) and main server replicates with 7 other servers that hold sub-databases.
The main server is a quad-core IBM X3400 with 6GB of RAM running Windows Server 2003 Enterprise x64. This is a new system that replaced an old server that crashed a while back. The old server ran on Windows 2000 Server with 4GB of RAM, but the data was stored on a Storage Area Network (SAN).
After moving to the new box, the dude "administrating" it kept restarting it on a daily basis saying that the server was "too slow" and that it didn't have enough memory (!!!). Later on, one of the people at the implementing company said that the server wasn't good.
I could let you live if you say a server that I built is slow, but saying it isn't good ... you just dug your own grave dude... I emailed one angry email at both the administrator and the no-good fella, CCed to my manager, accusing them of meaningless restarts and claiming that my server isn't good and that it's hardware fault, without ANY proof.
Of course, because I'm right, none of them replied, but sneakily they purchased extra RAM (2GB) behind my back, which is another big no-no. They didn't install the RAMs and were looking for someone (other than me) to do it for them.
Technical Part
It took me 5 minutes to identify the problem: During replication or at times were many files are open on the server, Windows caches these open files in the RAM (System Cache's responsibility).
According to the Help of Windows Task Manager: The System Cache shows the current physical memory used to map pages of open files.
Of course, Windows being itself, there's no direct way to tweak this properly, and after some search I found a registry key which can balance the usage of the System Cache and free up some RAM.
Go to HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\lanmanserver\parameters and look for a REG-WORD data type named Size, which can have the following settings:
1 = Minimize Memory Used.
2 = Balance.
3 = Maximize Throughput for File Sharing and Maximize Throughput for Network Applications.
The default, as you might have guessed, is 3. I changed it to 2 but nothing changed until a server restart (typical).
I fired another email saying that I fixed the problem with a registry key, and went there and took the 2GB RAM for another server.
Why this didn't happen to the old server? I'm guessing that the SAN's cache was handling it properly and/or Windows 2000 Server was configured from the start to be a database host, which the new 2003 box apparently haven't.