A while back it was time to replace an old Windows NT server. This server had a couple of functions, but the thing it did that I was interested in was it held over 500,000 (yep 1/2 a million) word documents. These documents were saved by the client side of an application I was supporting. The server side ran on HP-UX and I thought it would be best to consolidate all the data on the one server, thus ridding myself of having to rely on a Microsoft server for the application to run.
No sweat, thought I, I'll just put a Samba share up and move all the files to that. Piece of cake. Well, I transferred the files and gave it a whirl. It took my HP-UX box about five minutes to serve up one file, and during that five minutes the CPU was pegged at 100%. Not good, since the old NT box could do the open in about one second without working up a sweat.
I then realised that I had walked into one of the problems of layering CFIS on top of a system that does not natively support all of the concepts. In my case, the Samba code was being killed by looping through the file list to find all the possible file matches for each request. Since CFIS is case-insensitive, Samba has to do case-insensitive matches. Samba does this well, but when you want a random file in 500,000, you have a lot of computing to do.
Recently Andrew Tridgell gave an interview about this very problem with Samba and what he is planning to put in 4.0 to try and solve some of these issues.
I solved my problem by hacking the Samba code. I did two things:
I've corresponded with people who would like to move CAD files to a Samba share and have encountered the same problem, so I'm putting my code out for people to look at. If I had had more time, I would have made the list of directories a per-share parameter. I might get to that if I get any requests for it. I have patches for 2.2.5. Not doing the case-insensitive 8.3 stuff works for most cases (your Windows client needs to store 12345678.DOC and not then reopen the file as 12345678.doc). In my case, I controlled the client, so this was not a problem. Patch1 touches the trans2 code, and makes Samba use the native filesystem code to find an exact match and return that if found. If the subdirectory is "form_sav", then it abandons the search if the exact match was not found. Patch2 is the code that returns no matches for browsing in the dir.c code for the subdirectory "form_sav". Change form_sav to whatever you want it to be.
If you have similar needs for Samba, email me (firstname.lastname@example.org) and I would be glad to see what I can whip up for your particular case.