Creating Small Win32 Executables - Fast Builds
This paper focuses on creating small executables (*.exe) and dynamically linked libraries (*.dll) while creating fast building projects using Microsoft Visual C++ 6.0.

  Table of Contents
Preface
Creating Fast Builds
Creating Small Applications
    Getting Started
    Compiler Options
    Linker Options
    Executables
    Dynamically Linked Libraries
Limitations and Functions
    Buffer-Manipulation Routines
References and Links

  Preface
This paper assumes that the reader owns a copy of Microsoft Visual C++ and knows how to create regular applications and dynamically linked libraries.

The main idea behind making your application smaller is to link to as few libraries as possible. Libraries like MFC are obviously not going to make your application smaller. Some libraries are bigger than others. Libraries stubs created for external DLLs are almost harmless while other libraries such as a CLIB library actually includes code into your final application. The less code an application contains the smaller it will be.

This paper assumes the project settings are currently set to the default Microsoft Visual C++ Release configuration.

An application compiled with the default Release configuration settings set at Minimize Size compiled to 57,344 bytes, after tweaking the configuration settings as listed in this paper the application compiled to 21,504 bytes. This was accomplished without any 3rd party utility.

NOTE

The resulting executable or dynamically linked library will be smaller than a standard Release build but will in most cases be slower. If you seek fast executables use the standard Release build with the Maximize Speed compiler option.


  Creating Fast Builds
When working on large projects it may be tedious to recompile the entire project everytime you make a change in the main header file. The solution to this is to use a Precompiled Header. With a Precompiled Header you can have one header file that is included in all files, but that is compiled only once and reused throughout the project.

To create a precompiled header you will first need a main header file (ie. main.h) and a blank source file (ie. pch.cpp). In the header file include the main window headers and other headers that will be, for the most part, global to your application. Include this file in every source file you create. Then in the new source file do nothing but include the main header file. After you have done so your main header file should look something like this:

//**********************************************************************
// File: main.h					Last Modified:	03/01/01
//						Modified by:	PM
//	
// Purpose:      The main header file used within the project.
//
// Developed By: Piotr Mintus 2001
//**********************************************************************

#ifndef _MAIN_H_
#define _MAIN_H_

#include <windows.h>
#include <multimon.h>
#include <crtdbg.h>

#endif  // _MAIN_H_
Your new source file should look something like this:

//**********************************************************************
// File: pch.cpp				Last Modified:	03/01/01
//						Modified by:	PM
//	
// Purpose:	 Pre-compiled header
//
// Developed By: Piotr Mintus 2001
//**********************************************************************

#define COMPILE_MULTIMON_STUBS
#define WIN32_LEAN_AND_MEAN		// exclude some windows headers

#include "main.h"

Add these files to your project. Select Project -> Settings from the main menu, click on the C/C++ tab, and choose Precompiled Headers from the Category combo box. Highlight the project in the left hand side tree view. Select Automatic use of precompiled headers and in the Through header edit box write the name of the main header. Then select the new source file form the tree view and choose Create precompiled header file (.pch). In the Through Header edit box write the name of the main header file.

Other ways to make your compiling time shorter is to limit the amount of headers you include, this way the compiler has less to parse everytime it tries to compile your files. By defining WIN32_LEAN_AND_MEAN before including windows.h will exclude a few header files from being included with windows.h. This will only improve compile time and should not affect the size of your final application.

  Creating Small Applications

WARNING

The following examples remove the default entry point function when all default libraries are removed. The default entry point handles the initialization of the C run-time library (calling the _CRT_INIT function), and executes C++ constructors for static objects. It will now be up to the application to perform these tasks.

Getting Started

To get started on creating small applications a new configuration should be added to the main project to not break the current working Debug and Release configurations. To create a new configuration select Configurations... from the Build menu. Click the Add button. The Add Project Configuration dialog will be displayed. In the Configuration edit box type the new name of the configuration. In the Copy Settings from combo box select the Win32 Release configuration.

After adding the new configuration it must select as the current configuartion. To do so select Set active configuration... from the Build menu. The Set Active Project Configuration, will be displayed. Select the new configuration and click OK.

Now that the configuartion is set we may alter the compiler and linker options without worrying about breaking the project.


Compiler Options

To make this configuration unique for the preprocessor it is recommended that you added a global preprocessor definition. To do so add the following option in the Project Options under C/C++ settings:
/D _TINYCONFIG
One of the options to change here is to set the optimizations to Minimize Size. To do so change the following option in the Project Options under C/C++ settings:
/O1
The only other option to consider is the structure member alignment. The default alignment is 8 bytes, by changing this to 1 byte the structure members will be tightly packed in memory, although once in memory they will be slower. To change the structure member alignment add the following line to the Project Options under C/C++ settings, where n is 1, 2, 4, 8, or 16:
/Zp[n]


Linker Options

The main objective is to get rid of all libraries we do not need, this means clib. By removing clib you will not be able to use the standard C library, which means the code will not be portable across operating systems. To get rid of all default libraries select Project -> Settings from the main menu and click on the Link tab. Check the Ignore all default libraries check box. As this will get rid of all standard C functions we need to find alternatives in the Win32 API, luckily they exist. See the function list below. Now get rid of any library that you do not need from the Object/library modules edit box.

To disable the use of all default libraries add the following line to the Project Options under linker settings:
/NODEFAULTLIB
Incremental linking must also be disabled by adding the following line:
/INCREMENTAL:NO
Microsoft Visual C++ uses a default file alignment of 4KB instead of the default 512 bytes. When reduced to 512 bytes the size of the application falls significantly. Although a file alignment of 4KB loads faster on Windows 98 and reduces file swapping. To change the alignment to 512 bytes add the following line to the Project Options under linker settings:
/OPT:NOWIN98
An alternative is to define the exact file alignment:
/ALIGN:512
Although this parameter will cause the following linker warning:
LINK : warning LNK4108: /ALIGN specified without /DRIVER or /VXD; image may not run
Merging sections together will result in a smaller overall application. To view the names of the sections in your application use the DUMPBIN tool with the /SUMMARY parameter. For instance you can merge the relocation data (.rdata) and the text (.text) sections. To merge sections add the following line to the Project Options under linker settings:
/MERGE:[from]=[to]
If you changed the default name of the entry point you will need to tell the linker the new name of the entry point. Add the following line to the Project Options under linker settings:
/ENTRY:[function name]


Executables

When we remove the Default Libraries we also remove a hidden function that Microsoft Visual C++ provides. This function serves as the entry point to an executable. The name of the function that Microsoft Visual C++ uses as the entry point is called WinMainCRTStartup. To make life easier this article will use the same name for the entry point. If you wish to change the entry point function name see Linker Options above.

The application will have to supply its own entry point that calls WinMain. The entry point prototype is as follows:
int WINAPI WinMainCRTStartup(void);
Unlike WinMain this function does not take any parameters, therefore it's up to the application to provide the information when calling WinMain. The four parameters that WinMain takes are: HINSTANCE hInstance, HINSTANCE hPrevInstance, LPSTR lpCmdLine, int nCmdShow. All these can easily be retrieved through standard Win32 API. The hInstance parameter can be retrieved by calling GetModuleHandle. The hPrevInstance parameter is ignored in Win32. The lpCmdLine parameter can be retrieved by calling GetCommandLine. The nCmdShow parameter is a bit more complex, but all it requires is one call to GetStartupInfo. The main file of your application should now look something like the following:
int WINAPI WinMain(HINSTANCE hInstance,HINSTANCE hPrevInstance,LPSTR szCmdLine,int iCmdShow)
{
	...
	return 0;
}

#ifdef _TINYCONFIG
int WINAPI WinMainCRTStartup(void)
{
	STARTUPINFO				startInfo={sizeof(STARTUPINFO),0};

	GetStartupInfo(&startInfo);

	return WinMain(GetModuleHandle(NULL),NULL,GetCommandLine(),
		((startInfo.dwFlags & STARTF_USESHOWWINDOW)?startInfo.wShowWindow:SW_SHOWDEFAULT));
}
#endif


Dynamically Linked Libraries

When we remove the Default Libraries we also remove a hidden function that Microsoft Visual C++ provides. This function serves as the entry point to the dynamically linked library. The name of the function that Microsoft Visual C++ uses as the entry point is called _DllMainCRTStartup. To make life easier this article will use the same name for the entry point. If you wish to change the entry point function name see Linker Options above.

The application will have to supply its own entry point. The entry point prototype is as follows:
BOOL WINAPI _DllMainCRTStartup(HINSTANCE hinstDLL,DWORD fdwReason,LPVOID lpReserved);
As you can see the entry point function takes exactly the same parameters as DllMain. At first glance getting rid of the DllMain function seems like the logical thing to do, unfortunately the Debug and Release configurations require DllMain to exist. A solution is shown below. An alternative to this would be to create only a DllMain and change the ENTRY Linker Option for this configuration to DllMain.
BOOL WINAPI DllMain(HINSTANCE hinstDLL,DWORD fdwReason,LPVOID lpReserved)
{
	switch( fdwReason )
	{
		...
	}
	
	return 1;
}

#ifdef _TINYCONFIG
BOOL WINAPI _DllMainCRTStartup(HINSTANCE hinstDLL,DWORD fdwReason,LPVOID lpReserved)
{
	return DllMain(hinstDLL,fdwReason,lpReserved);
}
#endif


  Limitations and Functions
There are a few limitations to removing default libraries from your application. The most obvious are all the standard C function are no longer available. A limitation somewhat less obvious is that 64 bit operations no longer exist, operators such as multiplication, division and modulus require external libraries. To get around this limitation the Win32 functions Int64ShllMod32, Int64ShraMod32, and Int64ShrlMod32 can be used to write those operators. The writing of these operators is beyond the scope of this paper.

When getting rid of all default libraries you get rid of all standard C functions. Therefore functions such as malloc, free, memset, memcpy, etc are no longer available. The following table is a partial lists of the Win32 equivalents:

Functions CLIB Win32
Buffer-Manipulation Routines memcpy, memmove, memset [ Buffer-Manipulation Routines ]
Memory-Allocation Routines malloc, calloc, realloc, free GlobalAlloc, GlobalReAlloc, GlobalFree
String-Manipulation Routines sprintf, strcat, strcmp, strcpy, strlen, strupr, strlwr wsprintf, lstrcat, lstrcmp, lstrcpy, lstrlen, CharUpper, CharLower


Buffer-Manipulation Routines

There are Win32 Buffer-Manipulation Routines that are actually not document but exist in kernel32.dll. Win32 functions such as FillMemory, CopyMemory, MoveMemory, and ZeroMemory are all actually macros that use the CLIB equivalents. After removing all default libraries we no longer have the CLIB, therefore we must use an alternative. RtlFillMemory, RtlMoveMemory, and RtlZeroMemory are also macros that use CLIB, although these functions actually do exist in kernel32.dll as Win32 API calls. Even though they exist they are not document and chances are that there may indeed be a reason behind that. These functions may be missing in some versions of Windows and are not support by Microsoft. A solution to this is to write your own Buffer-Manipulation Routines.

The following is a header file that contains the source code to FillMemory, MoveMemory, CopyMemory, and ZeroMemory.
//**********************************************************************
// File: buffer.h				Last Modified:	03/02/01
//						Modified by:	PM
//	
// Purpose:      Custom Buffer-Manipulation Routines
//
// Developed By: Piotr Mintus 2001
//**********************************************************************

#ifndef _BUFFER_H_
#define _BUFFER_H_

#ifdef FillMemory
#undef FillMemory
#endif
#ifdef ZeroMemory
#undef ZeroMemory
#endif
#ifdef CopyMemory
#undef CopyMemory
#endif
#ifdef MoveMemory
#undef MoveMemory
#endif

__forceinline 
void FillMemory(void *dest,unsigned __int8 c,unsigned __int32 count)
{
	unsigned __int32	size32=count>>2;
	unsigned __int32	fill=(c<<24|c<<16|c<<8|c);
	unsigned __int32	*dest32=(unsigned __int32*)dest;

	switch( (count-(size32<<2)) )
	{
	case 3:	((unsigned __int8*)dest)[count - 3] = c;
	case 2:	((unsigned __int8*)dest)[count - 2] = c;
	case 1:	((unsigned __int8*)dest)[count - 1] = c;
	}

	while( size32-- > 0 )
		*(dest32++) = fill;

}  /* FillMemory */

#define ZeroMemory(dest,count) FillMemory(dest,0,count)

__forceinline 
void CopyMemory(void *dest,const void *src,unsigned __int32 count)
{
	unsigned __int32	size32=count>>2;
	unsigned __int32	*dest32=(unsigned __int32*)dest;
	unsigned __int32	*src32=(unsigned __int32*)src;

	switch( (count-(size32<<2)) )
	{
	case 3:	((unsigned __int8*)dest)[count-1] = 
		((unsigned __int8*)src)[count-1];
	case 2:	((unsigned __int8*)dest)[count-2] = 
		((unsigned __int8*)src)[count-2];
	case 1:	((unsigned __int8*)dest)[count-3] = 
		((unsigned __int8*)src)[count-3];
	}

	while( size32-- > 0 )
		dest32[size32] = src32[size32];

}  /* CopyMemory */

__forceinline 
void MoveMemory(void *dest,const void *src,unsigned __int32 count)
{
	unsigned __int32	size32=count>>2,i;
	unsigned __int32	*dest32=(unsigned __int32*)dest;
	unsigned __int32	*src32=(unsigned __int32*)src;

	if( dest > src )
	{
		switch( (count-(size32<<2)) )
		{
		case 3:	((unsigned __int8*)dest)[count-1] = 
			((unsigned __int8*)src)[count-1];
		case 2:	((unsigned __int8*)dest)[count-2] = 
			((unsigned __int8*)src)[count-2];
		case 1:	((unsigned __int8*)dest)[count-3] = 
			((unsigned __int8*)src)[count-3];
		}

		while( size32-- > 0 )
			dest32[size32] = src32[size32];
	}
	else
	{
		for(i=0;i<size32;i++)
			*(dest32++) = *(src32++);

		switch( (count-(size32<<2)) )
		{
		case 3:	((unsigned __int8*)dest)[count-3] = 
			((unsigned __int8*)src)[count-3];
		case 2:	((unsigned __int8*)dest)[count-2] = 
			((unsigned __int8*)src)[count-2];
		case 1:	((unsigned __int8*)dest)[count-1] = 
			((unsigned __int8*)src)[count-1];
		}
	}

}  /* MoveMemory */

#endif  // _BUFFER_H_
Now you can freely use FillMemory, MoveMemory, CopyMemory, and ZeroMemory within your application.

NOTE

The preceding source code uses the __forceinline keyword. This will achieve greater speed. If small size is desired over speed, please remove the keyword and use the preceding source as normal functions.



  References and Links
The following are references used by this paper and additional links:

Creating Small Win32 Exe's
Further general information on creating smaller Win32 executables.
The #WinProg FAQ
Frequently asked questions for the #WinProg EFNet IRC channel
PECompact
PECompact is a utility that compresses Windows 9x/NT4/w2k portable executables (EXE, DLL, SCR, OCX, etc..) significantly while leaving them 100% functional.
MASM32
MASM32 is a working development environment for programmers who want to write 32 bit Microsoft Assembler MASM



Piotr Mintus - [email protected]
March 3rd, 2001