Mostrando entradas con la etiqueta Direct3D. Mostrar todas las entradas

HLSL code editing in Notepad ++

enero 28, 2014 Iñaki Ayucar

There are several HLSL syntax highlight add-ins for Visual Studio out there, but if you prefer to use the great NotePad++ to author or edit your shaders, my fellow DirectX MVP Matt Pettineo has written a Notepad++ add-in to allow doing this.

You just need to download the HLSL.xml file from his GoogleDrive account, and in Notepad ++ click Language->Define Your Language->Import and then select the downloaded file. After restarting notepad++, you´ll find a new entry in the Language menu item like this:

By clicking in that new item when you load an HLSL file, you´ll get the following result:

According to him, it supports even SM 5.0 profiles.

Great job Matt !!! :)

DirectX Control Panel and D3D Debug Output in D3D 9.x/10.x/11.x for Windows 7, 8 and 8.1

octubre 04, 2013 Iñaki Ayucar

Debugging D3D applications can be a pain, but it´s completely necessary sometimes if you want to know what´s going on in your D3D application (error codes don´t give much information without the debug output).
However, things have changed quite a bit recently in the latest versions of Windows (8.1), Visual Studio (2013) and DirectX (11.2). The following video explains some of the changes related to D3D Debugging, the DirectX Control Panel, and how all the new infrastructure works:

You can also access the content in the form of slides.
Keep in mind that some of the DirectX features are no longer distributed with the DirectX SDK, but with the Windows SDK. So, we will try to cover all the possible cases you could face when trying to activate the Debug Output in D3D, no matter if you work in Windows 7 with the old version of DirectX SDK (June 2010), if you are in Windows 7 or Windows 8 and use the new Windows SDK, or if you are in the latest Windows 8.1 with its own Windows SDK.

The New DirectX Control Panel

We will need to deal with it to enable D3D debug and to manage other stuff, so first thing is to learn to differentiate between the old one (June 2010 DirectX SDK) and the new ones (Windows SDK). It´s easy: the new ones only include one tab (Direct3D 10.x/11.x):

Old Control Panel (DirectX SDK June 2010)	New DX Control Panel (Windows SDK)

Location: C:\Program Files (x86)\Microsoft DirectX SDK (June 2010)\Utilities\bin\x64 (or x86)	Location: C:\Windows\System32

So, if you are developing for D3D 10.x or 11.x, use the new one as the old one won´t have any effect. If you are still using D3D9 and the old DX SDK 2010, grab the one on your left.
Note: See the above video to learn about new features in the panel like the “Feature level limit”.

Windows 7

D3D 9.x

If you are still developing with D3D9, honestly you should seriously consider moving forward. But if you can´t, and you need to enable debug in your app, you just need to use the OLD Control Panel described above, and navigate to the Direct3D 9 tab to make sure you select “Use Debug Version of Direct3D 9”, and turn the Debug Output Level to “More”, just like depicted in the following image:

That should force your DirectX applications to use the Debug version of the DirectX libraries, so you should immediately start to see debug output in Visual Studio.

Managed D3D9 applications (SlimDX, SharpDX and similar wrappers)

If you are developing in C#, keep in mind that you will also need to activate the flag “Enable native code debugging” under the Debug tab of your main project properties in Visual Studio. If not, the native debug output cannot get through to the output window.

D3D 10.x / 11.x

Important None: The necessary components for debugging D3D 10.x and 11.x are no longer installed with the old DirectX SDK (June 2010). In order to have them you need to install the Windows 8 SDK (even if you are in Win7). If you don´t have the necessary components, the creation of the device with the "debug" flag will fail (see below for more info). One easy way to check if you have the components is to check the existance of the NEW DX Control Panel, in C:\Windows\System32.

Activating the debug output in D3D 10.x / 11.x is a bit different, as settings are handled per application (you need to add your exe to a list in the control panel, and set an specific configuration for it in there). To do so, please follow these steps:

1.- Open the NEW DirectX Control Panel and navigate to the Direct3D 10.x / 11 tab
2.- Click on “Edit List” to add your exe to the list of applications controlled by the DX panel
3.- In the window that will pop up (below), click on the dots “…” and navigate to your exe file. Then click “Ok”.

4.- Back in the main tab, choose the configuration you want (probably want to set “Force On” to force debug output), and mute all the message types you don´t want to see (if any)

Once your exe is on the list of apps the Control Panel manages, next step is to make sure your D3D device connects to the Debug Layer of DirectX.
You can find more info here, but basically what you need to do is create your Device with Creation Flags including the D3D11_CREATE_DEVICE_DEBUG flag.

Managed D3D 10.x /11.x applications (SlimDX, SharpDX and similar wrappers)

Just like with D3D 9, when developing in C# you should remember to activate the flag “Enable native code debugging” under the Debug tab of your main project properties in Visual Studio. If not, the native debug output cannot get through to the output window (see above in this post for more info).

Windows 8.x + Windows SDK

This part covers the case when working in Windows 8.x with the newer versions of the Windows SDK.

D3D 9.x

Debugging D3D 9 applications in Windows 8 should work exactly the same as we did in Windows 7. Of course, the new Windows SDK doesn’t include tools to configure D3D9, so you should install the June 2010 DX SDK to get access to the OLD control panel. I couldn’t make sure this works as all my machines are updated to Windows 8.1, so any feedback here will be really welcome.
What I can tell you is that, unfortunately, D3D9 debugging seems to be disabled in Windows 8.1. If you open the OLD DX Control Panel, you will see that all the debug parts of the D3D 9 tab are grayed out. I tried by all means to bring it back with no luck, so if you manage to enable it, please let me know.

D3D 10.x / 11.x

Enabling debug output for D3D 10.x and 11.x is pretty much the same as in the case of Windows 7, unless this time you will need to use the NEW version of the DX Control Panel, located in C:\Windows\System32 instead of the usual DXSDK folders.
Also, remember to create your devices specifying the D3D11_CREATE_DEVICE_DEBUG creation flag (as described above), and in the case of developing in C#, remember to activate the “Enable native code debugging” option in your main project.

Troubleshooting

The application works but I get no debug output: If you are in D3D9, make sure you activated the Debug libraries in the old DX Control Panel. Also, if you work in C#, ensure to activate the “Enable native code debugging” option. If you work in D3D 10/11, make sure you created the device with the D3D11_CREATE_DEVICE_DEBUG flag, and don´t forget to add your app to the list of programs managed by the DX Control Panel. In all cases, always use the appropriate DX Control Panel (see above to learn about this).
In D3D 10.x / 11.x, the application fails while trying to create the device with the DEBUG creation flag: This usually happens if you don´t have the correct SDK installed. If you are in Windows 7 or in Windows 8, make sure you install the Windows 8 SDK. If you are in the latest Windows 8.1 you should install its own Windows 8.1 SDK, as it´s not compatible with the 8.0 SDK version. One easy way to check if you have the components is to check the existance of the NEW DX Control Panel, in C:\Windows\System32.

Realtime, screen-space local reflections, using C# and SharpDX

febrero 09, 2013 Iñaki Ayucar

The following video shows my own implementation of the technique "Real Time Local Reflections (RLR)" used by Crytek in CryEngine3, and described here.

This particular implementation works with a non-deferred rendering system, and it’s adapted to work particularly well with planar surfaces like roads (which is what we most use it for, here at Simax).

The process is basically doing a texture lookup for the reflections as usual, but instead of using a cubemap, we use a simple texture (a copy of the previous back-buffer). It also needs a copy of the previous frame's depth buffer, to do a raymarch looking for the appropriate sample. The steps are the following:

1.- Start from the screen position of the pixel you are shading
2.- Move along the direction of the reflected (and projected to screen space) normal
3.- At each step, take a sample of the depth buffer, and look for a hit. If found, use the sample of the backbuffer at the same offset. If not, move one step forward until you are out of the texture bounds

Cons

It has a lot of downsides, as the amount of information present on a single texture is very limited. One key aspect is to fade out when you are reaching the limits of the backbuffer and when the reflection vector is facing the viewer (and therefore doesn´t hit the backbuffer). That way, you avoid hard edges in the reflection.

Another limitation is its compatibility with multisampling. The problem is that you need a copy of depth buffer, and if it's multisampled, you need to resolve it to a single sampled resource. Resolving the depth buffer from a multisample resource is not a trivial task, and in DX10 only graphics cards, it seems to be not possible (beside from doing it manually).

The method: ResolveSubResource does a good job with back-buffers, but it doesn´t work with depth-buffers (I haven´t tried in DX11 yet). Another option is to move to DX 10.1 and pass the depth buffer to the shader as a multi-sampled resource, using the Texture2DMS type introduced in DX 10.1. It allows to pass multi-sampled resources to shaders, so the resolving can be done in the shader.

Pros

The major advantage of this method is speed. By grabbing only the previous backbuffer, you can add reflections to almost any object in your scene. Of course, the shader used to draw is slower than a simple one, but nothing compared with the cost of rendering multiple cube-maps or other methods...

Also, despite its cons, it does a pretty convincing job in certain cases. Wet roads, water shaders and such stuff is a perfect case for it, as when you are driving in a simulator, the angle of incidence on the road, and therefore the reflection vector fit well with the back-buffer projection.

Another implementation of the technique can be found here. I haven´t tried it, but it seems to work too…

Cheers !

Projecting a 3D Vector to 2D screen space, with automatic viewport clipping (DirectX, SlimDX or XNA)

enero 29, 2013 Iñaki Ayucar

Many times, you will need to know the 2D screen coordinates of a 3D world position. DirectX already includes methods to perform vector projections, taking into account the needed World, View and Projection matrices, as well as the viewport scaling. It does not include however viewport clipping, as an additional feature in those methods.

Viewport clipping can be a tricky matter, and sometimes, you will need to rely on algorithms like the Sutherland-Hodgman algorithm, or the refined version specifically developed for 2D viewports: the Cohen-Sutherland algorithm. Those methods are especially appropriate when you are already dealing with 2D coordinates, or if you need to know the extra points or polygons generated when clipping is performed.

In our case however, we will only focus on finding the closest in-screen coordinates that correspond to an off-screen point, without dealing with any extra geometry or polygon sub-division. It’s important to note also that we will be working with 3D coordinates that go through a projection process (and finally getting 2D coords). This is relevant, as provides us with additional information we can use, and allows us to jump inside the algorithm and perform the clipping in the middle of the projection pipeline, instead of doing so at the end, when the coordinates are already 2D.

Resources like this, and this explain very well the processing of vertices in the Direct3D pipeline:

As you can see, each 3D position travels through different stages and spaces of coordinates: model space –> world space -> camera space –> projection space –> clipping space –> homogeneous space –> and finally: Screen Space.

Evidently, D3D also performs certain types of clipping to vectors, and you can tell by the above picture that clipping is done, (surprisingly), in clip space. We will try to mimic that behavior…

Note: Transforming coordinates with the MClip matrix, to go from projection space to clip space should be done only if you want to scale or shift your clipping volume. If you are ok with a clipping volume that matches your screen render target viewport (you will, most of the cases), you should leave this matrix as the Identity, or simply don´t perform this step. The below written algorithm has all this step commented.

Once our coordinates are in Clip Space (Xp, Yp, Zp, Wp), we easily perform the clipping by limiting their values to the range: –Wp .. Wp for the X and Y, and to the range: 0 .. Wp for Z.

After that, we just need to proceed with the normal Vector projection algorithm, as the resulting 2D coordinates will be stuck inside the screen viewport. An extra feature that should be nice to have, is a simple output variable that tells us if the coordinates were inside or outside the viewport.

A C# implementation of such an algorithm could be:

public static Vector2 ProjectAndClipToViewport(Vector3 pVector, float pX, float pY,       
                                float pWidth, float pHeight, float pMinZ, float pMaxZ,      
                                Matrix pWorldViewProjection, out bool pWasInsideScreen)      
        {      
            // First, multiply by worldViewProj, to get the coordinates in projection space      
            Vector4 vProjected = Vector4.Zero;      
            Vector4.Transform(ref pVector, ref pWorldViewProjection, out vProjected);      
      
            // Secondly (OPTIONAL STEP), multiply by the clipMatrix, if you want to scale        
            // or shift the clip volume. If not (most of the times you won´t), just leave  
            // this part commented,       
            // or set an Identity Matrix as the clip matrix. The default clip volume parameters        
            // (see below), will produce an identity clip matrix.      
            //float clipWidth = 2;      
            //float clipHeight = 2;      
            //float clipX = -1;      
            //float clipY = 1;      
            //float clipMinZ = 0;      
            //float clipMaxZ = 1;      
            //Matrix mclip = new Matrix();      
            //mclip.M11 = 2f / clipWidth;      
            //mclip.M12 = 0f;      
            //mclip.M13 = 0f;      
            //mclip.M14 = 0f;      
            //mclip.M21 = 0f;      
            //mclip.M22 = 2f / clipHeight;      
            //mclip.M23 = 0f;      
            //mclip.M24 = 0f;      
            //mclip.M31 = 0f;      
            //mclip.M32 = 0;      
            //mclip.M33 = 1f / (clipMaxZ - clipMinZ);      
            //mclip.M34 = 0f;      
            //mclip.M41 = -1 -2 * (clipX / clipWidth);      
            //mclip.M42 = 1 - 2 * (clipY / clipHeight);      
            //mclip.M43 = -clipMinZ / (clipMaxZ - clipMinZ);      
            //mclip.M44 = 1f;      
            //vProjected = Vector4.Transform(vProjected, mclip);      
             
            // Third: Once we have coordinates in clip space, perform the clipping,        
            // to leave the coordinates inside the screen. The clip volume is defined by:      
            //      
            //  -Wp < Xp <= Wp       
            //  -Wp < Yp <= Wp       
            //  0 < Zp <= Wp       
            //      
            // If any clipping is needed, then the point was out of the screen.      
            pWasInsideScreen = true;      
            if (vProjected.X < -vProjected.W)      
            {      
                vProjected.X = -vProjected.W;      
                pWasInsideScreen = false;      
            }      
            if (vProjected.X > vProjected.W)      
            {      
                vProjected.X = vProjected.W;      
                pWasInsideScreen = false;      
            }      
            if (vProjected.Y < -vProjected.W)      
            {      
                vProjected.Y = -vProjected.W;      
                pWasInsideScreen = false;      
            }      
            if (vProjected.Y > vProjected.W)      
            {      
                vProjected.Y = vProjected.W;      
                pWasInsideScreen = false;      
            }      
            if (vProjected.Z < 0)      
            {      
                vProjected.Z = 0;      
                pWasInsideScreen = false;      
            }      
            if (vProjected.Z > vProjected.W)      
            {      
                vProjected.Z = vProjected.W;      
                pWasInsideScreen = false;      
            }      
      
            // Fourth step: Divide by w, to move from homogeneous coordinates to 3D        
            // coordinates again      
            vProjected.X = vProjected.X / vProjected.W;      
            vProjected.Y = vProjected.Y / vProjected.W;      
            vProjected.Z = vProjected.Z / vProjected.W;      
      
            // Last step: Perform the viewport scaling, to get the appropiate coordinates        
            // inside the viewport      
            vProjected.X = ((float)(((vProjected.X + 1.0) * 0.5) * pWidth)) + pX;      
            vProjected.Y = ((float)(((1.0 - vProjected.Y) * 0.5) * pHeight)) + pY;      
            vProjected.Z = (vProjected.Z * (pMaxZ - pMinZ)) + pMinZ;      
      
            // Return pixel coordinates as 2D (change this to 3D if you need Z)      
            return new Vector2(vProjected.X, vProjected.Y);      
        }

Hope it helps !

Sonrisa

New XNA 4 book by Kurt Jaegers [Packt Publishing]

diciembre 12, 2012 Iñaki Ayucar

Kurt Jaegers has a new book on XNA 4 Game Development. I´ll review it in a few days, by now, I paste here some word from the author itself:

“This book follows the same style as my previous books on 2D game development with XNA, bringing three different 3D games to life. I cover items such as:
- The basic concepts behind 3D graphics and game design
- Generating geometry with triangles
- Converting height map images into terrain
- An introduction to HLSL, including writing shaders that handle lighting and multi-texturing
- Building a 2D button-based interface to overlay on your 3D action
- Implementing skyboxes for full 3D backgrounds”

More info here and here.

Los límites de la memoria

octubre 24, 2012 Iñaki Ayucar

Este artículo trata de servir como introducción a la gestión de memoria en .Net, los límites que el Runtime y la plataforma establecen para cada proceso, así como algunos Tips para lidiar con los problemas a los que nos enfrentamos al acercarnos a esos límites.

Memoria disponible por proceso

Como muchos de vosotros sabéis, por mucha memoria RAM que tenga instalada un ordenador, existen varias barreras impuestas a la cantidad de memoria usable en nuestras aplicaciones.

Por ejemplo, en un sistema de 32 bits no se pueden instalar más de 4GB de memoria física, evidentemente, porque 2^32 (dos elevado a 32) nos proporciona un espacio de direcciones con 4.294.967.296 entradas distintas (4GB). Pero incluso cuando el sistema cuente con 4GB de memoria física, nuestras aplicaciones se encontrarán con una barrera de 2GB impuesta por el sistema.

En estos entornos de 32 bits, cada proceso puede acceder a un espacio de direcciones de 2GB como máximo, porque el sistema se reserva los otros 2 para las aplicaciones que corren en modo Kernel (aplicaciones del sistema). Este comportamiento por defecto puede cambiarse mediante el uso del flag “/3gb” en el boot.ini del sistema, haciendo que Windows reserve 3GB para las aplicaciones que corren en Modo Usuario y 1GB de memoria para el Kernel.

Aún así, el límite por proceso permanecerá en 2GB, a no ser que explícitamente activemos un flag determinado (IMAGE_FILE_LARGE_ADDRESS_AWARE) en la cabecera de la aplicación. A esta combinación de flags en sistemas x86 se le denomina comúnmente: 4GT (4 GigaByte Tuning).

En sistemas de 64 bits sucede algo parecido. Aunque no tienen la misma limitación en cuanto a memoria física disponible, ni la impuesta por la reserva de direcciones para el kernel (y por lo tanto el flag /3gb no aplica en estos casos), el sistema también establece un límite por defecto de 2 GB para cada proceso, a no ser que se active el mismo flag en la cabecera de la aplicación (IMAGE_FILE_LARGE_ADDRESS_AWARE).

Activando el flag: IMAGE_FILE_LARGE_ADDRESS_AWARE

En el caso de aplicaciones nativas (C++), establecer dicho flag es fácil, ya que basta con añadir el parámetro /LARGEADDRESSAWARE a los parámetros del Linker dentro de Visual Studio.
En el caso de aplicaciones .Net:

Si están compiladas para 64bits, este flag estará activado por defecto, por lo que podrán acceder a un espacio de direcciones de 8 TB (dependiendo del S.O.)
Si están compiladas para 32bits, el entorno de Visual Studio no nos ofrece ninguna opción para activar dicho flag, por lo que tendremos que hacerlo con la utilidad EditBin.exe, distribuida con Visual Studio, la cual modificará el ejecutable de nuestra aplicación (activándole dicho flag).

La siguiente tabla, obtenida de esta página, muestra de forma resumida los límites en el espacio de direcciones de la memoria virtual, en función de la plataforma y del tipo de aplicación que estemos desarrollando:

Esta página tiene mucha más información sobre los límites de memoria según las versiones del S.O.

Los límites del sistema, más cerca de lo que crees

Hoy día, la memoria es barata, pero como ya se ha explicado en el apartado anterior, hay un buen número de casos en los que, por mucha memoria que instalemos en el PC, nuestro proceso solo podrá acceder a 2GB de la misma.

Además de esto, si vuestra aplicación está desarrollada en .Net, os encontraréis con que el propio Runtime introduce un overhead importante en cuestiones de memoria (suele decirse que está en torno a los 600-800 MB), por lo que en una aplicación corriente, es usual empezar a encontrar OutOfMemoryExceptions alrededor de los 1.3 GB de memoria usados. En este blog se discute el tema.

Por lo tanto, si no estamos en uno de esos casos en los que podemos direccionar más de 2GB, y además desarrollamos en .Net, independientemente de la memoria física instalada en el sistema nuestro límite real estará en torno a 1.3 GB de memoria RAM.

Para el 99% de las aplicaciones diarias, es más que suficiente, pero otras que requieren cálculos masivos, o que se relacionan con bases de datos, muy frecuentemente superarán ese límite.

Y lo que es peor…

Para complicar todavía más el asunto, una cosa es tener memoria disponible, y otra muy distinta es tener bloques de memoria contiguos disponibles.

Como todos sabéis, fruto de la gestión que el Sistema Operativo hace de la memoria, de técnicas como la Paginación, y de la creación y destrucción de objetos, la memoria poco a poco va quedando fragmentada. Esto quiere decir que, aunque tengamos suficiente memoria disponible, esta puede estar dividida en muchos bloques pequeños, en lugar de un único hueco con todo el tamaño disponible.

Los Sistemas Operativos modernos, y la propia plataforma .Net, tratan de evitar esto con técnicas de Compactación, y aunque reducen notablemente el problema, no lo eliminan por completo. Este completo artículo describe en detalle la gestión de memoria del Garbage Collector de .Net, y la labor de compactación que realiza.

¿En qué afecta la fragmentación? En mucho, ya que si vuestra aplicación necesita reservar un Array contiguo de 10 MB, y aunque todavía haya 1GB de memoria disponible, si la memoria está muy fragmentada y el sistema no es capaz de encontrar un bloque contiguo de ese tamaño, obtendremos un OutOfMemoryException.

En .Net, la fragmentación y compactación de objetos en memoria guarda una estrecha relación con el tamaño de éstos. Por eso, el siguiente apartado hablará un poco sobre este tema.

Grandes objetos en memoria

A la hora de reservar memoria para un único objeto, la plataforma .Net establece ciertos límites. Por ejemplo, en las versiones de .Net 1.0, 2.0, 3.0, 3.5 y 4.0, ese límite es de 2GB. Tanto para plataformas x86 como x64, ningún objeto único puede ser mayor de ese tamaño. Es así de simple. Únicamente a partir de .Net 4.5 este límite puede ser excedido (en procesos x64 exclusivamente). Aunque sinceramente, salvo rarísimas excepciones, si necesitas reservar más de 2GB de memoria para un único objeto, quizá deberías replantearte el diseño de tu aplicación.

En el mundo .Net, el Garbage Collector clasifica a los objetos en dos tipos: objetos grandes y objetos pequeños. Es una división bastante gruesa, la verdad, pero es así. ¿Qué considera .Net como un objeto pequeño? Todo aquel que ocupe menos de 85000 bytes.

Cuando el CLR de .Net es cargado, se reservan dos porciones de memoria diferentes: un Heap para los objetos pequeños (también llamado SOH, o Small Objects Heap), y otra para los objetos grandes (también llamado LOH, o Large Object Heap), y cada tipo de objeto se almacena en su Heap correspondiente.

¿En qué afecta todo esto al tema que estamos tratando? Sencillo, compactar objetos grandes es costoso, y a día de hoy, simplemente no se hace. Los objetos considerados “Grandes”, y que se introducen en el LOH, no se compactan (aunque el equipo de desarrollo advierte que pueden hacerlo algún día). Como mucho, cuando dos objetos grandes adyacentes son liberados, se fusionan en un único espacio de memoria disponible, pero ningún objeto es “movido” para realizar tareas de compactación.

Este fantástico artículo contiene muchísima más información acerca del LOH y su funcionamiento.

Arrays C# en los límites de la memoria

En C#, los Arrays Simples (de una dimensión) son una de las formas más comunes de consumir memoria, y debes saber que el CLR los reserva siempre como bloques continuos de memoria. Es decir, cuando instanciamos un objeto de tipo byte[1024], estamos solicitando al sistema un único bloque continuo de 1KB, y se generará un OutOfMemoryException si no encuentra ningún hueco contiguo de ese tamaño.

Cuando es necesario utilizar un Array de más de una dimensión, C# nos ofrece distintas opciones:

Arrays anidados, o arrays de arrays

Declarados como byte[][], suponen el método clásico de implementar arrays multi-dimensionales. De hecho, en lenguages como C++, es el único tipo de array multi-dimensional soportado de forma nativa.

En lo relativo a memoria, se comportan como un array simple (un único bloque de memoria), en el que cada elemento es otro array simple (esta vez del tipo declarado, y que también es un bloque único en memoria, pero distinto a los demás). Por lo tanto, en lo que a bloques de memoria se refiere, un array de tipo byte[1024][1024], utilizará 1024 bloques de memoria distintos (cada uno de 1024 bytes).

Arrays Multi-Dimensionales

C# introduce un nuevo tipo de Arrays, soportado de forma nativa: los arrays multi-dimensionales. En el caso de 2 dimensiones, se declaran como byte[,].

Aunque son muy cómodos de utilizar (disponen entre otras cosas de métodos como GetLength, para saber el tamaño de una dimensión), y su instanciación es más sencilla, su representación en memoria es diferente a la de los arrays anidados. Éstos se almacenan como un único bloque de memoria, del tamaño total del array.

En el siguiente apartado estableceremos una comparativa entre ambos tipos:

Comparativa: [,] vs [][]

El array 2D [,] (se almacena en un solo bloque):

Ventajas:

Utiliza menos memoria total (no tiene que almacenar las referencias a los n arrays simples)
Su creación es más rápida: reservar un bloque grande de memoria para para un solo objeto es más rápido que reservar bloques más pequeños para muchos objetos.
Su instanciación es más sencilla: una sola línea basta (new byte[128,128]).
Proporciona métodos útiles, como GetLength, y su uso es más claro y limpio.

Inconvenientes:

Encontrar un solo bloque de memoria continuo para el array puede ser un problema, si éste es muy grande o nos encontramos cerca del limite de RAM.
El acceso a los elementos del array es más lento que en arrays anidados (ver abajo)

El array anidado [][] (que se almacena en N bloques):

Ventajas:

Es más fácil encontrar memoria disponible para el array, ya que requiere de n bloques de tamaño más pequeño, lo cual debido a la fragmentación, suele ser más probable que encontrar un único bloque más grande.
El acceso a los elementos del array es más rápido que en los arrays 2D, gracias a las optimizaciones del compilador para manejar arrays simples (en definitiva, un array de arrays se compone de muchos arrays 1D).

Inconvenientes:

Utiliza más memoria total (tiene que almacenar las referencias a los n arrays simples)
Su creación es más lenta, ya que hay que reservar N bloques de memoria, en lugar de uno solo.
Su instanciación es un poco más molesta, ya que hay que recorrer el array instanciando cada uno de sus elementos (ver Tip más abajo).
No proporciona los métodos disponibles en los arrays 2D, y su uso puede ser un poco más confuso.

Este blog explica muy bien esta comparativa.

Conclusión

Cada usuario debe escoger el tipo de array que más le convenga en función de su experiencia y el contexto concreto en el que esté. No obstante, un desarrollador que habitualmente utilice gran cantidad de memoria, y preocupado por el rendimiento, tenderá a escoger siempre arrays anidados (o arrays de arrays [][]).

Tip: código generico para instanciar arrays anidados

Dado que instanciar un array de arrays es un poco molesto y repetitivo (y ya dijimos aqui que no conviene duplicar código), el siguiente método genérico se encargará de esa tarea por vosotros:

        public static T[][] Allocate2DArray<T>(int pWidth, int pHeight)             
        {       
            T[][] ret = new T[pWidth][];       
            for (int i = 0; i < pHeight; i++)       
                ret[i] = new T[pHeight];       
      
            return ret;       
        }

Espero que os Sirva !!!

Properly calculating the diffuse contribution of lights in HLSL Shaders

septiembre 17, 2012 Iñaki Ayucar

It’s been many years since Vertex and Pixel Shaders came out, and several years too since the Fixed Pipeline is deprecated, but there are still many questions in the forums out there asking about how to properly calculate the diffuse contribution of Lights. This paper has a great tutorial about the issue, and includes a whole Shader that mimics the Fixed Pipeline behavior. However, we will see here how to perform just the basic calculations, just in case you don’t need to emulate the full pipeline.
First thing is to write some D3D9 code that allows you to switch from the old Fixed Pipeline and your own Shaders, using the same parameters. Doing so, you will easily find any behavior differences in light calculations. You can read more about how D3D9 Fixed Pipeline calculates lighting in this page.
When writing shaders, people tend to calculate the diffuse contribution like:

Out.Color = (materialAmbient * lightAmbient) + (materialDiffuse * lightDiffuse * dot(Normal, L));

Where L is the vector from the vertex position (in world coordinates) to the light.
Apart from not doing any specular or emissive calculations (which could not be necessary in many cases, depending on your scenario), there are several mistakes in that approach:
1.- You don’t want the dot to return negative values, because it will black out colors wrongly. So, you need to clamp it to the 0..1 range, using the saturate operator: saturate(dot(Normal, L))
2.- In order to get the same results as the Fixed Pipeline, you should include Attenuation calculations, because they modify the intensity of light with the distance between the point being lit and the light source. Attenuation (as opposed to what its name suggests), not only attenuates light, but also can increase intensity in some circumstances. (See below how to properly calculate attenuation factors)
3.- Once you are calculating attenuation, you should remove the materialDiffuse factor from the previous equation, as you don’t want it to be attenuated too. You will apply it later, when the entire lighting contribution is properly calculated and attenuated.
Keeping those 3 things in mind, the final calculation in a vertex shader would be:

    float4 LightContrib = (0.f, 0.f, 0.f, 0.f);       

    float fAtten = 1.f;       

    // 1.- First, we store the total ambient light in the scene (multiplication of material_ambient, light_ambient, and any other global ambient component)       

    Out.Color = mMaterialAmbient * mLightAmbient;       

    // 2.- Calculate vector from point to Light (both normalized and not-normalized versions, as we might need to calculate its length later)       

    float pointToLightDif = mLightPos - P;       

    float3 pointToLightNormalized = normalize(pointToLightDif);       

    // 3.- Calculate dot product between world_normal and pointToLightNormalized       

    float NDotL = dot(Nw, pointToLightNormalized);         

    if(NDotL > 0)       

    {       

        LightContrib = mLightDiffuse * NDotL * mLightDivider;      

        float LD = length(pointToLightDif);         

        if(LD > mLightRange)       

            fAtten = 0.f;       

        else       

            fAtten = 1.f/(mLightAtt0 + mLightAtt1*LD + mLightAtt2*LD*LD);       

        LightContrib *= fAtten;       

    }       

    Out.Color += LightContrib * mMaterialColor;       

    Out.Color = saturate(Out.Color);

Comparison

First image is the Programmable version. You can slightly tell it by the reflections on the windows.

Second image is the Fixed Pipeline version (no real time reflections on windows):

La importancia de la codificación binaria de Shaders en DirectX o Silverlight

mayo 21, 2012 Iñaki Ayucar

Si alguna vez te has topado con un Vertex o Pixel Shader que al menos en apariencia es correcto, pero que sin embargo produce errores al compilar, ten en cuenta que la codificación utilizada para salvar el texto afecta.

Como ya sabrás, por mucho que un archivo contenga texto, en el disco duro de tu ordenador se almacena como datos binarios. Y para ello, es necesario escoger uno de los muchos métodos existentes para transformar el texto a binario, y vice-versa.

Si abrimos un archivo de texto con una herramienta de análisis Hexadecimal, como HxD, podremos observar que los primeros bytes del mismo determinan su codificación. Por ejemplo, la siguiente ilustración muestra un fichero con la cabecera EF BB BF, que determina que el fichero utiliza codificación UTF-8 (la codificación por defecto en Visual Studio).

Podéis encontrar más información sobre cabeceras de archivos de texto aqui.

Lamentablemente, el compilador de Shaders de DirectX solo admite determinados tipos de codificación, y UTF-8 no está entre ellos. Por eso, por mucho que el código de ese shader sea correcto, si tratamos de compilarlo recibiremos el siguiente error (u otros, dependiendo del entorno en el que nos encontremos):

“X3000: Illegal character in shader file“

Si esto sucede, solo tenemos que cambiar la codificación con la que se salva el archivo a disco, utilizando una sencilla opción de Visual Studio (Archivo->Opciones avanzadas de Salvado):

Aqui, podremos escoger qué codificación utilizar para guardar el archivo. Por ejemplo, podemos escoger “Western European (Windows) – Codepage 1252”, que es una codificación ASCII simple, para que el compilador de shaders funcione correctamente:

Mas info:

http://blog.pixelingene.com/2008/07/file-encodings-matter-when-writing-pixel-shaders/

http://www.cplotts.com/2008/08/22/encodings-matter-with-fx-files/

Developing a MatrixStack in pure managed C# code (ready for XNA)

marzo 23, 2012 Iñaki Ayucar

Some time ago, we already talked about the possibility of creating your own Math library directly in C#, with no native code. If you take enough care, it can be as fast as performing interop with a native one.

Today, we are showing an additional example on this matter, and we are going to develop our own fast MatrixStack class, all in safe C# code, with no COM interop.

Why?

I never understood well why the MatrixStack class remains to be an iDisposable COM object. Don´t know what kind of optimizations it has internally that justify having disposable resources, but it’s annoying to have the iDisposable overhead with no need for it.
Besides that, MatrixStacks are used in most cases as simple matrix helpers, to traverse object hierarchies. So, replacing the API MatrixStack with your own one should be a piece of cake, and will definitely help you if trying to port your code to some other platform.
Last, but not least, XNA does not have a MatrixStack class. So this C# implementation fits perfectly on it for all that want to use it.
I this example, I will be comparing my own class with the SlimDX MatrixStack, which is nothing more than a wrapper over the D3DX Matrix Stack.

The interface

In order to make the SlimDX stack replacement painless, I will keep the exact same interface in my class (except the COM-related stuff, which is no longer necessary). So, it will have to be something like this:

How it works

A MatrixStack, basically supplies a mechanism to enable matrices to be pushed onto and popped off of a matrix stack. Implementing a matrix stack is an efficient way to track matrices while traversing a transform hierarchy.
So, we can clear the stack to the Identity or to any other matrix, we can operate with the top of the stack, and we can add (push) or remove (pop) nodes (or levels, if you want) to the stack.
Example: for a robot arm hierarchy, we would go like this:

1.- Initialize the stack, and load the matrix of the first node in the hierarchy (the upper arm, for example). Now you can use the Top matrix to draw the upper arm.
2.- Create another level on the stack (Push) for the lower arm, and multiply the lower arm matrix. Use the Top matrix to draw the lower arm.
3.- Create another level on the stack (Push) for the hand, and multiply the hand matrix. Use the Top matrix to draw the hand.
The stack itself does nothing you cannot do with regular Matrix multiplications, except that it keeps track of the previous levels you have been creating. So you can go back to the upper node whenever you want. After the previous operations, for instance, if we perform a Pop, we would remove the top node of the stack, and go back to the previous. This way, the new Top node would represent the lower arm matrix, instead of the hand matrix.

The code

Here is my implementation of the MatrixStack. Please keep in mind that it has not been intensively tested, and might contain errors. Use it at your own risk:

    public class MatrixStack       

    {       

        /// <summary>       

        /// Retrieves the Top node matrix of the stack       

        /// </summary>       

        public Matrix Top = Matrix.Identity;         

        public object Tag = null;       

        private List<Matrix> mStack = new List<Matrix>();       

        /// <summary>       

        ///       

        /// </summary>       

        public MatrixStack()       

        {       

            LoadIdentity();       

        }       

        /// <summary>       

        /// Clears the stack and loads the Identity Matrix in the top of the stack       

        /// </summary>       

        public void LoadIdentity()       

        {       

            mStack.Clear();       

            Top = Matrix.Identity;       

        }       

        /// <summary>       

        /// Clears the Stack, and loads the matrix in the top of the stack       

        /// </summary>       

        /// <param name="pMat"></param>       

        public void LoadMatrix(Matrix pMat)       

        {       

            mStack.Clear();       

            Top = pMat;       

        }       

        /// <summary>       

        /// Adds a new level to the stack, cloning the current TOP matrix of the stack       

        /// </summary>       

        public void Push()       

        {       

            mStack.Add(Top);       

        }       

        /// <summary>       

        /// Removes the current TOP matrix of the stacks, returning back to the previous one       

        /// </summary>       

        public void Pop()       

        {       

            if (mStack.Count > 0)       

            {       

                Top = mStack[mStack.Count - 1];       

                mStack.RemoveAt(mStack.Count - 1);                 

            }       

        }       

        /// <summary>       

        /// This method right-multiplies the given matrix to the current matrix (transformation is about the current world origin).       

        /// This method does not add an item to the stack, it replaces the current matrix with the product of the current matrix and the given matrix.       

        /// </summary>       

        /// <param name="pMat"></param>       

        public void MultiplyMatrix(Matrix pMat)       

        {       

            Matrix.Multiply(ref Top, ref pMat, out Top);       

        }       

        /// <summary>       

        /// This method left-multiplies the given matrix to the current matrix (transformation is about the local origin of the object).       

        /// This method does not add an item to the stack, it replaces the current matrix with the product of the given matrix and the current matrix.       

        /// </summary>       

        /// <param name="pMat"></param>       

        public void MultiplyMatrixLocal(Matrix pMat)       

        {       

            Matrix.Multiply(ref pMat, ref Top, out Top);             

        }       

        /// <summary>       

        /// Rotates (relative to world coordinate space) around an arbitrary axis.       

        /// </summary>       

        public void RotateAxis(Vector3 pAxis, float pAngle)       

        {       

            Matrix tmp;       

            Matrix.RotationAxisAngle(ref pAxis, pAngle, out tmp);       

            Matrix.Multiply(ref Top, ref tmp, out Top);            

        }       

        /// <summary>       

        /// Rotates (relative to world coordinate space) around an arbitrary axis.       

        /// </summary>       

        public void RotateAxisLocal(Vector3 pAxis, float pAngle)       

        {       

            Matrix tmp;       

            Matrix.RotationAxisAngle(ref pAxis, pAngle, out tmp);       

            Matrix.Multiply(ref tmp, ref Top, out Top);            

        }       

        /// <summary>       

        /// Rotates (relative to world coordinate space) the specified Euler Angles       

        /// </summary>       

        public void RotateYawPitchRoll(float pYaw, float pPitch, float pRoll)       

        {       

            Matrix tmp;       

            Matrix.CreateFromYawPitchRoll(pYaw, pPitch, pRoll, out tmp);       

            Matrix.Multiply(ref Top, ref tmp, out Top);             

        }       

        /// <summary>       

        /// Rotates (relative to world coordinate space) the specified Euler Angles       

        /// </summary>       

        public void RotateYawPitchRollLocal(float pYaw, float pPitch, float pRoll)       

        {       

            Matrix tmp;       

            Matrix.CreateFromYawPitchRoll(pYaw, pPitch, pRoll, out tmp);       

            Matrix.Multiply(ref tmp, ref Top, out Top);            

        }       

        /// <summary>       

        /// Scale the current matrix about the world coordinate origin       

        /// </summary>       

        public void Scale(float pX, float pY, float pZ)       

        {       

            Matrix tmp;       

            Matrix.CreateScale(pX, pY, pZ, out tmp);       

            Matrix.Multiply(ref Top, ref tmp, out Top);       

        }       

        /// <summary>       

        /// Scale the current matrix about the world coordinate origin       

        /// </summary>       

        public void ScaleLocal(float pX, float pY, float pZ)       

        {       

            Matrix tmp;       

            Matrix.CreateScale(pX, pY, pZ, out tmp);       

            Matrix.Multiply(ref tmp, ref Top, out Top);            

        }       

        /// <summary>       

        /// Determines the product of the current matrix and the computed translation matrix determined by the given factors (x, y, and z).       

        /// </summary>       

        public void Translate(float pX, float pY, float pZ)       

        {       

            Matrix tmp;       

            Matrix.CreateTranslation(pX, pY, pZ, out tmp);       

            Matrix.Multiply(ref Top, ref tmp, out Top);            

        }       

        /// <summary>       

        /// Determines the product of the current matrix and the computed translation matrix determined by the given factors (x, y, and z).       

        /// </summary>       

        public void TranslateLocal(float pX, float pY, float pZ)       

        {       

            Matrix tmp;       

            Matrix.CreateTranslation(pX, pY, pZ, out tmp);       

            Matrix.Multiply(ref tmp, ref Top, out Top);       

        }       

    }

It has to be fast

When you start coding your own MatrixStack, you will soon realize that .Net includes a Generic Collection called Stack. You can use it, although I didn’t. Why?
Because I have separated the management of the Top Matrix of the stack to a member variable, and for the rest I just preferred to use a simple list to keep track of the previous nodes.
The Top Matrix is stored as a member variable to be able to pass it By Reference to the Matrix Multiplication methods. The speed increase avoiding to pass a whole matrix by value is significant. In the example below, it was around a 40% faster.

Test 1 – Reliability

I just made several random operations with the matrix stack, trying to test some of its features by comparing the end Top Matrix, both with a SlimDX MatrixStack and my own. The test operations are:

matrixStack.LoadIdentity();     

matrixStack.MultiplyMatrix(Matrix.PerspectiveFovLH(0.8f, 1.6f, 0.1f, 999f));      

matrixStack.Translate(10, 10, 10);      

matrixStack.Scale(2, 2, 2);      

matrixStack.RotateYawPitchRoll(1f, 0f, 0f);      

matrixStack.RotateAxis(Vector3.UnitY, 0.75f);      

matrixStack.Push();      

matrixStack.TranslateLocal(-5, -5, -5);      

matrixStack.ScaleLocal(0.1f, 0.1f, 0.1f);      

matrixStack.Pop();      

matrixStack.MultiplyMatrixLocal(Matrix.RotationZ(1.45f));

The resulting top matrix is:
SlimDX MatrixStack:

[M11:-0.06350367 M12:4.695973 M13:-0.3505643 M14:0]
[M21:0.5231493 M22:0.5700315 M23:2.887983 M24:0]
[M31:18.08297 M32:20 M33:-23.60117 M34:1]
[M41:-0.1968169 M42:0 M43:0.03565279 M44:0]

MyMatrixStack:

{M11:-0.06350368 M12:4.695973 M13:-0.3505643 M14:0}
{M21:0.5231493 M22:0.5700315 M23:2.887982 M24:0}
{M31:18.08297 M32:20 M33:-23.60117 M34:1}
{M41:-0.1968169 M42:0 M43:0.0356528 M44:0}

As you can see, the result is exactly the same.

Test 2 - Speed

Speed is important, so I decided to run the above mentioned operation 10 million times, to se how long it takes to complete both using SlimDX and my own code.
Obviously, if we run in Debug mode (disabling optimizations), there will be a huge performance difference, as the SlimDX dll is already compiled with optimizations. But what happens if we turn all optimizations on when compiling our code?
Here is the result of a small test application:

As you can see, the .Net Framework alone is faster than SlimDX, thanks to its optimizations and to the absence of the interop layer.
What happens if we increase the number of iterations to 60 million? The difference is obviously bigger (1.36 seconds faster):

Note: This test has been done on an intel i7 CPU at 3.8 Ghz, running on Windows 7 x64 with .Net Framework 4.0.

Note2: SlimDX MatrixStack uses its own Matrix class and operations. My implementation uses my own Matrix implementation, also written in pure C# code.

Conclusion: .Net Rocks. A purely native C++ code would be even faster of course, but if you put in the equation the huge amount of benefits .Net will give you, I really think it’s worth it. Don’t you think?

Cheers !