Wednesday, 30 December 2009

Writing a correct Main Loop for your application

Have you ever wondered how can Windows run more than one application at a time, even in a single-processor machine? The answer is easy. It doesn’t. The point is that it switches from one application to another so fast that you don´t notice the change, and it seems both applications are running in parallel.
To allow this, running Windows applications should not always be running.
It´d be a waste of resources to massively use the processor just to find out that nothing had to be done, don’t you think?. That´d be kind of a psychotic behavior. Windows has it´s protocols and procedures to avoid this, letting applications or threads enter idle states, even for a few milliseconds, and giving then priority to other, non-idling processes.
So, what would happen if your application never enters idle mode? It would bring down the multi-tasking capability of Windows, slowing down every other application than yours. That´s why I say that running applications should not always be running.
Under this point of view, applications can be divided into three groups:
  1. Psychotic applications that should be updated always, i.e. intense computing applications (not so frequent, nowadays)
  2. Applications that should be updated only as a response to some user interaction (most Windows Forms applications)
  3. Applications that sometimes need to be “always updated”, and sometimes not (like a video-game)
As you know, when an application is run, the Main method is invoked, and the O.S gives the application access to the processor and some memory. Apart from thread priority policies, application´s behavior will be one or another depending on what we do in this Main() method:

Case # 1. Compute, compute, compute

The Main method structure is simple: update as fast as you can, and consume as much processor as possible. The Main() method in pseudo-code:
            while (Application is Alive)
            {
                Compute()
            }
This is the typical behavior of old applications in non multi-tasking environments. They consume as much processing power as is available.

Case # 2. Don´t disturb unless absolutely necessary

What should be done here is easy: if events are pending, process them (DoEvents). If not, just relax and sleep, yielding processor to other threads or applications. The exact implementation of such a procedure can be more complex, as we should decide when to enter sleep modes and when not. Luckily, .Net does that for us. The C# implementation you should use is:
            System.Windows.Forms.Application.Run(mForm);
The Run() method takes care of all the things above mentioned (yielding and many other things), and it does it using the standard .Net procedures and protocols.

Case # 3: The mixed approach

Other situations need a mixed solution. For example, a Windows Forms application that sometimes needs to enter a “running” state, where everything has to be updated, like a Windows Forms-based video-game, or a 3D editor where you want to run an animation. Here, updates will be applied when the application is in Running state, but not when it´s not, saving this way CPU (and batteries, if running in a laptop).
So, we want to combine the efficiency of Case # 2, with the possibility of switching to Case # 1 when needed. The correct implementation for this, in C#, is:
       ...
            System.Windows.Forms.Application.Idle += new EventHandler(Application_Idle);
            System.Windows.Forms.Application.Run(mForm);
        }
        static void Application_Idle(object sender, EventArgs e)
        {
            if(Application.RunningState)
            {
                 while(ApplicationStillIdle)
                      Compute();
            }
        }
The Application_Idle event will be fired each time the application is about to enter in Idle state (where all the pending messages for the windows has been processed).
This way, interaction between our application and others (running at the same time) will be correct: when we decide it (activate running state), we will have all the computing power for ourselves, but when it´s not needed (RunningState is off), we will yield it.

How to check if application is still Idle

The Application_Idle event is fired only once, when the application is about to enter idle state. Here, we will have to update computations while it is still Idle, giving priority to Windows Messages if there are some. How to check if the application is still Idle? Easy, we can use a property like the following:
        /// <summary>Checks to see if the application is still idle</summary>
        private bool AppStillIdle
        {
            get
            {
                NativeMethods.Message msg;
                return !NativeMethods.PeekMessage(out msg, IntPtr.Zero, 0, 0, 0);
            }
        }

An extra option to make things even better

We have seen how to activate/deactivate the psychotic (processor consuming) behavior of an application, basing on a state variable, but sometimes it is interesting to also deactivate it when our application does not have the focus (it is not the active application in the O.S.).
Windows controls and forms have a “Focused” property, but that doesn’t help much in this case, as we want to check if the entire application has focus, not an specific form (the main form of the application will loose focus often, when we open a child window, for instance).
Then, how to check if our application is the active application in Windows? Unfortunately, it seems that there is no C# managed code to check that, but we can easily use DLLImport to invoke the “GetForegroundWindow” method, comparing its result to our main form handle:
[System.Runtime.InteropServices.DllImport("user32.dll")]
static extern IntPtr GetForegroundWindow();
 
public bool MyAppFocused
{
    get { return GetForegroundWindow() == this.Handle; }
}
This way, we can put an additional condition to the Compute method, deactivating it also when the application is not focused:
static void Application_Idle(object sender, EventArgs e)
{
   if(Application.RunningState && MyAppFocused)
   {
       while(ApplicationStillIdle)
            Compute();
   }
     }

What should not be done

Some people tend to write the following code when need the mixed solution mentioned:
while (mForm.Created)
{
      System.Windows.Forms.Application.DoEvents();
 
      if(Application.RunningState)
             Compute();
}

Though it works, this approach is not correct, as it doesn’t yield processor to other threads (your application won´t enter idle mode). Use the above Idle event approach instead.

Conclusion

Quality software use the processor when really need it. Adding a few extra lines of code, you software will better share resources with other applications, and will improve laptop or handheld device´s battery life.
Take Care!

Saturday, 19 December 2009

AutoBild estudia los efectos del cansancio y del alcohol al volante con simuladores Simax

En el número de esta semana de AutoBild (nº 214, del 18 al 24 de Diciembre de 2009), se publica la primera parte de un estudio realizado en colaboración con Toyota España y Simax, en el que se analizan los efectos del cansancio y del consumo de alcohol en la conducción.

PortadaAutobild¿Y qué mejor manera para hacerlo que realmente analizar la conducción de voluntarios en condiciones de fatiga reales, y después de haber bebido? Y no nos referimos a 10 minutos de test, sino a 2 horas seguidas conduciendo en un entorno monótono como una autovía, para incrementar todavía más el cansancio y el peligro. Además, y para obtener la mayor cantidad de información posible, hemos comparado sus resultados con los de otros voluntarios, descansados y sobrios.

Por razones obvias, este estudio no puede hacerse en la vía pública, y un circuito cerrado no sirve, ya que tendrían que estar 2 horas dando vueltas a un circuito, sin tráfico, y que nada tiene que ver con una carretera real.

La solución diseñada por AutoBild ha sido utilizar simuladores Simax, cedidos por Toyota España. Nos propusieron la idea y desde el principio nos encantó, así que nos pusimos manos a la obra.

Además de permitir a los conductores circular por una vía real (autopista), e interactuar con tráfico, desde Simax diseñamos un ejercicio en el que, a lo largo de 2 horas, los conductores irían enfrentándose a incidencias de tráfico de dificultad creciente. Desde una simple balsa de agua en la calzada, hasta un desprendimiento de rocas.

Los simuladores Simax monitorizan en todo momento todas las reacciones de los conductores y los vehículos, lo que nos ha permitido extraer conclusiones de lo más interesantes. Tenéis vuestro ejemplar de AutoBild en el quiosco.

¡No os lo perdáis!

La semana que viene se publica la segunda parte del estudio, y las conclusiones finales.

Monday, 7 December 2009

My ListView won´t show groups! Why?

This is a well documented issue, but just in case it helps someone.

If your ListView doesn't show groups, ensure the following (starting with the most obvious, and ending with the probable cause):

  1. Ensure your ListView has, at least, one column
  2. Ensure you have created Groups correctly
  3. Properly assign a group to each Item, through their “Group” property
  4. Set listView.ShowGroups property = True
  5. Ensure you have called System.Windows.Forms.Application.EnableVisualStyles() method at the beginning of your application

ListView groups rely on VisualStyles of the application, what is only available on Windows XP systems and above. So, groups won´t work on anything below Windows XP.

Here, you can find some explanations, like the following:

“ListView groups are available only on Windows XP and Windows Server 2003 when your application calls the Application.EnableVisualStyles method. On earlier operating systems, any code relating to groups has no effect and the groups will not appear. As a result, any code that depends on the grouping feature might not work correctly.”

Cheers!

Monday, 16 November 2009

NO, you are not the center of the universe!

If you are one of those who like to talk about nationalities, or about excluding others because they are foreigners, check this video…

No, you are not the center of the universe. And no, the guy from “wherever you want” (place here things like England, Spain, USA, Argentina, China, Japan, Chad, Congo, Sierra Leona, or what the fuck you want to write there) is not a foreigner. In fact, he is just the guy next door, so It´d great if you start treating him as your brother…

As a friend reminded me a couple of days ago:

You have a Jewish, Asian, or Islamic god. Your car comes from Japan but drinks Arabian black gold. You eat Italian pizza and warm your house up with Algerian Gas. You drink Brazilian or Colombian coffee in the morning, and Belgian beer in the night. You spend your spare time in Morocco, Thailand or France, and you read this words with your “made in China” computer, through a “made in USA” network. You count with Arabian numbers, and talk with Latin words.

Then tell me… How do you dare to call your neighbor a foreigner?

Thursday, 12 November 2009

Combos de controles en DotNetManía

El número 64 de la gran revista DotNetManía publica un artículo que he escrito sobre creación de combos de controles en Windows Forms con un solo click, además de muchas otras cosas muy muy interesantes. Espero os resulte de ayuda. Saludos!

  portadagr

http://www.dotnetmania.com/Articulos/064/index.html

Wednesday, 11 November 2009

Google StreetView-ed !

It was funny some months ago when I found the Google Street View car driving around our workplace. Now it has been uploaded to the web.

My pictures (excuse the quality):

google_low

Google version:


Ver mapa más grande

Jaaaaa… I´m the guy behind the streetlight… !!!

Cheers!

Wednesday, 28 October 2009

Autonomous car driving by V.A.I.L

Autoblog has published today a very interesting video on autonomous driving. It surely worth the time you´ll invest watching it…

 

During the first seconds, you can see some Java code. It appears there because that’s the “yesterday” part of the video. I´m sure that they were just about putting some .Net code in the “tomorrow” chapter. They probably run out of time… ;)

Specially beautiful the close look at the accelerating wheel, at the end of the video.

Hope you enjoy it.

Tuesday, 27 October 2009

Así son los moteros…

Un buen amigo me ha enviado hoy un texto que me ha tenido partiéndome el culo media hora. Seguro que se ha publicado en mil sitios ya, pero me ha hecho tanta gracia que no puedo sino pegarlo aquí.

Así son los moteros.....Sacado de un foro de motos:


Hola a Todos,

No suelo escribir en el foro, pero necesito vuestro consejo en lo que puede ser una decisión crucial.

Desde hace un tiempo sospecho que mi mujer me engaña. Ya sabéis, las cosas típicas como esas llamadas a casa que cuando descuelgas no contesta nadie. Mi mujer sale con amigas últimamente, pero le pregunto sus nombres y me dice que no las conozco, que son amigas del trabajo... Me quedo despierto para verla llegar en taxi, pero ella baja por la calle a pie y a lo lejos escucho un coche alejarse del que se ha bajado a la vuelta de la esquina. ¿Quizás no volvió en taxi...?

El otro día cogí su móvil sólo para mirar la hora y se puso como loca gritándome que nunca cogiera otra vez su móvil y que si la estaba espiando. Nunca he hablado de esto con mi mujer. Creo que en el fondo no quiero saber la verdad, pero anoche ella volvió a salir y decidí que iba a investigar.

Aparqué mi moto cerca del garaje y me escondí detrás de ella de manera que tuviera una buena visión de toda la calle cuando ella llegara a casa. En ese momento, agachado detrás de la moto, me di cuenta de que la tapa de balancines perdía algo de aceite.

Y aquí viene mi pregunta:

¿Reaprieto y me la juego a que se pase de rosca o me lío y cambio la junta?

 

¡Juas!

Thursday, 15 October 2009

The Italian (or Spanish) man who went to Malta… Juas!

Me parto el culo…

 

Tuesday, 6 October 2009

Simax ECO-Driving goes U.K.

Simax is starting to distribute the ECO-Experience Simax Simulator in the U.K. Of course, they drive on the left, so we had to develop new environments, with a British look.

Want it real? Check this…

IQ1lond7lond1  lond8

Simax ECO-Experience allows drivers to improve their driving skills, with a special focus on ecological driving concepts. Simax ECO-Experience has been installed in several 1st line Motor Shows around Europe, where thousands of drivers learned how to save tons of CO2 emissions. To know more:

www.simaxvirt.com

The real simulation experience

Saturday, 3 October 2009

Usuario y Password para configuración del router ADSL de timofónica

Si, como a mi, no te dieron el usuario y password para configurar tu router de telefónica no desesperes. Todos suelen tener los mismos. Aqui va una recopilación:

  • Usuario: admin, Password: admin
  • Usuario: adminttd, Password: adminttd
  • Usuario: 1234, Password: 1234
  • Usuario: user, Password: admin
  • Usuario: admin, Password: 1234
  • etc

Nota: Para acceder a la config de tu router tienes que ir en el navegador web (IE) a una dirección IP concreta, que suele ser: http://192.168.1.1

Espero que te sirva…

Monday, 28 September 2009

¿Y ahora qué? … o de cómo enmierdar un buen software

En los comentarios de este post del blog del Maligno (ver resumen aqui) hablé de tres categorías en las que, en mi opinión, puede dividirse el software. Hoy hablaré de otras dos totalmente diferentes: aplicaciones infinitas, y aplicaciones enmierdadas hasta el infinito.

Hay proyectos que no acaban nunca. Siempre pueden ser mejorados, y te darán trabajo hasta que te mueras, si quieres. Hay otros sin embargo, cuyo recorrido acaba, o al menos debería acabar.

Si un software ha nacido para tostar CDs, pues que tueste CDs. No necesitamos que también nos tueste las rebanadas del pan bimbo de las mañanas. Si una aplicación ha nacido para reproducir mp3, pues que la reproduzca. No necesitamos que venga con un plugin para reproducir un 8 pistas a través del RS232, o un vinilo pasando la uña crecida del dedo meñique por sus surcos (y enchufándonos un USB en el culo).

Y es que parece que ciertos desarrolladores, cuando ya no saben por donde seguir con un producto, en lugar de dedicarse a otra cosa lo convierten en un poupurri de mierda de lo más variopinta.

Un ejemplo: Nero

Sin duda el mejor programa de grabación de CDs/DVDs, reconvertido a todo una suite de aplicaciones chungueras. Tenemos el Nero Cover Designer, el Nero Track Editor, el Nero Publishing Suite, el Nero Hacking Monitor, el Nero Porn Finder y el Nero World Destroyer. Todo en uno. Al menos sus fabricantes han tenido la delicadeza de poner cada funcionalidad en una aplicación diferente (que puedes no instalar), y dejar al Nero original intacto. Creo que gracias a eso sigue siendo tan popular, porque… ¿os imagináis toda esa mierda dentro del interfaz de Nero?

Otro ejemplo: WinAmp.

Dios mío, hasta se me escapa una lagrimilla. ¿Donde está mi WinAmp de toda la vida? ¿Como se puede meter tanta mierda en un mismo saco? WinAmp Search Engine, WinAmp ToolBar, WinAmp Rip&Burn (es esto legal?), WinAmp Video, WinAmp Converters, WinAmp signal processing, …

¡Qué aburrimiento! Me pone de los nervios que los fabricantes de software intenten meterte toda su ponzoña y llenarte el PC de porquerías. Adobe te mete su updater y su downloader a gogó. Dell, más de lo mismo. Apple ni te cuento, con su puñetero Bonjour y su puto iTunes que no quiero ver ni en pintura (y que te obliga a instalar si quieres tener el QuickTime). El antivirus con su toolbar para el browser, yahoo igual, google igual, WinAmp igual, Megaupload igual… ¡5 toolbars que si no te preocupas quedarán instalados en tu sistema!

Y luego dicen que Windows o Internet Explorer va lento. ¡Nos ha jodido!

Todos los programas que instalas, por defecto llevan toda su mierda activada. Tienes que preocuparte proactivamente de desmarcar las opciones que no desees. ¿Tenéis idea de cuantos servicios están corriendo en vuestro PC ahora mismo que son totalmente INNECESARIOS y que restan rendimiento al sistema?

En fin, señores. Que si. Que es posible que un proyecto de software esté acabado. Cuando se vean a ustedes mismos incluyendo funcionalidades de meteorología en su desfragmentador de discos duros, es hora de dejarlo.

¿No creen?

Tuesday, 22 September 2009

Properly scaling Point Sprites in DirectX

When rendering point sprites with DirectX, many people deals with the issue of point sprite scaling. There are many options regarding this, but in the 99% of the cases, people just wants one of the following two things:

1.- Point Sprites with a constant screen-space size

Ver imagen en tamaño completo

This sprites are rendered at the 3D position (projected to the viewport of course), but with a constant size, in pixels. So, if you move the camera towards the point sprite, it feels like if it shrinks, and if you move the camera away from the sprite, it will look like if “growing”.

This effect is sometimes desirable, specially if you are using point sprites to simulate 2D effects, post-processing or camera effects like Lens Flares (on your left).

To setup this kind of Point Sprite rendering in DirectX, you just need to disable PointSprite scaling. This way, you will be able to specify point sizes both with a default value or in each point vertex information. These are the render states you need to modify:

Disable point scaling: D3DRS_POINTSCALEENABLE = false

Set default point size (remember, in pixels): D3DRS_POINTSIZE = XX

2.- Point Sprites with a real 3D size

By contrary to what’s shown in the previous chapter, sometimes you need point sprites to behave like real 3D objects, i.e. when using them to simulate a particle system, or that kind of stuff.

In this case, you don’t want sprites to remain at the same size if you move away from them, as real 3D objects get smaller as we get farther from them and vice versa. It is obvious that we need to change the size of sprites (remember, it’s a 2D size, in pixels), accordingly to our 3D movements, to give the illusion of 3D behavior.

DirectX is already prepared for that, with a built-in system to handle point sprite scaling. You just need to activate it setting the render state D3DRS_POINTSCALEENABLE to true, and setting up some other parameters. If you take a look at the docs, you will find the following:

Point Size Computations

Point size is determined by D3DRS_POINTSCALEENABLE. If this value is set to FALSE, the application-specified point size is used as the screen-space (post-transformed) size. Vertices that are passed to Direct3D in screen space do not have point sizes computed; the specified point size is interpreted as screen-space size.

If D3DRS_POINTSCALEENABLE is TRUE, Direct3D computes the screen-space point size according to the following formula. The application-specified point size is expressed in camera space units.

S s = Vh * S i * sqrt(1/(A + B * D e + C *( D e2 )))

In this formula, the input point size, S i, is either per-vertex or the value of the D3DRS_POINTSIZE render state. The point scale factors, D3DRS_POINTSCALE_A, D3DRS_POINTSCALE_B, and D3DRS_POINTSCALE_C, are represented by the points A, B, and C. The height of the viewport, V h, is the D3DVIEWPORT9 structure's Height member representing the viewport. D e, the distance from the eye to the position (the eye at the origin), is computed by taking the eye space position of the point (Xe, Ye, Ze) and performing the following operation.

D e = sqrt (Xe2 + Y e2 + Z e2)

The maximum point size, Pmax, is determined by taking the smaller of either the D3DCAPS9 structure's MaxPointSize member or the D3DRS_POINTSIZE_MAX render state. The minimum point size, Pmin, is determined by querying the value of D3DRS_POINTSIZE_MIN. Thus the final screen-space point size, S, is determined in the following manner.

  • If Ss > Pmax, then S = P max
  • if S < Pmin, then S = P min
  • Otherwise, S = S s

Well, that´s all. Clear ugh? Just kidding…

Setting the A,B,C scaling parameters for a correct, 3D, point scaling

In order to achieve the behavior you want, you just have to set the following configuration:

D3DRS_POINTSCALE_A = 0.0f
D3DRS_POINTSCALE_B = 0.0f
D3DRS_POINTSCALE_C = 1.0f
D3DRS_POINTSIZE_MIN = 0.0f
D3DRS_POINTSIZE_MAX = 9999.0f

Et voilá ! 3D point sprites…

Ultimas apariciones en prensa de Simax

SimaxReflect_256x128Simax ha aparecido últimamente en prensa y radio, a raíz del galardón en los premios Bancaja, e interesándose especialmente por nuestros productos para formación en conducción ecológica. Ahí van los links:

La razón:

http://www.larazon.es/noticia/conduccion-ecologica-virtual

Punto Radio (con audio de la entrevista):

http://actual.lasprovincias.es/jovenesemprendedores/etiquetas/inaki-ayucar/

Saludos.

Wednesday, 16 September 2009

Dell XPS 630i, the BT Mini-Receiver, download managers, and other crap…

If you are one of the owners of a Dell computer, you´ll probably be happy with it, like me. However, nothing´s perfect, and of course dell is not either.

One of the things I hate most on Earth is that too-usual way of managing download from websites. Every company has it’s own download manager, and in a few months you end up with a dozen of them installed on your computer or on your browser: Adobe Update Manager, Dell Download Manager, Apple Update Manager, Google updater, Aaaaarrgggllll !!!! I just want to download the f**ing file !!!!! Stop selling me your crap !!!!

Well, back to business. If you own an XPS 630i (and probably other Dell models), and you don’t install all the crap that comes with your system (Dell tune up utility, Dell application, Dell recovery, Dell diagnostics, Dell sandwich maker and the Dell fantastasystem), or if you simply lost your original CD drivers, you will probably miss one: the driver for the BlueTooth Mini-Receiver.

It is not initially listed in the default downloads list at www.dell.com for the 630i (a very deficient list, I must say. Is it that difficult to include a small description of the purpose of each application or update?), and believe me, it’s not easy to find.

I also tried to use one of the magnificent Dell helper-applications to find it, but the one it suggested didn’t work. Finally, I found it here, thanks to the people of My630i.

Thanks a lot!

Saturday, 5 September 2009

Ramblings about your own, fast C# Maths library

Why not building your own C# Maths library instead of using the ones included in APIs like DirectX or XNA? That would give you some advantages when developing a game or something similar. Keep reading…
It is late at night. Very late. In fact it is so late, that I should say it’s almost early in the morning… but I´m immersed in a task I have delayed for too much time: extend my own Math library, directly written in C#. What you said? That cannot be fast enough? ha ha… Some FAQs first:

Intro

Why to write my own library?

Because I think it’s good to depend on the Graphics API as less as possible. Why?
  • Because APIs are deprecated from time to time (as happened with Managed DirectX), or evolve, or change, or re-appear in some form (as it seems that is going to happen with some parts of Managed DirectX, via the Windows 7 Code Pack), or you just decide to use another API, or you want your code to be compatible both with XNA and with DirectX.
  • Because I want to be able to debug the inners of my Maths.
  • Because writing your own library (even if you copy most parts of it), you will learn a lot.
  • Because seeing the internals of things, you will realize how much work some tasks take, and you will think it twice before calling them…
Do you need any more reasons?

Wait a minute. Faster code, using no API?

For CPU-related stuff, YES, it is possible.
XNA (or DirectX) makes no magic. They just calculate things as you will do. But of course you must choose the best algorithms for each task.If you do so, you will end up with a code as fast as the one written in the API, but the point is, as you will know how things work in the inners, you will have a much more detailed idea of the cost of things, and you will optimize your code. In addition to that, you will be able to tune things a bit for your specific needs, increasing the speed a bit more.
However, keep in mind that companies like Microsoft have hundreds or thousands of the best engineers working in their products, so if there is a better method of doing something, the API will have it. You need to be that good too!
Let’s see an example of the kind of optimization you need to achieve to make your own Maths library as fast (or faster) than the ones included in the APIs. We will include some matrix operations and make some speed tests with different algorithms.

A note about Affine Transformation Matrices

In 3D graphics applications, transformations are typically built from the combination of elementary matrices (being this elementary matrices: translations, rotations, scales and the identity matrix). Matrices built this way, always result in Affine transformations (matrices with the right column of the matrix being 0,0,0,1 –if using row representation of matrices-).
There are some algorithm optimizations in this article that are only valid for Affine Transformation matrices. You should avoid using them for non-affine matrices. It is the case of Perspective Projection matrices, which DO NOT BELONG to the affine transformations group.

A note about JIT compilation and .NET maths performance

This is a very interesting question: Are local variables faster than member variables?
At first, one could say that they aren’t, as they have to be re-created each time the method is called. BUT, as this interesting blog post points out, while the JIT compiler can en-register local variables, it cannot do the same with member fields. So we should say that YES, THEY ARE, when talking about math calculations. As you can see the in the test, it makes a remarkable difference. So:
If you are developing in .Net and you have a performance critical math. calculation method, you should use local copy of variables to speed up calculations
In fact, if you get a glimpse inside the XNA code, every non-static method inside the Matrix class, which is performance critical, makes a local copy of each member field.

Part #1 Matrix Determinant Calculation

The general Method (Based on Graphics Gems I)

The following is a pretty straight-forward algorithm for calculating the determinant of a matrix. It is mostly designed for non-realtime applications, as it quite slow and uses double-precission. It calculates a 4x4 determinant basing on the 3x3 method, then in the 2x2 and so on. It is a based in an algorithm used in Graphics Gems I for matrix inversion.
        public double Determinant4x4()
        {
            double ans;
            double a1, a2, a3, a4, b1, b2, b3, b4, c1, c2, c3, c4, d1, d2, d3, d4;
 
            a1 = this.M11; b1 = this.M12;
            c1 = this.M13; d1 = this.M14;
 
            a2 = this.M21; b2 = this.M22;
            c2 = this.M23; d2 = this.M24;
 
            a3 = this.M31; b3 = this.M32;
            c3 = this.M33; d3 = this.M34;
 
            a4 = this.M41; b4 = this.M42;
            c4 = this.M43; d4 = this.M44;
 
            ans = a1 * Determinant3x3(b2, b3, b4, c2, c3, c4, d2, d3, d4)
                - b1 * Determinant3x3(a2, a3, a4, c2, c3, c4, d2, d3, d4)
                + c1 * Determinant3x3(a2, a3, a4, b2, b3, b4, d2, d3, d4)
                - d1 * Determinant3x3(a2, a3, a4, b2, b3, b4, c2, c3, c4);
            return ans;
        }      
        public double Determinant3x3(double  a1,double  a2,double  a3,double  b1,double  b2,double  b3,double  c1,double  c2,double  c3 )
        {
            double ans;
            ans = a1 * Determinant2x2(b2, b3, c2, c3)
                - b1 * Determinant2x2(a2, a3, c2, c3)
                + c1 * Determinant2x2(a2, a3, b2, b3);
            return ans;
        }
        public double Determinant2x2( double a,double  b, double c, double d)
        {
            double ans;
            ans = a * d - b * c;
            return ans;
        }

The general Method (real-time applications optimized)

This second version is also a full, general method, but optimized for real-time applications. It is much faster, because of the calculation and because of the grouping of duplicated mults in numXX variables, saving multiplications and adds.
       public float Determinant()
        {
            float num18 = (M33 * M44) - (M34 * M43);
            float num17 = (M32 * M44) - (M34 * M42);
            float num16 = (M32 * M43) - (M33 * M42);
            float num15 = (M31 * M44) - (M34 * M41);
            float num14 = (M31 * M43) - (M33 * M41);
            float num13 = (M31 * M42) - (M32 * M41);
            return ((((M11 * (((M22 * num18) - (M23 * num17)) + (M24 * num16))) - (M12 * (((M21 * num18) -
                   (M23 * num15)) + (M24 * num14)))) + (M13 * (((M21 * num17) - (M22 * num15)) +
                   (M24 * num13)))) - (M14 * (((M21 * num16) - (M22 * num14)) + (M23 * num13))));
        }
This is the one used in XNA or SlimDX. It is the standard de-facto.

Optimization I (faster Memory Accessing)

As stated in the intro, about the performance of local variables vs. member fields, we´ll make an optimization of the above method using this concept:
    public float Determinant()
        {
            float L11 = this.M11;
            float L12 = this.M12;
            float L13 = this.M13;
            float L14 = this.M14;
            float L21 = this.M21;
            float L22 = this.M22;
            float L23 = this.M23;
            float L24 = this.M24;
            float L31 = this.M31;
            float L32 = this.M32;
            float L33 = this.M33;
            float L34 = this.M34;
            float L41 = this.M41;
            float L42 = this.M42;
            float L43 = this.M43;
            float L44 = this.M44;
            float C1 = (L33 * L44) - (L34 * L43);
            float C2 = (L32 * L44) - (L34 * L42);
            float C3 = (L32 * L43) - (L33 * L42);
            float C4 = (L31 * L44) - (L34 * L41);
            float C5 = (L31 * L43) - (L33 * L41);
            float C6 = (L31 * L42) - (L32 * L41);
            return ((((L11 * (((L22 * C1) - (L23 * C2)) + (L24 * C3))) - (L12 * (((L21 * C1) -
                   (L23 * C4)) + (L24 * C5)))) + (L13 * (((L21 * C2) - (L22 * C4)) + (L24 * C6)))) -
                   (L14 * (((L21 * C3) - (L22 * C5)) + (L23 * C6))));
        }
See the test for performance results.

Optimization II (Based on Graphics Gems II, for Affine transformations only)

The following algorithm is a direct translation of a piece of code found in a Affine Matrix Inversion algorithm, inside Graphics Gems II (see in the next post for the full algorithm). It makes only 12 mults and 7 adds. Keep in mind that this method is only valid for affine transformations (see intro for further information):
        public float Determinant()
        {
            return (this.M11 * this.M22 * this.M33) +
                   (this.M12 * this.M23 * this.M31) +
                   (this.M13 * this.M21 * this.M32) -
                   (this.M13 * this.M22 * this.M31) -
                   (this.M12 * this.M21 * this.M33) -
                   (this.M11 * this.M23 * this.M32);
        }

 

The test

So, it’s time to measure this little boys.

Performance

The test was made calculation a 4x4 affine matrix determinant 100 Million times, in a Intel Core 2 Quad Q9450 2.66Ghz with 3 Gb (non multi-threading).
General method (Graphics Gems I, non real-time)
1 minute: 0.9 secs
General method (real-time)
2.6832 secs
Optimization I (memory access)
2.496 secs
Optimization II (affine transformations)
1.404 secs

Reliability

The reliability test was made with two different matrices.There is no surprises here, all the algorithms work well. The following is the matrices used for the tests (affine transformations) and the results (the determinant) calculated by each algorithm.
Matrix 1:
0.7071068 -0.235702276 0.6666667 0.0
0.0 0.9428091 0.33333334 0.0
-0.7071068 -0.235702276 0.6666667 0.0
0.0 0.0 -15.0 1
General method (Graphics Gems I)
1.0000001641611829
General method (real-time)
1.00000012
Optimization I (memory access)
1.00000012
Optimization II (affine transformations)
1.00000012

Matrix 2:
9.848019 0.10061159 0.0 0.0
-3.50578356 0.282625765 0.0 0.0
0.0 0.0 94.22 0.0
0.0 0.0 0.0 1
General method (Graphics Gems I)
295.47639778127973
General method (real-time)
295.4764
Optimization I (memory access)
295.4764
Optimization II (affine transformations)
295.4764

Part #2: Matrix Inverse Calculation

One of the most expensive Matrix operation is calculating the inverse of a matrix. Will start with the general algorithm for this, and study a couple of optimizations later:

The general Method

This algorithm (explained here), is something like this:
        public Matrix Inverse()
        {
            float determinant = this.Determinant();
            if (determinant == 0)
                throw new System.InvalidOperationException("Matrix has zero determinant (singular matrix). Inverse doesn´t exist");
 
            Matrix b = new Matrix();
            b.M11 = M22 * M33 * M44 + M23 * M34 * M42 + M24 * M32 * M43 - M22 * M34 * M43 - M23 * M32 * M44 - M24 * M33 * M42;
            b.M12 = M12 * M34 * M43 + M13 * M32 * M44 + M14 * M33 * M42 - M12 * M33 * M44 - M13 * M34 * M42 - M14 * M32 * M43;
            b.M13 = M12 * M23 * M44 + M13 * M24 * M42 + M14 * M22 * M43 - M12 * M24 * M43 - M13 * M22 * M44 - M14 * M23 * M42;
            b.M14 = M12 * M24 * M33 + M13 * M22 * M34 + M14 * M23 * M32 - M12 * M23 * M34 - M13 * M24 * M32 - M14 * M22 * M33;
 
            b.M21 = M21 * M34 * M43 + M23 * M31 * M44 + M24 * M33 * M41 - M21 * M33 * M44 - M23 * M34 * M41 - M24 * M31 * M43;
            b.M22 = M11 * M33 * M44 + M13 * M34 * M41 + M14 * M31 * M43 - M11 * M34 * M43 - M13 * M31 * M44 - M14 * M33 * M41;
            b.M23 = M11 * M24 * M43 + M13 * M21 * M44 + M14 * M23 * M41 - M11 * M23 * M44 - M13 * M24 * M41 - M14 * M21 * M43;
            b.M24 = M11 * M23 * M34 + M13 * M24 * M31 + M14 * M21 * M33 - M11 * M24 * M33 - M13 * M21 * M34 - M14 * M23 * M31;
 
            b.M31 = M21 * M32 * M44 + M22 * M34 * M41 + M24 * M31 * M42 - M21 * M34 * M42 - M22 * M31 * M44 - M24 * M32 * M41;
            b.M32 = M11 * M34 * M42 + M12 * M31 * M44 + M14 * M32 * M41 - M11 * M32 * M44 - M12 * M34 * M41 - M14 * M31 * M42;
            b.M33 = M11 * M22 * M44 + M12 * M24 * M41 + M14 * M21 * M42 - M11 * M24 * M42 - M12 * M21 * M44 - M14 * M22 * M41;
            b.M34 = M11 * M24 * M32 + M12 * M21 * M34 + M14 * M22 * M31 - M11 * M22 * M34 - M12 * M24 * M31 - M14 * M21 * M32;
 
            b.M41 = M21 * M33 * M42 + M22 * M31 * M43 + M23 * M32 * M41 - M21 * M32 * M43 - M22 * M33 * M41 - M23 * M31 * M42;
            b.M42 = M11 * M32 * M43 + M12 * M33 * M41 + M13 * M31 * M42 - M11 * M33 * M42 - M12 * M31 * M43 - M13 * M32 * M41;
            b.M43 = M11 * M23 * M42 + M12 * M21 * M43 + M13 * M22 * M41 - M11 * M22 * M43 - M12 * M23 * M41 - M13 * M21 * M42;
            b.M44 = M11 * M22 * M33 + M12 * M23 * M31 + M13 * M21 * M32 - M11 * M23 * M32 - M12 * M21 * M33 - M13 * M22 * M31;
 
            // This last operation involves a float * Matrix multiplication
            return (1f / determinant) * b;
        }
You can use any of the methods presented in Part 1 of this article for calculating the Determinant.

Optimization I (memory access and some operation savings)

This method applies the same concept, but using local variables and packing duplicated calculations in local variables too:
        public static Matrix InvertXNALocalValues(Matrix matrix)
        {
            Matrix outMatrix = new Matrix();
            float L1 = matrix.M11;
            float L2 = matrix.M12;
            float L3 = matrix.M13;
            float L4 = matrix.M14;
            float L5 = matrix.M21;
            float L6 = matrix.M22;
            float L7 = matrix.M23;
            float L8 = matrix.M24;
            float L9 = matrix.M31;
            float L10 = matrix.M32;
            float L11 = matrix.M33;
            float L12 = matrix.M34;
            float L13 = matrix.M41;
            float L14 = matrix.M42;
            float L15 = matrix.M43;
            float L16 = matrix.M44;
            float L17 = (L11 * L16) - (L12 * L15);
            float L18 = (L10 * L16) - (L12 * L14);
            float L19 = (L10 * L15) - (L11 * L14);
            float L20 = (L9 * L16) - (L12 * L13);
            float L21 = (L9 * L15) - (L11 * L13);
            float L22 = (L9 * L14) - (L10 * L13);
            float L23 = ((L6 * L17) - (L7 * L18)) + (L8 * L19);
            float L24 = -(((L5 * L17) - (L7 * L20)) + (L8 * L21));
            float L25 = ((L5 * L18) - (L6 * L20)) + (L8 * L22);
            float L26 = -(((L5 * L19) - (L6 * L21)) + (L7 * L22));
            float L27 = 1f / ((((L1 * L23) + (L2 * L24)) + (L3 * L25)) + (L4 * L26));
            float L28 = (L7 * L16) - (L8 * L15);
            float L29 = (L6 * L16) - (L8 * L14);
            float L30 = (L6 * L15) - (L7 * L14);
            float L31 = (L5 * L16) - (L8 * L13);
            float L32 = (L5 * L15) - (L7 * L13);
            float L33 = (L5 * L14) - (L6 * L13);
            float L34 = (L7 * L12) - (L8 * L11);
            float L35 = (L6 * L12) - (L8 * L10);
            float L36 = (L6 * L11) - (L7 * L10);
            float L37 = (L5 * L12) - (L8 * L9);
            float L38 = (L5 * L11) - (L7 * L9);
            float L39 = (L5 * L10) - (L6 * L9);
            outMatrix.M11 = L23 * L27;
            outMatrix.M21 = L24 * L27;
            outMatrix.M31 = L25 * L27;
            outMatrix.M41 = L26 * L27;
            outMatrix.M12 = -(((L2 * L17) - (L3 * L18)) + (L4 * L19)) * L27;
            outMatrix.M22 = (((L1 * L17) - (L3 * L20)) + (L4 * L21)) * L27;
            outMatrix.M32 = -(((L1 * L18) - (L2 * L20)) + (L4 * L22)) * L27;
            outMatrix.M42 = (((L1 * L19) - (L2 * L21)) + (L3 * L22)) * L27;         
            outMatrix.M13 = (((L2 * L28) - (L3 * L29)) + (L4 * L30)) * L27;
            outMatrix.M23 = -(((L1 * L28) - (L3 * L31)) + (L4 * L32)) * L27;
            outMatrix.M33 = (((L1 * L29) - (L2 * L31)) + (L4 * L33)) * L27;
            outMatrix.M43 = -(((L1 * L30) - (L2 * L32)) + (L3 * L33)) * L27;         
            outMatrix.M14 = -(((L2 * L34) - (L3 * L35)) + (L4 * L36)) * L27;
            outMatrix.M24 = (((L1 * L34) - (L3 * L37)) + (L4 * L38)) * L27;
            outMatrix.M34 = -(((L1 * L35) - (L2 * L37)) + (L4 * L39)) * L27;
            outMatrix.M44 = (((L1 * L36) - (L2 * L38)) + (L3 * L39)) * L27;
            return outMatrix;
        }

Optimization II (for Affine Transformations only -Based on the Kevin Wu article, Graphics Gems II-)

The Graphics Gems books series are a must in any graphics or game developer’s bookshelves. You can by the the book I’m talking about here, or get more info about the Graphics Gems series here. Google have some excerpts of this book in the following link, but one page out of each two is missing: Graphics gems II‎ - Página 342.
So, the algorithm purposed by Mr. Wu looks like the following (ported to C#, original in C++):
   public static bool Inverse(Matrix min, Matrix mout)
        {
            float det_1;
            float pos, neg, temp;
            double PRECISION_LIMIT = 1.0e-15;
            // Calculate determinand
            pos = neg = 0.0f;
            temp = min.M11 * min.M22 * min.M33;
            if (temp >= 0.0)
                pos += temp;
            else
                neg += temp;
            temp = min.M12 * min.M23 * min.M31;
            if (temp >= 0.0)
                pos += temp;
            else
                neg += temp;
            temp = min.M13 * min.M21 * min.M32;
            if (temp >= 0.0)
                pos += temp;
            else
                neg += temp;
            temp = -min.M13 * min.M22 * min.M31;
            if (temp >= 0.0)
                pos += temp;
            else
                neg += temp;
            temp = -min.M12 * min.M21 * min.M33;
            if (temp >= 0.0)
                pos += temp;
            else
                neg += temp;
            temp = -min.M11 * min.M23 * min.M32;
            if (temp >= 0.0)
                pos += temp;
            else
                neg += temp;
            det_1 = pos + neg;
 
            // Is the submatrix A singular? then Matrix M has no inverse
            if ((det_1 == 0.0) || (Math.Abs(det_1 / (pos - neg)) < PRECISION_LIMIT))
                throw new System.InvalidOperationException("Matrix has zero determinant (singular matrix). Inverse doesn´t exist");
 
            // Calculate inverse(A) = adj(A) / det(A)
            det_1 = 1.0f / det_1;
            mout.M11 = (min.M22 * min.M33 - min.M23 * min.M32) * det_1;
            mout.M21 = -(min.M21 * min.M33 - min.M23 * min.M31) * det_1;
            mout.M31 = (min.M21 * min.M32 - min.M22 * min.M31) * det_1;
            mout.M12 = -(min.M12 * min.M33 - min.M13 * min.M32) * det_1;
            mout.M22 = (min.M11 * min.M33 - min.M13 * min.M31) * det_1;
            mout.M32 = -(min.M11 * min.M32 - min.M12 * min.M31) * det_1;
            mout.M13 = (min.M12 * min.M23 - min.M13 * min.M22) * det_1;
            mout.M23 = -(min.M11 * min.M23 - min.M13 * min.M21) * det_1;
            mout.M33 = (min.M11 * min.M22 - min.M12 * min.M21) * det_1;
 
            /* Calculate -C * inverse(A) */
            mout.M41 = -(min.M41 * mout.M11 + min.M42 * mout.M21 + min.M43 * mout.M31);
            mout.M42 = -(min.M41 * mout.M12 + min.M42 * mout.M22 + min.M43 * mout.M32);
            mout.M43 = -(min.M41 * mout.M13 + min.M42 * mout.M23 + min.M43 * mout.M33);
 
            /* Fill in last column */
            mout.M14 = mout.M24 = mout.M34 = 0.0f;
            mout.M44 = 1.0f;
            return true;
        }
Please keep in mind that this algorithm is only valid for affine transformations. Of course, some memory accessing optimization could be done here, using local variables as explained in Part 1.

Optimization III (for matrices composed by Rotations and Translations only)

If you know that your matrix is composed by rotations and translations only, you can apply this (even faster) method, which avoids calculating the determinant, and takes advantage of some properties of this kind of matrices. You can read all the info here.
        public Matrix Inverse()
        {
            Matrix r = new Matrix();
            r.M11 = this.M11;
            r.M12 = this.M21;
            r.M13 = this.M31;
            r.M14 = 0f;
 
            r.M21 = this.M12;
            r.M22 = this.M22;
            r.M23 = this.M32;
            r.M24 = 0f;
 
            r.M31 = this.M13;
            r.M32 = this.M23;
            r.M33 = this.M33;
            r.M34 = 0f;
 
            r.M41 = -(r.M11 * this.M41 + r.M21 * this.M42 + r.M31 * this.M43);
            r.M42 = -(r.M12 * this.M41 + r.M22 * this.M42 + r.M32 * this.M43);
            r.M43 = -(r.M13 * this.M41 + r.M23 * this.M42 + r.M33 * this.M43);
            r.M44 = 1f;
            return r;
        }

Additional Optimizations

If you need it, other optimizations could be included in this algorithms, detecting special matrices whose inverse can be calculated even faster. It is the case of diagonal matrices (where only the diagonal should be inverted using 1/value entries), or Orthogonal matrices (whose inverse is equal to its transpose).

The Test

We made a very simple test, with 50.000.000 (50 Million) inverse calculations of affine transformations, in an Intel Core 2 Quad Q9450 2.66Ghz with 3 Gb (non multi-threading).
General method
11.523 secs
Optimization I (memory accessing and op. savings)
5.4912 secs
Optimization II (affine transformations only)
2.7144 secs
Optimization III (rotations and translations only)
1.326 secs

Conclusions

Searching a bit, it is indeed possible to reach the performance of your graphics API and as mentioned above, the advantages of having (and building) your own library are countless.
Take care!