Mis-using this blog to write up a personal projects again, I thought I’d explain the basics behind a simple eye-tracker I hacked together a while back (this video from August 08 shows my first version):
I’ve had quite a few questions about how to implement this in Python, so I thought I would explain the general process:
Detecting the Face
The first stage is to take the image from the webcam and estimate where the face is. There are good tutorials on using the haar classifiers that come with open cv to detect faces, and I would recommend reading a few of those.
The important thing is that this stage has to be really fast and give a low false positive rate – so I tweaked the parameters so that the false positive rate was low, and it accurately detected a face position on average about one in ten frames. Since the face isn’t going to be moving much, I used the previous face positions if I had not detected a new one. I did not care too much about accuracy for this stage, though.
Detecting the Eye positions
This is (again) very important, but this step has to be done with a very high level of accuracy for the simple method I used. Since the eyes are expected to be in a certain position on the head, I ran the haar classifier for eyes over the top 2/3 of the “face” detected before, which massively reduces the processing time for this stage.
I also found that I regularly had several suggested regions where the eyes were (possibly because I used a low-quality classifier), so I enforced some hard-coded rules to try to reduce the number of regions to two, while ensuring that they were in fact eyes being detected.
For example, I checked that the two regions didn’t intersect, and that they were roughly the same size. I also checked the probability that they were in that position based on the previous recorded position etc.
Analysing the pupils
Once we have the eye positions, we only look at properties of the image around the eyes. (you can see an example image of an eye region being shown in the video – I analysed both, but I only displayed one)
I really did take a very basic technique here – rather than relying on having a light in front to reflect a white dot in the pupils (which is the standard trick), I simply analyse the normalised moments of the pixelvalues.
(I just realised that sentence doesn’t sound like “simply” should be included, so here’s a bit more background). Basically, when you’re looking a distribution (like the distribution of values along the x-axis of an image), the “first moment” of the distribution is the average position, the second moment is the variance, and the third moment is the skew.
The variance does not tell us anything about the direction of the pupils, so I used the first and third moments in the X and Y axis as inputs to decide where the eyes are looking.
Feeding this into a simple linear learning algorithm worked very well. I did try a few more complicated algorithms, but my quick tests did show that I got almost linear behavior between the moments measured and the (true) pupil position on-screen (probably due to the wonderful accuracy of small angle approximations).
Question: Open Source Eye Tracking
What astounded me was the simplicity of the project - I think it took me about two evenings to do as much as I had done. Why then does all quality eye-tracking software cost an arm and a leg? Sure, there are algorithms and bits of code out there as part of thesis’, but they all have to be made from source and none of them really have a nice interface.
In my opinion, it would be a massive boost to the FOSS community if someone would focus on building such a system. Something that integrates screen capture, a key/mouse logger, and eyetracking. Unfortunately my code for this was really a quick hack – and I think it would be better to start from scratch than to re-factor the code (which is why I haven’t made it available yet).
Tags: eyes, eyetracking, opencv, programming, python













Cool, Tim! I’d love to give it a spin….
This sounds really cool.
What’s your email Tim?
Cheers -Luke
Hey,
when I detect eyes using a Haar classifier, I often get 5-6 eye candidate regions. They are almost always right on top, or in the vicinity of the actual eyes. However, this means that I get 2-3 boxes for the left eye and 2-3 boxes for the right eye. Sometimes these boxes are intersecting, sometimes a box completely encapsulates another box.
How did you avoid these cases? Can you let us know; we’d appreciate some example code as well if you can
Regards,
@Flint
I found the same issue of getting several corresponding regions.
First, I looked at the average area contained by the regions, and then removed any which were appeared to obviously be too large or small compared to the average.
From this new set, I found they were nearly all boxes that contained each other (for each eye) – I can’t remember if I chose the smallest or the largest box as the one to chose (haven’t got the code on me now) – but that worked quite well.
There were a few other tricks – I couldn’t get two eyes from every frame, but if I found one eye then I updated the information for that eye, and then updated the position approximation when I managed to find the second eye.
Have you made this code available?
Gregory Bohuslav
Same as the last guy posted a few months ago:
Is the code available?
Is there a possibility to send me this code?
Greets