Skip navigation

Visualizing Dynamic Images and Eye Movements with CARPE

The DIEM project is an investigation of how people look and see. DIEM has so far collected data from over 250 participants watching 85 different videos. All of our data is freely available for research and non-commercial use as restricted by a CC-NC-SA 3.0 Creative Commons license. The data together with CARPE will let you visualize where people look during dynamic scene viewing such as during film trailers, music videos, or advertisements. The project was originally conceived and implemented at the University of Edinburgh, and made possible by generous funding from the Leverhulme Trust and the Economic and Social Research Council of the UK. Professor John M. Henderson, Principal Investigator,  is now at the Center for Mind and Brain, University of California, Davis.

If you end up using CARPE and find it helpful, please contact us and let us know about your project. We’d be very interested in any extensions or modifications you might make to it.

CARPE, or Computational and Algorithmic Representation and Processing of Eye-movements, allows one to begin visualizing eye-movement data in a number of ways.

There are a number of different visualization options:

  • low level visual features that process the input video to show flicker or edges;
  • heat-maps that show where people are looking;
  • clustered heat-maps that use pattern recognition to define the best model of fixations for each frame;
  • peek-through which uses the heat-map information to only show parts of the video where people are looking.

Have a look at a montage of 4 example visualizations, all of which were produced with CARPE:

This post will help you get started. Before we begin, make sure you meet the system requirements:

Windows XP+ or Vista+ (Not tested on Windows 7)
1+ GB RAM
Updated graphics card w/ 128+ MB (It is very important that you have updated your graphics card as CARPE uses the latest graphic card functionality)

Note that if you are a developer, you may want to play with the source code to try getting CARPE working on your system, whether it be OSX or Linux, as CARPE uses all platform-independent-code and libraries with the exception of reading and writing files for the Windows OS.

Installing the dependencies:

CARPE requires a number of dependencies that are all contained within the following package: Download [36 MB] and unzip the package hosted on our Google CODE repository and install each file.

Installing the binary:

CARPE can be installed by downloading and unzipping the package hosted on our Google CODE repository: CARPE.7z [80 MB]

If you are having problems unzipping these files, please install a 7z client: http://www.7-zip.org/

CARPE’s folder structure:

/bin - main executable file: CARPE.exe
/bin/data
/bin/data/video - input eye-movement video files
/bin/data/audio - input eye-movement audio files
/bin/data/event_data - input eye-movement tracking data
/bin/data/output - output recorded movie visualization
/bin/data/stats - output GMM heatmap statistics

With CARPE.7z comes an example video file, in /bin/data/video/50_people_brooklyn_1280x720.mp4, example eye-tracking data in /bin/data/event_data/*50_people_brooklyn_1280x720*.txt, and example audio file in /bin/data/audio/50_people_brooklyn_1280x720.wav

Initially, the video and audio files were merged as a single file. For the purposes of eye-tracking, they have been split into 2: video and audio. This ensures sample accurate measurements. A number of participants eye-movement files are available. These were recorded at 1000 Hz and sampled down to video rate.

Running CARPE

After installing all the dependencies, begin using CARPE by running the executable: /bin/CARPE.exe

A dialog window should appear asking you to open the eye-tracking move file. Navigate to /bin/data/video/50_people_brooklyn_1280x720.mp4

CARPE should now load the eye-tracking files in /bin/data/event_data into memory and then display the main window.

Commands

Select any options that you would like (the default is for producing clustered heatmaps), and then press the ‘space bar’ to begin. You can press the ‘space bar’ or ‘x’ keys to pause/play the movie. Further, you can also press ‘z’ to advance a frame backwards, or ‘c’ to advance a frame forwards. Pressing ‘p’ will display/hide the Options panel.

A slider on the bottom of the window lets you advance to any point in the movie by using the mouse. You can also click the play/pause button to switch between play or pause.

Exiting CARPE

Yes, exiting CARPE requires its own section. Always press the ESC button to exit CARPE. If you are unable to exit CARPE, you will then have to force quit CARPE by closing the console window. We are still working on resolving this issue.

Download CARPE and start playing with our data hosted at The DIEM Database.

If you use CARPE in your publication, we would appreciate it if you could cite the following work:

Parag K. Mital, Tim J. Smith, Robin Hill, John M. Henderson. “Clustering of Gaze during Dynamic Scene Viewing is Predicted by Motion” Cognitive Computation, Volume 3, Issue 1, pp 5-24, March 2011.

5 Comments

  1. Do I need a camera or IR sensor to track my eyes?
    How do you do it?

  2. Excellent work – very useful – could you give some details as to the required format of the eye data files – e.g. how you down-sampled the 1000Hz files to the video framerate? And what the data in the 5th and 9th columns represents / whether it needs to be binocular etc…

  3. Hello, I’ve a doubt about the setup to make this experiment. In the database you have included a lot of videos with diferent resolutions and the most important with different aspect ratio, but I suppose that all of them have been presented in the same monitor.

    So, the question is. The videos has been rescaled to fill the hole monitor screen? or when the ratio is different from the screen ratio, you have shown the usual black bars at the sides of the screen, or at the bottom/top .

    Thanks in advance…

  4. I find another question. I am not able to recover the heatmap as you show in the video. I go to the eye fixation data, e.g., “harry potter 6 trailer 1280×544 web“, where I find most of the y-coordinate of eyes are within the range of [400: 500], which means near the bottom of the frame. However, the heat map of the videos shows the eye fixations are mostly in the middle of the frame, which is more reasonable to me.


Leave a comment