Reading data
Note
Make sure that you have installed the necessary dependencies for running the examples.
Loading recordings
pupil_recording_interface provides a simple interface for loading recordings made with Pupil Capture into Python. Recordings are loaded as xarray Datasets which provide an elegant way of working with multi-dimensional labeled data.
To get started, we use get_test_recording()
to download and cache a
very short example recording. The method returns the path to the cached folder:
>>> import pupil_recording_interface as pri
>>> folder = pri.get_test_recording()
Gaze
You can easily load recorded gaze data with load_gaze()
:
>>> pri.load_gaze(folder)
<xarray.Dataset>
Dimensions: (cartesian_axis: 3, pixel_axis: 2, time: 5160)
Coordinates:
* time (time) datetime64[ns] 2019-10-10T16:43:20.149777889 ... 2019-10-10T16:43:41.262381792
* pixel_axis (pixel_axis) <U1 'x' 'y'
* cartesian_axis (cartesian_axis) <U1 'x' 'y' 'z'
Data variables:
eye (time) int64 2 2 2 2 2 2 2 2 2 2 ... 2 2 2 2 2 2 2 2 2 2
gaze_norm_pos (time, pixel_axis) float64 0.4082 0.3912 ... 0.1286
gaze_point (time, cartesian_axis) float64 nan nan ... 0.1299
eye0_center (time, cartesian_axis) float64 nan nan ... -0.02006
eye1_center (time, cartesian_axis) float64 nan nan ... -0.02153
eye0_normal (time, cartesian_axis) float64 nan nan ... 0.267 0.9269
eye1_normal (time, cartesian_axis) float64 nan nan ... 0.1646 0.9797
gaze_confidence_3d (time) float64 0.8787 0.8769 0.9233 ... 0.9528 0.9528
The gaze dataset contains the following arrays:
eye
: the eye data that produced the mapping, 0 for eye 0 (usually right), 1 for eye 1 (usually left), 2 for binocular mappinggaze_norm_pos
: the two dimensional norm pos, i.e. the normalized position of the gaze in the video frame where (0, 0) is the lower left and (1, 1) is the upper right corner of the framegaze_point
: the three-dimensional gaze point represented in the world camera coordinate system (x right, y down, z forward), in meterseye{0,1}_center
: the center of each eye represented in the world camera coordinate system, in meterseye{0,1}_normal
: the normal of each eye represented in the world camera coordinate systemgaze_confidence_3d
: the confidence of the mapping between 0 and 1
All arrays behave like numpy arrays, (i.e. you can use functions like
np.sum
on them), but will preserve their labels (called coordinates) in
most cases.
If you performed post-hoc gaze mapping, it is also possible to reference an offline gaze mapper by name and load its data:
>>> pri.load_gaze(folder, source='3d Gaze Mapper')
<xarray.Dataset>
Dimensions: (cartesian_axis: 3, pixel_axis: 2, time: 5125)
Coordinates:
* time (time) datetime64[ns] 2019-10-10T16:43:20.278882265 ... 2019-10-10T16:43:41.209933281
* pixel_axis (pixel_axis) <U1 'x' 'y'
* cartesian_axis (cartesian_axis) <U1 'x' 'y' 'z'
Data variables:
eye (time) int64 2 2 2 2 2 2 2 2 2 2 ... 2 2 2 2 2 2 2 2 2 2
gaze_norm_pos (time, pixel_axis) float64 0.551 -0.2074 ... -0.07305
gaze_point (time, cartesian_axis) float64 0.003464 ... 0.03572
eye0_center (time, cartesian_axis) float64 0.005147 ... -0.02
eye1_center (time, cartesian_axis) float64 -0.04123 ... -0.02
eye0_normal (time, cartesian_axis) float64 -0.252 ... 0.9327
eye1_normal (time, cartesian_axis) float64 0.8417 -0.1958 ... 0.8089
gaze_confidence_3d (time) float64 0.9761 0.9586 0.9473 ... 0.9487 0.9207
Finally, it is also possible to merge the data from a 2d and a 3d gaze mapper, in which case the norm pos will come from the 2d mapper and the gaze point from the 3d mapper:
>>> pri.load_gaze(
... folder, source={'2d': '2d Gaze Mapper ', '3d': '3d Gaze Mapper'}
... )
<xarray.Dataset>
Dimensions: (cartesian_axis: 3, pixel_axis: 2, time: 4987)
Coordinates:
* time (time) datetime64[ns] 2019-10-10T16:43:20.278882265 ... 2019-10-10T16:43:41.209933281
* pixel_axis (pixel_axis) <U1 'x' 'y'
* cartesian_axis (cartesian_axis) <U1 'x' 'y' 'z'
Data variables:
eye (time) int64 2 2 2 2 2 2 2 2 2 2 ... 2 2 2 2 2 2 2 2 2 2
gaze_norm_pos (time, pixel_axis) float64 0.4303 0.3921 ... 0.1736
gaze_point (time, cartesian_axis) float64 0.003464 ... 0.03572
eye0_center (time, cartesian_axis) float64 0.005147 ... -0.02
eye1_center (time, cartesian_axis) float64 -0.04123 ... -0.02
eye0_normal (time, cartesian_axis) float64 -0.252 ... 0.9327
eye1_normal (time, cartesian_axis) float64 0.8417 -0.1958 ... 0.8089
gaze_confidence_2d (time) float64 0.9761 0.9586 0.9473 ... 0.9487 0.9207
gaze_confidence_3d (time) float64 0.9761 0.9586 0.9473 ... 0.9487 0.9207
We can get a set of all available gaze mappers for a recording with:
>>> pri.get_gaze_mappers(folder)
{'2d Gaze Mapper ', '3d Gaze Mapper', 'recording'}
Pupils
Pupil data can be loaded in a similar manner with load_pupils()
.
Since the recorded pupil data in the example recording uses the 3d pupil
detector, we need to specify method="3d"
:
>>> pri.load_pupils(folder, method="3d")
<xarray.Dataset>
Dimensions: (cartesian_axis: 3, pixel_axis: 2, time: 5170)
Coordinates:
* time (time) datetime64[ns] 2019-10-10T16:43:20.188713789 ... 2019-10-10T16:43:41.300101757
* pixel_axis (pixel_axis) <U1 'x' 'y'
* cartesian_axis (cartesian_axis) <U1 'x' 'y' 'z'
Data variables:
eye (time) int64 1 0 1 0 1 0 1 0 1 ... 1 0 1 0 1 0 1 0
confidence (time) float64 0.9374 0.9412 0.9036 ... 1.0 0.9771
diameter (time) float64 49.14 43.71 49.01 ... 47.23 39.66
ellipse_angle (time) float64 61.4 85.65 61.37 ... 53.67 76.26
pupil_norm_pos (time, pixel_axis) float64 0.2423 0.663 ... 0.4504
ellipse_center (time, pixel_axis) float64 46.52 64.7 ... 105.5
ellipse_axes (time, pixel_axis) float64 42.82 49.14 ... 39.66
circle_center (time, cartesian_axis) float64 -5.992 ... 81.2
circle_normal (time, cartesian_axis) float64 -0.1717 ... -0.9904
sphere_center (time, cartesian_axis) float64 -3.931 ... 93.08
projected_sphere_center (time, pixel_axis) float64 67.58 101.9 ... 93.74
projected_sphere_axes (time, pixel_axis) float64 173.5 173.5 ... 159.9
diameter_3d (time) float64 5.928 5.864 5.913 ... 5.876 5.194
theta (time) float64 1.175 1.993 1.174 ... 1.354 1.704
phi (time) float64 -1.758 -1.533 ... -1.691 -1.534
model_confidence (time) float64 0.5911 0.851 ... 0.1845 0.6902
circle_radius (time) float64 2.964 2.932 2.956 ... 2.938 2.597
sphere_radius (time) float64 12.0 12.0 12.0 ... 12.0 12.0 12.0
projected_sphere_angle (time) float64 90.0 90.0 90.0 ... 90.0 90.0 90.0
model_birth_timestamp (time) float64 2.286e+03 2.284e+03 ... 2.284e+03
It is also possible to load pupil data that was computed post-hoc by
specifying source="offline"
. The post-hoc data contains 2d pupil data which
we can load with method="2d"
:
>>> pri.load_pupils(folder, source="offline", method="2d")
<xarray.Dataset>
Dimensions: (pixel_axis: 2, time: 5164)
Coordinates:
* time (time) datetime64[ns] 2019-10-10T16:43:20.277472973 ... 2019-10-10T16:43:41.364654779
* pixel_axis (pixel_axis) <U1 'x' 'y'
Data variables:
eye (time) int64 1 0 1 0 1 0 1 0 1 0 1 ... 0 1 0 1 0 1 0 1 0 1 0
confidence (time) float64 0.9622 0.99 0.9273 0.9674 ... 0.99 0.99 0.0
diameter (time) float64 49.46 44.2 49.62 44.2 ... 39.84 47.64 0.0
ellipse_angle (time) float64 62.86 77.64 61.91 78.05 ... 83.34 124.2 -90.0
pupil_norm_pos (time, pixel_axis) float64 0.242 0.6634 0.5005 ... 0.0 1.0
ellipse_center (time, pixel_axis) float64 46.46 64.62 96.1 ... 0.0 0.0
ellipse_axes (time, pixel_axis) float64 43.06 49.46 39.78 ... 0.0 0.0
Calibration markers
Locations of calibration markers that were detected post-hoc can be loaded
with load_markers()
:
>>> pri.load_markers(folder)
<xarray.Dataset>
Dimensions: (pixel_axis: 2, time: 240)
Coordinates:
* time (time) datetime64[ns] 2019-10-10T16:43:20.238371849 ... 2019-10-10T16:43:41.232636929
* pixel_axis (pixel_axis) <U1 'x' 'y'
Data variables:
frame_index (time) int64 0 1 3 4 90 91 92 ... 427 428 429 430 431 513 540
location (time, pixel_axis) float64 202.4 258.9 202.8 ... 822.5 598.0
Caching
Note
This functionality requires the netcdf4
library, see the
dependencies for data export.
All of the above methods accept a cache
argument that will cache the loaded
data in the netCDF format, making subsequent loading significantly faster:
>>> pri.load_gaze(folder, cache=True)
<xarray.Dataset>
Dimensions: (cartesian_axis: 3, pixel_axis: 2, time: 5160)
Coordinates:
* time (time) datetime64[ns] 2019-10-10T16:43:20.149777889 ... 2019-10-10T16:43:41.262381792
* pixel_axis (pixel_axis) object 'x' 'y'
* cartesian_axis (cartesian_axis) object 'x' 'y' 'z'
Data variables:
eye (time) float64 ...
gaze_norm_pos (time, pixel_axis) float64 ...
gaze_point (time, cartesian_axis) float64 ...
eye0_center (time, cartesian_axis) float64 ...
eye1_center (time, cartesian_axis) float64 ...
eye0_normal (time, cartesian_axis) float64 ...
eye1_normal (time, cartesian_axis) float64 ...
gaze_confidence_3d (time) float64 ...
Loading videos
Since video data is rather large, we rarely bulk-load entire video
recordings (although it is possible, see Other video functionality).
Rather, this library provides a VideoReader
class with which
we can go through videos frame by frame. You can get information about the
world camera video with:
>>> reader = pri.VideoReader(folder)
>>> reader.video_info
{'resolution': (1280, 720), 'frame_count': 504, 'fps': 23.987}
With VideoReader.load_raw_frame()
you can retrieve a raw video
frame by index:
>>> reader = pri.VideoReader(folder)
>>> frame = reader.load_raw_frame(100)
>>> frame.shape
(720, 1280, 3)
or by timestamp:
>>> reader = pri.VideoReader(folder)
>>> frame = reader.load_raw_frame(reader.timestamps[100])
>>> frame.shape
(720, 1280, 3)
Here, we used the timestamps
attribute of the reader which contains the
timestamps for each frame to get the timestamp of the frame with index 100.
If you have the matplotlib
library installed, you can show the frame
with imshow()
. Note that you have to reverse the last axis as the frame
is loaded as a BGR image but imshow expects RGB:
>>> import matplotlib.pyplot as plt
>>> plt.imshow(frame[:, :, ::-1])
(Source code, png, hires.png, pdf)
Eye videos can be loaded by specifying the stream
parameter:
>>> eye_reader = pri.VideoReader(folder, stream='eye0')
With return_timestamp=True
you can get the corresponding timestamp for a
frame:
>>> timestamp, frame = eye_reader.load_frame(100, return_timestamp=True)
>>> timestamp
Timestamp('2019-10-10 16:43:21.087194920')
This timestamp can now be used to get the closest world video frame:
>>> frame = reader.load_frame(timestamp)
>>> frame.shape
(720, 1280, 3)
ROI extraction
You can easily extract an ROI around the current gaze point in the image by
specifying the norm_pos
and roi_size
parameters and using the
VideoReader.load_frame()
method:
>>> gaze = pri.load_dataset(folder, gaze='2d Gaze Mapper ')
>>> reader = pri.VideoReader(
... folder, norm_pos=gaze.gaze_norm_pos, roi_size=64
... )
>>> frame = reader.load_frame(100)
>>> frame.shape
(64, 64, 3)
>>> plt.imshow(frame[:, :, ::-1])
(Source code, png, hires.png, pdf)
Other video functionality
Video frames can also be sub-sampled and converted to grayscale with the
subsampling
and color_format
parameters:
>>> reader = pri.VideoReader(
... folder, color_format='gray', subsampling=4.
... )
>>> frame = reader.load_frame(100)
>>> frame.shape
(180, 320)
VideoReader.read_frames()
provides a generator for frames that
can be restricted to an index or timestamp range with the start
and end
parameters:
>>> reader = pri.VideoReader(folder)
>>> reader.read_frames(start=100, end=200)
<generator object VideoReader.read_frames at ...>
The video reader also provides a VideoReader.load_dataset()
method. The method is rather slow as it has to load each frame individually.
You can provide start
and end
timestamps to specify the time frame
of the loaded data:
>>> reader = pri.VideoReader(folder, subsampling=8.)
>>> reader.load_dataset(
... start=reader.user_info['experiment_start'],
... end=reader.user_info['experiment_end'],
... )
<xarray.Dataset>
Dimensions: (color: 3, frame_x: 160, frame_y: 90, time: 22)
Coordinates:
* time (time) datetime64[ns] 2019-10-10T16:43:23.237552881 ... 2019-10-10T16:43:24.175843954
* frame_x (frame_x) int64 0 1 2 3 4 5 6 7 ... 152 153 154 155 156 157 158 159
* frame_y (frame_y) int64 0 1 2 3 4 5 6 7 8 9 ... 81 82 83 84 85 86 87 88 89
* color (color) <U1 'B' 'G' 'R'
Data variables:
frames (time, frame_y, frame_x, color) uint8 8 7 23 9 8 21 ... 5 11 9 7 21
Other files
The recording metadata file created by Pupil Capture (info.player.json
)
can be loaded with load_info()
:
>>> pri.load_info(folder)
{'duration_s': 21.111775958999715,
'meta_version': '2.3',
'min_player_version': '2.0',
'recording_name': '2019_10_10',
'recording_software_name': 'Pupil Capture',
'recording_software_version': '1.16.95',
'recording_uuid': 'e5059604-26f1-42ed-8e35-354198b56021',
'start_time_synced_s': 2294.807856069,
'start_time_system_s': 1570725800.220913,
'system_info': 'User: test_user, Platform: Linux'}
You can also load the user info (user_info.csv
) with
load_user_info()
:
>>> pri.load_user_info(folder)
{'name': 'TEST',
'pre_calibration_start': Timestamp('2019-10-10 16:43:21.220912933'),
'pre_calibration_end': Timestamp('2019-10-10 16:43:22.220912933'),
'experiment_start': Timestamp('2019-10-10 16:43:23.220912933'),
'experiment_end': Timestamp('2019-10-10 16:43:24.220912933'),
'post_calibration_start': Timestamp('2019-10-10 16:43:25.220912933'),
'post_calibration_end': Timestamp('2019-10-10 16:43:26.220912933'),
'height': 1.8}
A recording usually also contains other msgpack-encoded files, for example
the camera intrinsics. These can be loaded with load_object()
>>> pri.load_object(folder / "world.intrinsics")
{'version': 1,
'(1280, 720)':
{'camera_matrix': [[826.1521272145403, 0.0, 627.8343012631811],
[0.0, 769.1786602466223, 359.810742797992],
[0.0, 0.0, 1.0]],
'dist_coefs': [[-0.14119955505495468],
[0.024919584525371758],
[-0.21233916934283625],
[0.22636274811293397]],
'resolution': [1280, 720],
'cam_type': 'fisheye'}}
Data export
Note
Make sure you have installed the necessary dependencies for data export.
Recorded data can also directly be written to disk with
write_netcdf()
:
>>> pri.write_netcdf(folder, gaze='recording', output_folder='.')
This will create a gaze.nc
file in the current folder. This file type can
directly be loaded by xarray:
>>> import xarray as xr
>>> xr.open_dataset('gaze.nc')
<xarray.Dataset>
Dimensions: (cartesian_axis: 3, pixel_axis: 2, time: 5160)
Coordinates:
* time (time) datetime64[ns] 2019-10-10T16:43:20.149777889 ... 2019-10-10T16:43:41.262381792
* pixel_axis (pixel_axis) object 'x' 'y'
* cartesian_axis (cartesian_axis) object 'x' 'y' 'z'
Data variables:
eye (time) float64 ...
gaze_norm_pos (time, pixel_axis) float64 ...
gaze_point (time, cartesian_axis) float64 ...
eye0_center (time, cartesian_axis) float64 ...
eye1_center (time, cartesian_axis) float64 ...
eye0_normal (time, cartesian_axis) float64 ...
eye1_normal (time, cartesian_axis) float64 ...
gaze_confidence_3d (time) float64 ...