Should be possible. From the FFmpeg x11grab docs:
From "man x":
specifies the X11 display name of the screen to grab from. hostname can be omitted, and defaults to "localhost". The environment variable DISPLAY contains the default display name.
x_offset and y_offset specify the offsets of the grabbed area with respect to the top-left border of the X11 screen. They default to 0.
Check the X11 documentation (e.g. man X) for more detailed information.
Use the dpyinfo program for getting basic information about the properties of your X11 display (e.g. grep for "name" or "dimensions").
I only have one screen, but I guess you would try something like:
The phrase "display" is usually used to refer to a collection
of monitors that share a common set of input devices (keyboard,
mouse, tablet, etc.). Most workstations tend to only have one
display. Larger, multi-user systems, however, frequently have
several displays so that more than one person can be doing
graphics work at once. To avoid confusion, each display on a
machine is assigned a display number (beginning at 0) when the
X server for that display is started. The display number must
always be given in a display name.
Some displays share their input devices among two or more moni‐
tors. These may be configured as a single logical screen,
which allows windows to move across screens, or as individual
screens, each with their own set of windows. If configured
such that each monitor has its own set of windows, each screen
is assigned a screen number (beginning at 0) when the X server
for that display is started. If the screen number is not
given, screen 0 will be used.
I assume this will create two separate video streams (one per screen) into one output file. You didn't mention if you wanted them combined (side-by-side for example) or anything.
ffmpeg -f x11grab -framerate 25 -video_size 1680x1050 -i :0.0 -f x11grab -framerate 25 -video_size 1680x1050 -i :0.1 -c:v libx264 -crf 0 -preset ultrafast output.mkv