Here is some useful stuff for using firewire cameras:
Drivers: See the CMU1394 driver project. Unibrain also have a commercial API
Firewire cameras (not camcorders!) use a standard for transmitting the video signal called DCAM or IIDC. This specifies a set predefined formats which are contained in the table below. The CMU1394 API uses the function
C1394Camera::SetVideoFrameRate( unsigned long rate )
to set the video rate
C1394Camera::SetVideoMode ( unsigned long mode )
to set the video mode, and
C1394Camera::SetVideoFormat ( unsigned long format )
to set the video format.
I have written a C library that wraps this table and setups and captures from a firewire camera.
|0||0||160 x 120 YUV 4:4:4 (24 bits per pixel)|
|0||1||320 x 240 YUV 4:2:2 (16 bits per pixel)|
|0||2||640 x 480 YUV 4:1:1 (12 bits per pixel)|
|0||3||640 x 480 YUV 4:2:2 (16 bits per pixel)|
|0||4||640 x 480 RGB (24 bits per pixel)|
|0||5||640 x 480 Y Mono (8 bits per pixel)|
|0||6||640 x 480 Y Mono16 (16 bits per pixel)|
|1||0||800 x 600 YUV 4:2:2 (16 bits per pixel)|
|1||1||800 x 600 RGB (24 bits per pixel)|
|1||2||800 x 600 Y (8 bits per pixel)|
|1||3||1024 x 768 YUV (4:2:2) (16 bits per pixel)|
|1||4||1024 x 768 RGB (24 bits per pixel)|
|1||5||1024 x 768 Y Mono (8 bits per pixel)|
|1||6||800 x 600 Y Mono16 (16 bits per pixel)|
|1||7||1024 x 768 Y Mono16 (16 bits per pixel)|
|2||0||1280 x 960 YUV 4:2:2 (16 bits per pixel)|
|2||1||1280 x 960 RGB (24 bits per pixel)|
|2||2||1280 x 960 Y Mono (8 bits per pixel)|
|2||3||1600 x 1200 YUV (4:2:2) (16 bits per pixel)|
|2||4||1600 x 1200 RGB (24 bits per pixel)|
|2||5||1600 x 1200 Y Mono (8 bits per pixel)|
|2||6||1280 x 960 Y Mono16 (16 bits per pixel)|
|2||7||1600 x 1200 Y Mono16 (16 bits per pixel)|
|Frame Rate Number||Frame Rate|
Note that all these rates are supported by Format 0 (although the camera might not support them!) but sometimes Frame Rate 0,1,6,7 are not supported for all the modes in formats 1 and 2. See the DCAM spec [pdf] for details.
In addition there is also Format 7 which allow the precise control over the video. It can control video size, binning, colour models etc. This is really useful for cameras with non-standard sensor sizes and in CMOS cameras where regions of interest can be specified.
The description in the table above takes some explaining. CCDs and CMOS sensors are monochromatic. By themselves they can not differentiate colours. To do that filters need to be added. There are two main ways of doing this:
The second option is by far the most common and is know as a Bayer filter. So if our image is 640x480 and RGB, does that mean the sensor has 640x480x3 pixels? Sadly the answer is no. The sensor has 640x480 pixels and the colours are interpolated by looking at the surrounding pixels. You need to be aware of this if you are doing colour image processing.
The human eye is better at detecting grey level (luma) changes than it is at seeing colour changes. The camera transmission protocols take advantage of this. The YUV is a colour space widely using in TV transmission to compress the data. The Y is the luma value and there are two colour or chormiance components (U and V). Since the human eye has poor colour resolution, the Y value can be transmitted at a higher data rate than the colours. So YUV 4:4:4 means for every four Y components, four U and four V are transmitted. 4:2:2 means half of the colour data has been discarded and 4:1:1 mean only one U and one V are transmitted for every four luma values. This way the data has been compressed.
To get RGB colour back, the data has to be again interpolated - this is two lots of interpolation of the colour data now so expect errors! If you are working with colour image processing I would avoid YUV 4:2:2 and 4:1:1 modes. Many cheap webcams appear to transmit in these two modes.