third_party/blink/renderer/modules/imagecapture/README.md
This folder contains the implementation of the W3C Image Capture API. Image Capture was shipped in Chrome M59; please consult the Implementation Status if you think a feature should be available and isn't.
This API is structured around the ImageCapture class and a number of
extensions to the MediaStreamTrack feeding it (let's call them
theImageCapturer and theTrack, respectively).
takePhoto() and grabFrame()takePhoto() returns the result of a single photographic exposure as a
Blob which can be downloaded, stored by the browser or displayed in an
img element. This method uses the highest available photographic camera
resolution.
grabFrame() returns a snapshot of the live video in theTrack as an
ImageBitmap object which could (for example) be drawn on a canvas and
then post-processed to selectively change color values. Note that the
ImageBitmap will only have the resolution of the video track — which
will generally be lower than the camera's still-image resolution.
(Adapted from the blog post)
The photo-specific options and settings are associated to theImageCapturer or
theTrack depending on whether a given capability/setting has an immediately
recognisable effect on theTrack, in other words if it's "live" or not. For
example, changing the zoom level is instantly reflected on the theTrack,
while enabling red eye reduction, if available, is not.
| Object | Type | Example |
|---|---|---|
PhotoCapabilities | non-live capabilities | theImageCapturer.getPhotoCapabilities() |
MediaTrackCapabilities | live capabilities | theTrack.getCapabilities() |
PhotoSettings | non-live settings | theImageCapturer.takePhoto(photoSettings) |
MediaTrackSettings | live settings | theTrack.getSettings() |
takePhoto() and grabFrame() the same?These methods would not produce the same results as explained in this issue comment:
Let me reconstruct the conversion steps each image goes through in CrOs/Linux; [...]
a) Live video capture produces frames via
V4L2CaptureDelegate::DoCapture()[1]. The original data (from the WebCam) comes in either YUY2 (a 4:2:2 format) or MJPEG, depending if the capture is smaller than 1280x720p or not, respectively.
b) This
V4L2CaptureDelegatesends the capture frame to a conversion stage to I420 [2]. I420 is a 4:2:0 format, so it has lost some information irretrievably. This I420 format is the one used for transporting video frames to the rendered.
c) This I420 is the input to
grabFrame(), which produces a JS ImageBitmap, unencoded, after converting the I420 into RGBA [3] of the appropriate endian-ness.
What happens to
takePhoto()? It takes the data from the Webcam in a) and either returns a JPEG Blob [4] or converts the YUY2 [5] and encodes it to PNG using the default compression value (6 in a 0-10 scale IIRC) [6].
IOW:
- for smaller video resolutions:
OS -> YUY2 ---> I420 --> RGBA --> ImageBitmap grabFrame()
|
+--> RGBA --> PNG ---> Blob takePhoto()
- and for larger video resolutions:
OS -> MJPEG ---> I420 --> RGBA --> ImageBitmap grabFrame()
|
+--> JPG ------------> Blob takePhoto()
Where every conversion to-I420 loses information and so does the encoding to PNG. Even a conversion
RGBA --> I420 --> RGBAwould not produce the original image. (Plus, when you showImageBitmapand/or Blob on an `` or<canvas>there are more stages of decoding and even colour correction involved!)
With all that, I'm not surprised at all that the images are not pixel accurate! :-)
PhotoCapabilities.fillLightMode and MediaTrackCapabilities.torch separated?Because they are different things: torch means flash constantly on/off whereas
fillLightMode means flash always-on/always-off/auto when taking a
photographic exposure.
torch lives in theTrack because the effect can be seen "live" on it,
whereas fillLightMode lives in theImageCapture object because the effect
of modifying it can only be seen after taking a picture.
Image Capture web tests are located in web_tests/external/mediacapture-image.