docs/content/docs/a-frame.mdx
import { Tab, Tabs } from 'fumadocs-ui/components/tabs'
A Frame is an in-memory buffer allowing GPU- and CPU-access to individual pixels. It is typically streamed in realtime by a CameraFrameOutput.
See "The Frame Output" for more information about streaming Frames.
Typically, you pass a Frame to "Native Frame Processor Plugins" to avoid touching pixels from JS.
A "Native Frame Processor Plugin" is simply a Nitro Module that uses the Frame type from react-native-vision-camera to run some kind of processing using native Swift/Kotlin/C++ code.
For example, the BarcodeScanner (from react-native-vision-camera-barcode-scanner) is a native plugin to scan barcodes in a Frame.
See "Native Frame Processor Plugins" for more information.
const codeScanner = useBarcodeScanner({
barcodeFormats: ['all-formats']
})
const frameOutput = useFrameOutput({
onFrame(frame) {
'worklet'
// [!code ++]
const barcodes = codeScanner.scanCodes(frame)
console.log(`Scanned ${barcodes.length} barcodes!`)
frame.dispose()
}
})
orientation and isMirroredThe Frame's orientation describes its orientation relative to the output's outputOrientation.
Similarly, isMirrored describes if the Frame is considered to be mirrored, relative to the output's mirrorMode.
Consider both orientation and isMirrored as a "recipe" to get the Frame's intended presentation.
The Camera pipeline does not physically rotate or mirror buffers automatically, as this is computationally expensive.
Instead, it is much more efficient to pass orientation and isMirrored along as metadata so consumers can apply rotation/mirroring logic themselves - for example, react-native-vision-camera-skia transforms a Frame using matrix rotations and mirroring - this runs on the GPU at rendering-level, avoiding expensive buffer modifications entirely.
[!TIP] If you need your buffers to be correctly rotated and mirrored already, enable
enablePhysicalBufferRotation, which results inorientationalways being'up'andisMirroredalways beingfalse, indicating no rotation or mirroring is necessary to get the Frame's intended presentation.
Most Frame Processing libraries allow setting orientation and mirror settings as flags, for example, MLKit uses UIImageOrientation:
let frame = ...
let mlImage = MLImage(sampleBuffer: frame.sampleBuffer)
switch frame.orientation {
case .up:
// [!code ++]
mlImage.orientation = frame.isMirrored ? .portrait : .portraitMirrored
case .down:
// ...
}
[!TIP] See "Orientation" for more information about orientation.
A Frame has a GPU-backed buffer that contains its pixels - their layout is described by the Frame's pixelFormat.
const frame = ...
console.log(frame.pixelFormat) // 'yuv-420-8-bit-full'
While the most commonly known PixelFormat is RGB (often 'rgb-bgra-8-bit'), it is not natively produced by the Camera, which means it requires expensive conversion causing higher latency, more bandwidth, and overall slower performance.
The Camera's native pixel format is typically YUV (often 'yuv-420-8-bit-full') or a vendor-specific variation of it ('private'), which uses ~50% less memory than RGB and requires little to no conversion overhead.
[!NOTE] See "Frame Output: Choosing a Pixel Format" for more information about configuring the Pixel Format a
CameraFrameOutputstreams in.
[!NOTE] See "Pixel Formats Map: Inspecting a Pixel Format" for more information about Pixel Formats and their native counterparts.
A Frame exposes its native, GPU-backed pixel buffer via getPixelBuffer(), which provides zero-copy access into the Frame's actual buffer:
const frame = ...
// [!code ++]
const buffer = frame.getPixelBuffer()
[!WARNING] The
bufferis only valid as long as theFrameis valid (seeFrame.isValid). Once theFrameis disposed (seedispose()), thebuffermust no longer be used.
The Pixel Buffer's layout depends on the Frame's pixelFormat. For example, in 'rgb-bgra-8-bit', pixels are laid out in 8-bit Uints, in the order of [B, G, R, A].
const frame = ...
const buffer = frame.getPixelBuffer()
const pixels = new Uint8Array(buffer)
if (frame.pixelFormat === 'rgb-bgra-8-bit') {
// [!code ++]
const firstPixel = { r: pixels[2], g: pixels[1], b: pixels[0] }
console.log(`First Pixel:`, firstPixel)
}
Some Frames are planar, which means they don't have a contiguous buffer of pixel data but instead use two or more separate buffers for their pixel data.
Y plane, and one buffer for the interleaved UV plane.Y, one U and one V plane.To get the individual FramePlanes, use Frame.getPlanes():
const frame = ...
// [!code ++]
if (frame.isPlanar) {
// [!code ++]
const planes = frame.getPlanes()
if (planes.length === 2) {
// Y + UV
const yBuffer = planes[0].getPixelBuffer()
const uvBuffer = planes[1].getPixelBuffer()
const yPixels = new Uint8Array(yBuffer)
const uvPixels = new Uint8Array(uvBuffer)
const firstPixel = { y: yPixels[0], u: uvPixels[0], v: uvPixels[0] }
}
} else {
// regular pixel buffer access
}
[!NOTE] If
Frame.isPlanarisfalse,Frame.getPlanes()may return an empty array.
A Frame is a large GPU-buffer of raw pixel data. A 4k RGB Frame is roughly ~34MB in memory, so if a CameraFrameOutput streams at 60 FPS, it uses over 2GB/s of bandwidth.
Typically the pipeline uses ring-buffers and avoids any copies to ensure it can run in realtime.
To let the pipeline know that a buffer can be re-used, you need to dispose a Frame once you are done with it, ideally as quickly as possible:
const frameOutput = useFrameOutput({
onFrame(frame) {
'worklet'
try {
// ..any processing
} finally {
// [!code ++]
frame.dispose()
}
}
})
After a Frame has been disposed, it is no longer valid (Frame.isValid == false).
Any Pixel Buffers or FramePlanes are also no longer valid and must not be used anymore.