scientific-skills/flowio/references/api_reference.md
FlowIO is a Python library for reading and writing Flow Cytometry Standard (FCS) files. It supports FCS versions 2.0, 3.0, and 3.1 with minimal dependencies.
pip install flowio
Supports Python 3.9 and later.
The primary class for working with FCS files.
FlowData(fcs_file,
ignore_offset_error=False,
ignore_offset_discrepancy=False,
use_header_offsets=False,
only_text=False,
nextdata_offset=None,
null_channel_list=None)
Parameters:
fcs_file: File path (str), Path object, or file handleignore_offset_error (bool): Ignore offset errors (default: False)ignore_offset_discrepancy (bool): Ignore offset discrepancies between HEADER and TEXT sections (default: False)use_header_offsets (bool): Use HEADER section offsets instead of TEXT section (default: False)only_text (bool): Only parse the TEXT segment, skip DATA and ANALYSIS (default: False)nextdata_offset (int): Byte offset for reading multi-dataset filesnull_channel_list (list): List of PnN labels for null channels to excludeFile Information:
name: Name of the FCS filefile_size: Size of the file in bytesversion: FCS version (e.g., '3.0', '3.1')header: Dictionary containing HEADER segment informationdata_type: Type of data format ('I', 'F', 'D', 'A')Channel Information:
channel_count: Number of channels in the datasetchannels: Dictionary mapping channel numbers to channel infopnn_labels: List of PnN (short channel name) labelspns_labels: List of PnS (descriptive stain name) labelspnr_values: List of PnR (range) values for each channelfluoro_indices: List of indices for fluorescence channelsscatter_indices: List of indices for scatter channelstime_index: Index of the time channel (or None)null_channels: List of null channel indicesEvent Data:
event_count: Number of events (rows) in the datasetevents: Raw event data as bytesMetadata:
text: Dictionary of TEXT segment key-value pairsanalysis: Dictionary of ANALYSIS segment key-value pairs (if present)as_array(preprocess=True)
Return event data as a 2-D NumPy array.
Parameters:
preprocess (bool): Apply gain, logarithmic, and time scaling transformations (default: True)Returns:
Example:
flow_data = FlowData('sample.fcs')
events_array = flow_data.as_array() # Preprocessed data
raw_array = flow_data.as_array(preprocess=False) # Raw data
write_fcs(filename, metadata=None)
Export the FlowData instance as a new FCS file.
Parameters:
filename (str): Output file pathmetadata (dict): Optional dictionary of TEXT segment keywords to add/updateExample:
flow_data = FlowData('sample.fcs')
flow_data.write_fcs('output.fcs', metadata={'$SRC': 'Modified data'})
Note: Exports as FCS 3.1 with single-precision floating-point data.
read_multiple_data_sets(fcs_file,
ignore_offset_error=False,
ignore_offset_discrepancy=False,
use_header_offsets=False)
Read all datasets from an FCS file containing multiple datasets.
Parameters:
nextdata_offset)Returns:
Example:
from flowio import read_multiple_data_sets
datasets = read_multiple_data_sets('multi_dataset.fcs')
print(f"Found {len(datasets)} datasets")
for i, dataset in enumerate(datasets):
print(f"Dataset {i}: {dataset.event_count} events")
create_fcs(filename,
event_data,
channel_names,
opt_channel_names=None,
metadata=None)
Create a new FCS file from event data.
Parameters:
filename (str): Output file pathevent_data (ndarray): 2-D NumPy array of event data (rows=events, columns=channels)channel_names (list): List of PnN (short) channel namesopt_channel_names (list): Optional list of PnS (descriptive) channel namesmetadata (dict): Optional dictionary of TEXT segment keywordsExample:
import numpy as np
from flowio import create_fcs
# Create synthetic data
events = np.random.rand(10000, 5)
channels = ['FSC-A', 'SSC-A', 'FL1-A', 'FL2-A', 'Time']
opt_channels = ['Forward Scatter', 'Side Scatter', 'FITC', 'PE', 'Time']
create_fcs('synthetic.fcs',
events,
channels,
opt_channel_names=opt_channels,
metadata={'$SRC': 'Synthetic data'})
Generic warning class for non-critical issues.
Warning raised when PnE values are invalid during FCS file creation.
Base exception class for FlowIO errors.
Raised when there are issues parsing an FCS file.
Raised when the HEADER and TEXT sections provide different byte offsets for data segments.
Workaround: Use ignore_offset_discrepancy=True parameter when creating FlowData instance.
Raised when attempting to read a file with multiple datasets using the standard FlowData constructor.
Solution: Use read_multiple_data_sets() function instead.
FCS files consist of four segments:
$BEGINDATA, $ENDDATA: Byte offsets for DATA segment$BEGINANALYSIS, $ENDANALYSIS: Byte offsets for ANALYSIS segment$BYTEORD: Byte order (1,2,3,4 for little-endian; 4,3,2,1 for big-endian)$DATATYPE: Data type ('I'=integer, 'F'=float, 'D'=double, 'A'=ASCII)$MODE: Data mode ('L'=list mode, most common)$NEXTDATA: Offset to next dataset (0 if single dataset)$PAR: Number of parameters (channels)$TOT: Total number of eventsPnN: Short name for parameter nPnS: Descriptive stain name for parameter nPnR: Range (max value) for parameter nPnE: Amplification exponent for parameter n (format: "a,b" where value = a * 10^(b*x))PnG: Amplification gain for parameter nFlowIO automatically categorizes channels:
Access indices via:
flow_data.scatter_indicesflow_data.fluoro_indicesflow_data.time_indexWhen calling as_array(preprocess=True), FlowIO applies:
To access raw, unprocessed data: as_array(preprocess=False)
only_text=True when only metadata is neededread_multiple_data_sets() if unsure about dataset countignore_offset_discrepancy=TrueFor advanced flow cytometry analysis including compensation, gating, and GatingML support, consider using FlowKit library alongside FlowIO. FlowKit provides higher-level abstractions built on top of FlowIO's file parsing capabilities.
from flowio import FlowData
# Read FCS file
flow = FlowData('experiment.fcs')
# Print basic info
print(f"Version: {flow.version}")
print(f"Events: {flow.event_count}")
print(f"Channels: {flow.channel_count}")
print(f"Channel names: {flow.pnn_labels}")
# Get event data
events = flow.as_array()
print(f"Data shape: {events.shape}")
from flowio import FlowData
flow = FlowData('sample.fcs', only_text=True)
# Access metadata
print(f"Acquisition date: {flow.text.get('$DATE', 'N/A')}")
print(f"Instrument: {flow.text.get('$CYT', 'N/A')}")
# Channel information
for i, (pnn, pns) in enumerate(zip(flow.pnn_labels, flow.pns_labels)):
print(f"Channel {i}: {pnn} ({pns})")
import numpy as np
from flowio import create_fcs
# Generate or process data
data = np.random.rand(5000, 3) * 1000
# Define channels
channels = ['FSC-A', 'SSC-A', 'FL1-A']
stains = ['Forward Scatter', 'Side Scatter', 'GFP']
# Create FCS file
create_fcs('output.fcs',
data,
channels,
opt_channel_names=stains,
metadata={
'$SRC': 'Python script',
'$DATE': '19-OCT-2025'
})
from flowio import read_multiple_data_sets
# Read all datasets
datasets = read_multiple_data_sets('multi.fcs')
# Process each dataset
for i, dataset in enumerate(datasets):
print(f"\nDataset {i}:")
print(f" Events: {dataset.event_count}")
print(f" Channels: {dataset.pnn_labels}")
# Get data array
events = dataset.as_array()
mean_values = events.mean(axis=0)
print(f" Mean values: {mean_values}")
from flowio import FlowData
# Read original file
flow = FlowData('original.fcs')
# Get event data
events = flow.as_array(preprocess=False)
# Modify data (example: apply custom transformation)
events[:, 0] = events[:, 0] * 1.5 # Scale first channel
# Note: Currently, FlowIO doesn't support direct modification of event data
# For modifications, use create_fcs() instead:
from flowio import create_fcs
create_fcs('modified.fcs',
events,
flow.pnn_labels,
opt_channel_names=flow.pns_labels,
metadata=flow.text)