python/3_FrameworkInterop/customTensorFlowKernel/README.md
Learn how to add a custom GPU operation to TensorFlow using cuda.core with tf.py_function. This sample implements a custom ReLU operation (y = max(0, x)) for rapid prototyping of GPU operations.
Q: How do I add a custom GPU op to TensorFlow?
A: Use tf.py_function to wrap cuda.core kernels:
tf.py_function to call from TensorFlow@tf.custom_gradientcd python/3_FrameworkInterop/customTensorFlowKernel
pip install -r requirements.txt
python customTensorFlowKernel.py
python customTensorFlowKernel.py --size 1000000
import tensorflow as tf
from customTensorFlowKernel import custom_relu
# Simple usage
x = tf.random.normal([100], dtype=tf.float32)
y = custom_relu(x)
# In a Keras model
model = tf.keras.Sequential([
tf.keras.layers.Dense(128),
tf.keras.layers.Lambda(custom_relu),
tf.keras.layers.Dense(10)
])
This sample is for rapid prototyping. For production: