Back to Deeplearning4j

Refactor nd4j to Centralize Offset Storage and Introduce OpaqueNDArray

ADRs/0028 - Offset centralization.md

1.0.0-M2.13.7 KB
Original Source

Refactor nd4j to Centralize Offset Storage and Introduce OpaqueNDArray

Status

Proposed

Proposed by: Adam Gibson Oct 24,2024

Context

The current nd4j codebase stores offsets in multiple locations, including opaque data buffers and data buffers. This scattered approach leads to inconsistencies, potential bugs, and difficulties in maintenance. Additionally, the current method of passing NDArray components from Java to C++ involves manually unpacking various elements, which is error-prone and cumbersome.

Proposal

We propose to refactor the nd4j codebase to centralize offset storage within NDArrays and introduce a new OpaqueNDArray type for improved Java-C++ interoperability. The key features of this proposal include:

  1. Moving all offset information into NDArray objects, removing them from opaque data buffers and data buffers.
  2. Introducing an OpaqueNDArray type in C++, which is an alias for NDArray*.
  3. Updating the Java-C++ interop layer to use OpaqueNDArray for passing NDArray information.
  4. Refactoring existing code to use the new centralized offset storage and OpaqueNDArray.
  5. Updating documentation and coding standards to reflect these changes.

Example of the proposed change:

// Before void someOperation(void* dataBuffer, sd::LongType* shapeBuffer, sd::LongType offset, ...) { // Manual unpacking and offset handling }

// After typedef NDArray* OpaqueNDArray;

void someOperation(OpaqueNDArray array) { // Access all necessary information through the NDArray object sd::LongType offset = array->offset(); // ... }

Consequences

Advantages

  • Improves code consistency and reduces the risk of offset-related bugs.
  • Simplifies the Java-C++ interop by encapsulating NDArray information.
  • Reduces the likelihood of errors from manual unpacking of NDArray components.
  • Makes the codebase more maintainable and easier to understand.

Disadvantages

  • Requires a significant refactoring effort across the nd4j codebase.
  • May introduce temporary bugs during the transition if not done carefully.
  • Could potentially impact performance if not optimized properly.
  • Might require updates to external code that interacts with nd4j.

Risks

  • Risk of introducing bugs during the refactoring process, especially in complex operations.
  • Potential for decreased performance if the new offset access methods are not efficiently implemented.
  • May cause confusion for developers who are accustomed to the current system.
  • Could potentially break existing code that relies on the current offset storage mechanism.

Action Items

  1. Develop a comprehensive guide for implementing and using the new offset storage system and OpaqueNDArray.
  2. Create a set of unit tests to verify the correctness of offset handling and OpaqueNDArray usage.
  3. Update the team's coding standards to include guidelines on using the new offset storage and OpaqueNDArray.
  4. Conduct a pilot implementation on a small, isolated part of the codebase.
  5. Schedule the refactoring process, prioritizing the most frequently used operations.
  6. Update all relevant documentation, including comments in the code and API documentation.
  7. Implement benchmarks to compare performance before and after the changes.
  8. Conduct thorough code reviews and testing for each refactored section.
  9. Plan for a grace period to allow developers to familiarize themselves with the new system.
  10. Monitor and address any issues or feedback arising from the new offset storage and OpaqueNDArray usage.
  11. Consider creating a static analysis tool to ensure consistent usage of the new system across the codebase.
  12. Update any build scripts or configuration files that may be affected by the changes.
  13. Prepare a migration guide for users of nd4j to update their code to work with the new system.