contrib/nd4j-log-analyzer/README.md
ND4J Log Analyzer is a Java agent designed to record ND4J operation executions in an H2 database and index a specified DeepLearning4J codebase. This tool is crucial for identifying regressions between different versions of DeepLearning4J by analyzing the execution patterns and performance metrics of ND4J operations.
Clone the repository:
git clone https://github.com/your-repo/nd4j-log-analyzer.git
Navigate to the project directory:
cd nd4j-log-analyzer
Build the project using Maven:
mvn clean package
To use the ND4J Log Analyzer, you need to inject it as a Java agent into your DeepLearning4J application. Use the following VM arguments when running your Java application:
-DsourceCodeIndexerPath=/path/to/your/deeplearning4j/codebase -javaagent:/path/to/nd4j-log-analyzer-1.0-SNAPSHOT.jar
Example:
-DsourceCodeIndexerPath=/home/user/Documents/GitHub/deeplearning4j/ -javaagent:/home/user/Documents/GitHub/deeplearning4j/contrib/nd4j-log-analyzer/nd4j-log-analyzer/target/nd4j-log-analyzer-1.0-SNAPSHOT.jar
Make sure to replace the paths with the appropriate locations on your system.
The agent uses two main configuration options:
sourceCodeIndexerPath: The path to the DeepLearning4J codebase you want to index.javaagent: The path to the compiled ND4J Log Analyzer JAR file.The H2 database is automatically created and managed by the agent. It contains two main tables:
OpLogEvent: Stores information about ND4J operation executions.
SourceCodeLine: Stores indexed source code information (created only if sourceCodeIndexerPath is provided).
OpLogEvent table.SourceCodeLine table.The ND4J Log Analyzer includes a StackTraceCodeFinder utility that helps locate the relevant source code lines for recorded operations. Key features include:
String rootDirectory = "/path/to/your/deeplearning4j/codebase";
StackTraceElement[] stackTrace = // ... obtain stack trace
String sourceCodeLine = StackTraceCodeFinder.getFirstLineOfCode(rootDirectory, stackTrace);
This utility is used internally by the Log Analyzer to associate recorded operations with their corresponding source code lines.
The JsonComparisonReport is a powerful tool for comparing operation logs between different runs or versions of your DeepLearning4J application. It helps identify differences in ND4J operations, which is crucial for detecting regressions or unexpected changes in behavior.
Key features of the JsonComparisonReport include:
To use the JsonComparisonReport, run it as a standalone Java application:
java org.nd4j.interceptor.data.JsonComparisonReport <directory1> <directory2>
Where:
<directory1> is the path to the first set of JSON log files<directory2> is the path to the second set of JSON log files to compare againstThe tool will generate two types of reports for each epsilon value defined in InterceptorEnvironment.EPSILONS:
comparison_report_<epsilon>.json: A detailed report of all differences foundearliest_difference_<epsilon>.json: Information about the first difference encountered in the execution flowThese reports can be used to identify and analyze discrepancies between different runs or versions of your DeepLearning4J application.
The JsonReport is a utility tool that generates JSON files for each unique operation name from the recorded ND4J operations. This tool is useful for exporting the collected data in a format that's easy to analyze or compare using other tools.
Key features of the JsonReport include:
To use the JsonReport, run it as a standalone Java application:
java org.nd4j.interceptor.data.JsonReport <path_to_oplog.db>
Where:
<path_to_oplog.db> is the path to the H2 database file containing the recorded operationsThe tool will create a new directory called "jsonReports" (or clear it if it already exists) and generate JSON files for each unique operation name found in the database.
For each unique operation name, a JSON file will be created in the "jsonReports" directory. The file name will be <operation_name>.json. Each file contains:
These JSON files can be used for further analysis, comparison between different runs, or as input for other tools like the JsonComparisonReport.
java org.nd4j.interceptor.data.JsonReport path/to/your/oplog.db
java org.nd4j.interceptor.data.JsonComparisonReport path/to/jsonReports1 path/to/jsonReports2
After running your DeepLearning4J application with the agent, you can query the H2 database to analyze the recorded operations. The InterceptorPersistence class provides several methods for data analysis:
Get all unique operation names:
Set<String> uniqueOpNames = InterceptorPersistence.getUniqueOpNames(filePath);
Filter operations by name:
List<OpLogEvent> filteredEvents = InterceptorPersistence.filterByOpName(filePath, opName);
Group operations by source code line:
Map<String, List<OpLogEvent>> groupedEvents = InterceptorPersistence.groupedByCodeSortedByEventId(logEvents);
You can also use your preferred SQL client or the H2 Console to connect to the database and run custom queries.
For comparing results between different runs or versions, use the JsonComparisonReport utility as described above.
For exporting the recorded operations to JSON format for further analysis or comparison, use the JsonReport utility as described in the previous section.
InterceptorPersistence.listTables() method to check the existing tables in the database.sourceCodeIndexerPath is correct and that the StackTraceCodeFinder can access the necessary files.Contributions to the ND4J Log Analyzer are welcome! Please submit pull requests or open issues on the GitHub repository.
This project is licensed under the Apache License 2.0. See the LICENSE file for details.