This portal is to open public enhancement requests against products and services offered by the IBM Data Platform organization. To view all of your ideas submitted to IBM, create and manage groups of Ideas, or create an idea explicitly set to be either visible by all (public) or visible only to you and IBM (private), use the IBM Unified Ideas Portal (https://ideas.ibm.com).
Shape the future of IBM!
We invite you to shape the future of IBM, including product roadmaps, by submitting ideas that matter to you the most. Here's how it works:
Search existing ideas
Start by searching and reviewing ideas and requests to enhance a product or service. Take a look at ideas others have posted, and add a comment, vote, or subscribe to updates on them if they matter to you. If you can't find what you are looking for,
Post your ideas
Post ideas and requests to enhance a product or service. Take a look at ideas others have posted and upvote them if they matter to you,
Post an idea
Upvote ideas that matter most to you
Get feedback from the IBM team to refine your idea
Specific links you will want to bookmark for future use
Welcome to the IBM Ideas Portal (https://www.ibm.com/ideas) - Use this site to find out additional information and details about the IBM Ideas process and statuses.
IBM Unified Ideas Portal (https://ideas.ibm.com) - Use this site to view all of your ideas, create new ideas for any IBM product, or search for ideas across all of IBM.
ideasibm@us.ibm.com - Use this email to suggest enhancements to the Ideas process or request help from IBM for submitting your Ideas.
IBM Employees should enter Ideas at https://ideas.ibm.com
I thing with Lustre, it has the ability directly in Lustre to report I/O on that file system by Slurm job as reported through the a Lustre extension to the proc filesystem. The problem has always been: "Should there not be a standard way to do this in the Kernel?" Kernel is the operative term. If there can be some standardization there, it's possible to make it work with tools like LSF in a way that is sustainable. Having one off's for Lustre is not scalable.
This has been previously discussed in an older enhancement request.
The main challenge is that accurately tracking disk usage and IO statistics per job is not practical.
For example, a job can read or write to any directory/file local or remote - if a local block device that can trapped (with some overhead), but for a remote file, it may get served out of cache, so never generates network io. From the example in grid engine's documentation, it is reporting on usage within a TMPDIR dynamically created for the job.
Similarly, while cgroup does have an io.stat subsystem, it is only correct for block devices - ie local disks; it does not report accurate usage for shared (NFS) devices)
If you are using Storage Scale as your file system, then there is an LSF integration that will report on true IO from the job to the backend storage.