IBM Data and AI Ideas Portal for Customers


Shape the future of IBM!

We invite you to shape the future of IBM, including product roadmaps, by submitting ideas that matter to you the most. Here's how it works:

Post your ideas

Post ideas and requests to enhance a product or service. Take a look at ideas others have posted and upvote them if they matter to you,

  1. Post an idea

  2. Upvote ideas that matter most to you

  3. Get feedback from the IBM team to refine your idea

Help IBM prioritize your ideas and requests

The IBM team may need your help to refine the ideas so they may ask for more information or feedback. The product management team will then decide if they can begin working on your idea. If they can start during the next development cycle, they will put the idea on the priority list. Each team at IBM works on a different schedule, where some ideas can be implemented right away, others may be placed on a different schedule.

Receive notification on the decision

Some ideas can be implemented at IBM, while others may not fit within the development plans for the product. In either case, the team will let you know as soon as possible. In some cases, we may be able to find alternatives for ideas which cannot be implemented in a reasonable time.

Additional Information

To view our roadmaps: http://ibm.biz/Data-and-AI-Roadmaps

Reminder: This is not the place to submit defects or support needs, please use normal support channel for these cases

IBM Employees:

The correct URL for entering your ideas is: https://hybridcloudunit-internal.ideas.aha.io


Status Future consideration
Workspace Spectrum Conductor
Components Future Version
Created by Guest
Created on Oct 12, 2021

Add support for spark.worker.cleanup or its alternative

Some of our Spark applications run long and produce large volume of shuffle data. We use SparkCleanup to cleanup shuffle data, however SparkCleanup service sometimes deletes shuffle files on long running applications which causes failures. If there was possibility of cleaning up shuffle files within Spark application, this issue would be resolved.

Needed By Month
  • Guest
    Oct 19, 2021

    We have already tweaked SparkCleanup service to match the longest running job. The problem is when the workload is unexpectedly large, the fact that SparkCleanup runs centrally and has no awareness of the status of Spark jobs, this can cause failures if it deletes files that are still needed.

  • Admin
    Steve Haertel
    Oct 19, 2021

    Here is the list of environment variables that can be set in the SparkCleanup service profile: https://www.ibm.com/docs/en/spectrum-conductor/2.5.0?topic=groups-configuring-cleanup-instance

  • Admin
    Steve Haertel
    Oct 19, 2021

    Please use the SparkCleanup service environment variable LOCAL_DIR_RETENTION_IN_MINS which should be set to a value that is longer than your longest running application. This value will be used when the SparkCleanup service determines if the Shuffle blockmgr-* files are old enough to be deleted.