Batch Mode Inferencing for Enterprise-Grade Scalable AI Workflows and Data Processing with WatsonX

See this idea on ideas.ibm.com

The WatsonX AI platform currently lacks support for batch mode inferencing, which is essential for handling large-scale data processing tasks required by our enterprise customers. Many of our users have identified the need to perform inferencing on multiple inputs simultaneously, often involving a significant volume of text data that needs to be processed quickly and efficiently.

For example, one of our customers operates in an environment where they frequently encounter limitations when trying to process large batches of texts for inferencing, even with a Bring Your Own Model (BYOM) instance. These limitations stem from network constraints, model configuration issues, or the need to perform pre/post-processing steps on a high volume of data before it can be fed into WatsonX AI for inference. This not only slows down their operations but also forces them to rely on technical workarounds that are often cumbersome and inefficient.

Implementing batch mode inferencing would directly address these pain points by enabling users to send large batches of texts or other types of data in a single request, thereby improving processing speed and reducing operational bottlenecks. This feature would not only enhance the usability of WatsonX AI but also open up new opportunities for customers to leverage its capabilities on an unprecedented scale.

Implementation Ideas/Options:

API Level Implementation:
- Introduce a new endpoint or method in the WatsonX AI API that allows users to send multiple inputs (e.g., texts, images, or other data types) as part of a single batch request. This would enable customers to process large amounts of data without having to make multiple individual calls.
- Optional: This can also be supported in streaming-mode to enable rapid action on completed batch items.
- Optional: Provide an option for users to specify parameters such as model settings, pre/post-processing steps, and output formats that apply uniformly across all items in the batch.
- Optional: Support resiliency by allowing some batch items to fail while while continuing to process the rest of the items.
UI Level Implementation:
- Create a dedicated section within the WatsonX AI user interface where users can upload or drag-and-drop multiple files (e.g., text files, CSVs, or other data types) for batch processing. This would streamline workflows for customers who frequently handle large-scale inferencing tasks and want to avoid coding or navigating complex APIs manually.
- Offer a visual indicator showing the progress of each item in the batch as well as any errors or warnings during processing.

I can see this enhancement would be a valuable asset to WatsonX, offering both immediate benefits to current users and future growth opportunities. I would welcome the opportunity to discuss this proposal further and explore its feasibility and timeline for implementation.

Needed By

Yesterday (Let's go already!)

Post comment

By clicking the "Post Comment" or "Submit Idea" button, you are agreeing to the IBM Ideas Portal Terms of Use.
Do not place IBM confidential, company confidential, or personal information into any field.

Please enter your email address

RELATED IDEAS

Batch Mode Inferencing for Enterprise-Grade Scalable AI Workflows and Data Processing with WatsonX