This portal is to open public enhancement requests against products and services offered by the IBM Data & AI organization. To view all of your ideas submitted to IBM, create and manage groups of Ideas, or create an idea explicitly set to be either visible by all (public) or visible only to you and IBM (private), use the IBM Unified Ideas Portal (https://ideas.ibm.com).
Shape the future of IBM!
We invite you to shape the future of IBM, including product roadmaps, by submitting ideas that matter to you the most. Here's how it works:
Search existing ideas
Start by searching and reviewing ideas and requests to enhance a product or service. Take a look at ideas others have posted, and add a comment, vote, or subscribe to updates on them if they matter to you. If you can't find what you are looking for,
Post your ideas
Post ideas and requests to enhance a product or service. Take a look at ideas others have posted and upvote them if they matter to you,
Post an idea
Upvote ideas that matter most to you
Get feedback from the IBM team to refine your idea
Specific links you will want to bookmark for future use
Welcome to the IBM Ideas Portal (https://www.ibm.com/ideas) - Use this site to find out additional information and details about the IBM Ideas process and statuses.
IBM Unified Ideas Portal (https://ideas.ibm.com) - Use this site to view all of your ideas, create new ideas for any IBM product, or search for ideas across all of IBM.
ideasibm@us.ibm.com - Use this email to suggest enhancements to the Ideas process or request help from IBM for submitting your Ideas.
IBM Employees should enter Ideas at https://ideas.ibm.com
See this idea on ideas.ibm.com
Resource allocation limits: Set GPU or resource limits for each watsonx.ai project to ensure efficient resource usage across multiple projects. This will prevent any one project from monopolizing resources and ensure that other projects can run smoothly.
Dynamic GPU resource preemption & scheduling: Allow high-priority training jobs to preempt and acquire GPU resources from lower-priority jobs. This ensures that critical workloads are processed first and helps optimize the overall usage of resources based on the importance and urgency of tasks.
Enhanced UI for visual management: Provide a user-friendly interface that allows users to visualize and manage GPU allocation, task priority, and real-time resource usage. Features like drag-and-drop or slider controls will make resource management easy and accessible without deep technical knowledge.
Multi-level resource allocation strategy: Implement different resource allocation strategies based on task needs. For instance, urgent tasks can be allocated higher priority resources, while longer-running models can be assigned lower-priority resources, balancing the overall workload.
Automated resource management: Automatically adjust resource allocation based on the priority and importance of the training jobs. This will enable efficient, real-time resource distribution, ensuring that deep learning tasks always have access to the most suitable resources.
Manual GPU adjustment during training: Allow CPD admin to manually adjust the GPU allocation while a training job is running. This flexibility gives admin more control over the allocation of resources based on real-time needs, ensuring optimal performance during training.\
Background of my client:
My client is one of Taiwan's most important research institutions. Their primary projects focus on military applications, and AI solutions for central and local governments. They have procured a large number of AI servers and GPUs and are preparing to establish a national-level computing center and AI training platform. They has multiple teams working on AI research, including LLM, machine learning, deep learning. I have been engaged with them for nearly 6 months and initially received very positive feedback on watsonx.ai. However, when it comes to deep learning capabilities and GPU management, they have found several limitations that do not meet their requirements.
Specifically, they have identified issues such as:
The lack of dynamic manual GPU resource allocation, preventing users from adjusting GPU distribution in real-time.
The inability to predefine GPU resource limits at the project level through the user interface.
Due to these shortcomings in GPU management and deep learning capabilities, our platform appears less comprehensive compared to other vendors. As a result, they are actively considering alternative solutions from competitors for their deep learning.
Impact and Significance :
If we successfully sell watsonx.ai to this client, it would be a big milestone for IBM’s AI platform in Taiwan’s government sector. This achievement would not only strengthen IBM’s influence in government AI applications but also establish a solid foundation of trust for expanding into other government agencies in the future.
Needed By | Quarter |
By clicking the "Post Comment" or "Submit Idea" button, you are agreeing to the IBM Ideas Portal Terms of Use.
Do not place IBM confidential, company confidential, or personal information into any field.