This portal is to open public enhancement requests against products and services offered by the IBM Data & AI organization. To view all of your ideas submitted to IBM, create and manage groups of Ideas, or create an idea explicitly set to be either visible by all (public) or visible only to you and IBM (private), use the IBM Unified Ideas Portal (https://ideas.ibm.com).
Shape the future of IBM!
We invite you to shape the future of IBM, including product roadmaps, by submitting ideas that matter to you the most. Here's how it works:
Search existing ideas
Start by searching and reviewing ideas and requests to enhance a product or service. Take a look at ideas others have posted, and add a comment, vote, or subscribe to updates on them if they matter to you. If you can't find what you are looking for,
Post your ideas
Post ideas and requests to enhance a product or service. Take a look at ideas others have posted and upvote them if they matter to you,
Post an idea
Upvote ideas that matter most to you
Get feedback from the IBM team to refine your idea
Specific links you will want to bookmark for future use
Welcome to the IBM Ideas Portal (https://www.ibm.com/ideas) - Use this site to find out additional information and details about the IBM Ideas process and statuses.
IBM Unified Ideas Portal (https://ideas.ibm.com) - Use this site to view all of your ideas, create new ideas for any IBM product, or search for ideas across all of IBM.
ideasibm@us.ibm.com - Use this email to suggest enhancements to the Ideas process or request help from IBM for submitting your Ideas.
IBM Employees should enter Ideas at https://ideas.ibm.com
Thank you for taking the time to provide your ideas to IBM. We truly value our relationship with you and appreciate your willingness to share details about your experience, your recommendations, and ideas.
IBM has evaluated the request and has determined that it cannot be implemented at this time, has been open an extended time without gaining community support or does not align with our current strategy or roadmap. If you would prefer that IBM re-evaluate this decision, please open a new Idea.
This tool needs an EDIT button. I'll have Sun Yi log that.
Giving this more thought, it would be interesting to see how this works in the context of a database, in that case, you could combine the lshosts and lsload structures and then use database constructs. I'm not sure if SQLite tables can span threads like say MySQL or MariaDB, but it's something to think about. A database with X connections allows developers to focus on SQL vs. memory table API's. But that does not mean you have to use them.
Gave the SELECT phase some more thought, in the 'merge' type SQL WHERE context but handled with hash based memory table, you should always keep track of how many rows are in each matching resource, and then when preparing to down select hosts, reorder (in a legal way), the select to place the smallest number of matching hosts to the left, and then work your way right. This is likely what Databases are doing under the covers to handle a SQL WHERE anyway.
LSF Should have an option to push back on bad syntax too, something like "select[resourcea && resourceb || resourcec]" Even though this is not legally incorrect, it's indicative of the user either not thinking, or relying on the left to right processing. Just a thought.
Also, before the schedule allocates threads, and I think this is important, it should look for similar matching resource requirements, and create a set of tables that process each of the major patterns into separate memory tables.
That way, when a bucket is processed, it should check if it's resource requirements match a pattern, and if so, take the pre-processed results from the pattern matching memory table instead of reprocessing.
For example, if you have 10,000 buckets, and all 10,000 have the following:
"select[type == any && health == ok && mem > 10000 [ && blah ]]"
Then, the results of the matching results of the select upto the blah should be placed in a table at the beginning of the process before dispatching threads to complete the allocations for the bucket.
So, in summary, we have this in the scheduler loop:
Create tables for each "dynamic" resource
Order the Resource Requirements by the number of rows with the least to the left, and the most to the right
Run one sweep to create a list of patterns and mark them in a mapping table
Use x threads to process each of the patterns into their own table
For each bucket dispatch a group of threads upto X, and maintaining X until all buckets are process
Then, inside the bucket process
From the pre-ordered group of resource requirements, search for pre-processes lists and decompose the bucket data as required
For resources that don't match a pattern, obtain the results via a query of shared memory
Processes the select string and make an allocation.
I have no visibility as to what the algorithm is today, these are just my thoughts on how you can:
Not have to perform a down-select of hosts more than once
Minimize wasted cycles
Parallelize the selection/allocation processing time
Make the best use of memory
That's enough for this morning.
Man, all that nice formatting was lost. I was going to add another RFE, but instead, I'm just going to type it in here.
LIM fidelity should also be included by having the ls_info, ls_host, and ls_load structures in shared memory and accessible to MBD/MBSCHED. Since the shmem API supports locks, there is no reason that MBD could not use these structures.
In addition. if for every LIM host update or registration, you were for handle that process in a thread, and registering/updating the shmem tables, you can increase the fidelity and frequency of LIM updates, thus allowing say a 30 second update frequency from a 10,000 node cluster.
Doing the math, that would mean roughly 333 updates per second for the master lim, and if the master lim used 33 threads, that would be 10/second/thread, which is not a big number actually. Using shmem and locking, each transaction would take about 40ns or so, which means that the "effective" rate for a 10,000 node cluster with 33 threads could even be larger.
If 40ns is in fact the update time required to update the shmem database per host, we could get a theoretical update frequency of twice a second. Now, that's a bit much, but that would be the upside, or the upper limit of how low the sampling could be taken. Scary shit.
All that is needed is to verify the time TIMEIT() per host update to make the database update, and then do the math. Based upon that time, you can calculate what the upper limit of max threads that you would be able to theoritically supprt as these updates are locking at the table level (shared memory named segment). Math is fun.