This portal is to open public enhancement requests against products and services offered by the IBM Data & AI organization. To view all of your ideas submitted to IBM, create and manage groups of Ideas, or create an idea explicitly set to be either visible by all (public) or visible only to you and IBM (private), use the IBM Unified Ideas Portal (https://ideas.ibm.com).
Shape the future of IBM!
We invite you to shape the future of IBM, including product roadmaps, by submitting ideas that matter to you the most. Here's how it works:
Search existing ideas
Start by searching and reviewing ideas and requests to enhance a product or service. Take a look at ideas others have posted, and add a comment, vote, or subscribe to updates on them if they matter to you. If you can't find what you are looking for,
Post your ideas
Post ideas and requests to enhance a product or service. Take a look at ideas others have posted and upvote them if they matter to you,
Post an idea
Upvote ideas that matter most to you
Get feedback from the IBM team to refine your idea
Specific links you will want to bookmark for future use
Welcome to the IBM Ideas Portal (https://www.ibm.com/ideas) - Use this site to find out additional information and details about the IBM Ideas process and statuses.
IBM Unified Ideas Portal (https://ideas.ibm.com) - Use this site to view all of your ideas, create new ideas for any IBM product, or search for ideas across all of IBM.
ideasibm@us.ibm.com - Use this email to suggest enhancements to the Ideas process or request help from IBM for submitting your Ideas.
IBM Employees should enter Ideas at https://ideas.ibm.com
This RFE's Headline was changed after submission to reflect the headline of an internal request we were already considering, but will now track here.
After discussions with Larry we better understand the requirement. We will schedule a patch for this.
Hi Bill,
I've tested your suggestion and it doesn't work. Have you tested it ?
Sorry it takes some time for me, because we have custom esub and -W option is required. I've tested on my test LSF cluster.
1. I've created regular user reservation:
[lsfadmin@mgmt2 etc]$ brsvadd -o -n 64 -m node3-20 -u lsfadmin -b 2015:04:23:11:00 -e 2015:04:23:13:00 -N t1
Reservation t1 is created
[lsfadmin@mgmt2 etc]$ brsvs
RSVID TYPE USER NCPUS RSV_HOSTS TIME_WINDOW
t1 user lsfadmin 0/64 node3-20:0/64 4/23/11/0-4/23/13/0
2. I submitted job as user lsfadmin(the same as in reservation), with -We which will overlap the start reservation time.
[lsfadmin@mgmt2 configdir]$ bsub -q lsftest -m node3-20 -n 1 -We 10 sleep 60
Initializing program...
Thu Apr 23 10:58:16 2015
ProgType =
Read job specification...
Thu Apr 23 10:58:16 2015
***Reading Job Command File***
***Parsing Job Command File***
LSB_SUB_QUEUE = "lsftest"
LSB_SUB3_RUNTIME_ESTIMATION = 600
LSB_SUB_COMMANDNAME = "sleep"
LSB_SUB_COMMAND_LINE = "sleep 60"
LSB_SUB_HOSTS = "node3-20"
LSB_SUB_NUM_PROCESSORS = 1
LSB_SUB_MAX_NUM_PROCESSORS = 1
***Environment***
Applying Mt. Sinai Options...
Thu Apr 23 10:58:16 2015
User lsfadmin specifies queue "lsftest"
Job <837> is submitted to queue .
3. The job won't run because reservation, the node3-20 is empty.
[lsfadmin@mgmt2 configdir]$ bjobs -lp 837
Job <837>, User , Project , Application , Status
, Queue , Job Priority <50>, Command 60>
Thu Apr 23 10:58:16: Submitted from host , CWD h/minerva_test/configdir>, Specified Hosts ;
RUNTIME
10.0 min of mgmt2bq
PENDING REASONS:
Unable to reach slave batch server: node24-48;
Not enough slots or resources for whole duration of the job: node3-20;
Not specified in job submission: node28-1, mgmt2, mgmt3;
SCHEDULING PARAMETERS:
r15s r1m r15m ut pg io ls it tmp swp mem
loadSched - - - - - - - - - - -
loadStop - - - - - - - - - - -
cpu_mhz healthy gbytesin gmbytesin gbytesout gmbytesout gopens
loadSched - - - - - - -
loadStop - - - - - - -
gcloses greads gwrites grdir giupdate gbytesin_orga gmbytesin_orga
loadSched - - - - - - -
loadStop - - - - - - -
gbytesout_orga gmbytesout_orga ngpus ngpus_shared ngpus_excl_t
loadSched - - - - -
loadStop - - - - -
ngpus_excl_p
loadSched -
loadStop -
RESOURCE REQUIREMENT DETAILS:
Combined: select[type == any] order[!-slots:-maxslots] rusage[mem=2000.00] sam
e[model] affinity[core(1)*1]
lsfadmin@mgmt2 configdir]$ bhosts -l node3-20
HOST node3-20
STATUS CPUF JL/U MAX NJOBS RUN SSUSP USUSP RSV DISPATCH_WINDOW
ok 60.00 - 64 0 0 0 0 0 -
CURRENT LOAD USED FOR SCHEDULING:
r15s r1m r15m ut pg io ls it tmp swp mem slots cpu_mhz
Total 0.0 0.0 0.0 29% 0.0 132 0 1198 870G 0M 240.1G 64 1400.0
Reserved 0.0 0.0 0.0 0% 0.0 0 0 0 0M 0M 0M - 0.0
Please, let me know if I am missing something. I need a working solution for this problem.
The other question I need to provide -U reservation_id for the job to run during reservation. Is it possible to omit it, so users don't need to keep track of reservations and if reservation is valid job will run in it. We are using IBM Platform LSF Standard 9.1.2.0.
Thanks,
Sveta Mazurkova.
Hi Bill,
Sorry for very late replay. Can you please reopen this PMR. Unfortunately I haven't received e-mail with your questions and I wonder why.
Please, check historical PMR on this issue. [PMR 77813,7TD,000]
Summering your suggestion: User can submit job with -We and job may start to run before normal user reservation even if -We time will overlap with reservation and continue to run regardless of the start time reservation. Is it correct? I am going to test it now.
Thanks,
Sveta.
Sveta, I'm not sure I understand your request.
There are two run limit parameters, -We (run estimate) and -W (hard runlimit). The run estimate parameter allows the user to specify their best guess, but they won't get penalised for it if they are wrong.
When trying to fit a job in before a system reservation, the scheduler will use -We if specified. So if someone wants to take the risk and try and get their job in before the reservation starts, they could do bsub -We 1 a.out and if it finishes in time then great, if not, it will get killed.
By definition, a system reservation is exclusive, so jobs that haven't completed by the time it becomes active will be killed.
If you don't want them killed, the simplest solution would be to create a normal user reservation, rather than system reservation - jobs will continue past the start of the reservation. And when you really want to do maintenance, you can either explicitly kill them, or submit a job that will fill the reservation which will force those jobs to be killed/requeued.
Regards,
Bill McMillan, Global Product Portfolio Manager for the IBM Platform LSF Family
Creating a new RFE based on Community RFE #65036 in product Platform LSF.