IBM Data and AI Ideas Portal for Customers


Shape the future of IBM!

We invite you to shape the future of IBM, including product roadmaps, by submitting ideas that matter to you the most. Here's how it works:

Post your ideas

Post ideas and requests to enhance a product or service. Take a look at ideas others have posted and upvote them if they matter to you,

  1. Post an idea

  2. Upvote ideas that matter most to you

  3. Get feedback from the IBM team to refine your idea

Help IBM prioritize your ideas and requests

The IBM team may need your help to refine the ideas so they may ask for more information or feedback. The product management team will then decide if they can begin working on your idea. If they can start during the next development cycle, they will put the idea on the priority list. Each team at IBM works on a different schedule, where some ideas can be implemented right away, others may be placed on a different schedule.

Receive notification on the decision

Some ideas can be implemented at IBM, while others may not fit within the development plans for the product. In either case, the team will let you know as soon as possible. In some cases, we may be able to find alternatives for ideas which cannot be implemented in a reasonable time.

Additional Information

To view our roadmaps: http://ibm.biz/Data-and-AI-Roadmaps

Reminder: This is not the place to submit defects or support needs, please use normal support channel for these cases

IBM Employees:

The correct URL for entering your ideas is: https://hybridcloudunit-internal.ideas.aha.io


Status Delivered
Workspace Spectrum LSF
Created by Guest
Created on Jan 27, 2015

Add functionality to LSF Advance Reservations

Allow LSF jobs start running before reservation and continue during resource reservation, if reservation allows to run these jobs.

  • Guest
    Oct 4, 2016

    This RFE's Headline was changed after submission to reflect the headline of an internal request we were already considering, but will now track here.

  • Guest
    May 6, 2015

    After discussions with Larry we better understand the requirement. We will schedule a patch for this.

  • Guest
    Apr 23, 2015

    Hi Bill,

    I've tested your suggestion and it doesn't work. Have you tested it ?
    Sorry it takes some time for me, because we have custom esub and -W option is required. I've tested on my test LSF cluster.

    1. I've created regular user reservation:
    [lsfadmin@mgmt2 etc]$ brsvadd -o -n 64 -m node3-20 -u lsfadmin -b 2015:04:23:11:00 -e 2015:04:23:13:00 -N t1
    Reservation t1 is created
    [lsfadmin@mgmt2 etc]$ brsvs
    RSVID TYPE USER NCPUS RSV_HOSTS TIME_WINDOW
    t1 user lsfadmin 0/64 node3-20:0/64 4/23/11/0-4/23/13/0

    2. I submitted job as user lsfadmin(the same as in reservation), with -We which will overlap the start reservation time.
    [lsfadmin@mgmt2 configdir]$ bsub -q lsftest -m node3-20 -n 1 -We 10 sleep 60

    Initializing program...
    Thu Apr 23 10:58:16 2015
    ProgType =
    Read job specification...
    Thu Apr 23 10:58:16 2015
    ***Reading Job Command File***
    ***Parsing Job Command File***
    LSB_SUB_QUEUE = "lsftest"
    LSB_SUB3_RUNTIME_ESTIMATION = 600
    LSB_SUB_COMMANDNAME = "sleep"
    LSB_SUB_COMMAND_LINE = "sleep 60"
    LSB_SUB_HOSTS = "node3-20"
    LSB_SUB_NUM_PROCESSORS = 1
    LSB_SUB_MAX_NUM_PROCESSORS = 1
    ***Environment***
    Applying Mt. Sinai Options...
    Thu Apr 23 10:58:16 2015
    User lsfadmin specifies queue "lsftest"
    Job <837> is submitted to queue .

    3. The job won't run because reservation, the node3-20 is empty.
    [lsfadmin@mgmt2 configdir]$ bjobs -lp 837

    Job <837>, User , Project , Application , Status
    , Queue , Job Priority <50>, Command 60>
    Thu Apr 23 10:58:16: Submitted from host , CWD h/minerva_test/configdir>, Specified Hosts ;
    RUNTIME
    10.0 min of mgmt2bq
    PENDING REASONS:
    Unable to reach slave batch server: node24-48;
    Not enough slots or resources for whole duration of the job: node3-20;
    Not specified in job submission: node28-1, mgmt2, mgmt3;

    SCHEDULING PARAMETERS:
    r15s r1m r15m ut pg io ls it tmp swp mem
    loadSched - - - - - - - - - - -
    loadStop - - - - - - - - - - -

    cpu_mhz healthy gbytesin gmbytesin gbytesout gmbytesout gopens
    loadSched - - - - - - -
    loadStop - - - - - - -

    gcloses greads gwrites grdir giupdate gbytesin_orga gmbytesin_orga
    loadSched - - - - - - -
    loadStop - - - - - - -

    gbytesout_orga gmbytesout_orga ngpus ngpus_shared ngpus_excl_t
    loadSched - - - - -
    loadStop - - - - -

    ngpus_excl_p
    loadSched -
    loadStop -

    RESOURCE REQUIREMENT DETAILS:
    Combined: select[type == any] order[!-slots:-maxslots] rusage[mem=2000.00] sam
    e[model] affinity[core(1)*1]

    lsfadmin@mgmt2 configdir]$ bhosts -l node3-20
    HOST node3-20
    STATUS CPUF JL/U MAX NJOBS RUN SSUSP USUSP RSV DISPATCH_WINDOW
    ok 60.00 - 64 0 0 0 0 0 -

    CURRENT LOAD USED FOR SCHEDULING:
    r15s r1m r15m ut pg io ls it tmp swp mem slots cpu_mhz
    Total 0.0 0.0 0.0 29% 0.0 132 0 1198 870G 0M 240.1G 64 1400.0
    Reserved 0.0 0.0 0.0 0% 0.0 0 0 0 0M 0M 0M - 0.0


    Please, let me know if I am missing something. I need a working solution for this problem.

    The other question I need to provide -U reservation_id for the job to run during reservation. Is it possible to omit it, so users don't need to keep track of reservations and if reservation is valid job will run in it. We are using IBM Platform LSF Standard 9.1.2.0.

    Thanks,

    Sveta Mazurkova.

  • Guest
    Apr 22, 2015

    Hi Bill,

    Sorry for very late replay. Can you please reopen this PMR. Unfortunately I haven't received e-mail with your questions and I wonder why.
    Please, check historical PMR on this issue. [PMR 77813,7TD,000]

    Summering your suggestion: User can submit job with -We and job may start to run before normal user reservation even if -We time will overlap with reservation and continue to run regardless of the start time reservation. Is it correct? I am going to test it now.

    Thanks,

    Sveta.

  • Guest
    Feb 11, 2015

    Sveta, I'm not sure I understand your request.

    There are two run limit parameters, -We (run estimate) and -W (hard runlimit). The run estimate parameter allows the user to specify their best guess, but they won't get penalised for it if they are wrong.

    When trying to fit a job in before a system reservation, the scheduler will use -We if specified. So if someone wants to take the risk and try and get their job in before the reservation starts, they could do bsub -We 1 a.out and if it finishes in time then great, if not, it will get killed.

    By definition, a system reservation is exclusive, so jobs that haven't completed by the time it becomes active will be killed.

    If you don't want them killed, the simplest solution would be to create a normal user reservation, rather than system reservation - jobs will continue past the start of the reservation. And when you really want to do maintenance, you can either explicitly kill them, or submit a job that will fill the reservation which will force those jobs to be killed/requeued.

    Regards,
    Bill McMillan, Global Product Portfolio Manager for the IBM Platform LSF Family

  • Guest
    Jan 29, 2015

    Creating a new RFE based on Community RFE #65036 in product Platform LSF.