Skip to Main Content
IBM Data and AI Ideas Portal for Customers


This portal is to open public enhancement requests against products and services offered by the IBM Data & AI organization. To view all of your ideas submitted to IBM, create and manage groups of Ideas, or create an idea explicitly set to be either visible by all (public) or visible only to you and IBM (private), use the IBM Unified Ideas Portal (https://ideas.ibm.com).


Shape the future of IBM!

We invite you to shape the future of IBM, including product roadmaps, by submitting ideas that matter to you the most. Here's how it works:


Search existing ideas

Start by searching and reviewing ideas and requests to enhance a product or service. Take a look at ideas others have posted, and add a comment, vote, or subscribe to updates on them if they matter to you. If you can't find what you are looking for,


Post your ideas

Post ideas and requests to enhance a product or service. Take a look at ideas others have posted and upvote them if they matter to you,

  1. Post an idea

  2. Upvote ideas that matter most to you

  3. Get feedback from the IBM team to refine your idea


Specific links you will want to bookmark for future use

Welcome to the IBM Ideas Portal (https://www.ibm.com/ideas) - Use this site to find out additional information and details about the IBM Ideas process and statuses.

IBM Unified Ideas Portal (https://ideas.ibm.com) - Use this site to view all of your ideas, create new ideas for any IBM product, or search for ideas across all of IBM.

ideasibm@us.ibm.com - Use this email to suggest enhancements to the Ideas process or request help from IBM for submitting your Ideas.

IBM Employees should enter Ideas at https://ideas.ibm.com


Status Delivered
Workspace Spectrum Symphony
Components Version 7.1
Created by Guest
Created on Feb 17, 2017

[Integration Requirement] Possibility to select different memory allocation libraries

Summary:

Provide the possibility to select different memory allocation libraries (implementing certain allocation/de-allocation algorithms) via application profile entry for the SSM. Proposed candidates: jemalloc and tcmalloc. They become necessary in certain scenarios where performance and resource consumption need to be tweaked, or where small allocations trigger dramatic memory fragmentation and subsequently lead to rapid resource starvation. The grid administrator needs to be in full control in these scenarios, and be able to take mitigating or resolving actions.

It might be unlikely and even unwanted for the actual compiled allocators to be rolled out as a part of a Symphony release, so the profile switch might need to point to a path within the local file system (or even share), where the actual library can be found. This would also allow customers to be close to market since they can compile the library version they need, and get outside support.

Details:
The standard implementation of glibc's malloc/free is based on the ptmalloc2 allocator algorithm:

http://malloc.de/en/
https://www.gnu.org/software/libc/manual/html_mono/libc.html

malloc/free are just POSIX interfaces, to be implemented by the corresponding OS or application wanting to be POSIX compatible.

https://www.researchgate.net/profile/Rivalino_Matias_Jr/publication/224222636_The_mechanics_of_memory-related_software_aging/links/0deec5288c31c2a166000000.pdf

Of course, most applications use the allocator provided by their OS's standard library, in this case the glibc's ptmalloc2 malloc/free combo. And for general purpose applications, this choice usually poses no problem.

But long-running, multi-threaded applications require special attention in terms of process lifetime and under “software aging” considerations, with both items being well researched. And because it's all well researched, and IEEE and its journals are accessible for any serious engineering department, it can be expected that software aging and multi-threaded process lifetime evaluation is being taken into account when designing such a service (= SSM). Part of the design is the explicit choice of an allocator, or in general, the explicit choice for the right tool for the job at hand.

Since the minimum allocation size of ptmalloc2 is 32 bytes, any allocation smaller than that will have an overhead attached to it to fill the gap, and lead to fragmentation. Jemalloc on the other hand is optimized to reduce fragmentation and handle small allocations well. It is implemented by prominent software like Firefox, PostgreSQL, and throughout many server applications at Facebook:

http://highscalability.com/blog/2015/3/17/in-memory-computing-at-aerospike-scale-when-to-choose-and-ho.html
https://facebook.com/notes/facebook-engineering/scalable-memory-allocation-using-jemalloc/480222803919/

Google's tcmalloc is optimized for sheer speed. It handles small allocations well, but never releases any memory back to the OS. By design, since releasing memory back to the OS does require some extra CPU cycles. Just marking memory as reusable seems to fit the bill for some high-performance applications and scenarios.

As you can see, there's enough choice even between just these 3 allocators. With just a little bit of research, a tailored choice for an allocator can be made, based on the nature of one's own application. If the application's nature would indeed be highly variable, the choice of allocator can be handled via parameter switch or by providing distinct binaries with different allocator implementations. All subject to test and verification, of course.

Let's summarize some facts we gathered about the SSM through the course of this journey:

• It's a long-running, multi-threaded (> 20 threads) network server application
• It requests a lot of predominantly small allocations (13 byte) from the OS, and presumably marks them as free after use

So, armed with some generic knowledge of memory allocators, how would we (or the SSM creator) go about the choice for the right one? Let's compare the allocators:

Please Follow the proposal mentioned above to also provide jemalloc and tcmalloc via application profile switch as an RFE (this one right here). From a customer perspective, this would allow us to flexibly decide for which application and/or cluster environment we'd like to run which allocator on SSM side. After all, there's potentially a significant performance gain to be realized, and/or resource usage to be optimized in general (see “software aging”).

For IBM, this would provide a substantial competitive advantage since the product would “go the extra mile” to squeeze out every drop of performance available, if wanted and configured by the customer, and thus be true to the HPC spirit.

Implementation of such feature would be straightforward, given thorough testing and support.

  • Guest
    Reply
    |
    Dec 13, 2020

    .Mark Status. The enhancement had been delivered and also included in roadmap release since Symphony 7.2

  • Guest
    Reply
    |
    Mar 15, 2017

    This RFE's Headline was changed after submission to reflect the headline of an internal request we were already considering, but will now track here.