Skip to Main Content
IBM Data and AI Ideas Portal for Customers

This portal is to open public enhancement requests against products and services offered by the IBM Data & AI organization. To view all of your ideas submitted to IBM, create and manage groups of Ideas, or create an idea explicitly set to be either visible by all (public) or visible only to you and IBM (private), use the IBM Unified Ideas Portal (

Shape the future of IBM!

We invite you to shape the future of IBM, including product roadmaps, by submitting ideas that matter to you the most. Here's how it works:

Search existing ideas

Start by searching and reviewing ideas and requests to enhance a product or service. Take a look at ideas others have posted, and add a comment, vote, or subscribe to updates on them if they matter to you. If you can't find what you are looking for,

Post your ideas

Post ideas and requests to enhance a product or service. Take a look at ideas others have posted and upvote them if they matter to you,

  1. Post an idea

  2. Upvote ideas that matter most to you

  3. Get feedback from the IBM team to refine your idea

Specific links you will want to bookmark for future use

Welcome to the IBM Ideas Portal ( - Use this site to find out additional information and details about the IBM Ideas process and statuses.

IBM Unified Ideas Portal ( - Use this site to view all of your ideas, create new ideas for any IBM product, or search for ideas across all of IBM. - Use this email to suggest enhancements to the Ideas process or request help from IBM for submitting your Ideas.

IBM Employees should enter Ideas at

Status Not under consideration
Created by Guest
Created on Mar 9, 2021

Arabic names parsing

We are looking for is to have configurable option to reparse the token based on culture and confidence.

Arabic name does not get parsed in the complete way, when for example; Muhammad Hafiz is compared with Hafiz and KHAN with KHAN and come back no match.

("Muhammad Hafiz KHAN","Hafiz KHAN")

The IBM suggestion to manually configure HAFIZ to always be treated as a GN does not sound feasible for 2 reasons.

- First of all because it is not reasonable to ask the client him to configure all possible middle names.
- Secondly, assume that the client will follow your advice what will happen if he then try to search "Muhammad Hafeeze KHAN" (middle name written differently).
Hafeeze is not configured anywhere so he will again get GN: "Muhammad" SN: "Hafeeze KHAN".
Now he will have 2 missing stems one in the GN and one in the SN and won't get a hit…

  1. The client wants is to be able to get both parses e.g.

When he provides "Muhammad Hafiz KHAN"

  • GN:"Muhammad Hafiz" SN: "KHAN"

  • GN:"Muhammad" SN: "Hafiz KHAN"

So, he can get a hit when comparing with either:

    • "Muhammad KHAN"


      • "Hafiz KHAN"

  • Currently even if we set the re-parse threshold to be high enough so that the name will be re-parsed, we are not able to guarantee to the client he will get the second parse of the name.
    That is as the second name parsing confidence must be higher than the first name parse confidence for us to get the second parse.
    I think it is reasonable to return the second parse with its confidence and let us (via internal configuration of course) to decide ourselves what to do with the second parse.
    I think that when middle names are involved, especially 3 tokens names, it make sense to always do the following:

    • If the middle name in the first parse was attached to the GN return it in the SN.

    • If the middle name in the first parse was attached to the SN return it in the GN.

And return both parses and their confidence.

Needed by Date Apr 3, 2021
  • Guest
    Mar 9, 2021

    Thank you - Ronen - for submission. We will investigate and follow up here with status.