Asyncio coroutine returns <_GatheringFuture pending> instead of actual values

So I am trying to replace a bunch of values in one dataframe from another namely results and results2 . I want values in results2 to be replaced by the closest match in results which look like:

results:

   products
0, pizza
1, ketchup
2, salami
3, anchovy
4, pepperoni
5, marinara
6, olive
7, sausage
8, cheese
9, bbq sauce
10, stuffed crust

results2:

   products
0, salaaaami
1, kechap
2, lives
3, ppprn
4, pizzas
5, marinara
6, sauce de bbq
7, marinara sauce
8, chease
9, sausages
10, crust should be stuffed

Both dataframes are fairly large so calculating levenshteinDistance was taking forever. So I tried making use of asyncio as follows:

async def levenshteinDistance(s1, s2):
    if len(s1) > len(s2):
        s1, s2 = s2, s1

    distances = range(len(s1) + 1)
    for i2, c2 in enumerate(s2):
        distances_ = [i2+1]
        for i1, c1 in enumerate(s1):
            if c1 == c2:
                distances_.append(distances[i1])
            else:
                distances_.append(1 + min((distances[i1], distances[i1 + 1], distances_[-1])))
        distances = distances_
    return distances[-1]

async def closest_match(string, matchings):
    scores = {}
    for m in matchings:
        scores[m] = 1 - levenshteinDistance(string,m)
    
    return await max(scores.items(), key=operator.itemgetter(1))[0]

results2.products = asyncio.gather(*[closest_match(string, results.products.values) 
                    if string not in results.products else string 
                    for string in results2.products])
results2

Placing or removing await (as suggested by various answers on SO) here has no advantage as it will return <_GatheringFuture pending> instead of actual values thus not waiting at all. I tried making use of a separate caller function (some answers I saw) to wait for them before returning results as:

async def caller(results, results2):
    return await asyncio.gather(*[closest_match(string, results.products.values) 
                    if string not in results.products else string 
                    for string in results2.products]) 

#and changing last lines as:
results2.products = caller(results, results2)
results2

But now it only returns <coroutine object caller at 0x000001FE173742B0> so it is actually not waiting but returning coroutine itself. Any bits I am missing out on?

P.S. I am running python 3.6 so cannot make use of asyncio.create_task() or asyncio.run()



Read more here: https://stackoverflow.com/questions/64959097/asyncio-coroutine-returns-gatheringfuture-pending-instead-of-actual-values

Content Attribution

This content was originally published by Hamza at Recent Questions - Stack Overflow, and is syndicated here via their RSS feed. You can read the original post over there.

%d bloggers like this: