[Python], How to calculate 'service response time' per conversation id

Question

I would like to calculate the average ‘service response time’ per conversation id as a variable in a dataframe (in minutes).

The ‘service response time’ is calculated by the difference between the ‘created_at’ variable from Y and X in minutes:

X = the first row where owner_type == "User" and is_interaction == 1.

Y = the first row after X where owner_type == "Agent" and owner_id != 1.

Update:

id	owner_type	owner_id	conversation_id	message	created_at	is_interaction
260943	Agent	1	26276	a	01/03/2022 15:00
265544	Agent	1	26276	b	05/03/2022 12:01
266749	User	153263	26276	c	05/03/2022 15:49	1
266750	User	153263	26276	d	05/03/2022 15:49	1
266753	Agent	14	26276	e	05/03/2022 15:51
267003	Agent	1	26276	f	06/03/2022 12:01
268900	User	153263	26276	g	06/03/2022 17:01	1
268904	Agent	1	26276	h	07/03/2022 12:00
271141	Agent	1	26276	i	09/03/2022 12:00
271725	User	153263	26276	j	09/03/2022 13:01	1
271728	User	153263	26276	k	09/03/2022 13:01	1
271727	Agent	10	26277	l	09/03/2022 13:01
272085	Agent	1	26276	m	10/03/2022 12:01

Any ideas on how to calculate this?

Update:

The resulted output should look like this:

You should replace the column name "Average Response Time (in minutes)" for "srt" in the dataframe. Ignore the "Average" in this column name, because it’s not and the "Date" column if not needed.

Best regards,

Milan Passchier

Asked By: Milan Passchier

||

Source

Answer 1

Update 04.11.2022

If you had a unique ID for each event, then it would be easier. And more: ‘X = the first row where owner_type == "User" and is_interaction == 1.’ This is not the first row at all, but the last one before ‘Y = the first row after X where owner_type == "Agent" and owner_id != 1’.

I offer two options. In both cases, the created_at column is converted to the desired format using pd.to_datetime, and a ‘srt’ column is created with empty values. Explicit loc indexing is used.

In the first one, the main logic is in list comprehensions (they are many times faster than a loop).

More:

First, a list bbb is created, in which the condition is checked at each iteration:

if df.loc[i, 'owner_type'] == 'User' and df.loc[i, 'is_interaction'] == 1

if it is met, then the iteration number is written and the my_func function is called, which is fed the iteration number and ‘conversation_id’. The function takes a dataframe by slice starting from i to the last one. Finds a row with ‘Agent’ that does not equal 1 and has the same ‘conversation_id’. The first available line is taken:

m = aaa.index[0]

If there are no such strings, then the function returns -1.

Thus, we get the list bbb, in which the User indexes are on the left and Agent on the right.

In the fff list, the last lines where the Agent index stops repeating are copied.

Further in the loop, with the help of the selected indices, the necessary lines are filled using loc.

code list comprehensions:

import numpy as np
import pandas as pd


df['created_at'] = pd.to_datetime(df['created_at'], errors='raise')
df['srt'] = np.nan

def my_func(i, id):
    m = -1
    aaa = df[i:]
    aaa = aaa[(df.loc[i:, 'conversation_id'] == id) & (df.loc[i:, 'owner_type'] == 'Agent')
              & (df.loc[i:, 'owner_id'] != 1)]

    if len(aaa) > 0:
        m = aaa.index[0]

    return m


bbb = np.array([[i, my_func(i, df.loc[i, 'conversation_id'])]
                for i in range(len(df)) if df.loc[i, 'owner_type'] == 'User' and df.loc[i, 'is_interaction'] == 1])
fff = [bbb[i] for i in range(len(bbb) - 1) if (bbb[i, 1] != bbb[i + 1:, 1]).all() == True and bbb[i, 1] != -1]

if len(bbb) > 1 and bbb[-1, 1] != -1:
    fff.append(bbb[-1])

fff = np.array(fff)

for i in fff:
    df.loc[i[0], 'srt'] = (df.loc[i[1], 'created_at'] - df.loc[i[0], 'created_at']) / np.timedelta64(1, 'm')

print(df)

The solution is where all the logic is in a loop.

import numpy as np
import pandas as pd
    
df['created_at'] = pd.to_datetime(df['created_at'], errors='raise')
df['srt'] = np.nan


ferst_time_user = 0
cid = 0
ind = 0

for i in range(len(df)):
    if df.loc[i, 'owner_type'] == 'User' and df.loc[i, 'is_interaction'] == 1:
        ferst_time_user = df.loc[i, 'created_at']
        cid = df.loc[i, 'conversation_id']
        ind = i

    if ferst_time_user != 0 and df.loc[i, 'conversation_id'] == cid and df.loc[i, 'owner_type'] == 'Agent' and df.loc[i, 'owner_id'] != 1:
        df.loc[ind, 'srt'] = (df.loc[i, 'created_at'] - ferst_time_user) / np.timedelta64(1, 'm')
        ferst_time_user = 0
        ind = 0
        cid = 0

print(df)

Output

        id owner_type  owner_id  conversation_id message          created_at  
0   260943      Agent         1            26276       a 2022-01-03 15:00:00   
1   265544      Agent         1            26276       b 2022-05-03 12:01:00   
2   266749       User    153263            26276       c 2022-05-03 15:49:00   
3   266750       User    153263            26276       d 2022-05-03 15:49:00   
4   266753      Agent        14            26276       e 2022-05-03 15:51:00   
5   267003      Agent         1            26276       f 2022-06-03 12:01:00   
6   268900       User    153263            26276       g 2022-06-03 17:01:00   
7   268904      Agent         1            26276       h 2022-07-03 12:00:00   
8   271141      Agent         1            26276       i 2022-09-03 12:00:00   
9   271725       User    153263            26276       j 2022-09-03 13:01:00   
10  271728       User    153263            26276       k 2022-09-03 13:01:00   
11  271727      Agent        10            26277       l 2022-09-03 13:01:00   
12  272085      Agent         1            26276       m 2022-10-03 12:01:00   

    is_interaction  srt  
0              NaN  NaN  
1              NaN  NaN  
2              1.0  NaN  
3              1.0  2.0  
4              NaN  NaN  
5              NaN  NaN  
6              1.0  NaN  
7              NaN  NaN  
8              NaN  NaN  
9              1.0  NaN  
10             1.0  NaN  
11             NaN  NaN  
12             NaN  NaN

Answered By: inquirer

[Python], How to calculate 'service response time' per conversation id

Question:

Answers: