How to generate a table from arp command output using pandas

Question:

I have a arp command output executed with Paramiko in a table format as shown below:

stdin, stdout, stderr = client.exec_command("arp")
opt = stdout.read().decode('ascii')
print(opt)

Address                 HWtype  HWaddress           Flags Mask    Iface(Ports)
0.00.00.00              ether       (incomplete)       C               eth0.1
0.00.00.00              ether       (incomplete)       C               eth0.2
0.00.00.00              ether   00:00:00:00:00:00      C             eth0.001(2)
0.00.00.00              ether   00:00:00:00:00:00      C             eth0.002(6)

Now, I wanted to read this table using pandas and wanted to display only few columns, I tried the below code but its not working for me.

    opt = io.StringIO(opt)
    df = pd.read_table(opt, sep='s{2,}', usecols=[0,2,4], engine='python')
    print(df)

     Address              HWaddress         Iface(Ports)
0      00:00:00:00           eth0.1          None
1      00:00:00:00           eth0.2          None
2      00:00:00:00  00:00:00:00:00:00   eth0.001(2)
3      00:00:00:00  00:00:00:00:00:00   eth0.002(6)

As we can see in the above Dataframe output, wherever we have ‘(incomplete)’ string that is replaced by Iface(Ports) column data. can anyone please help me on this.

Asked By: demo

||

Answers:

Do you mind doing some string parsing?

d = '''

Address                 HWtype  HWaddress           Flags Mask    Iface(Ports)
0.00.00.00              ether       (incomplete)       C               eth0.1
0.00.00.00              ether       (incomplete)       C               eth0.2
0.00.00.00              ether   00:00:00:00:00:00      C             eth0.001(2)
0.00.00.00              ether   00:00:00:00:00:00      C             eth0.002(6)

'''

removing empty lines

string  =[x for x in d.split('n') if x != '']

Structuring data

data = []
for line in dd:
    data.append([x for x in line.split(' ') if x != ''])

Create dataframe

df = pd.DataFrame(data)
df.columns = df.iloc[0]
df = df[1:]

Sample output

0     Address HWtype          HWaddress Flags         Mask Iface(Ports)
1  0.00.00.00  ether       (incomplete)     C       eth0.1         None
2  0.00.00.00  ether       (incomplete)     C       eth0.2         None
3  0.00.00.00  ether  00:00:00:00:00:00     C  eth0.001(2)         None
4  0.00.00.00  ether  00:00:00:00:00:00     C  eth0.002(6)         None
Answered By: srinath

I think it will be simpler to use read_fwf() (fixed-width format)

df = pd.read_fwf(io.StringIO(text))

and later select columns

df = df[['Address','HWaddress', 'Iface(Ports)']]

Full example code:

text = '''Address                 HWtype  HWaddress           Flags Mask    Iface(Ports)
0.00.00.00              ether       (incomplete)       C               eth0.1
0.00.00.00              ether       (incomplete)       C               eth0.2
0.00.00.00              ether   00:00:00:00:00:00      C             eth0.001(2)
0.00.00.00              ether   00:00:00:00:00:00      C             eth0.002(6)
'''

import pandas as pd
import io

df = pd.read_fwf(io.StringIO(text))

print('--- full ---')
print(df)

df = df[['Address','HWaddress', 'Iface(Ports)']]

print('--- selected ---')
print(df)

Result:

--- full ---
      Address HWtype          HWaddress Flags  Mask Iface(Ports)
0  0.00.00.00  ether       (incomplete)     C   NaN       eth0.1
1  0.00.00.00  ether       (incomplete)     C   NaN       eth0.2
2  0.00.00.00  ether  00:00:00:00:00:00     C   NaN  eth0.001(2)
3  0.00.00.00  ether  00:00:00:00:00:00     C   NaN  eth0.002(6)
--- selected ---
      Address          HWaddress Iface(Ports)
0  0.00.00.00       (incomplete)       eth0.1
1  0.00.00.00       (incomplete)       eth0.2
2  0.00.00.00  00:00:00:00:00:00  eth0.001(2)
3  0.00.00.00  00:00:00:00:00:00  eth0.002(6)

EDIT

After getting DataFrame you can modify data useing pandas function – i.e. .apply(). You have strings so you can use string functions or regex.

def convert(item):
    if '(' in item:
        item = item.split('(')[1].split(')')[0]
    return item

df['Iface(Ports)'] = df['Iface(Ports)'].apply(convert)

print(df)

Result:

      Address          HWaddress Iface(Ports)
0  0.00.00.00       (incomplete)       eth0.1
1  0.00.00.00       (incomplete)       eth0.2
2  0.00.00.00  00:00:00:00:00:00            2
3  0.00.00.00  00:00:00:00:00:00            6
Answered By: furas
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.