How to generate a table from arp command output using pandas
Question:
I have a arp
command output executed with Paramiko in a table format as shown below:
stdin, stdout, stderr = client.exec_command("arp")
opt = stdout.read().decode('ascii')
print(opt)
Address HWtype HWaddress Flags Mask Iface(Ports)
0.00.00.00 ether (incomplete) C eth0.1
0.00.00.00 ether (incomplete) C eth0.2
0.00.00.00 ether 00:00:00:00:00:00 C eth0.001(2)
0.00.00.00 ether 00:00:00:00:00:00 C eth0.002(6)
Now, I wanted to read this table using pandas and wanted to display only few columns, I tried the below code but its not working for me.
opt = io.StringIO(opt)
df = pd.read_table(opt, sep='s{2,}', usecols=[0,2,4], engine='python')
print(df)
Address HWaddress Iface(Ports)
0 00:00:00:00 eth0.1 None
1 00:00:00:00 eth0.2 None
2 00:00:00:00 00:00:00:00:00:00 eth0.001(2)
3 00:00:00:00 00:00:00:00:00:00 eth0.002(6)
As we can see in the above Dataframe output, wherever we have ‘(incomplete)’ string that is replaced by Iface(Ports) column data. can anyone please help me on this.
Answers:
Do you mind doing some string parsing?
d = '''
Address HWtype HWaddress Flags Mask Iface(Ports)
0.00.00.00 ether (incomplete) C eth0.1
0.00.00.00 ether (incomplete) C eth0.2
0.00.00.00 ether 00:00:00:00:00:00 C eth0.001(2)
0.00.00.00 ether 00:00:00:00:00:00 C eth0.002(6)
'''
removing empty lines
string =[x for x in d.split('n') if x != '']
Structuring data
data = []
for line in dd:
data.append([x for x in line.split(' ') if x != ''])
Create dataframe
df = pd.DataFrame(data)
df.columns = df.iloc[0]
df = df[1:]
Sample output
0 Address HWtype HWaddress Flags Mask Iface(Ports)
1 0.00.00.00 ether (incomplete) C eth0.1 None
2 0.00.00.00 ether (incomplete) C eth0.2 None
3 0.00.00.00 ether 00:00:00:00:00:00 C eth0.001(2) None
4 0.00.00.00 ether 00:00:00:00:00:00 C eth0.002(6) None
I think it will be simpler to use read_fwf() (fixed-width format
)
df = pd.read_fwf(io.StringIO(text))
and later select columns
df = df[['Address','HWaddress', 'Iface(Ports)']]
Full example code:
text = '''Address HWtype HWaddress Flags Mask Iface(Ports)
0.00.00.00 ether (incomplete) C eth0.1
0.00.00.00 ether (incomplete) C eth0.2
0.00.00.00 ether 00:00:00:00:00:00 C eth0.001(2)
0.00.00.00 ether 00:00:00:00:00:00 C eth0.002(6)
'''
import pandas as pd
import io
df = pd.read_fwf(io.StringIO(text))
print('--- full ---')
print(df)
df = df[['Address','HWaddress', 'Iface(Ports)']]
print('--- selected ---')
print(df)
Result:
--- full ---
Address HWtype HWaddress Flags Mask Iface(Ports)
0 0.00.00.00 ether (incomplete) C NaN eth0.1
1 0.00.00.00 ether (incomplete) C NaN eth0.2
2 0.00.00.00 ether 00:00:00:00:00:00 C NaN eth0.001(2)
3 0.00.00.00 ether 00:00:00:00:00:00 C NaN eth0.002(6)
--- selected ---
Address HWaddress Iface(Ports)
0 0.00.00.00 (incomplete) eth0.1
1 0.00.00.00 (incomplete) eth0.2
2 0.00.00.00 00:00:00:00:00:00 eth0.001(2)
3 0.00.00.00 00:00:00:00:00:00 eth0.002(6)
EDIT
After getting DataFrame you can modify data useing pandas function – i.e. .apply()
. You have strings so you can use string functions or regex.
def convert(item):
if '(' in item:
item = item.split('(')[1].split(')')[0]
return item
df['Iface(Ports)'] = df['Iface(Ports)'].apply(convert)
print(df)
Result:
Address HWaddress Iface(Ports)
0 0.00.00.00 (incomplete) eth0.1
1 0.00.00.00 (incomplete) eth0.2
2 0.00.00.00 00:00:00:00:00:00 2
3 0.00.00.00 00:00:00:00:00:00 6
I have a arp
command output executed with Paramiko in a table format as shown below:
stdin, stdout, stderr = client.exec_command("arp")
opt = stdout.read().decode('ascii')
print(opt)
Address HWtype HWaddress Flags Mask Iface(Ports)
0.00.00.00 ether (incomplete) C eth0.1
0.00.00.00 ether (incomplete) C eth0.2
0.00.00.00 ether 00:00:00:00:00:00 C eth0.001(2)
0.00.00.00 ether 00:00:00:00:00:00 C eth0.002(6)
Now, I wanted to read this table using pandas and wanted to display only few columns, I tried the below code but its not working for me.
opt = io.StringIO(opt)
df = pd.read_table(opt, sep='s{2,}', usecols=[0,2,4], engine='python')
print(df)
Address HWaddress Iface(Ports)
0 00:00:00:00 eth0.1 None
1 00:00:00:00 eth0.2 None
2 00:00:00:00 00:00:00:00:00:00 eth0.001(2)
3 00:00:00:00 00:00:00:00:00:00 eth0.002(6)
As we can see in the above Dataframe output, wherever we have ‘(incomplete)’ string that is replaced by Iface(Ports) column data. can anyone please help me on this.
Do you mind doing some string parsing?
d = '''
Address HWtype HWaddress Flags Mask Iface(Ports)
0.00.00.00 ether (incomplete) C eth0.1
0.00.00.00 ether (incomplete) C eth0.2
0.00.00.00 ether 00:00:00:00:00:00 C eth0.001(2)
0.00.00.00 ether 00:00:00:00:00:00 C eth0.002(6)
'''
removing empty lines
string =[x for x in d.split('n') if x != '']
Structuring data
data = []
for line in dd:
data.append([x for x in line.split(' ') if x != ''])
Create dataframe
df = pd.DataFrame(data)
df.columns = df.iloc[0]
df = df[1:]
Sample output
0 Address HWtype HWaddress Flags Mask Iface(Ports)
1 0.00.00.00 ether (incomplete) C eth0.1 None
2 0.00.00.00 ether (incomplete) C eth0.2 None
3 0.00.00.00 ether 00:00:00:00:00:00 C eth0.001(2) None
4 0.00.00.00 ether 00:00:00:00:00:00 C eth0.002(6) None
I think it will be simpler to use read_fwf() (fixed-width format
)
df = pd.read_fwf(io.StringIO(text))
and later select columns
df = df[['Address','HWaddress', 'Iface(Ports)']]
Full example code:
text = '''Address HWtype HWaddress Flags Mask Iface(Ports)
0.00.00.00 ether (incomplete) C eth0.1
0.00.00.00 ether (incomplete) C eth0.2
0.00.00.00 ether 00:00:00:00:00:00 C eth0.001(2)
0.00.00.00 ether 00:00:00:00:00:00 C eth0.002(6)
'''
import pandas as pd
import io
df = pd.read_fwf(io.StringIO(text))
print('--- full ---')
print(df)
df = df[['Address','HWaddress', 'Iface(Ports)']]
print('--- selected ---')
print(df)
Result:
--- full ---
Address HWtype HWaddress Flags Mask Iface(Ports)
0 0.00.00.00 ether (incomplete) C NaN eth0.1
1 0.00.00.00 ether (incomplete) C NaN eth0.2
2 0.00.00.00 ether 00:00:00:00:00:00 C NaN eth0.001(2)
3 0.00.00.00 ether 00:00:00:00:00:00 C NaN eth0.002(6)
--- selected ---
Address HWaddress Iface(Ports)
0 0.00.00.00 (incomplete) eth0.1
1 0.00.00.00 (incomplete) eth0.2
2 0.00.00.00 00:00:00:00:00:00 eth0.001(2)
3 0.00.00.00 00:00:00:00:00:00 eth0.002(6)
EDIT
After getting DataFrame you can modify data useing pandas function – i.e. .apply()
. You have strings so you can use string functions or regex.
def convert(item):
if '(' in item:
item = item.split('(')[1].split(')')[0]
return item
df['Iface(Ports)'] = df['Iface(Ports)'].apply(convert)
print(df)
Result:
Address HWaddress Iface(Ports)
0 0.00.00.00 (incomplete) eth0.1
1 0.00.00.00 (incomplete) eth0.2
2 0.00.00.00 00:00:00:00:00:00 2
3 0.00.00.00 00:00:00:00:00:00 6