st_ino from os.stat in Python gets unexpectedly altered if output to a file

Question:

I have tried to research this to see if it is expected behavior or not but I haven’t found anything. Maybe I’m not using the right search terms. I use os.stat in Python and capture file attributes but I have noticed some strange behavior with st_ino. I am using Python 3.10 in Linux. I’ve noticed when I output st_ino to a file, the value is or somehow gets changed. Here is an example:

import os
import xlsxwriter

directory = "/mnt/user/other"
workbook = xlsxwriter.Workbook('os.walk.file_attributes.xlsx')
worksheet = workbook.add_worksheet()
headers = ['File Name', 'Size (bytes)', 'Creation Time', 'Last Modified Time', 'Last Access Time', 'Inode Links', 'Inode Number str', 'Inode Number']
for i, header in enumerate(headers):
    worksheet.write(0, i, header)

row = 1

for root, dirs, files in os.walk(directory):
    for file in files:
        filepath = os.path.join(root, file)
        statinfo = os.stat(filepath)
        worksheet.write(row, 0, file)
        worksheet.write(row, 1, statinfo.st_size)
        worksheet.write(row, 2, statinfo.st_ctime)
        worksheet.write(row, 3, statinfo.st_mtime)
        worksheet.write(row, 4, statinfo.st_atime)
        worksheet.write(row, 5, statinfo.st_nlink)
        worksheet.write(row, 6, str(statinfo.st_ino))
        worksheet.write(row, 7, statinfo.st_ino)
        row += 1
workbook.close()
print("File attributes saved to os.walk.file_attributes.xlsx")

If you run this code and look at the last column in the xlsx file it creates, ino numbers are all wrong. I had many repeats of inode numbers for files of different sizes that aren’t hard links. That should not be the case. The column before that however, I first converted it to a string and that seems to be correct. When I print to screen both statinfo.st_ino and str(statinfo.st_ino) are identical, as they should be. For some reason it gets changed when output to a file unless it is strigified. I first noticed this because I was using shelve to save time on testing and getting inconsistent results when I would load shelved data. That’s when I tried the code above to see what the issue was. I couldn’t find any mention of this unexpected behavior in the Python docs. It is simply a matter of needing to stringify and int before writing it to a file? I know that is the case when using write but it errors out and explicitly tells you so and I figured that was inherent to the write method and not necessarily ints. Has anyone else come across this or an explanation as to why this happens?

Asked By: user2328273

||

Answers:

I am on Linux so the inode numbers are 18 digit. So it seems like the fault is with Excel and not with Python, as @user2357112 had suggested. Excel was rounding to 15 digits, replacing the last 3 digits with zeros. Thank you all.

Answered By: user2328273
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.