ValueError : I/O operation on closed file (local machine OK but not Google Colab)
Question:
I have some CSV files in a folder. A function was defined, to read a column of it (from each CSV file), times the values, find out the max, and print it out.
I’d like the output to be written into a text file.
The lines work well on local machine.
But when it’s put on Google Colab, it produces an error, and seems keep running no stop:
Exception in callback BaseAsyncIOLoop._handle_events(17, 1)
handle: <Handle BaseAsyncIOLoop._handle_events(17, 1)>
Traceback (most recent call last):
File "/usr/lib/python3.7/asyncio/events.py", line 88, in _run
self._context.run(self._callback, *self._args)
File "/usr/local/lib/python3.7/dist-packages/tornado/platform/asyncio.py", line 122, in _handle_events
handler_func(fileobj, events)
File "/usr/local/lib/python3.7/dist-packages/tornado/stack_context.py", line 300, in null_wrapper
return fn(*args, **kwargs)
File "/usr/local/lib/python3.7/dist-packages/zmq/eventloop/zmqstream.py", line 451, in _handle_events
self._handle_recv()
File "/usr/local/lib/python3.7/dist-packages/zmq/eventloop/zmqstream.py", line 480, in _handle_recv
self._run_callback(callback, msg)
File "/usr/local/lib/python3.7/dist-packages/zmq/eventloop/zmqstream.py", line 434, in _run_callback
callback(*args, **kwargs)
File "/usr/local/lib/python3.7/dist-packages/tornado/stack_context.py", line 300, in null_wrapper
return fn(*args, **kwargs)
File "/usr/local/lib/python3.7/dist-packages/ipykernel/kernelbase.py", line 283, in dispatcher
return self.dispatch_shell(stream, msg)
File "/usr/local/lib/python3.7/dist-packages/ipykernel/kernelbase.py", line 239, in dispatch_shell
sys.stdout.flush()
ValueError: I/O operation on closed file.
Where went wrong, and how can it be corrected?
from google.colab import drive
drive.mount('/content/drive')
import pandas as pd
import numpy as np
import glob, sys
folder = "/content/drive/My Drive/Data folder/"
def to_cal(file_name, times):
df['Result'] = df['Unit Price'] * times
print (file_name, df['Result'].max())
return
files = glob.glob(folder + "/*.csv")
with open(folder + 'output (testing).txt', 'a') as outfile:
sys.stdout = outfile
for f in files:
df = pd.read_csv(f)
file_name = f.replace(folder, "")
to_cal(file_name, 10)
outfile.close()
Answers:
I run it on Colab
and FULL error message shows very intersting: sys.stdout.flush()
.
It can confirm that problem makes sys.stdout = outfile
.
On local computer you probably runs as python script
so it always starts with new intepreter which uses new sys.stdout
and close
doesn’t make problem but on Colab
(and probably in other Python shells) it runs all time the same interpreter and when first executions closes sys.stdout
then other execution may have problem to use it.
if you want to redirect print()
to file then better use
print(..., file=outfile)
Or maybe write it in normal way
text = '{} {}n'.format(file_name, df['Result'].max())
outfile.write(text)
I have some CSV files in a folder. A function was defined, to read a column of it (from each CSV file), times the values, find out the max, and print it out.
I’d like the output to be written into a text file.
The lines work well on local machine.
But when it’s put on Google Colab, it produces an error, and seems keep running no stop:
Exception in callback BaseAsyncIOLoop._handle_events(17, 1)
handle: <Handle BaseAsyncIOLoop._handle_events(17, 1)>
Traceback (most recent call last):
File "/usr/lib/python3.7/asyncio/events.py", line 88, in _run
self._context.run(self._callback, *self._args)
File "/usr/local/lib/python3.7/dist-packages/tornado/platform/asyncio.py", line 122, in _handle_events
handler_func(fileobj, events)
File "/usr/local/lib/python3.7/dist-packages/tornado/stack_context.py", line 300, in null_wrapper
return fn(*args, **kwargs)
File "/usr/local/lib/python3.7/dist-packages/zmq/eventloop/zmqstream.py", line 451, in _handle_events
self._handle_recv()
File "/usr/local/lib/python3.7/dist-packages/zmq/eventloop/zmqstream.py", line 480, in _handle_recv
self._run_callback(callback, msg)
File "/usr/local/lib/python3.7/dist-packages/zmq/eventloop/zmqstream.py", line 434, in _run_callback
callback(*args, **kwargs)
File "/usr/local/lib/python3.7/dist-packages/tornado/stack_context.py", line 300, in null_wrapper
return fn(*args, **kwargs)
File "/usr/local/lib/python3.7/dist-packages/ipykernel/kernelbase.py", line 283, in dispatcher
return self.dispatch_shell(stream, msg)
File "/usr/local/lib/python3.7/dist-packages/ipykernel/kernelbase.py", line 239, in dispatch_shell
sys.stdout.flush()
ValueError: I/O operation on closed file.
Where went wrong, and how can it be corrected?
from google.colab import drive
drive.mount('/content/drive')
import pandas as pd
import numpy as np
import glob, sys
folder = "/content/drive/My Drive/Data folder/"
def to_cal(file_name, times):
df['Result'] = df['Unit Price'] * times
print (file_name, df['Result'].max())
return
files = glob.glob(folder + "/*.csv")
with open(folder + 'output (testing).txt', 'a') as outfile:
sys.stdout = outfile
for f in files:
df = pd.read_csv(f)
file_name = f.replace(folder, "")
to_cal(file_name, 10)
outfile.close()
I run it on Colab
and FULL error message shows very intersting: sys.stdout.flush()
.
It can confirm that problem makes sys.stdout = outfile
.
On local computer you probably runs as python script
so it always starts with new intepreter which uses new sys.stdout
and close
doesn’t make problem but on Colab
(and probably in other Python shells) it runs all time the same interpreter and when first executions closes sys.stdout
then other execution may have problem to use it.
if you want to redirect print()
to file then better use
print(..., file=outfile)
Or maybe write it in normal way
text = '{} {}n'.format(file_name, df['Result'].max())
outfile.write(text)