Import of large CSV file using Pandas – Avoid truncated output

Question:

Is there any way to import a large CSV data file into Pycharm using Pandas Import? Because no matter what I do, the output seen in the run terminal is severely truncated which is not good for any selection or cleaning data operations.

Any suggestions would be appreciated.

Asked By: Wyatt_Earp

||

Answers:

Pandas provides options for displaying DataFrame.

  • pd.options.display.width
  • pd.options.display.max_columns
  • pd.options.display.max_rows

By default, pandas will display a truncated table if the DataFrame has more rows/columns than max_rows/max_columns.
You can adjust this if you want. Here’s some sample code.

>>> import pandas as pd
>>> from random import random

>>> df = pd.DataFrame({
...     f'c{col_no}': [random() for _ in range(100)] 
...     for col_no in range(15)
... })

>>> pd.options.display.max_columns, pd.options.display.max_rows
(0, 60)
>>> df
          c0        c1        c2  ...       c12       c13       c14
0   0.871826  0.415696  0.962756  ...  0.036385  0.405643  0.807471
1   0.531463  0.516149  0.811182  ...  0.588035  0.015000  0.447855
2   0.703785  0.793341  0.019570  ...  0.374489  0.057472  0.590761
3   0.762984  0.171603  0.127855  ...  0.357097  0.013220  0.132322
4   0.991035  0.113433  0.840822  ...  0.113895  0.707505  0.457993
..       ...       ...       ...  ...       ...       ...       ...
95  0.438203  0.465847  0.287558  ...  0.236885  0.495121  0.115823
96  0.612054  0.709875  0.217789  ...  0.569730  0.779009  0.429083
97  0.396499  0.017465  0.075139  ...  0.032245  0.955732  0.708767
98  0.096672  0.227434  0.347087  ...  0.841708  0.031055  0.689640
99  0.123338  0.199680  0.284335  ...  0.328187  0.362656  0.379024

>>> pd.options.display.width = 200
>>> pd.options.display.max_columns = 15
>>> pd.options.display.max_rows = 100
>>> df
          c0        c1        c2        c3        c4        c5        c6        c7        c8        c9       c10       c11       c12       c13       c14
0   0.871826  0.415696  0.962756  0.337541  0.798125  0.641710  0.060606  0.268195  0.033646  0.713952  0.999305  0.266091  0.036385  0.405643  0.807471
1   0.531463  0.516149  0.811182  0.517024  0.907563  0.098621  0.486572  0.105661  0.233740  0.442899  0.882617  0.491250  0.588035  0.015000  0.447855
2   0.703785  0.793341  0.019570  0.656947  0.771691  0.163144  0.739283  0.775620  0.454568  0.739937  0.376440  0.783414  0.374489  0.057472  0.590761
3   0.762984  0.171603  0.127855  0.347233  0.681083  0.469366  0.074852  0.327360  0.583786  0.570660  0.918842  0.140252  0.357097  0.013220  0.132322
4   0.991035  0.113433  0.840822  0.198988  0.117649  0.148605  0.173794  0.126979  0.322275  0.766880  0.011601  0.918334  0.113895  0.707505  0.457993
5   0.027492  0.441665  0.015462  0.425986  0.876837  0.041831  0.385929  0.622585  0.893251  0.207410  0.126994  0.540103  0.132818  0.320651  0.135680
6   0.364498  0.777506  0.571290  0.463168  0.372986  0.727358  0.286281  0.060411  0.091997  0.599882  0.914836  0.713235  0.769993  0.912143  0.973625
7   0.021097  0.271388  0.903971  0.347351  0.255841  0.020190  0.307909  0.189683  0.635788  0.932846  0.740916  0.657532  0.347275  0.677888  0.027598
8   0.594859  0.905407  0.767936  0.929833  0.048191  0.084725  0.967413  0.183815  0.758094  0.686023  0.087515  0.512909  0.942502  0.858353  0.855532
9   0.899373  0.681138  0.546424  0.809373  0.174588  0.691135  0.755386  0.590502  0.161688  0.711284  0.918817  0.579863  0.599287  0.280585  0.691854
10  0.471923  0.523145  0.918165  0.406063  0.095486  0.972089  0.724117  0.231671  0.200418  0.733166  0.019452  0.128490  0.524909  0.895029  0.584772
... print all rows

Reference: Options and settings – pandas


In PyCharm you can use SciView to explore DataFrame.

Click ‘View as DataFrame’ in ‘Variables View’ (right panel)
PyCharm Python Console

The DataFrame will opened in ‘SciView’ panel.
PyCharm SciView

Answered By: lunalcni
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.