dataframe to long format

Question:

I have the following df:

tz.head()

    state   2004    2005    2006    2007    2008    2009    2010    2011    2012    2013    2014    2015
0   AL      5.7     4.5     4.0     4.0     5.7     11.0    10.5    9.6     8.0     7.2     6.8     6.1
1   AK      7.5     6.9     6.6     6.3     6.7     7.7     7.9     7.6     7.1     6.9     6.9     6.5
2   AZ      5.0     4.7     4.2     3.9     6.2     9.9     10.4    9.5     8.3     7.7     6.8     6.1
3   AR      5.7     5.2     5.2     5.3     5.5     7.8     8.2     8.3     7.6     7.3     6.1     5.2
4   CA      6.2     5.4     4.9     5.4     7.3     11.2    12.2    11.7    10.4    8.9     7.5     6.2

I would like to change it so that looks like this:

year    state   unemployment
2004    AL  5.7
2005    AL  4.5
2006    AL  4.0
2007    AL  4.0
2008    AL  5.7
2009    AL  11.0
2010    AL  10.5
2011    AL  9.6
2012    AL  8.0
2013    AL  7.2
2014    AL  6.8
2015    AL  6.1
2004    AK  7.5
2005    AK  6.9
2006    AK  6.6
2007    AK  6.3
2008    AK  6.7
2009    AK  7.7
2010    AK  7.9
2011    AK  7.6
2012    AK  7.1
2013    AK  6.9
2014    AK  6.9
2015    AK  6.5

The reason is that I have a df that is similarly shaped and I need to merge the two dfs. I have recently had similar df shaping issues that I have been unable to find simple quick solutions to with python. Does anyone know how to solve this kind of problem?

Asked By: Collective Action

||

Answers:

You can use melt:

print pd.melt(df,id_vars=['state'],var_name='year', value_name='unemployment')

   state  year  unemployment
0     AL  2004           5.7
1     AK  2004           7.5
2     AZ  2004           5.0
3     AR  2004           5.7
4     CA  2004           6.2
5     AL  2005           4.5
6     AK  2005           6.9
7     AZ  2005           4.7
8     AR  2005           5.2
9     CA  2005           5.4
10    AL  2006           4.0
11    AK  2006           6.6
12    AZ  2006           4.2
13    AR  2006           5.2
14    CA  2006           4.9
15    AL  2007           4.0
16    AK  2007           6.3
17    AZ  2007           3.9
18    AR  2007           5.3
19    CA  2007           5.4
20    AL  2008           5.7
21    AK  2008           6.7
22    AZ  2008           6.2
23    AR  2008           5.5
24    CA  2008           7.3
25    AL  2009          11.0
26    AK  2009           7.7
27    AZ  2009           9.9
28    AR  2009           7.8
29    CA  2009          11.2
30    AL  2010          10.5
31    AK  2010           7.9
32    AZ  2010          10.4
33    AR  2010           8.2
34    CA  2010          12.2
35    AL  2011           9.6
36    AK  2011           7.6
37    AZ  2011           9.5
38    AR  2011           8.3
39    CA  2011          11.7
40    AL  2012           8.0
41    AK  2012           7.1
42    AZ  2012           8.3
43    AR  2012           7.6
44    CA  2012          10.4
45    AL  2013           7.2
46    AK  2013           6.9
47    AZ  2013           7.7
48    AR  2013           7.3
49    CA  2013           8.9
50    AL  2014           6.8
51    AK  2014           6.9
52    AZ  2014           6.8
53    AR  2014           6.1
54    CA  2014           7.5
55    AL  2015           6.1
56    AK  2015           6.5
57    AZ  2015           6.1
58    AR  2015           5.2
59    CA  2015           6.2
Answered By: jezrael

Besides melt, pandas also provides wide_to_long, which is “Less flexible but more user-friendly than melt.” according to the docs.

# Adding prefixes here to get nice column names afterwards
df = df.add_prefix('unemployment')
df.rename(columns={'unemploymentstate': 'state'}, inplace=True)

# Reshaping to long/tidy format
pd.wide_to_long(df, stubnames='unemployment', i='state', j='year')

        unemployment
state   year    
AL  2004    5.7
AK  2004    7.5
AZ  2004    5.0
AR  2004    5.7
CA  2004    6.2
AL  2005    4.5
...
Answered By: rvf
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.