Python Dict Comprehension retrieve value from 1 dataframe column if match another column value

Question:

I have a dataframe and there are 2 columns ["country"] and ["city"] which basically informs of the country and their cities.

I need to create a dict using dict comprehensions, to get as a key, the country and as values, a list of the city/cities (some of them only have one city, others many).

I’m able to define the keys and create a list but all the cities existing appears a values, I am not able to create the condition that the country of the value should be the key:

Dic = {k: list(megacities["city"]) for k,f in megacities.groupby('country')}
for k in Dic:
    print("{}:{}n".format(k, Dic[k]))

Part of the output that I receive is:

 Argentina:['Tokyo', 'Jakarta', 'Delhi', 'Manila', 'São Paulo', 'Seoul', 'Mumbai', 'Shanghai', 'Mexico City', 'Guangzhou', 'Cairo', 'Beijing', 'New York', 'Kolkāta', 'Moscow', 'Bangkok', 'Dhaka', 'Buenos Aires', 'Ōsaka', 'Lagos', 'Istanbul', 'Karachi', 'Kinshasa', 'Shenzhen', 'Bangalore', 'Ho Chi Minh City', 'Tehran', 'Los Angeles', 'Rio de Janeiro', 'Chengdu', 'Baoding', 'Chennai', 'Lahore', 'London', 'Paris', 'Tianjin', 'Linyi', 'Shijiazhuang', 'Zhengzhou', 'Nanyang']

Bangladesh:['Tokyo', 'Jakarta', 'Delhi', 'Manila', 'São Paulo', 'Seoul', 'Mumbai', 'Shanghai', 'Mexico City', 'Guangzhou', 'Cairo', 'Beijing', 'New York', 'Kolkāta', 'Moscow', 'Bangkok', 'Dhaka', 'Buenos Aires', 'Ōsaka', 'Lagos', 'Istanbul', 'Karachi', 'Kinshasa', 'Shenzhen', 'Bangalore', 'Ho Chi Minh City', 'Tehran', 'Los Angeles', 'Rio de Janeiro', 'Chengdu', 'Baoding', 'Chennai', 'Lahore', 'London', 'Paris', 'Tianjin', 'Linyi', 'Shijiazhuang', 'Zhengzhou', 'Nanyang']

    Brazil:['Tokyo', 'Jakarta', 'Delhi', 'Manila', 'São Paulo', 'Seoul', 'Mumbai', 'Shanghai', 'Mexico City', 'Guangzhou', 'Cairo', 'Beijing', 'New York', 'Kolkāta', 'Moscow', 'Bangkok', 'Dhaka', 'Buenos Aires', 'Ōsaka', 'Lagos', 'Istanbul', 'Karachi', 'Kinshasa', 'Shenzhen', 'Bangalore', 'Ho Chi Minh City', 'Tehran', 'Los Angeles', 'Rio de Janeiro', 'Chengdu', 'Baoding', 'Chennai', 'Lahore', 'London', 'Paris', 'Tianjin', 'Linyi', 'Shijiazhuang', 'Zhengzhou', 'Nanyang']

So basically the expect output would be:

Argentina:['Buenos Aires']

Bangladesh:['Dhaka']

Brazil:['São Paulo', 'Rio de Janeiro']

How can I should proceed in terms of syntaxis to stablish that condition for the value in the dict comprehension?

Lastly, the dataframe:

city    city_ascii  lat     lng     country     iso2    iso3    admin_name  capital     population  id
0   Tokyo   Tokyo   35.6839     139.7744    Japan   JP  JPN     Tōkyō   primary     39105000    1392685764
1   Jakarta     Jakarta     -6.2146     106.8451    Indonesia   ID  IDN     Jakarta     primary     35362000    1360771077
2   Delhi   Delhi   28.6667     77.2167     India   IN  IND     Delhi   admin   31870000    1356872604
3   Manila  Manila  14.6000     120.9833    Philippines     PH  PHL     Manila  primary     23971000    1608618140
4   São Paulo   Sao Paulo   -23.5504    -46.6339    Brazil  BR  BRA     São Paulo   admin   22495000    1076532519
5   Seoul   Seoul   37.5600     126.9900    South Korea     KR  KOR     Seoul   primary     22394000    1410836482
6   Mumbai  Mumbai  19.0758     72.8775     India   IN  IND     Mahārāshtra     admin   22186000    1356226629
7   Shanghai    Shanghai    31.1667     121.4667    China   CN  CHN     Shanghai    admin   22118000    1156073548
8   Mexico City     Mexico City     19.4333     -99.1333    Mexico  MX  MEX     Ciudad de México    primary     21505000    1484247881
9   Guangzhou   Guangzhou   23.1288     113.2590    China   CN  CHN     Guangdong   admin   21489000    1156237133
10  Cairo   Cairo   30.0444     31.2358     Egypt   EG  EGY     Al Qāhirah  primary     19787000    1818253931
11  Beijing     Beijing     39.9040     116.4075    China   CN  CHN     Beijing     primary     19437000    1156228865
12  New York    New York    40.6943     -73.9249    United States   US  USA     New York    NaN     18713220    1840034016
13  Kolkāta     Kolkata     22.5727     88.3639     India   IN  IND     West Bengal     admin   18698000    1356060520
14  Moscow  Moscow  55.7558     37.6178     Russia  RU  RUS     Moskva  primary     17693000    1643318494
15  Bangkok     Bangkok     13.7500     100.5167    Thailand    TH  THA     Krung Thep Maha Nakhon  primary     17573000    1764068610
16  Dhaka   Dhaka   23.7289     90.3944     Bangladesh  BD  BGD     Dhaka   primary     16839000    1050529279
17  Buenos Aires    Buenos Aires    -34.5997    -58.3819    Argentina   AR  ARG     Buenos Aires, Ciudad Autónoma de    primary     16216000    1032717330
18  Ōsaka   Osaka   34.7520     135.4582    Japan   JP  JPN     Ōsaka   admin   15490000    1392419823
19  Lagos   Lagos   6.4500  3.4000  Nigeria     NG  NGA     Lagos   minor   15487000    1566593751
20  Istanbul    Istanbul    41.0100     28.9603     Turkey  TR  TUR     İstanbul    admin   15311000    1792756324
21  Karachi     Karachi     24.8600     67.0100     Pakistan    PK  PAK     Sindh   admin   15292000    1586129469
22  Kinshasa    Kinshasa    -4.3317     15.3139     Congo (Kinshasa)    CD  COD     Kinshasa    primary     15056000    1180000363
23  Shenzhen    Shenzhen    22.5350     114.0540    China   CN  CHN     Guangdong   minor   14678000    1156158707
24  Bangalore   Bangalore   12.9791     77.5913     India   IN  IND     Karnātaka   admin   13999000    1356410365
25  Ho Chi Minh City    Ho Chi Minh City    10.8167     106.6333    Vietnam     VN  VNM     Hồ Chí Minh     admin   13954000    1704774326
26  Tehran  Tehran  35.7000     51.4167     Iran    IR  IRN     Tehrān  primary     13819000    1364305026
27  Los Angeles     Los Angeles     34.1139     -118.4068   United States   US  USA     California  NaN     12750807    1840020491
28  Rio de Janeiro  Rio de Janeiro  -22.9083    -43.1964    Brazil  BR  BRA     Rio de Janeiro  admin   12486000    1076887657
29  Chengdu     Chengdu     30.6600     104.0633    China   CN  CHN     Sichuan     admin   11920000    1156421555
30  Baoding     Baoding     38.8671     115.4845    China   CN  CHN     Hebei   NaN     11860000    1156256829
31  Chennai     Chennai     13.0825     80.2750     India   IN  IND     Tamil Nādu  admin   11564000    1356374944
32  Lahore  Lahore  31.5497     74.3436     Pakistan    PK  PAK     Punjab  admin   11148000    1586801463
33  London  London  51.5072     -0.1275     United Kingdom  GB  GBR     London, City of     primary     11120000    1826645935
34  Paris   Paris   48.8566     2.3522  France  FR  FRA     Île-de-France   primary     11027000    1250015082
35  Tianjin     Tianjin     39.1467     117.2056    China   CN  CHN     Tianjin     admin   10932000    1156174046
36  Linyi   Linyi   35.0606     118.3425    China   CN  CHN     Shandong    NaN     10820000    1156086320
37  Shijiazhuang    Shijiazhuang    38.0422     114.5086    China   CN  CHN     Hebei   admin   10784600    1156217541
38  Zhengzhou   Zhengzhou   34.7492     113.6605    China   CN  CHN     Henan   admin   10136000    1156183137
39  Nanyang     Nanyang     32.9987     112.5292    China   CN  CHN     Henan   NaN     10013600    1156192287

Many thanks!

Asked By: Joseph

||

Answers:

Since you are doing the groupby, You need to fetch city from the group

Dic = {k: f['city'].unique() for k,f in megacities.groupby('country')}
Answered By: Rahul K P

Try:

d = {i: g["city"].to_list() for i, g in df.groupby("country")}
print(d)

Prints:

{
    "Argentina": ["Buenos Aires"],
    "Bangladesh": ["Dhaka"],
    "Brazil": ["São Paulo", "Rio de Janeiro"],
    "China": [
        "Shanghai",
        "Guangzhou",
        "Beijing",
        "Shenzhen",
        "Chengdu",
        "Baoding",
        "Tianjin",
        "Linyi",
        "Shijiazhuang",
        "Zhengzhou",
        "Nanyang",
    ],
    "Congo (Kinshasa)": ["Kinshasa"],
    "Egypt": ["Cairo"],
    "France": ["Paris"],
    "India": ["Delhi", "Mumbai", "Kolkāta", "Bangalore", "Chennai"],
    "Indonesia": ["Jakarta"],
    "Iran": ["Tehran"],
    "Japan": ["Tokyo", "Ōsaka"],
    "Mexico": ["Mexico City"],
    "Nigeria": ["Lagos"],
    "Pakistan": ["Karachi", "Lahore"],
    "Philippines": ["Manila"],
    "Russia": ["Moscow"],
    "South Korea": ["Seoul"],
    "Thailand": ["Bangkok"],
    "Turkey": ["Istanbul"],
    "United Kingdom": ["London"],
    "United States": ["New York", "Los Angeles"],
    "Vietnam": ["Ho Chi Minh City"],
}
Answered By: Andrej Kesely