Pandas DataFrames: Create new rows with calculations across existing rows The 2019 Stack...
What LEGO pieces have "real-world" functionality?
                
                    How are presidential pardons supposed to be used?
                
                    Difference between "generating set" and free product?
                
                    How do you keep chess fun when your opponent constantly beats you?
                
                    ELI5: Why do they say that Israel would have been the fourth country to land a spacecraft on the Moon and why do they call it low cost?
                
                    Pandas DataFrames: Create new rows with calculations across existing rows
                
                    Windows 10: How to Lock (not sleep) laptop on lid close?
                
                    What was the last x86 CPU that did not have the x87 floating-point unit built in?
                
                    What does the torsion-free condition for a connection mean in terms of its horizontal bundle?
                
                    Am I ethically obligated to go into work on an off day if the reason is sudden?
                
                    How long does the line of fire that you can create as an action using the Investiture of Flame spell last?
                
                    Does Parliament hold absolute power in the UK?
                
                    Why is superheterodyning better than direct conversion?
                
                    How to colour the US map with Yellow, Green, Red and Blue to minimize the number of states with the colour of Green
                
                    When did F become S in typeography, and why?
                
                    How can I protect witches in combat who wear limited clothing?
                
                    Make it rain characters
                
                    Can smartphones with the same camera sensor have different image quality?
                
                    How to make `trap` know if the EXIT is after successful program finish or because of premature as an error or something
                
                    In horse breeding, what is the female equivalent of putting a horse out "to stud"?
                
                    Take groceries in checked luggage
                
                    Is it ethical to upload a automatically generated paper to a non peer-reviewed site as part of a larger research?
                
                    What aspect of planet Earth must be changed to prevent the industrial revolution?
                
                    Simulating Exploding Dice
Pandas DataFrames: Create new rows with calculations across existing rows
The 2019 Stack Overflow Developer Survey Results Are In
Announcing the arrival of Valued Associate #679: Cesar Manara
Planned maintenance scheduled April 17/18, 2019 at 00:00UTC (8:00pm US/Eastern)
The Ask Question Wizard is Live!
Data science time! April 2019 and salary with experienceDynamic Expression Evaluation in pandas using pd.eval()Add one row to pandas DataFrameSelecting multiple columns in a pandas dataframeAdding new column to existing DataFrame in Python pandasDelete column from pandas DataFrame by column nameHow to drop rows of Pandas DataFrame whose value in certain columns is NaN“Large data” work flows using pandasHow do I get the row count of a pandas DataFrame?How to iterate over rows in a DataFrame in Pandas?Select rows from a DataFrame based on values in a column in pandasGet list from pandas DataFrame column headers
.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty{ height:90px;width:728px;box-sizing:border-box;
}
How can I create new rows from an existing DataFrame by grouping by certain fields (in the example "Country" and "Industry") and applying some math to another field (in the example "Field" and "Value")?
Source DataFrame
df = pd.DataFrame({'Country': ['USA','USA','USA','USA','USA','USA','Canada','Canada'],
                   'Industry': ['Finance', 'Finance', 'Retail', 
                                'Retail', 'Energy', 'Energy', 
                                'Retail', 'Retail'],
                   'Field': ['Import', 'Export','Import', 
                             'Export','Import', 'Export',
                             'Import', 'Export'],
                   'Value': [100, 50, 80, 10, 20, 5, 30, 10]})
    Country Industry    Field   Value
0   USA     Finance     Import  100
1   USA     Finance     Export  50
2   USA     Retail      Import  80
3   USA     Retail      Export  10
4   USA     Energy      Import  20
5   USA     Energy      Export  5
6   Canada  Retail      Import  30
7   Canada  Retail      Export  10
Target DataFrame
Net = Import - Export
    Country Industry    Field   Value
0   USA     Finance     Net     50
1   USA     Retail      Net     70
2   USA     Energy      Net     15
3   Canada  Retail      Net     20
python pandas dataframe
add a comment |
How can I create new rows from an existing DataFrame by grouping by certain fields (in the example "Country" and "Industry") and applying some math to another field (in the example "Field" and "Value")?
Source DataFrame
df = pd.DataFrame({'Country': ['USA','USA','USA','USA','USA','USA','Canada','Canada'],
                   'Industry': ['Finance', 'Finance', 'Retail', 
                                'Retail', 'Energy', 'Energy', 
                                'Retail', 'Retail'],
                   'Field': ['Import', 'Export','Import', 
                             'Export','Import', 'Export',
                             'Import', 'Export'],
                   'Value': [100, 50, 80, 10, 20, 5, 30, 10]})
    Country Industry    Field   Value
0   USA     Finance     Import  100
1   USA     Finance     Export  50
2   USA     Retail      Import  80
3   USA     Retail      Export  10
4   USA     Energy      Import  20
5   USA     Energy      Export  5
6   Canada  Retail      Import  30
7   Canada  Retail      Export  10
Target DataFrame
Net = Import - Export
    Country Industry    Field   Value
0   USA     Finance     Net     50
1   USA     Retail      Net     70
2   USA     Energy      Net     15
3   Canada  Retail      Net     20
python pandas dataframe
add a comment |
How can I create new rows from an existing DataFrame by grouping by certain fields (in the example "Country" and "Industry") and applying some math to another field (in the example "Field" and "Value")?
Source DataFrame
df = pd.DataFrame({'Country': ['USA','USA','USA','USA','USA','USA','Canada','Canada'],
                   'Industry': ['Finance', 'Finance', 'Retail', 
                                'Retail', 'Energy', 'Energy', 
                                'Retail', 'Retail'],
                   'Field': ['Import', 'Export','Import', 
                             'Export','Import', 'Export',
                             'Import', 'Export'],
                   'Value': [100, 50, 80, 10, 20, 5, 30, 10]})
    Country Industry    Field   Value
0   USA     Finance     Import  100
1   USA     Finance     Export  50
2   USA     Retail      Import  80
3   USA     Retail      Export  10
4   USA     Energy      Import  20
5   USA     Energy      Export  5
6   Canada  Retail      Import  30
7   Canada  Retail      Export  10
Target DataFrame
Net = Import - Export
    Country Industry    Field   Value
0   USA     Finance     Net     50
1   USA     Retail      Net     70
2   USA     Energy      Net     15
3   Canada  Retail      Net     20
python pandas dataframe
How can I create new rows from an existing DataFrame by grouping by certain fields (in the example "Country" and "Industry") and applying some math to another field (in the example "Field" and "Value")?
Source DataFrame
df = pd.DataFrame({'Country': ['USA','USA','USA','USA','USA','USA','Canada','Canada'],
                   'Industry': ['Finance', 'Finance', 'Retail', 
                                'Retail', 'Energy', 'Energy', 
                                'Retail', 'Retail'],
                   'Field': ['Import', 'Export','Import', 
                             'Export','Import', 'Export',
                             'Import', 'Export'],
                   'Value': [100, 50, 80, 10, 20, 5, 30, 10]})
    Country Industry    Field   Value
0   USA     Finance     Import  100
1   USA     Finance     Export  50
2   USA     Retail      Import  80
3   USA     Retail      Export  10
4   USA     Energy      Import  20
5   USA     Energy      Export  5
6   Canada  Retail      Import  30
7   Canada  Retail      Export  10
Target DataFrame
Net = Import - Export
    Country Industry    Field   Value
0   USA     Finance     Net     50
1   USA     Retail      Net     70
2   USA     Energy      Net     15
3   Canada  Retail      Net     20
python pandas dataframe
python pandas dataframe
edited 8 hours ago
Scott Boston
58.6k73258
58.6k73258
asked 9 hours ago
LorenzLorenz
595
595
add a comment |
add a comment |
                                5 Answers
                            5
                        
active
oldest
votes
There are quite possibly many ways. Here's one using groupby and unstack:
(df.groupby(['Country', 'Industry', 'Field'], sort=False)['Value']
   .sum()
   .unstack('Field')
   .eval('Import - Export')
   .reset_index(name='Value'))
  Country Industry  Value
0     USA  Finance     50
1     USA   Retail     70
2     USA   Energy     15
3  Canada   Retail     20
1
By far the best answer. Theunstackfollowed byevalis a really nice trick — better than a secondgroupbyandget_groupI would have done
– BallpointBen
8 hours ago
1
@BallpointBenevalandqueryare personal favourites of mine from the API. I've made attempts to popularise their use, but their usage is not completely understood. I have a QnA here, if you are interested.
– coldspeed
8 hours ago
Works like a charm. Thank you very much. Very small comment - there is a closing bracket missing in the last line.
– Lorenz
5 hours ago
@Lorenz Oops... fixed, thanks!
– coldspeed
5 hours ago
@coldspeed Actually I think there’s a better way… see my answer.unstackis expensive because it reshapes. Using the structure of the first groupby is more efficient, although it takes two lines.
– BallpointBen
3 hours ago
|
show 1 more comment
IIUC
df=df.set_index(['Country','Industry'])
Newdf=(df.loc[df.Field=='Export','Value']-df.loc[df.Field=='Import','Value']).reset_index().assign(Field='Net')
Newdf
  Country Industry  Value Field
0     USA  Finance    -50   Net
1     USA   Retail    -70   Net
2     USA   Energy    -15   Net
3  Canada   Retail    -20   Net
pivot_table
df.pivot_table(index=['Country','Industry'],columns='Field',values='Value',aggfunc='sum').
  diff(axis=1).
     dropna(1).
        rename(columns={'Import':'Value'}).
          reset_index()
Out[112]: 
Field Country Industry  Value
0      Canada   Retail   20.0
1         USA   Energy   15.0
2         USA  Finance   50.0
3         USA   Retail   70.0
add a comment |
You can use Groupby.diff() and after that recreate the Field column and finally use DataFrame.dropna:
df['Value'] = df.groupby(['Country', 'Industry'])['Value'].diff().abs()
df['Field'] = 'Net'
df.dropna(inplace=True)
df.reset_index(drop=True, inplace=True)
print(df)
  Country Industry Field  Value
0     USA  Finance   Net   50.0
1     USA   Retail   Net   70.0
2     USA   Energy   Net   15.0
3  Canada   Retail   Net   20.0
add a comment |
You can do it this way to add those rows to your original dataframe:
df.set_index(['Country','Industry','Field'])
  .unstack()['Value']
  .eval('Net = Import - Export')
  .stack().rename('Value').reset_index()
Output:
   Country Industry   Field  Value
0   Canada   Retail  Export     10
1   Canada   Retail  Import     30
2   Canada   Retail     Net     20
3      USA   Energy  Export      5
4      USA   Energy  Import     20
5      USA   Energy     Net     15
6      USA  Finance  Export     50
7      USA  Finance  Import    100
8      USA  Finance     Net     50
9      USA   Retail  Export     10
10     USA   Retail  Import     80
11     USA   Retail     Net     70
Thanks - actually, I wanted to append it to the original df. So, nice trick to do this all in one command,
– Lorenz
5 hours ago
1
Coldspeed‘s answer was a slight better fit to my overall code. Took from your code how you appended the result to the original df. Very tight result, though. Pitty that i can not accept two answers. But thanks again!
– Lorenz
3 hours ago
add a comment |
This answer takes advantage of the fact that pandas puts the group keys in the multiindex of the resulting dataframe. (If there were only one group key, you could use loc.)
>>> s = df.groupby(['Country', 'Industry', 'Field'])['Value'].sum()
>>> s.xs('Import', axis=0, level='Field') - s.xs('Export', axis=0, level='Field')
Country  Industry
Canada   Retail      20
USA      Energy      15
         Finance     50
         Retail      70
Name: Value, dtype: int64
add a comment |
                                Your Answer
                            
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55670192%2fpandas-dataframes-create-new-rows-with-calculations-across-existing-rows%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
                                5 Answers
                            5
                        
active
oldest
votes
                                5 Answers
                            5
                        
active
oldest
votes
active
oldest
votes
active
oldest
votes
There are quite possibly many ways. Here's one using groupby and unstack:
(df.groupby(['Country', 'Industry', 'Field'], sort=False)['Value']
   .sum()
   .unstack('Field')
   .eval('Import - Export')
   .reset_index(name='Value'))
  Country Industry  Value
0     USA  Finance     50
1     USA   Retail     70
2     USA   Energy     15
3  Canada   Retail     20
1
By far the best answer. Theunstackfollowed byevalis a really nice trick — better than a secondgroupbyandget_groupI would have done
– BallpointBen
8 hours ago
1
@BallpointBenevalandqueryare personal favourites of mine from the API. I've made attempts to popularise their use, but their usage is not completely understood. I have a QnA here, if you are interested.
– coldspeed
8 hours ago
Works like a charm. Thank you very much. Very small comment - there is a closing bracket missing in the last line.
– Lorenz
5 hours ago
@Lorenz Oops... fixed, thanks!
– coldspeed
5 hours ago
@coldspeed Actually I think there’s a better way… see my answer.unstackis expensive because it reshapes. Using the structure of the first groupby is more efficient, although it takes two lines.
– BallpointBen
3 hours ago
|
show 1 more comment
There are quite possibly many ways. Here's one using groupby and unstack:
(df.groupby(['Country', 'Industry', 'Field'], sort=False)['Value']
   .sum()
   .unstack('Field')
   .eval('Import - Export')
   .reset_index(name='Value'))
  Country Industry  Value
0     USA  Finance     50
1     USA   Retail     70
2     USA   Energy     15
3  Canada   Retail     20
1
By far the best answer. Theunstackfollowed byevalis a really nice trick — better than a secondgroupbyandget_groupI would have done
– BallpointBen
8 hours ago
1
@BallpointBenevalandqueryare personal favourites of mine from the API. I've made attempts to popularise their use, but their usage is not completely understood. I have a QnA here, if you are interested.
– coldspeed
8 hours ago
Works like a charm. Thank you very much. Very small comment - there is a closing bracket missing in the last line.
– Lorenz
5 hours ago
@Lorenz Oops... fixed, thanks!
– coldspeed
5 hours ago
@coldspeed Actually I think there’s a better way… see my answer.unstackis expensive because it reshapes. Using the structure of the first groupby is more efficient, although it takes two lines.
– BallpointBen
3 hours ago
|
show 1 more comment
There are quite possibly many ways. Here's one using groupby and unstack:
(df.groupby(['Country', 'Industry', 'Field'], sort=False)['Value']
   .sum()
   .unstack('Field')
   .eval('Import - Export')
   .reset_index(name='Value'))
  Country Industry  Value
0     USA  Finance     50
1     USA   Retail     70
2     USA   Energy     15
3  Canada   Retail     20
There are quite possibly many ways. Here's one using groupby and unstack:
(df.groupby(['Country', 'Industry', 'Field'], sort=False)['Value']
   .sum()
   .unstack('Field')
   .eval('Import - Export')
   .reset_index(name='Value'))
  Country Industry  Value
0     USA  Finance     50
1     USA   Retail     70
2     USA   Energy     15
3  Canada   Retail     20
edited 5 hours ago
answered 9 hours ago
coldspeedcoldspeed
142k25159247
142k25159247
1
By far the best answer. Theunstackfollowed byevalis a really nice trick — better than a secondgroupbyandget_groupI would have done
– BallpointBen
8 hours ago
1
@BallpointBenevalandqueryare personal favourites of mine from the API. I've made attempts to popularise their use, but their usage is not completely understood. I have a QnA here, if you are interested.
– coldspeed
8 hours ago
Works like a charm. Thank you very much. Very small comment - there is a closing bracket missing in the last line.
– Lorenz
5 hours ago
@Lorenz Oops... fixed, thanks!
– coldspeed
5 hours ago
@coldspeed Actually I think there’s a better way… see my answer.unstackis expensive because it reshapes. Using the structure of the first groupby is more efficient, although it takes two lines.
– BallpointBen
3 hours ago
|
show 1 more comment
1
By far the best answer. Theunstackfollowed byevalis a really nice trick — better than a secondgroupbyandget_groupI would have done
– BallpointBen
8 hours ago
1
@BallpointBenevalandqueryare personal favourites of mine from the API. I've made attempts to popularise their use, but their usage is not completely understood. I have a QnA here, if you are interested.
– coldspeed
8 hours ago
Works like a charm. Thank you very much. Very small comment - there is a closing bracket missing in the last line.
– Lorenz
5 hours ago
@Lorenz Oops... fixed, thanks!
– coldspeed
5 hours ago
@coldspeed Actually I think there’s a better way… see my answer.unstackis expensive because it reshapes. Using the structure of the first groupby is more efficient, although it takes two lines.
– BallpointBen
3 hours ago
1
1
By far the best answer. The
unstack followed by eval is a really nice trick — better than a second groupby and get_group I would have done– BallpointBen
8 hours ago
By far the best answer. The
unstack followed by eval is a really nice trick — better than a second groupby and get_group I would have done– BallpointBen
8 hours ago
1
1
@BallpointBen
eval and query are personal favourites of mine from the API. I've made attempts to popularise their use, but their usage is not completely understood. I have a QnA here, if you are interested.– coldspeed
8 hours ago
@BallpointBen
eval and query are personal favourites of mine from the API. I've made attempts to popularise their use, but their usage is not completely understood. I have a QnA here, if you are interested.– coldspeed
8 hours ago
Works like a charm. Thank you very much. Very small comment - there is a closing bracket missing in the last line.
– Lorenz
5 hours ago
Works like a charm. Thank you very much. Very small comment - there is a closing bracket missing in the last line.
– Lorenz
5 hours ago
@Lorenz Oops... fixed, thanks!
– coldspeed
5 hours ago
@Lorenz Oops... fixed, thanks!
– coldspeed
5 hours ago
@coldspeed Actually I think there’s a better way… see my answer.
unstack is expensive because it reshapes. Using the structure of the first groupby is more efficient, although it takes two lines.– BallpointBen
3 hours ago
@coldspeed Actually I think there’s a better way… see my answer.
unstack is expensive because it reshapes. Using the structure of the first groupby is more efficient, although it takes two lines.– BallpointBen
3 hours ago
|
show 1 more comment
IIUC
df=df.set_index(['Country','Industry'])
Newdf=(df.loc[df.Field=='Export','Value']-df.loc[df.Field=='Import','Value']).reset_index().assign(Field='Net')
Newdf
  Country Industry  Value Field
0     USA  Finance    -50   Net
1     USA   Retail    -70   Net
2     USA   Energy    -15   Net
3  Canada   Retail    -20   Net
pivot_table
df.pivot_table(index=['Country','Industry'],columns='Field',values='Value',aggfunc='sum').
  diff(axis=1).
     dropna(1).
        rename(columns={'Import':'Value'}).
          reset_index()
Out[112]: 
Field Country Industry  Value
0      Canada   Retail   20.0
1         USA   Energy   15.0
2         USA  Finance   50.0
3         USA   Retail   70.0
add a comment |
IIUC
df=df.set_index(['Country','Industry'])
Newdf=(df.loc[df.Field=='Export','Value']-df.loc[df.Field=='Import','Value']).reset_index().assign(Field='Net')
Newdf
  Country Industry  Value Field
0     USA  Finance    -50   Net
1     USA   Retail    -70   Net
2     USA   Energy    -15   Net
3  Canada   Retail    -20   Net
pivot_table
df.pivot_table(index=['Country','Industry'],columns='Field',values='Value',aggfunc='sum').
  diff(axis=1).
     dropna(1).
        rename(columns={'Import':'Value'}).
          reset_index()
Out[112]: 
Field Country Industry  Value
0      Canada   Retail   20.0
1         USA   Energy   15.0
2         USA  Finance   50.0
3         USA   Retail   70.0
add a comment |
IIUC
df=df.set_index(['Country','Industry'])
Newdf=(df.loc[df.Field=='Export','Value']-df.loc[df.Field=='Import','Value']).reset_index().assign(Field='Net')
Newdf
  Country Industry  Value Field
0     USA  Finance    -50   Net
1     USA   Retail    -70   Net
2     USA   Energy    -15   Net
3  Canada   Retail    -20   Net
pivot_table
df.pivot_table(index=['Country','Industry'],columns='Field',values='Value',aggfunc='sum').
  diff(axis=1).
     dropna(1).
        rename(columns={'Import':'Value'}).
          reset_index()
Out[112]: 
Field Country Industry  Value
0      Canada   Retail   20.0
1         USA   Energy   15.0
2         USA  Finance   50.0
3         USA   Retail   70.0
IIUC
df=df.set_index(['Country','Industry'])
Newdf=(df.loc[df.Field=='Export','Value']-df.loc[df.Field=='Import','Value']).reset_index().assign(Field='Net')
Newdf
  Country Industry  Value Field
0     USA  Finance    -50   Net
1     USA   Retail    -70   Net
2     USA   Energy    -15   Net
3  Canada   Retail    -20   Net
pivot_table
df.pivot_table(index=['Country','Industry'],columns='Field',values='Value',aggfunc='sum').
  diff(axis=1).
     dropna(1).
        rename(columns={'Import':'Value'}).
          reset_index()
Out[112]: 
Field Country Industry  Value
0      Canada   Retail   20.0
1         USA   Energy   15.0
2         USA  Finance   50.0
3         USA   Retail   70.0
edited 7 hours ago
answered 8 hours ago
Wen-BenWen-Ben
125k83871
125k83871
add a comment |
add a comment |
You can use Groupby.diff() and after that recreate the Field column and finally use DataFrame.dropna:
df['Value'] = df.groupby(['Country', 'Industry'])['Value'].diff().abs()
df['Field'] = 'Net'
df.dropna(inplace=True)
df.reset_index(drop=True, inplace=True)
print(df)
  Country Industry Field  Value
0     USA  Finance   Net   50.0
1     USA   Retail   Net   70.0
2     USA   Energy   Net   15.0
3  Canada   Retail   Net   20.0
add a comment |
You can use Groupby.diff() and after that recreate the Field column and finally use DataFrame.dropna:
df['Value'] = df.groupby(['Country', 'Industry'])['Value'].diff().abs()
df['Field'] = 'Net'
df.dropna(inplace=True)
df.reset_index(drop=True, inplace=True)
print(df)
  Country Industry Field  Value
0     USA  Finance   Net   50.0
1     USA   Retail   Net   70.0
2     USA   Energy   Net   15.0
3  Canada   Retail   Net   20.0
add a comment |
You can use Groupby.diff() and after that recreate the Field column and finally use DataFrame.dropna:
df['Value'] = df.groupby(['Country', 'Industry'])['Value'].diff().abs()
df['Field'] = 'Net'
df.dropna(inplace=True)
df.reset_index(drop=True, inplace=True)
print(df)
  Country Industry Field  Value
0     USA  Finance   Net   50.0
1     USA   Retail   Net   70.0
2     USA   Energy   Net   15.0
3  Canada   Retail   Net   20.0
You can use Groupby.diff() and after that recreate the Field column and finally use DataFrame.dropna:
df['Value'] = df.groupby(['Country', 'Industry'])['Value'].diff().abs()
df['Field'] = 'Net'
df.dropna(inplace=True)
df.reset_index(drop=True, inplace=True)
print(df)
  Country Industry Field  Value
0     USA  Finance   Net   50.0
1     USA   Retail   Net   70.0
2     USA   Energy   Net   15.0
3  Canada   Retail   Net   20.0
answered 8 hours ago
ErfanErfan
3,2111419
3,2111419
add a comment |
add a comment |
You can do it this way to add those rows to your original dataframe:
df.set_index(['Country','Industry','Field'])
  .unstack()['Value']
  .eval('Net = Import - Export')
  .stack().rename('Value').reset_index()
Output:
   Country Industry   Field  Value
0   Canada   Retail  Export     10
1   Canada   Retail  Import     30
2   Canada   Retail     Net     20
3      USA   Energy  Export      5
4      USA   Energy  Import     20
5      USA   Energy     Net     15
6      USA  Finance  Export     50
7      USA  Finance  Import    100
8      USA  Finance     Net     50
9      USA   Retail  Export     10
10     USA   Retail  Import     80
11     USA   Retail     Net     70
Thanks - actually, I wanted to append it to the original df. So, nice trick to do this all in one command,
– Lorenz
5 hours ago
1
Coldspeed‘s answer was a slight better fit to my overall code. Took from your code how you appended the result to the original df. Very tight result, though. Pitty that i can not accept two answers. But thanks again!
– Lorenz
3 hours ago
add a comment |
You can do it this way to add those rows to your original dataframe:
df.set_index(['Country','Industry','Field'])
  .unstack()['Value']
  .eval('Net = Import - Export')
  .stack().rename('Value').reset_index()
Output:
   Country Industry   Field  Value
0   Canada   Retail  Export     10
1   Canada   Retail  Import     30
2   Canada   Retail     Net     20
3      USA   Energy  Export      5
4      USA   Energy  Import     20
5      USA   Energy     Net     15
6      USA  Finance  Export     50
7      USA  Finance  Import    100
8      USA  Finance     Net     50
9      USA   Retail  Export     10
10     USA   Retail  Import     80
11     USA   Retail     Net     70
Thanks - actually, I wanted to append it to the original df. So, nice trick to do this all in one command,
– Lorenz
5 hours ago
1
Coldspeed‘s answer was a slight better fit to my overall code. Took from your code how you appended the result to the original df. Very tight result, though. Pitty that i can not accept two answers. But thanks again!
– Lorenz
3 hours ago
add a comment |
You can do it this way to add those rows to your original dataframe:
df.set_index(['Country','Industry','Field'])
  .unstack()['Value']
  .eval('Net = Import - Export')
  .stack().rename('Value').reset_index()
Output:
   Country Industry   Field  Value
0   Canada   Retail  Export     10
1   Canada   Retail  Import     30
2   Canada   Retail     Net     20
3      USA   Energy  Export      5
4      USA   Energy  Import     20
5      USA   Energy     Net     15
6      USA  Finance  Export     50
7      USA  Finance  Import    100
8      USA  Finance     Net     50
9      USA   Retail  Export     10
10     USA   Retail  Import     80
11     USA   Retail     Net     70
You can do it this way to add those rows to your original dataframe:
df.set_index(['Country','Industry','Field'])
  .unstack()['Value']
  .eval('Net = Import - Export')
  .stack().rename('Value').reset_index()
Output:
   Country Industry   Field  Value
0   Canada   Retail  Export     10
1   Canada   Retail  Import     30
2   Canada   Retail     Net     20
3      USA   Energy  Export      5
4      USA   Energy  Import     20
5      USA   Energy     Net     15
6      USA  Finance  Export     50
7      USA  Finance  Import    100
8      USA  Finance     Net     50
9      USA   Retail  Export     10
10     USA   Retail  Import     80
11     USA   Retail     Net     70
answered 8 hours ago
Scott BostonScott Boston
58.6k73258
58.6k73258
Thanks - actually, I wanted to append it to the original df. So, nice trick to do this all in one command,
– Lorenz
5 hours ago
1
Coldspeed‘s answer was a slight better fit to my overall code. Took from your code how you appended the result to the original df. Very tight result, though. Pitty that i can not accept two answers. But thanks again!
– Lorenz
3 hours ago
add a comment |
Thanks - actually, I wanted to append it to the original df. So, nice trick to do this all in one command,
– Lorenz
5 hours ago
1
Coldspeed‘s answer was a slight better fit to my overall code. Took from your code how you appended the result to the original df. Very tight result, though. Pitty that i can not accept two answers. But thanks again!
– Lorenz
3 hours ago
Thanks - actually, I wanted to append it to the original df. So, nice trick to do this all in one command,
– Lorenz
5 hours ago
Thanks - actually, I wanted to append it to the original df. So, nice trick to do this all in one command,
– Lorenz
5 hours ago
1
1
Coldspeed‘s answer was a slight better fit to my overall code. Took from your code how you appended the result to the original df. Very tight result, though. Pitty that i can not accept two answers. But thanks again!
– Lorenz
3 hours ago
Coldspeed‘s answer was a slight better fit to my overall code. Took from your code how you appended the result to the original df. Very tight result, though. Pitty that i can not accept two answers. But thanks again!
– Lorenz
3 hours ago
add a comment |
This answer takes advantage of the fact that pandas puts the group keys in the multiindex of the resulting dataframe. (If there were only one group key, you could use loc.)
>>> s = df.groupby(['Country', 'Industry', 'Field'])['Value'].sum()
>>> s.xs('Import', axis=0, level='Field') - s.xs('Export', axis=0, level='Field')
Country  Industry
Canada   Retail      20
USA      Energy      15
         Finance     50
         Retail      70
Name: Value, dtype: int64
add a comment |
This answer takes advantage of the fact that pandas puts the group keys in the multiindex of the resulting dataframe. (If there were only one group key, you could use loc.)
>>> s = df.groupby(['Country', 'Industry', 'Field'])['Value'].sum()
>>> s.xs('Import', axis=0, level='Field') - s.xs('Export', axis=0, level='Field')
Country  Industry
Canada   Retail      20
USA      Energy      15
         Finance     50
         Retail      70
Name: Value, dtype: int64
add a comment |
This answer takes advantage of the fact that pandas puts the group keys in the multiindex of the resulting dataframe. (If there were only one group key, you could use loc.)
>>> s = df.groupby(['Country', 'Industry', 'Field'])['Value'].sum()
>>> s.xs('Import', axis=0, level='Field') - s.xs('Export', axis=0, level='Field')
Country  Industry
Canada   Retail      20
USA      Energy      15
         Finance     50
         Retail      70
Name: Value, dtype: int64
This answer takes advantage of the fact that pandas puts the group keys in the multiindex of the resulting dataframe. (If there were only one group key, you could use loc.)
>>> s = df.groupby(['Country', 'Industry', 'Field'])['Value'].sum()
>>> s.xs('Import', axis=0, level='Field') - s.xs('Export', axis=0, level='Field')
Country  Industry
Canada   Retail      20
USA      Energy      15
         Finance     50
         Retail      70
Name: Value, dtype: int64
answered 3 hours ago
BallpointBenBallpointBen
3,7481639
3,7481639
add a comment |
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
 
But avoid …
- Asking for help, clarification, or responding to other answers.
 - Making statements based on opinion; back them up with references or personal experience.
 
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55670192%2fpandas-dataframes-create-new-rows-with-calculations-across-existing-rows%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown